Incorporation of phonetic constraints in acoustic-to-articulatory inversion

Blaise Potard, Yves Laprie, Slim Ouni
2008 Journal of the Acoustical Society of America  
This study investigates the use of constraints upon articulatory parameters derived from standard phonetic knowledge in the context of acoustic-to-articulatory inversion. These speaker independant "phonetic" constraints are introduced and investigated in an existing inversion framework. The validity of these constraints is assessed by comparing synthetic vocal tract shapes and real vocal tract shapes obtained from X-ray images. Beyond the scope of phonetic constraints, this study also provides
more » ... tudy also provides an extensive exploration of the acoustical properties of Maeda's articulatory model. Acoustic-to-articulatory inversion remains an open challenge in speech analysis. Although there is a wide range of potential applications, there is as of yet no clear answer to whether or not inversion is possible for all the sounds of speech 19 . However, there do exist numerical simulations that cover both articulatory and acoustical phenomena involved in speech production and which enable the synthesis of acoustical artificial signals close to natural speech. These tools, especially those generating a speech spectrum, are often used to perform inversion. Indeed, most of the existing approaches to acoustic-to-articulatory inversion are analysis-by-synthesis methods. The key difficulty is that an infinity of vocal tract shapes can produce any given spectrum. In order to reduce the number of inverse solutions, methods of acoustic-toarticulatory inversion incorporate explicit or implicit constraints. Sorokin, for instance 20 , presents seven possible kinds of constraints: limitations in the contractive force of muscles involved in speech production, anatomy of the vocal tract or equivalently, ranges of articulatory parameters, interdependencies between muscles, i.e. interdependent variations of the articulatory parameters, interdependency between transversal and mid-sagittal dimensions of the vocal tract, aerodynamic constraints with respect to the kinds of sound produced, level of the acoustical deviation tolerated between analyzed and resynthesized sounds according to style and rate of speech, and lastly, a constraint concerning the complexity of planning and programming of the articulatory control. Some of these constraints, those upon articulatory parameters and, to a certain extent, those upon the transversal dimension estimated from the mid-sagittal profile of the vocal tract, can be incorporated directly in the analyzing model, in the form of an articulatory model. They can rely only on pure geometrical primitives, like those * Electronic address: Yves.Laprie@loria.fr † Electronic address: Blaise.Potard@loria.fr
doi:10.1121/1.2885747 pmid:18397035 fatcat:sw6on7b4jbflnou7mlxj4ckmte