An analysis-by-synthesis approach to vocal tract modeling for robust speech recognition

Ziad Al Bawab
2012 Qatar Foundation Annual Research Forum Proceedings  
Words cannot describe how grateful I am to my mentor, Richard M. Stern, for the support and confidence he provided me on this long journey. Rich believed in my ability to perform genuine research from day one. He assigned a very challenging research problem to me and after six years, together, we came up with this thesis. Rich was more than an advisor to me. I enjoyed learning from his wisdom and experience in life as much as I enjoyed learning and deeply understanding the basic issues related
more » ... sic issues related to signal processing and speech recognition from him. I enjoyed meeting with Rich on a weekly basis and discussing progress on research. In addition, I enjoyed discussing with Rich different subjects about academia, life, careers, politics, cultures, and many more. He was always available to chat. Bhiksha Raj has been a big brother to me that I turned to for inspiration and motivation at all times. Most of the ideas you see in this thesis have stemmed from discussions with him. This work could not become a reality without Bhiksha's help. Bhiksha's positive attitude to life and research problems is unique. Whenever I felt down, Bhiksha was there to motivate me to pursue my ideas. Bhiksha was always available for discussions. We started this work when Bhiksha was in Boston working for Mitsubishi Electric Research Labs (MERL) using Skype and Powerpoint. I am very grateful to Bhiksha's flexibility and availability. He is a main architect of this thesis. Bhiksha Raj and Rita Singh have been a great help to me on the speech recognition problem I wanted to solve. Together they are an encyclopedia on ideas related to speech recognition and have contributed to this field for more than a decade now. I feel lucky to have them around at Carnegie Mellon University (CMU) during my stay. I am also grateful to Lorenzo Turicchia and to Sankaran Panchapagesan (Panchi) for their collaboration. Their expertise on articulatory synthesis and geometric modeling was very helpful to this thesis. Panchi's help in forwarding the articulatory synthesis package was the basic starting point for this work. In addition to Rich, the chair of my PhD committee, and to Bhiksha, I would also like to thank the other members of the committee Alan W. Black and Tsuhan Chen for iv their time and collaboration on this piece of work. I have enjoyed the time I spent at CMU with my robust speech lab colleagues. I am thankful to
doi:10.5339/qfarf.2012.aesnp6 fatcat:awqcncfewvaytnkfstmvehwcvm