Rake, Peel, Sketch:The Signal Processing Pipeline Revisited

Robin Scheibler
2017
There was a table set out under a tree in front of the house, and the March Hare and the Hatter were having tea at it: a Dormouse was sitting between them, fast asleep, and the other two were using it as a cushion, resting their elbows on it, and talking over its head. Alice's Adventures in Wonderland Lewis Carroll Martin, thank you for the amazing journey. I came for the freedom to conduct my own research and was not disappointed! I am grateful for your supervision and the trust you gave me.
more » ... en if at times I felt like I was fumbling in the dark, you had new insights and long term visions to put me back on track whenever we met. Thank you for tolerating my short attention span and my multiple side projects. And, finally, thanks for bringing together such great people in the lab. It was an honor and a pleasure to be part of it! Next I would like to thank my thesis committee, Laurent Daudet, Nobutaka Ono, Nisheeth Vishnoi, and its president Olivier Lévêque, for taking the time to thoroughly read my thesis and provide critical feedback, as well as new ideas. Serendipity, and sometimes a gentle push from Martin, allowed me to collaborate with a number of brilliant researchers without whom this thesis would be much blander. Thank you Saeid, it was a chance to meet you early on. The quiet strength with which you conduct research is an example for me. Then, I have to thank Ivan for counterbalancing this quietness by bringing his wild and contagious enthusiasm, in research, brewing, and everything else, to our common endeavors. René, thank you for partnering on the crazy microphone array project. There is a lot for me to learn from your meticulous approach to embedded systems and your experience supervising student projects. Thanks Hanjie for an infinite rate of FRI goodness, great breakfasts in Brisbane, and to occasionally indulge my love of hamburgers. Thank you Eric for believing in the demo and bringing so much energy and passion to it. Next, there are a few mentors I need to thank for their guidance. Amina for introducing me to the LCAV way of doing research, teaching me how to write and make presentations, and generally what it means to care for one's student. Dirk Schröder for his help and deep wisdom in acoustics and experiments, his strong and honest opinions, and the ski and hiking trips. v vi Acknowledgments One characteristics of life at LCAV is the constant stimulation provided by its collection of singular but very likable characters. Thank you Juri, after a great start in Zürich, it was great fun to meet again and spend some time in the office next door. Thanks Marta for always bringing a fresh perspective and teaching me that Spanish chiringuitos are better than Swiss ones (obvious in hindsight)! Mihailo for the late night drinks and dinners (not before 10pm) and all the stories. Paolo for sharing advice on latex and spare bicycle parts. Thanks Frederike for joining us and the robot, and then deciding to stick around. Hesam, my longest running office mate, for improving our shared space with beautiful plants and interesting discussions. And thank you to everyone else who made this journey so enjoyable: Thanks for all the good times, the ICASSP dinners, the parties, and the coffee breaks. While it is sad to part ways, I also know that wherever I will go, there will always be a friend close by! One of the highlights of my PhD time was to be able to work with many great students on a variety of fun projects. A big thank you to Sidney, Ivan, Thomas, Juan, Basile, and Corentin. Your hard work and dedication made it possible to create fantastic microphone arrays and a great demo! During the realization of the microphone arrays and the demonstration hardware, I was very lucky to benefit from the invaluable help and experience of André Guignard, André Badertscher, and Peter Brühlmeier Thank you Sachiko for inviting me to collaborate on the Biodesign for the Real World project. It was great fun to step out of my comfort zone to work with these pesky biological organisms. Thanks also to all the students who trusted us enough to spend a semester on this strange looking project. Thanks to the Hackuarium community for welcoming Biodesign and being such a great place to hang out and dream new projects. At home, I cannot thank Risa enough. For following me here and bearing life without karaoke and izakaya for four years. For being a hundred percent supportive of anything I do, and always understanding of my antics. Thanks for being here with me. For the last two years, we had the addition of the fantastic Rémi. Thank you for brightening our life and making sure there is not a single boring day. Special thanks to my mother-in-law Tokuko who kindly hosted me in snowy Tohoku while I was writing this thesis. Last but not least, I would like to thank my family. My fearless older sister Sophie for showing the way one conquers the world. My parents Josiane and André for their unconditional love and support, for nurturing my curiosity and sense of wonder, and teaching me the unknown is not to be feared. Thank you for everything. Abstract The prototypical signal processing pipeline can be divided into four blocks. Representation of the signal in a basis suitable for processing. Enhancement of the meaningful part of the signal and noise reduction. Estimation of important statistical properties of the signal. Adaptive processing to track and adapt to changes in the signal statistics. This thesis revisits each of these blocks and proposes new algorithms, borrowing ideas from information theory, theoretical computer science, or communications. First, we revisit the Walsh-Hadamard transform (WHT) for the case of a signal sparse in the transformed domain, namely that has only K ≤ N non-zero coefficients. We show that an efficient algorithm exists that can compute these coefficients in O(K log 2 (K) log 2 (N/K)) and using only O(K log 2 (N/K)) samples. This algorithm relies on a fast hashing procedure that computes small linear combinations of transformed domain coefficients. A bipartite graph is formed with linear combinations on one side, and non-zero coefficients on the other. A peeling decoder is then used to recover the non-zero coefficients one by one. A detailed analysis of the algorithm based on error correcting codes over the binary erasure channel is given. The second chapter is about beamforming. Inspired by the rake receiver from wireless communications, we recognize that echoes in a room are an important source of extra signal diversity. We extend several classic beamforming algorithms to take advantage of echoes and also propose new optimal formulations. We explore formulations both in time and frequency domains. We show theoretically and in numerical simulations that the signal-to-interference-and-noise ratio increases proportionally to the number of echoes used. Finally, beyond objective measures, we show that echoes also directly improve speech intelligibility as measured by the perceptual evaluation of speech quality (PESQ) metric. Next, we attack the problem of direction of arrival of acoustic sources, to which we apply a robust finite rate of innovation reconstruction framework. FRIDA -the resulting algorithm -exploits wideband information coherently, works at very low signal-to-noise ratio, and can resolve very close sources. The algorithm can use either raw microphone signals or their crosscorrelations. While the former lets us work with correlated sources, the latter creates a quadratic number of measurements that allows to locate many sources with few microphones. Thorough experiments on simulated and recorded data shows that FRIDA compares favorably with the state-of-the-art. We continue by revisiting the classic recursive least squares (RLS) adaptive filter with ideas borrowed from recent results on sketching least squares problems. The exact update of RLS is replaced by a few steps of conjugate gradient descent. We propose then two different preconditioners, obtained by sketching the data, to accelerate the convergence of the gradient descent. Experiments on artificial as well as natural signals show that the proposed algorithm has a performance very close to that of RLS at a lower computational burden. The fifth and final chapter is dedicated to the software and hardware tools developed for this thesis. We describe the pyroomacoustics Python package that contains routines for the evaluation vii viii Abstract of audio processing algorithms and reference implementations of popular algorithms. We then give an overview of the microphone arrays developed and used for the experimental validation of FRIDA. We use this as an opportunity to start a discussion on the challenges of reproducible research at a global level. We conclude with a modest proposal.
doi:10.5075/epfl-thesis-7651 fatcat:36ckogcv3nf4fdz3lx2zzihdre