Filters








946 Hits in 4.7 sec

Speaker Tracking and Identifying Based on Indoor Localization System and Microphone Array

Xiaojie Chen, Yuanchun Shi, Wenfeng Jiang
<span title="">2007</span> <i title="IEEE"> <a target="_blank" rel="noopener" href="https://fatcat.wiki/container/6fvrx6ji2nauffetsxyi4abwam" style="color: black;">21st International Conference on Advanced Information Networking and Applications Workshops (AINAW&#39;07)</a> </i> &nbsp;
This paper presents a novel multimodal system to track the participants and identify the active speaker in the smart meeting room.  ...  Integrating these two cues from Cicada and microphone arrays into a statistical framework, results in a more robust solution to speaker localization and identification.  ...  The proposed system Our multimodal system is based on indoor localization system called Cicada [5] and microphone array.  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1109/ainaw.2007.341">doi:10.1109/ainaw.2007.341</a> <a target="_blank" rel="external noopener" href="https://dblp.org/rec/conf/aina/ChenSJ07.html">dblp:conf/aina/ChenSJ07</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/rki3ukx3t5e3lotq3fu3uu7lt4">fatcat:rki3ukx3t5e3lotq3fu3uu7lt4</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20170810071932/http://media.cs.tsinghua.edu.cn/~pervasive/paper/2007/2007AINA-CHEN%20Xiaojie.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/d6/85/d6859848006e7102a443599168e72581395335ff.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1109/ainaw.2007.341"> <button class="ui left aligned compact blue labeled icon button serp-button"> <i class="external alternate icon"></i> ieee.com </button> </a>

Non-Field-of-View Acoustic Target Estimation in Complex Indoor Environment [chapter]

Kuya Takami, Tomonari Furukawa, Makoto Kumon, Gamini Dissanayake
<span title="">2016</span> <i title="Springer International Publishing"> <a target="_blank" rel="noopener" href="https://fatcat.wiki/container/lirc5icuifhy7ln5rynpn4ak2e" style="color: black;">Springer Tracts in Advanced Robotics</a> </i> &nbsp;
In this approach, microphones are fixed sparsely in the indoor environment of concern.  ...  This paper presents a new approach which acoustically localizes a mobile target outside the Field-of-View (FOV), or the Non-Field-of-View (NFOV), of an optical sensor, and its implementation to complex  ...  Conclusions This paper has presented a new approach which uses a set of microphones to localize and track a mobile NFOV target, and its applicability and implementation in complex indoor environments.  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1007/978-3-319-27702-8_38">doi:10.1007/978-3-319-27702-8_38</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/kzlydqu6bnfrvj4tvuzn3fpwqe">fatcat:kzlydqu6bnfrvj4tvuzn3fpwqe</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20190220142339/http://pdfs.semanticscholar.org/24d4/56080e5a009ca75b5c16531d3428b94853d4.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/24/d4/24d456080e5a009ca75b5c16531d3428b94853d4.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1007/978-3-319-27702-8_38"> <button class="ui left aligned compact blue labeled icon button serp-button"> <i class="external alternate icon"></i> springer.com </button> </a>

Microphones' directivity for the localization of sound sources

Piervincenzo Rizzo, Mahdi Tajari, Antonino Spada, Edward M. Carapezza
<span title="2011-05-13">2011</span> <i title="SPIE"> Unattended Ground, Sea, and Air Sensor Technologies and Applications XIII </i> &nbsp;
The method relies on the use of unidirectional microphones and amplitude-based signals' features to extract information about the direction of the incoming sound.  ...  Marzani, and J. Vipperman, "Localization of Sound Sources by Means of Unidirectional Microphones, Meas. Sci.  ...  (b) Photo of the microphone array. (c) Photo of the room and the array-Indoor test. Speaker at location S1.  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1117/12.884626">doi:10.1117/12.884626</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/ivrnopx6kbhidm5eqncy4vai3i">fatcat:ivrnopx6kbhidm5eqncy4vai3i</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20190430063426/https://iris.unipa.it/retrieve/handle/10447/78998/78664/8046-2.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/ef/17/ef179742e34e9ab8bc7560e2db75118773367636.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1117/12.884626"> <button class="ui left aligned compact blue labeled icon button serp-button"> <i class="external alternate icon"></i> Publisher / doi.org </button> </a>

CHiME-6 Challenge:Tackling Multispeaker Speech Recognition for Unsegmented Recordings [article]

Shinji Watanabe, Michael Mandel, Jon Barker, Emmanuel Vincent, Ashish Arora, Xuankai Chang, Sanjeev Khudanpur, Vimal Manohar, Daniel Povey, Desh Raj, David Snyder, Aswin Shanmugam Subramanian (+9 others)
<span title="2020-05-02">2020</span> <i > arXiv </i> &nbsp; <span class="release-stage" >pre-print</span>
enhancement, speaker diarization, and speech recognition modules.  ...  The new challenge revisits the previous CHiME-5 challenge and further considers the problem of distant multi-microphone conversational speech diarization and recognition in everyday home environments.  ...  After training, run.sh finally calls the inference script (local/decode.sh), which performs speech enhancement, SAD, speaker diarization, and speech recognition based on the trained models.  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/2004.09249v2">arXiv:2004.09249v2</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/ks7gqko5v5htjmx2fgiiyvg6su">fatcat:ks7gqko5v5htjmx2fgiiyvg6su</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20200826070947/https://arxiv.org/pdf/2004.09249v2.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/ea/d6/ead6323f137c2f99ef0ffcfa34fa6eb1c6eca3c6.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener" href="https://arxiv.org/abs/2004.09249v2" title="arxiv.org access"> <button class="ui compact blue labeled icon button serp-button"> <i class="file alternate outline icon"></i> arxiv.org </button> </a>

Daredevil

Ionut Constandache, Sharad Agarwal, Ivan Tashev, Romit Roy Choudhury
<span title="2014-06-03">2014</span> <i title="Association for Computing Machinery (ACM)"> <a target="_blank" rel="noopener" href="https://fatcat.wiki/container/btnjvwlwczgibccn2pceyhyytu" style="color: black;">ACM SIGMOBILE Mobile Computing and Communications Review</a> </i> &nbsp;
In our system called Daredevil, smartphones emit sound at particular times and frequencies, which are received by microphone arrays.  ...  In this paper, we propose a novel approach using sound source localization (SSL) with microphone arrays to determine where in a room a smartphone is located.  ...  Our system uses sound source localization to calculate the angle between a phone and each microphone array, and then triangulation to calculate the position.  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1145/2636242.2636245">doi:10.1145/2636242.2636245</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/u4a3cl7zdrdhnnwxmk4yxb6omy">fatcat:u4a3cl7zdrdhnnwxmk4yxb6omy</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20160417111623/http://research.microsoft.com:80/pubs/216653/MC2R2014.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/2e/8c/2e8c00a7382226bf6ccb337ae7f772549004c71a.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1145/2636242.2636245"> <button class="ui left aligned compact blue labeled icon button serp-button"> <i class="external alternate icon"></i> acm.org </button> </a>

Geometry calibration of multiple microphone arrays in highly reverberant environments

Axel Plinge, Gernot A. Fink
<span title="">2014</span> <i title="IEEE"> <a target="_blank" rel="noopener" href="https://fatcat.wiki/container/rprrdgqig5b43pamdd3yiddohu" style="color: black;">2014 14th International Workshop on Acoustic Signal Enhancement (IWAENC)</a> </i> &nbsp;
Microphone arrays can be used for a number of applications such as speaker diarization and tracking. For these, it is necessary to calibrate their geometry with good precision.  ...  It does not require speakers at the nodes and works well in high reverberation. It was evaluated with real recordings in a smart room.  ...  INTRODUCTION For applications such as speaker diarization and tracking [1] . multiple distributed microphone arrays are utilized, whose geometry has to be known.  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1109/iwaenc.2014.6954295">doi:10.1109/iwaenc.2014.6954295</a> <a target="_blank" rel="external noopener" href="https://dblp.org/rec/conf/iwaenc/PlingeF14.html">dblp:conf/iwaenc/PlingeF14</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/5afle64sqbev3l3vcszybwjicm">fatcat:5afle64sqbev3l3vcszybwjicm</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20200321100700/http://patrec.cs.tu-dortmund.de/pubs/papers/Plinge2014-GCM.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/b3/69/b369906ddbd3fba36421bf8cd7325d726946e4a6.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1109/iwaenc.2014.6954295"> <button class="ui left aligned compact blue labeled icon button serp-button"> <i class="external alternate icon"></i> ieee.com </button> </a>

A System for Multimodal Context-Awareness

Georgios Galatas, Fillia Makedon
<span title="">2013</span> <i title="The Science and Information Organization"> <a target="_blank" rel="noopener" href="https://fatcat.wiki/container/2yzw5hsmlfa6bkafwsibbudu64" style="color: black;">International Journal of Advanced Computer Science and Applications</a> </i> &nbsp;
More specifically, we use skeletal tracking conducted on the depth images and sound source localization conducted on the audio signals captured by the Kinect sensors to accurately localize and track multiple  ...  The unintrusive devices used are RFID and 3-D audio-visual information from 2 Kinect sensors deployed at various locations of a simulated apartment to continuously track and identify its occupants, thus  ...  It is based on the PrimeSensor design [25] and it incorporates a color camera, a depth sensor and a microphone array. Depth images are acquired using the structured light technique.  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.14569/ijacsa.2013.040921">doi:10.14569/ijacsa.2013.040921</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/v7pcminxqbaqxft6jv7n7us5gq">fatcat:v7pcminxqbaqxft6jv7n7us5gq</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20170815211841/http://thesai.org/Downloads/Volume4No9/Paper_21-A_System_for_Multimodal_Context-Awareness.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/a2/5f/a25f1b02c63857482dcaa621f3a52e2b34d8b022.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.14569/ijacsa.2013.040921"> <button class="ui left aligned compact blue labeled icon button serp-button"> <i class="unlock alternate icon" style="background-color: #fb971f;"></i> Publisher / doi.org </button> </a>

Audio-visual multi-person tracking and identification for smart environments

Keni Bernardin, Rainer Stiefelhagen
<span title="">2007</span> <i title="ACM Press"> <a target="_blank" rel="noopener" href="https://fatcat.wiki/container/lahlxihmo5fhzpexw7rundu24u" style="color: black;">Proceedings of the 15th international conference on Multimedia - MULTIMEDIA &#39;07</a> </i> &nbsp;
In parallel, speech segmentation, sound source localization and speaker identification are performed using several far-field microphones and arrays.  ...  This paper presents a novel system for the automatic and unobtrusive tracking and identification of multiple persons in an indoor environment.  ...  Speaker localization and identification cues are delivered by a combination of Kalman filter tracking and GMM-based ID using microphone arrays and tabletop microphones.  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1145/1291233.1291388">doi:10.1145/1291233.1291388</a> <a target="_blank" rel="external noopener" href="https://dblp.org/rec/conf/mm/BernardinS07.html">dblp:conf/mm/BernardinS07</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/nhskt4j34nbq7kbqt4ill6gwvi">fatcat:nhskt4j34nbq7kbqt4ill6gwvi</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20080424163450/http://isl.ira.uka.de/~stiefel/papers/MM07_p661.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/62/3f/623ff17f08e4da832b078f0cb655d2384cfdd0e5.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1145/1291233.1291388"> <button class="ui left aligned compact blue labeled icon button serp-button"> <i class="external alternate icon"></i> acm.org </button> </a>

Multimodal identity tracking in a smart room

Keni Bernardin, Hazim Kemal Ekenel, Rainev Stiefelhagen
<span title="2007-06-14">2007</span> <i title="Springer Nature"> <a target="_blank" rel="noopener" href="https://fatcat.wiki/container/yubpzzxtazfhzllyyxylnnu7ru" style="color: black;">Personal and Ubiquitous Computing</a> </i> &nbsp;
It relies on a set of fixed and active cameras to track the users and get closeups of their faces for identification, and on several microphone arrays to determine active speakers and steer the attention  ...  The automatic detection, tracking, and identification of multiple people in intelligent environments is an important building block on which smart interaction systems can be designed.  ...  Our system uses a fixed camera for tracking, several T-shaped microphone arrays for speech localization and two active cameras for person identification, and realizes an online, incremental identification  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1007/s00779-007-0175-y">doi:10.1007/s00779-007-0175-y</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/7d7wg57ecfcrxgoyvkygpzyy3u">fatcat:7d7wg57ecfcrxgoyvkygpzyy3u</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20070824105916/http://isl.ira.uka.de/~stiefel/papers/Bernardin_AIAI2006.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/2a/82/2a8252babd99c20dcb03c6edd1f4ed7419b74c05.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1007/s00779-007-0175-y"> <button class="ui left aligned compact blue labeled icon button serp-button"> <i class="external alternate icon"></i> springer.com </button> </a>

Multimodal identity tracking in a smart room

Keni Bernardin, Hazim Kemal Ekenel, Rainer Stiefelhagen
<span title="2009-01-08">2009</span> <i title="Springer Nature"> <a target="_blank" rel="noopener" href="https://fatcat.wiki/container/yubpzzxtazfhzllyyxylnnu7ru" style="color: black;">Personal and Ubiquitous Computing</a> </i> &nbsp;
It relies on a set of fixed and active cameras to track the users and get closeups of their faces for identification, and on several microphone arrays to determine active speakers and steer the attention  ...  The automatic detection, tracking, and identification of multiple people in intelligent environments is an important building block on which smart interaction systems can be designed.  ...  Our system uses a fixed camera for tracking, several T-shaped microphone arrays for speech localization and two active cameras for person identification, and realizes an online, incremental identification  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1007/s00779-008-0216-1">doi:10.1007/s00779-008-0216-1</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/ycd6z2chafgwtn43qfqscwat3u">fatcat:ycd6z2chafgwtn43qfqscwat3u</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20070824105916/http://isl.ira.uka.de/~stiefel/papers/Bernardin_AIAI2006.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/2a/82/2a8252babd99c20dcb03c6edd1f4ed7419b74c05.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1007/s00779-008-0216-1"> <button class="ui left aligned compact blue labeled icon button serp-button"> <i class="external alternate icon"></i> springer.com </button> </a>

Towards Computer Understanding of Human Interactions [chapter]

Iain McCowan, Daniel Gatica-Perez, Samy Bengio, Darren Moore, Hervé Bourlard
<span title="">2003</span> <i title="Springer Berlin Heidelberg"> <a target="_blank" rel="noopener" href="https://fatcat.wiki/container/2w3awgokqne6te4nvlofavy5a4" style="color: black;">Lecture Notes in Computer Science</a> </i> &nbsp;
Based on this view, this article presents an approach in which relevant information content of a meeting is identified from a variety of audio and visual sensor inputs and statistical models of interacting  ...  We also comment on current developments and the future challenges in automatic meeting analysis. 2 IDIAP-RR 03-xx  ...  Our approach We have developed a principled method for speaker tracking, fusing information coming from multiple microphones and uncalibrated cameras [22] , based on Sequential Monte Carlo (SMC) methods  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1007/978-3-540-39863-9_18">doi:10.1007/978-3-540-39863-9_18</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/rbihvoyvafd2pfmfyx4csgd6xi">fatcat:rbihvoyvafd2pfmfyx4csgd6xi</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20170921210155/http://www.idiap.ch/ftp/reports/2003/rr03-45.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/54/df/54df9e7043e1856dee2b2596503bdbfa287fad69.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1007/978-3-540-39863-9_18"> <button class="ui left aligned compact blue labeled icon button serp-button"> <i class="external alternate icon"></i> springer.com </button> </a>

Towards Computer Understanding of Human Interactions [chapter]

Iain McCowan, Daniel Gatica-Perez, Samy Bengio, Darren Moore, Hervé Bourlard
<span title="">2005</span> <i title="Springer Berlin Heidelberg"> <a target="_blank" rel="noopener" href="https://fatcat.wiki/container/2w3awgokqne6te4nvlofavy5a4" style="color: black;">Lecture Notes in Computer Science</a> </i> &nbsp;
Based on this view, this article presents an approach in which relevant information content of a meeting is identified from a variety of audio and visual sensor inputs and statistical models of interacting  ...  We also comment on current developments and the future challenges in automatic meeting analysis. 2 IDIAP-RR 03-xx  ...  Our approach We have developed a principled method for speaker tracking, fusing information coming from multiple microphones and uncalibrated cameras [22] , based on Sequential Monte Carlo (SMC) methods  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1007/978-3-540-30568-2_6">doi:10.1007/978-3-540-30568-2_6</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/pxwis6jtbneylf5abpvky2wtbm">fatcat:pxwis6jtbneylf5abpvky2wtbm</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20170921210155/http://www.idiap.ch/ftp/reports/2003/rr03-45.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/54/df/54df9e7043e1856dee2b2596503bdbfa287fad69.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1007/978-3-540-30568-2_6"> <button class="ui left aligned compact blue labeled icon button serp-button"> <i class="external alternate icon"></i> springer.com </button> </a>

Hierarchical audio-visual cue integration framework for activity analysis in intelligent meeting rooms

Shankar T. Shivappa, Mohan M. Trivedi, Bhaskar D. Rao
<span title="">2009</span> <i title="IEEE"> <a target="_blank" rel="noopener" href="https://fatcat.wiki/container/ilwxppn4d5hizekyd3ndvy2mii" style="color: black;">2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops</a> </i> &nbsp;
Tasks such as person localization and tracking, speaker ID, focus of attention detection, speech recognition and affective state recognition are among them.  ...  The system performs the tasks of person tracking, head pose estimation, beamforming, speaker ID and speech recognition using audio and visual cues.  ...  In [12] , the authors explain the benefits of using video based tracking to enhance speaker localization for effective beamforming.  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1109/cvprw.2009.5204224">doi:10.1109/cvprw.2009.5204224</a> <a target="_blank" rel="external noopener" href="https://dblp.org/rec/conf/cvpr/ShivappaTR09.html">dblp:conf/cvpr/ShivappaTR09</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/wywnweppbfahhbalu4wh3hzbri">fatcat:wywnweppbfahhbalu4wh3hzbri</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20130617074446/http://cvrr.ucsd.edu/publications/2009/shivappa_CVPR09.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/0f/00/0f005b09ff392a2ae83e8cc5080a9fdda03a19f9.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1109/cvprw.2009.5204224"> <button class="ui left aligned compact blue labeled icon button serp-button"> <i class="external alternate icon"></i> ieee.com </button> </a>

Audio-visual sensing from a quadcopter: dataset and baselines for source localization and sound enhancement

Lin Wang, Ricardo Sanchez-Matilla, Andrea Cavallaro
<span title="">2019</span> <i title="IEEE"> <a target="_blank" rel="noopener" href="https://fatcat.wiki/container/dmucnarmarh2fj6syg5jyqs7ny" style="color: black;">2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)</a> </i> &nbsp;
The dataset was collected using a small circular array with 8 microphones and a camera mounted on the quadcopter.  ...  The dataset includes a scenario for source localization and sound enhancement with up to two static sources, and a scenario for source localization and tracking with a moving sound source.  ...  Based on the role of the onboard microphones and sensors used, these algorithms can be categorized as unsupervised or supervised approaches.  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1109/iros40897.2019.8968183">doi:10.1109/iros40897.2019.8968183</a> <a target="_blank" rel="external noopener" href="https://dblp.org/rec/conf/iros/0009SC19.html">dblp:conf/iros/0009SC19</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/eownwp3xnjgfbnblvmtjb4rswi">fatcat:eownwp3xnjgfbnblvmtjb4rswi</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20200506201758/https://qmro.qmul.ac.uk/xmlui/bitstream/handle/123456789/59302/Wang%20Audio-visual%20sensing%202019%20Accepted.pdf;jsessionid=FCC687508CF03F9822EAFB06D565C0AA?sequence=2" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/62/8d/628d454160f6c2da49c96d3b1a7ea5bc2e821b61.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1109/iros40897.2019.8968183"> <button class="ui left aligned compact blue labeled icon button serp-button"> <i class="external alternate icon"></i> ieee.com </button> </a>

A Bayesian Approach for Sound Source Estimation

Krittameth Teachasrisaksakul, Surapa Thiemjarus, Chantri Polprasert
<span title="1970-01-01">1970</span> <i title="ECTI"> <a target="_blank" rel="noopener" href="https://fatcat.wiki/container/25z3g4hfevb2rf4fiw3mnyn47q" style="color: black;">ECTI Transactions on Computer and Information Technology</a> </i> &nbsp;
Based on an experiment with various parameter settings in an indoor environment, different factors that affect the classification accuracy have been analyzed.  ...  Based on time delay of arrival (TDOA) features, a Bayesian approach to sound source direction estimation is proposed.  ...  ASL is an active research topic and has been applied in various speaker-tracking applications, ranging from source localization micro-sensor [16] , camera pointing system for video conferencing [17]  ... 
<span class="external-identifiers"> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.37936/ecti-cit.201372.54370">doi:10.37936/ecti-cit.201372.54370</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/hkccxi2ibve5zk7etqpqmq5gq4">fatcat:hkccxi2ibve5zk7etqpqmq5gq4</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20200421125929/https://ph01.tci-thaijo.org/index.php/ecticit/article/download/54370/45144" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/22/bf/22bf03507ea35b93c92d0e6e0d21af1cc24293db.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.37936/ecti-cit.201372.54370"> <button class="ui left aligned compact blue labeled icon button serp-button"> <i class="unlock alternate icon" style="background-color: #fb971f;"></i> Publisher / doi.org </button> </a>
&laquo; Previous Showing results 1 &mdash; 15 out of 946 results