Filters








17,987 Hits in 5.0 sec

Using AI to Hack IA: A New Stealthy Spyware Against Voice Assistance Functions in Smart Phones [article]

Rongjunchen Zhang, Xiao Chen, Jianchao Lu, Sheng Wen, Surya Nepal, Yang Xiang
2018 arXiv   pre-print
We suggest to revise the activation logic of voice assistant to be resilient to the speaker based attack.  ...  We propose an attacking framework, which records the activation voice of the user, and launch the attack by playing the activation voice and attack commands via the built-in speaker.  ...  We apply real-time speech recognition to synthesis the target activation keywords.  ... 
arXiv:1805.06187v1 fatcat:5puavld4bvbkjb2qxyky6sk2b4

Speech Assistant System With Local Client and Server Devices to Guarantee Data Privacy

Hans-Günter Hirsch
2022 Frontiers in Computer Science  
Users of speech assistant systems have reservations about the distributed approach of these systems.  ...  Besides activating a client by a sensor that detects approaching people, the recognition of a spoken wake-up word is the usual way for activation.  ...  In our application, where we want to apply the keyword detection for the activation of a home automation system, we prioritized lowering the FAR.  ... 
doi:10.3389/fcomp.2022.778367 dblp:journals/fcomp/Hirsch22 fatcat:4wokitax3bfltj5fe6dq5iw3ay

Keyword Spotting for Hearing Assistive Devices Robust to External Speakers

Iván López-Espejo, Zheng-Hua Tan, Jesper Jensen
2019 Interspeech 2019  
Keyword spotting (KWS) is experiencing an upswing due to the pervasiveness of small electronic devices that allow interaction with them via speech.  ...  For applications like KWS for hearing assistive devices this is unacceptable, as only the user must be allowed to handle them.  ...  This technology has become a popular research topic as it is considered a keystone for voice-based activation of virtual assistants (e.g., smart speakers) by means of keywords or wake-up-words [1] .  ... 
doi:10.21437/interspeech.2019-2010 dblp:conf/interspeech/Lopez-EspejoT019 fatcat:6s3vne6vubdihpf27u6tq67xqm

Keyword Spotting Using Human Electrocorticographic Recordings

Griffin Milsap, Maxwell Collard, Christopher Coogan, Qinwan Rabbani, Yujing Wang, Nathan E. Crone
2019 Frontiers in Neuroscience  
of modern voice-activated AI assistant technologies.  ...  Neural vocal activity detection (VAD) was used to identify utterance times and a discriminative classifier was used to determine if these utterances were the keyword or non-keyword speech.  ...  Kanas et al. (2014) used high frequency content of speechactive areas of the brain to perform voice-activity-detection, or VAD-segmenting periods of speech from non-speech periods.  ... 
doi:10.3389/fnins.2019.00060 pmid:30837823 pmcid:PMC6389788 fatcat:m6ya26ctq5g4zjboebw466zlaa

Aware: Intuitive Device Activation Using Prosody for Natural Voice Interactions

Xinlei Zhang, Zixiong Su, Jun Rekimoto
2022 CHI Conference on Human Factors in Computing Systems  
Figure 1: A demonstration of Aware. a) A person is introducing a smart speaker "Alexa" to another person. b) From the prosody patterns, the device knows that he is not calling it so no responses are provided  ...  . c) The person is calling the device. d) The device determined that he is calling him this time, so voice responses are given.  ...  We want to give special thanks to Ryotaro Miura for helping with the early exploration of this study, and Wanhui Li for helping with the data analysis.  ... 
doi:10.1145/3491102.3517687 fatcat:ngn65dgcw5hm7dpxausyxzqmmm

An End-to-End Architecture for Keyword Spotting and Voice Activity Detection [article]

Chris Lengerich, Awni Hannun
2016 arXiv   pre-print
We propose a single neural network architecture for two tasks: on-line keyword spotting and voice activity detection.  ...  keyword spotting and voice activity detection without retraining.  ...  Voice activity detection (VAD) requires detecting human speech in the signal, often for the purpose of endpointing in a large vocabulary speech recognition system.  ... 
arXiv:1611.09405v1 fatcat:wquj37wolnhcxg62y5vxf5b45y

Speech Processing for Digital Home Assistants: Combining signal processing with deep-learning techniques

Reinhold Haeb-Umbach, Shinji Watanabe, Tomohiro Nakatani, Michiel Bacchiani, Bjorn Hoffmeister, Michael L. Seltzer, Heiga Zen, Mehrez Souden
2019 IEEE Signal Processing Magazine  
The purpose of this tutorial article is to describe, in a way amenable to the non-specialist, the key speech processing algorithms that enable reliable fully hands-free speech interaction with digital  ...  , high-quality speech synthesis, as well as sophisticated statistical models for speech and language, learned from large amounts of heterogeneous training data.  ...  In one approach that has been proposed in [46] , a voice activity detection (VAD) is used in a first step to reduce computation, so that the search for a keyword is only conducted if speech has been detected  ... 
doi:10.1109/msp.2019.2918706 fatcat:h4hfdjvqk5cglint4246weob2i

A Voice-Activated Switch for Persons with Motor and Speech Impairments: Isolated-Vowel Spotting Using Neural Networks

Shanqing Cai, Lisie Lillianfeld, Katie Seaver, Jordan R. Green, Michael P. Brenner, Philip C. Nelson, D. Sculley
2021 Conference of the International Speech Communication Association  
Severe speech impairments limit the precision and range of producible speech sounds.  ...  Preliminary user testing indicates the vowel spotter has the potential to be a useful and flexible emergency communication channel for motor-and speech-impaired individuals.  ...  For the call-for-assistance use case, this is particularly important at nighttime.  ... 
doi:10.21437/interspeech.2021-330 dblp:conf/interspeech/CaiLSGBNS21 fatcat:zojx4cwqezd7djo6thsvqadbue

Deep Spoken Keyword Spotting: An Overview

Ivan Lopez-Espejo, Zheng-Hua Tan, John Hansen, Jesper Jensen
2021 IEEE Access  
VOICE ACTIVATION OF VOICE ASSISTANTS The flagship application of (deep) KWS is the activation of C.  ...  [158], this has not been the case for KWS. 10 For example, activation of voice assistants typically takes place at home Regardless of the chosen approach, the DNN front-end and  ... 
doi:10.1109/access.2021.3139508 fatcat:i4pfpfxcpretlkbefp7owtxcti

End-to-End Transformer-Based Open-Vocabulary Keyword Spotting with Location-Guided Local Attention

Bo Wei, Meirong Yang, Tao Zhang, Xiao Tang, Xing Huang, Kyuhong Kim, Jaeyun Lee, Kiho Cho, Sung-Un Park
2021 Conference of the International Speech Communication Association  
Secondly, we calculate the existence probability of keyword by fusing the located keyword speech segment and text with local attention.  ...  Open-vocabulary keyword spotting (KWS) aims to detect arbitrary keywords from continuous speech, which allows users to define their personal keywords.  ...  We argue that the speech segment containing keyword contributes most to the detection of the existence of keyword. Other irrelevant parts of the speech may introduce interference information.  ... 
doi:10.21437/interspeech.2021-1335 dblp:conf/interspeech/WeiYZTHKLCP21 fatcat:qjlsmvsb2zhgfh4plkwkyjuuqu

Multilevel structured convolution neural network for speech keyword location and recognition: MSS‐Net

Yongliang Feng
2021 The Journal of Engineering  
When valid audio is detected, the second level is activated: the keyword detection module.  ...  The first level is the voice start and end detection module, which is responsible for detecting the start and end positions of valid audio.  ...  For example, the voice assistant "Siri" of Apple mobile phones has different research directions for voice signals in different application scenarios.  ... 
doi:10.1049/tje2.12066 fatcat:23vpttz37vd2hnnmjb32xlwlya

Call-Center Virtual Assistant Using Natural Language Processing and Speech Recognition

Andrei Vasilateanu, Razvan Ene
2018 Journal of ICT Design Engineering and Technological Science  
Call center assistance is one of the many domains of activity that could be enabled by Arti icial Intelligence. Enter Customer Recommended Interaction Software (CRIS).  ...  The dashboard contains elements such as call-category, based on detected keywords, and sentiment analysis.  ...  DIRECTION FOR FUTURE RESEARCH The next step is choosing a couple of existing knowledge bases for call-center-speci ic activity domain keywords (i.e., a knowledge base for each identi ied call nature from  ... 
doi:10.33150/jitdets-2.2.3 fatcat:5uiq6vmusbdk5ddfdoqr2hu3uq

Self-talk Discrimination in Human–Robot Interaction Situations for Supporting Social Awareness

Jade Le Maitre, Mohamed Chetouani
2013 International Journal of Social Robotics  
Being aware of the presence, activities and is fundamental for Human-Robot Interaction and assistive applications.  ...  In this paper, we describe (1) designing triadic situations for cognitive stimulation for elderly users; (2) characterizing social signals that describe social context: system directed speech (SDS) and  ...  Acknowledgements The authors would like to thank the Broca hospital for their work: Ya-Huei Wu, Christine Fassert, Victoria Cristancho-Lacroix and Anne-Sophie Rigaud.  ... 
doi:10.1007/s12369-013-0179-x fatcat:yh63fosbsna4lmwsgtz5wugggq

An end-to-end eChronicling System for Mobile Human Surveillance

Gopal Pingali, Ying-Li Tian, Shahram Ebadollahi, Jason Pelecanos, Mark Podlaseck, Harry Stavropoulos
2007 2007 IEEE Conference on Computer Vision and Pattern Recognition  
The authors would like to acknowledge the input and influence of the rest of the people in the EC-ASSIST project team at IBM, Georgia Tech, MIT, and UC Irvine.  ...  The authors specially thank Milind Naphade for providing the models trained for LSCOMLite ontology.  ...  For example, detection of an "explosive sound" or "visible fire" is an elemental event in our terminology while detection of a "fuel gas incident of type propane involving toxic release with high severity  ... 
doi:10.1109/cvpr.2007.383527 dblp:conf/cvpr/PingaliTEPPS07 fatcat:k6jbjrmltnc7hdqqzoubi2otyu

Development of a framework for a collaborative and personalised voice assistant

Sangeetha Manoharan, Parth Natu
2021 Electronic Government, an International Journal  
Based on the type of intent, the NLP unit passes to one of the two services Google Assistant or Amazon Alexa.  ...  There is a need for seamless integration of high level general purpose voice assistants such as Google Assistant and Amazon Alexa under a single framework.  ...  The algorithm explaining how a hotward detection works is given as follows: Speech detection algorithm The speech detection unit uses a speech detection algorithm, where the speech signal is sampled  ... 
doi:10.1504/eg.2021.112935 fatcat:7quvbtl7v5hwbie4fud2snwfty
« Previous Showing results 1 — 15 out of 17,987 results