Filters








305 Hits in 4.6 sec

Multi-Task Self-Supervised Learning for Disfluency Detection [article]

Shaolei Wang, Wanxiang Che, Qi Liu, Pengda Qin, Ting Liu, William Yang Wang
2020 arXiv   pre-print
To tackle the training data bottleneck, we investigate methods for combining multiple self-supervised tasks-i.e., supervised tasks where data can be collected without manual labeling.  ...  First, we construct large-scale pseudo training data by randomly adding or deleting words from unlabeled news data, and propose two self-supervised pre-training tasks: (i) tagging task to detect the added  ...  Acknowledgments We thank the anonymous reviewers for their valuable comments.  ... 
arXiv:1908.05378v2 fatcat:ypeblqfxiffg3ltdehvadwkige

Multi-Task Self-Supervised Learning for Disfluency Detection

Shaolei Wang, Wangxiang Che, Qi Liu, Pengda Qin, Ting Liu, William Yang Wang
2020 PROCEEDINGS OF THE THIRTIETH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE AND THE TWENTY-EIGHTH INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE  
To tackle the training data bottleneck, we investigate methods for combining multiple self-supervised tasks-i.e., supervised tasks where data can be collected without manual labeling.  ...  First, we construct large-scale pseudo training data by randomly adding or deleting words from unlabeled news data, and propose two self-supervised pre-training tasks: (i) tagging task to detect the added  ...  Acknowledgments We thank the anonymous reviewers for their valuable comments.  ... 
doi:10.1609/aaai.v34i05.6456 fatcat:wnay3lhrbnejrosv4uckclq2oi

Combining Self-Training and Self-Supervised Learning for Unsupervised Disfluency Detection [article]

Shaolei Wang, Zhongyuan Wang, Wanxiang Che, Ting Liu
2020 arXiv   pre-print
There have been several proposals to alleviate this issue with, for instance, self-supervised learning techniques, but they still require human-annotated corpora.  ...  Most existing approaches to disfluency detection heavily rely on human-annotated corpora, which is expensive to obtain in practice.  ...  Acknowledgments We thank the anonymous reviewers for their valuable comments.  ... 
arXiv:2010.15360v1 fatcat:a7nsj33ezbahbbxwkmlhsdivji

In the Zone: Towards Detecting Student Zoning Out Using Supervised Machine Learning [chapter]

Joanna Drummond, Diane Litman
2010 Lecture Notes in Computer Science  
Standard supervised machine learning techniques were used to create classification models, built on prosodic and lexical features.  ...  This paper explores automatically detecting student zoning out while performing a spoken learning task.  ...  Schunn and Dr. J. Moss for our data, M. Lipschultz and ITSPOKE group for comments, and NSF grant #0631930.  ... 
doi:10.1007/978-3-642-13437-1_53 fatcat:wnq4xtlcm5g4dfrms2n4t75wqa

Semi-Supervised Disfluency Detection

Feng Wang, Wei Chen, Zhen Yang, Qianqian Dong, Shuang Xu, Bo Xu
2018 International Conference on Computational Linguistics  
While the disfluency detection has achieved notable success in the past years, it still severely suffers from the data scarcity.  ...  To tackle this problem, we propose a novel semi-supervised approach which can utilize large amounts of unlabelled data.  ...  Acknowledgements The research work is supported by the National Key Research and Development Program of China under Grant No. 2017YFB1002102 and the NSFC project 61702514.  ... 
dblp:conf/coling/WangCYDXX18 fatcat:hjq25zin5zhpbdyrvvxagvxqei

Machine Learning for Stuttering Identification: Review, Challenges and Future Directions [article]

Shakeel Ahmad Sheikh and Md Sahidullah and Fabrice Hirsch and Slim Ouni
2022 arXiv   pre-print
In this paper, we review comprehensively acoustic features, statistical and deep learning based stuttering/disfluency classification methods.  ...  Stuttering identification is an interesting interdisciplinary domain research problem which involves pathology, psychology, acoustics, and signal processing that makes it hard and complicated to detect  ...  Experiments presented in this paper were carried out using the Grid'5000 testbed, supported by a scientific interest group hosted by Inria and including CNRS, RENATER and several universities as well as  ... 
arXiv:2107.04057v4 fatcat:ifgl6z5f45getdmzp3rve6bvce

Multimodal explainable AI predicts upcoming speech behavior in adults who stutter

Arun Das, Jeffrey Mock, Farzan Irani, Yufei Huang, Peyman Najafirad, Edward Golob
2022 Frontiers in Neuroscience  
An explainable self-supervised multimodal architecture learned the temporal dynamics of both EEG and facial muscle movements during speech preparation in AWS, and predicted fluent or stuttered speech at  ...  The self-supervised architecture successfully identified multimodal activity that predicted upcoming behavior on a trial-by-trial basis.  ...  Currently, self-supervised learning is used to learn temporal correspondences in videos (Tschannen et al., 2020) , disfluency detection from text to improve annotation (Wang et al., 2019) , and has many  ... 
doi:10.3389/fnins.2022.912798 fatcat:3x43ijcjafctjb4w6magwkd5qu

Analysing the potential of seq-to-seq models for incremental interpretation in task-oriented dialogue [article]

Dieuwke Hupkes, Sanne Bouwmeester, Raquel Fernández
2018 arXiv   pre-print
We investigate how encoder-decoder models trained on a synthetic dataset of task-oriented dialogues process disfluencies, such as hesitations and self-corrections.  ...  Using visualisation and diagnostic classifiers, we analyse the representations that are incrementally built by the model, and discover that models develop little to no awareness of the structure of disfluencies  ...  The presence of editing terms is not reliably identifiable given the hidden layer activations of a model (37.3% and 55.7% precision for self-corrections and restarts, respectively), which is surprising  ... 
arXiv:1808.09178v1 fatcat:aj27r34qgrd2rb6iez2qo6f26u

Analysing the potential of seq-to-seq models for incremental interpretation in task-oriented dialogue

Dieuwke Hupkes, Sanne Bouwmeester, Raquel Fernández
2018 Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP  
We investigate how encoder-decoder models trained on a synthetic dataset of task-oriented dialogues process disfluencies, such as hesitations and self-corrections.  ...  Using visualisations and diagnostic classifiers, we analyse the representations that are incrementally built by the model, and discover that models develop little to no awareness of the structure of disfluencies  ...  The presence of editing terms is not reliably identifiable given the hidden layer activations of a model (37.3% and 55.7% precision for self-corrections and restarts, respectively), which is surprising  ... 
doi:10.18653/v1/w18-5419 dblp:conf/emnlp/HupkesBF18 fatcat:liwfuimz4zgsrovlj2ejomafqm

Multi-View Semi-Supervised Learning for Dialog Act Segmentation of Speech

U. Guz, S. Cuendet, D. Hakkani-Tur, G. Tur
2010 IEEE Transactions on Audio, Speech, and Language Processing  
We especially focus on two semi-supervised learning approaches, namely, self-training and co-training.  ...  Furthermore, we propose another method, called self-combined, which is a combination of self-training and co-training.  ...  Zimmerman, and M. Magimai Doss for many helpful discussions.  ... 
doi:10.1109/tasl.2009.2028371 fatcat:wpldl6le25g67e6ey4unc2a6ma

Alzheimer's Dementia Recognition From Spontaneous Speech Using Disfluency and Interactional Features

Shamila Nasreen, Morteza Rohanian, Julian Hough, Matthew Purver
2021 Frontiers in Computer Science  
Detecting diagnostic biomarkers that are noninvasive and cost-effective is of great value not only for clinical assessments and diagnostics but also for research purposes.  ...  Our feature analysis comprised two sets: disfluency features, including indicators such as self-repairs and fillers, and interactional features, including overlaps, turn-taking behavior, and distributions  ...  Deletes We automatically annotated self-repairs using a deep-learningdriven model of incremental detection of disfluency developed by and Hough and Schlangen (2017) . 1 It consists of a deep learning  ... 
doi:10.3389/fcomp.2021.640669 fatcat:pt5q3didpvaj5jpjkfa7iurtia

Analyzing the Robustness of Unsupervised Speech Recognition [article]

Guan-Ting Lin, Chan-Jan Hsu, Da-Rong Liu, Hung-Yi Lee, Yu Tsao
2022 arXiv   pre-print
Unsupervised speech recognition (unsupervised ASR) aims to learn the ASR system with non-parallel speech and text corpus only.  ...  Wav2vec-U has shown promising results in unsupervised ASR by self-supervised speech representations coupled with Generative Adversarial Network (GAN) training, but the robustness of the unsupervised ASR  ...  We thank National Center for High-performance Computing (NCHC) of National Applied Research Laboratories (NARLabs) in Taiwan for providing computational and storage resources.  ... 
arXiv:2110.03509v5 fatcat:vixfqifhn5d75n22dniimdqmeu

Unpaired Cross-lingual Image Caption Generation with Self-Supervised Rewards [article]

Yuqing Song, Shizhe Chen, Yida Zhao, Qin Jin
2019 arXiv   pre-print
In this paper, we propose to generate cross-lingual image captions with self-supervised rewards in the reinforcement learning framework to alleviate these two types of errors.  ...  We employ self-supervision from mono-lingual corpus in the target language to provide fluency reward, and propose a multi-level visual semantic matching model to provide both sentence-level and concept-level  ...  [25] except that we use self-supervision with respect to fluency and relevancy as rewards for model learning.  ... 
arXiv:1908.05407v1 fatcat:6s56mckd35d2bjhpm7o2itdjla

Introducing ECAPA-TDNN and Wav2Vec2.0 Embeddings to Stuttering Detection [article]

Shakeel Ahmad Sheikh, Md Sahidullah, Fabrice Hirsch, Slim Ouni
2022 arXiv   pre-print
The adoption of advanced deep learning (DL) architecture in stuttering detection (SD) tasks is challenging due to the limited size of the available datasets.  ...  After extracting the embeddings, we benchmark with several traditional classifiers, such as a k-nearest neighbor, Gaussian naive Bayes, and neural network, for the stuttering detection tasks.  ...  Wav2Vec2.0 contextual embeddings The WaveVec2.0 model is a self-supervised representation learning framework of raw audio, and is comprised of three modules including feature encoder f : X → Z, contextual  ... 
arXiv:2204.01564v1 fatcat:rx6re6p5erh7tl7co7w4rzpyuy

Automated Curriculum Learning for Turn-level Spoken Language Understanding with Weak Supervision [article]

Hao Lang, Wen Wang
2019 arXiv   pre-print
Furthermore, considering the diversity of problem complexity, we explore automated curriculum learning (CL) for weak supervision to accelerate exploration and learning.  ...  We propose a learning approach for turn-level spoken language understanding, which facilitates a user to speak one or more utterances compositionally in a turn for completing a task (e.g., voice ordering  ...  Traditionally, SLU is performed on sentences generated by voice activity detection on user queries.  ... 
arXiv:1906.04291v1 fatcat:jro2hpn2gjatrlimyqcwv7xlne
« Previous Showing results 1 — 15 out of 305 results