7 Hits in 7.9 sec

Improving Non-native Word-level Pronunciation Scoring with Phone-level Mixup Data Augmentation and Multi-source Information [article]

Kaiqi Fu, Shaojun Gao, Kai Wang, Wei Li, Xiaohai Tian, Zejun Ma
2022 arXiv   pre-print
In this paper, we propose a phone-level mixup, a simple yet effective data augmentation method, to improve the performance of word-level pronunciation scoring.  ...  Moreover, we utilize multi-source information (e.g., MFCC and deep features) to further improve the scoring system performance.  ...  Effect of Multi-Source Information for Scoring Then, we examined word-level pronunciation scoring performance over different input, e.g. MFCC, deep and multi-source feature.  ... 
arXiv:2203.01826v1 fatcat:rqbkvjltmvel5aoelngfxrucpm

NL-Augmenter: A Framework for Task-Sensitive Natural Language Augmentation [article]

Kaustubh D. Dhole, Varun Gangal, Sebastian Gehrmann, Aadesh Gupta, Zhenhao Li, Saad Mahamood, Abinaya Mahendiran, Simon Mille, Ashish Srivastava, Samson Tan, Tongshuang Wu, Jascha Sohl-Dickstein (+113 others)
2021 arXiv   pre-print
Data augmentation is an important component in the robustness evaluation of models in natural language processing (NLP) and in enhancing the diversity of the data they are trained on.  ...  In this paper, we present NL-Augmenter, a new participatory Python-based natural language augmentation framework which supports the creation of both transformations (modifications to the data) and filters  ...  Each word with a to negative sentiment data. For non-labelled data, common misspelling is replaced by the version it adds neutral smileys.  ... 
arXiv:2112.02721v1 fatcat:uqizuxc4wzgxnnfsc6azh6ckpq

A Roadmap for Big Model [article]

Sha Yuan, Hanyu Zhao, Shuai Zhao, Jiahong Leng, Yangxiao Liang, Xiaozhi Wang, Jifan Yu, Xin Lv, Zhou Shao, Jiaao He, Yankai Lin, Xu Han (+88 others)
2022 arXiv   pre-print
We introduce 16 specific BM-related topics in those four parts, they are Data, Knowledge, Computing System, Parallel Training System, Language Model, Vision Model, Multi-modal Model, Theory&Interpretability  ...  In this paper, we cover not only the BM technologies themselves but also the prerequisites for BM training and applications with BMs, dividing the BM review into four parts: Resource, Models, Key Technologies  ...  benchmark framework and multi-level scoring strategy.  ... 
arXiv:2203.14101v4 fatcat:rdikzudoezak5b36cf6hhne5u4

E-LEARNING IN 21st CENTURY [article]

Editor-Wakil Kumal Yadab, Editor-Jitendranath Gorai, Editor-M R. Rahul, Editor-Ms Seema Shukla, Editor-Mr. Mridul C Mrinal, Editor-Mr. Mujammil Pasha, Editor-Mrs. K. P. Jaiyasri
2021 Zenodo  
The author of this book is solely responsible and liable for its content including but not limited to the views, representations, descriptions, statements, information, opinions and references [―Content  ...  This book has been published with all reasonable efforts taken to make the material error-free after the consent of the author.  ...  Parents: Due to lockdown people working in the informal sector lost their job and no source of income leads to non-payment of fees.  ... 
doi:10.5281/zenodo.5506446 fatcat:xcbnkhip25eq7epb2s7jupuu2q

Deep audio-visual speech recognition

Pingchuan Ma, Maja Pantic, Honda Gijutsu Kenkyūjo
We then propose the addition of prediction-based auxiliary tasks to a VSR model and highlight the importance of hyper-parameter optimisation and appropriate data augmentations.  ...  Specifically, we train a ResNet+Conformer model to predict acoustic features from unlabelled visual speech, and find that this pre-trained model can be leveraged towards word-level and sentence-level lip-reading  ...  In this chapter, four commonly-used data augmentation techniques are examined, including Random Cropping, Flipping, Mixup, and Time Masking.  ... 
doi:10.25560/99007 fatcat:7fctszjpujb5ji3euzbqlwmxam

The Seventh International Conference on Creative Content Technologies

Hans-Werner Sehring, Jaime Mauri, Jalel Ben-Othman, Zhou Su, Waseda University, Japan, Lorena Parra, Javier Quevedo-Fernandez, Jalel Ben-Othman, Zhou Su, Waseda University, Japan (+43 others)
Forward The Seventh International Conference on Creative Content Technologies (CONTENT 2015), held between   unpublished
Special processing challenges occur when dealing with social, graphic content, animation, speech, voice, image, audio, data, or image contents.  ...  Multi-cast and uni-cast content distribution, content localization, on-demand or following customer profiles are common challenges for content producers and distributors.  ...  ACKNOWLEDGMENT We would like to thank Ahlia University for providing the support and our students and their English teacher for working on developing and testing some of our research ideas.  ... 

Making Things Together: The Island & The Valley, Selves & Software, Here & There

Kaiton Williams
This dissertation explores the work of startup tech entrepreneurs in Jamaica, and how, through intertwined strategies, they craft software and self using design and development methods understood as emerging  ...  I also offer a profound thank you to my other committee members, Marina Welker and Steven Jackson, who provided not just critical reorientations but new directions and opportunities for my scholarship,  ...  Her unwavering dedication to me and to my work often eclipsed my own. My route has been outside the usual and  ... 
doi:10.7298/x4rr1wfn fatcat:cx722yzk35drvacaxjsiu2uvbi