A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2020; you can also visit the original URL.
The file type is application/pdf
.
Filters
Morphological Analysis of the Corpus of Spontaneous Japanese
2004
IEEE Transactions on Speech and Audio Processing
This paper describes two methods for detecting word segments and their morphological information in a Japanese spontaneous speech corpus, and a method for accurately tagging a large spontaneous speech ...
In this paper, we show that by using semi-automatic analysis we can expect a precision of over 99% for detecting and tagging short words and 97% for long words; the two types of words comprising the corpus ...
INTRODUCTION The "Spontaneous Speech: Corpus and Processing Technology" project is sponsoring the construction of a large spontaneous Japanese speech corpus, Corpus of Spontaneous Japanese (CSJ) [1] . ...
doi:10.1109/tsa.2004.828700
fatcat:vycubdew4vd7pnqzldkv5xldry
Morphological analysis of a large spontaneous speech corpus in Japanese
2003
Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - ACL '03
This paper describes two methods for detecting word segments and their morphological information in a Japanese spontaneous speech corpus, and describes how to tag a large spontaneous speech corpus accurately ...
the corpus. ...
Introduction The "Spontaneous Speech: Corpus and Processing Technology" project is sponsoring the construction of a large spontaneous Japanese speech corpus, Corpus of Spontaneous Japanese (CSJ) (Maekawa ...
doi:10.3115/1075096.1075157
dblp:conf/acl/UchimotoNYSI03
fatcat:sqgwohghwnaopbsdnjqvuwifx4
Morphological analysis of the spontaneous speech corpus
2002
Proceedings of the 19th international conference on Computational linguistics -
unpublished
We use a morphological analysis system based on a maximum entropy model, which is independent of the domain of corpora. ...
In this paper we show the tagging accuracy achieved by using the model and discuss problems in tagging the spontaneous speech corpus. ...
Tagging the spontaneous speech corpus with morphological information such as word segmentation and parts-of-speech is one of the goals of the project. ...
doi:10.3115/1071884.1071903
fatcat:apxcquur4zfzhnfn4xmhadlbvu
Grammatically Coded Corpus Of Spoken Lithuanian: Methodology And Development
2017
Zenodo
The development of the Corpus of Spoken Lithuanian has led to the constant increase in studies on spontaneous communication, and various papers have dealt with a distribution of parts of speech, use of ...
The paper deals with the main issues of methodology of the Corpus of Spoken Lithuanian which was started to be developed in 2006. ...
Thus, when creating a balanced corpus, it was decided to collect the data of spontaneous private speech and prepared public speech, since the analysis of such data is informative and revealing not only ...
doi:10.5281/zenodo.1129916
fatcat:6422uxpygvbbpemypvu2fdwtcy
Corpora of spoken Lithuanian
2009
Eesti Rakenduslingvistika Ühingu Aastaraamat
The data are transcribed and coded according to the requirements of CHILDES. The second part of the paper presents a corpus based analysis and provides preliminary results. ...
The data of adult-directed speech, child-directed speech and child speech are analysed to reveal the frequency distribution of parts of speech. ...
However, until the end of 2006 there was no corpus of Lithuanian adult speech to provide for spontaneous adult speech analysis. ...
doi:10.5128/erya5.05
fatcat:zi66kgtrtfalbo6om4zzqgoeui
Syntactically Coded Corpus of Spoken Lithuanian: Developmental Issues and Pilot Studies
2016
Studies About Languages
First, we consider a methodology of development of the Corpus as well as the principles of transcribing and coding Lithuanian speech data. ...
Generally, we believe that future systematic corpus-based research of spontaneous spoken language will give more possibilities to identify, evaluate, and elaborate the development of the Lithuanian language ...
Later on, the Corpus was supplemented by a new data of spontaneous speech and expanded by a data of prepared speech, and thus renamed the Corpus of Spoken Lithuanian 2 . ...
doi:10.5755/j01.sal.0.28.15131
fatcat:h6wswk35tjejtpaxx5x2v2pyr4
Linguistic and Logical Tools for an Advanced Interactive Speech System in Spanish
[chapter]
2001
Lecture Notes in Computer Science
The research here presented shows work on the development of a restricted-domain spontaneous speech dialogue system in Spanish. ...
Following the morphological, syntactic and semantic analysis, the module generates a structured representation with the content of the user's intervention. ...
The morphological analysis is carried out by means of MACO+ (Morphological Analyzer Corpus Oriented [11] ), which has been adapted for the task domain. ...
doi:10.1007/3-540-45517-5_58
fatcat:h3tfqtj375dvpb5hbtify5xp7y
Morphology-based investigation of differences between spoken and written isiZulu
2021
Journal of the Digital Humanities Association of Southern Africa (DHASA)
In this paper, we present a quantitative investigation into such differences by considering the morphology of tokens in a transcribed spoken isiZulu corpus and a written isiZulu corpus. ...
This analysis presents information that could inform the development of voice-enabled computer applications for isiZulu. ...
considered as spontaneous speech. ...
doi:10.55492/dhasa.v3i01.3860
fatcat:znp34ybwqna7fbzfje7dlgwjpy
Recent Results in Speech Recognition for the Tatar Language
[chapter]
2017
Lecture Notes in Computer Science
In this paper we describe an approach to the creation of automatic speech recognition systems for the Tatar language. ...
We developed a speech analysis platform to work with under-resourced languages and used this tool to create a baseline speech recognition system. ...
Speech corpus and acoustic models The creation of the multi-speaker speech corpus for the Tatar language is currently in progress. ...
doi:10.1007/978-3-319-64206-2_21
fatcat:zgdfv3add5b33pggrah4uxjiqy
Experiments on Detection of Voiced Hesitations in Russian Spontaneous Speech
2016
Journal of Electrical and Computer Engineering
Experimental results on the mixed and quality diverse corpus of spontaneous Russian speech indicate the efficiency of the techniques for the task in question, with SVM outperforming other methods. ...
The development and popularity of voice-user interfaces made spontaneous speech processing an important research field. ...
The third part is the corpus of scientific reports from seminar devoted to analysis of conversational speech held at SPIIRAS in 2011. ...
doi:10.1155/2016/2013658
fatcat:o6ar2z7kbfhltnxmkh7e5m7btq
Affixation effects on word-final coda deletion in spontaneous Seoul Korean speech
2016
Phonetics and Speech Sciences
The Korean Corpus of Spontaneous Speech (Yun et al., 2016) showed high percentages of labeling consistency for the analysis of the present study. ...
For more details on the Korean Corpus of Spontaneous Speech, please see the corpus manual (Yun et al., 2015) . ...
doi:10.13064/ksss.2016.8.4.009
fatcat:oo4fvtuk3nghffje3dnxnmw6ae
Gossip is More than Just Story Telling Topic Modelling and Quantitative Analysis on a Spontaneous Speech Corpus
2018
European Conference on Information Retrieval
In this paper, we describe a quantitative approach to identify gossip in a large corpus containing spontaneous talk with LDA topic modeling and quantitative analysis. ...
We also analyze the topics to distinguish gossiping and storytelling by dividing gossip and non-gossip texts in our large spontaneous speech corpora. ...
For our analysis we used a unique corpus of Hungarian language which consists of approximately 550 hours of spontaneous speech. ...
dblp:conf/ecir/PapayKG18
fatcat:5jnlcmymdrhllfalzununwfhae
Spoken Tunisian Arabic Corpus "STAC": Transcription and Annotation
2015
Research in Computing Science
This paper presents the "STAC" corpus (Spoken Tunisian Arabic Corpus) of spontaneous Tunisian Arabic speech. We present our method used for the collection and the transcription of this corpus. ...
Then, we detail the different stages done to enrich the corpus with necessary linguistic and speech annotations that makes it more useful for many NLP applications. ...
The iterative procedure starts by dividing our corpus to 10 folders according to the number of sentences. We begin with a morphological analysis of the first folder of the corpus. ...
doi:10.13053/rcs-90-1-9
fatcat:vhrhp7aobna7fnb2bgqumlhysu
The present status, progress, and usage of speech databases in Japan
2005
Acoustical Science and Technology
of spontaneous speech data is available. ...
The present status, progress and usage of Japanese speech database has been described. The database project in Japan started in the early 1980s. ...
of spontaneous speech data is available. ...
doi:10.1250/ast.26.62
fatcat:f4rtbn2oovboxognl52ccktfde
The Corpus of Lithuanian Children Language: Development and application for modern studies in language acquisition
2019
Kalbotyra
The longitudinal data (conversations between the target children and their caretakers) compiled according to the requirement of natural observation includes transcribed and morphologically annotated speech ...
First of all, the procedure of data collection for the Corpus is discussed. ...
Table 3 . 3 The structure and size of the dialogue sub-corpus
Table 6 . 6 Results of the comparative analysis among the cohorts ...
doi:10.15388/kalbotyra.2018.1
fatcat:quof4nej65ef7lkn7hiyk5s46a
« Previous
Showing results 1 — 15 out of 8,806 results