Filters








263,137 Hits in 8.4 sec

A Comparative Study of Parameter Estimation Methods for Statistical Natural Language Processing

Jianfeng Gao, Galen Andrew, Mark Johnson, Kristina Toutanova
2007 Annual Meeting of the Association for Computational Linguistics  
This paper presents a comparative study of five parameter estimation algorithms on four NLP tasks.  ...  Our experiments show that across tasks, three of the estimators -ME estimation with L 1 or L 2 regularization, and APare in a near statistical tie for first place.  ...  While recent studies claim advantages for L 1 regularization, this study is the first of which we are aware to systematically compare it to a range of estimators on a diverse set of NLP tasks.  ... 
dblp:conf/acl/GaoAJT07 fatcat:wkd6dgqsovbohmjlxac4tyuyjy

Comparison Of Arrival Models In The Context Of Urban Transport

Endri Raço, Shpëtim Leka
2012 Social and Natural Sciences Journal  
To evaluate the performance of these models visual and analytical methods are used in this study. The simulation of these processes is made possible using the power of R language.  ...  The process of modeling a random process requires a careful analysis and a correct interpretation of the behavior of the process.  ...  Ÿ Computer simulation of processes was done using estimated parameters for distributions.  ... 
doi:10.12955/snsj.v6i0.312 fatcat:lekobxtpp5amnktqoihbv3w4ku

Hierarchical pitman-yor language model for information retrieval

Saeedeh Momtazi, Dietrich Klakow
2010 Proceeding of the 33rd international ACM SIGIR conference on Research and development in information retrieval - SIGIR '10  
The Pitman-Yor process creates a power-law distribution which is one of the statistical properties of word frequency in natural language.  ...  In this paper, we propose a new application of Bayesian language model based on Pitman-Yor process for information retrieval. This model is a generalization of the Dirichlet distribution.  ...  Yor language model for the document retrieval task, and compare this approach with the state-of-the-art smoothing methods widely studied for language model-based information retrieval.  ... 
doi:10.1145/1835449.1835619 dblp:conf/sigir/MomtaziK10 fatcat:kcqypr4rrbbyvaxt6e77whra5y

LOW FOOTPRINT HIGH INTELLIGIBILITY MALAY SPEECH SYNTHESIZER BASED ON STATISTICAL DATA

Yong
2014 Journal of Computer Science  
Statistical parametric method was utilized in this study. The database was constructed to be balanced with all the phonetic sample appeared in Malay language.  ...  In conclusion, a Malay language speech synthesizer was designed using statistical parametric method with hidden Markov model. The output speech was verified to be good in quality.  ...  So in this study, we built a Malay language speech synthesizer based on statistical parametric method (Zen et al., 2009) using Hidden Markov Model (HMM) .  ... 
doi:10.3844/jcssp.2014.316.324 fatcat:6o3df5m2yzhvvdj5vcksoy2nr4

Is the Best Better? Bayesian Statistical Model Comparison for Natural Language Processing [article]

Piotr Szymański, Kyle Gorman
2020 arXiv   pre-print
Recent work raises concerns about the use of standard splits to compare natural language processing models.  ...  We propose a Bayesian statistical model comparison technique which uses k-fold cross-validation across multiple data sets to estimate the likelihood that one model will outperform the other, or that the  ...  Acknowledgments We would like to thank Steve Bedrick for previous work on this topic.  ... 
arXiv:2010.03088v1 fatcat:lzjhrfxse5dgdcojgcqzux4ezy

Statistical Parametric Speech Synthesis of Malay Language using Found Training Data

Lau Chee Yong, Tan Tian Swee
2014 Research Journal of Applied Sciences Engineering and Technology  
Statistical parametric speech synthesis method applying Hidden Markov Model (HMM) has been used. To test the reliability of synthetic speech, perceptual test has been conducted.  ...  The preparation of training data for statistical parametric speech synthesis can be sophisticated. To ensure the good quality of synthetic speech, high quality low noise recording must be prepared.  ...  ACKNOWLEDGMENT The authors would like to thank IJN for their professional opinions and involvement, Ministry of Higher Education (MOHE), Universiti Teknologi Malaysia (UTM) and UTM Research Management  ... 
doi:10.19026/rjaset.7.910 fatcat:r736g6ancbedvn6nhyoh2umnhe

Re-evaluating phoneme frequencies [article]

Jayden L. Macklin-Cordes, Erich R. Round
2020 Frontiers in Psychology   accepted
We compare these new insights the kinds of causal processes that affect the evolution of phonemic inventories over time, and identify a potential account for why, despite there being an important role  ...  We infer the fit of power laws and three alternative distributions to 166 Australian languages, using a maximum likelihood framework.  ...  et al., 2009, 680).8 Maximum likelihood estimation (MLE) is a method for estimating the parameters in a statistical model, given some set of observations by finding the set of parameter values,θ, that  ... 
doi:10.3389/fpsyg.2020.570895 pmid:33329209 pmcid:PMC7714923 arXiv:2006.05206v2 fatcat:2pcni4vcqra5tdrs6oj5mcdlny

Estimating Predictive Rate–Distortion Curves via Neural Variational Inference

Michael Hahn, Richard Futrell
2019 Entropy  
Based on the results, we argue that the Predictive Rate–Distortion curve is more useful than the usual notion of statistical complexity for characterizing highly complex processes such as natural language  ...  Existing estimation methods for this curve work by clustering finite sequences of observations or by utilizing analytically known causal states.  ...  Estimating Predictive Rate-Distortion for Natural Language We consider the problem of estimating rate-distortion for natural language.  ... 
doi:10.3390/e21070640 pmid:33267354 pmcid:PMC7515133 fatcat:kbnidve5v5aknmz5yumzdrl34i

Information Theory and Language

Łukasz Dębowski, Christian Bentz
2020 Entropy  
Human language is a system of communication. Communication, in turn, consists primarily of information transmission [...]  ...  Acknowledgments: We express our thanks to the authors of the above contributions, the reviewers for their feedback on the manuscripts, and to the journal Entropy and MDPI for their support during this  ...  Conflicts of Interest: The authors declare no conflict of interest.  ... 
doi:10.3390/e22040435 pmid:33286209 fatcat:me5ui7eginbsfl4663jyrprzle

Post-Processing Using Speech Enhancement Techniques for Unit Selection and Hidden Markov Model Based Low Resource Language Marathi Text-to-Speech System

Sangramsing Kayte, Monica Mundada
2018 The 6th Intl. Workshop on Spoken Language Technologies for Under-Resourced Languages  
In this paper, we first introduce the background and the details of the proposed method for low resource Marathi language.  ...  The primary aim of the study is to improve the quality of speech after synthesizing voice employing USS and HMM methods for building low resource Marathi TTS using speech enhancement techniques.  ...  Bharti Gawali Department of Computer Science and IT, Dr. Babasaheb Ambedkar Marathwada University, Aurangabad for providing the guidance for the research.  ... 
doi:10.21437/sltu.2018-20 dblp:conf/sltu/KayteM18 fatcat:wvwmwm6ywjdizgfbom7xj7hbsa

Page 52 of Library & Information Science Abstracts Vol. , Issue 6 [page]

1995 Library & Information Science Abstracts  
Part of a special issue devoted to the integration of natural language and vision processing.  ...  Suggests a taxonomy of user characteristics for such studies, in order to make results comparable.  ... 

Investigation of Effects of Different Synthesis Unit to the Quality of Malay Synthetic Speech

Lau Chee Yong, Tan Tian Swee, Mohd Nizam Mazenan
2014 Research Journal of Applied Sciences Engineering and Technology  
In this study, another type of synthesis unit is introduced which is letter. In Malay language, the unit size of letter is smaller than phoneme.  ...  Using letter as the synthesis unit is recommended because it excludes the dependency of linguist and expands the idea of language independent front end text processing.  ...  ACKNOWLEDGMENT The authors would like to thank IJN for their professional opinions and involvement, Ministry of Higher Education (MOHE), Universiti Teknologi Malaysia (UTM) and UTM Research Management  ... 
doi:10.19026/rjaset.7.737 fatcat:lqb4qw24kjbdjnfzobufffbzza

A study of Poisson query generation model for information retrieval

Qiaozhu Mei, Hui Fang, ChengXiang Zhai
2007 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval - SIGIR '07  
Many variants of language models have been proposed for information retrieval.  ...  In this paper, we propose and study a new family of query generation models based on Poisson distribution.  ...  ACKNOWLEDGMENTS We thank the anonymous SIGIR 07 reviewers for their useful comments.  ... 
doi:10.1145/1277741.1277797 dblp:conf/sigir/MeiFZ07 fatcat:kma7msknn5g2zc6nfnvvplhnni

Hidden Markov Model based Speech Synthesis: A Review

Sangramsing Kayte, Monica Mundada, Jayesh Gujrathi
2015 International Journal of Computer Applications  
A Text-to-speech (TTS) synthesis system is the artificial production of human system.  ...  The HTS is based on the generation of an optimal parameter sequence from subword HMMs. The quality of HTS framework relies on the accurate description of the phoneset.  ...  Currently, the statistical parametric speech synthesis has been the most rigorously studied approach for speech synthesis.  ... 
doi:10.5120/ijca2015906965 fatcat:xrxlqnbgqzad3dxdboft4vysmy

Review of Stochastic POS Tagging Techniques used in Bengali

Abul KalamMd.RajibHasan
2014 International Journal of Computer Applications  
In this paper, we describe different stochastic methods or techniques used for POS tagging of Bengali language. We have shown a generalized stochastic model for POS tagging in Bengali.  ...  We reviewed kinds of corpus and number of tags used for tagging methods. In the study it is found that as many as 45 useful tags existed in the literature.  ...  In Bengali large training corpora is rare .So in near future we must build large Bengali corpus for Natural Language Processing (NLP) task.  ... 
doi:10.5120/17838-8724 fatcat:g7qoks57j5ghlon3inzqpr5b6u
« Previous Showing results 1 — 15 out of 263,137 results