Filters








331 Hits in 2.8 sec

Stochastic Grammatical Inference with Multinomial Tests [chapter]

Christopher Kermorvant, Pierre Dupont
2002 Lecture Notes in Computer Science  
Improvement over classical stochastic grammatical inference algorithm is shown on artificial data.  ...  We present a new statistical framework for stochastic grammatical inference algorithms based on a state merging strategy.  ...  Introduction The aim of stochastic regular grammatical inference is to learn a stochastic regular language from examples, mainly through learning the structure of a stochastic finite state automaton and  ... 
doi:10.1007/3-540-45790-9_12 fatcat:qzy76uv6dzccbh35azggixmnna

Improvement of the State Merging Rule on Noisy Data in Probabilistic Grammatical Inference [chapter]

Amaury Habrard, Marc Bernard, Marc Sebban
2003 Lecture Notes in Computer Science  
In this paper we study the influence of noise in probabilistic grammatical inference. We paradoxically bring out the idea that specialized automata deal better with noisy data than more general ones.  ...  We propose then to replace the statistical test of the Alergia algorithm by a more restrictive merging rule based on a test of proportion comparison.  ...  Acknowledgements The authors wish to thank Christopher Kermorvant for his help and for having allowed us to easily compare our work with MAlergia.  ... 
doi:10.1007/978-3-540-39857-8_17 fatcat:s7ajqxsfn5cslfa2lxopf2sxi4

Some Classes of Regular Languages Identifiable in the Limit from Positive Data [chapter]

François Denis, Aurélien Lemay, Alain Terlutte
2002 Lecture Notes in Computer Science  
Tree Automata for Multi-relational Data Mining p. 120 On Sufficient Conditions to Identify Classes of Grammars from Polynomial Time and Data p. 134 Stochastic Grammatical Inference with Multinomial  ...  Tests p. 149 Learning Languages with Help p. 161 Incremental Learning of Context Free Grammars p. 174 Estimating Grammar Parameters Using Bounded Memory p. 185 Stochastic k-testable Tree Languages  ... 
doi:10.1007/3-540-45790-9_6 fatcat:nmlknwqoyfbybhb6rpomqrn7qy

HDP-HMM-SCFG: A Novel Model for Trajectory Representation and Classification

Weiguang Xu, Yafei Zhang, Jianjiang Lu, Jiabao Wang
2011 Procedia Engineering  
Trajectories are represented by stochastic grammar, where trajectory segments are considered as observations emitted by the grammar terminals, which are attached with HMMs.  ...  The label of the class with the maximum likelihood to generate the test trajectory is assigned to the test trajectory. Experiment on ASL dataset is carried on to validate our approach.  ...  Representation, Inference and Classification In our approach, a model of HMM-SCFG, combining HMM with stochastic grammar, is proposed to represent trajectory cluster.  ... 
doi:10.1016/j.proeng.2011.08.117 fatcat:e3welr5lurebpainfdkwjrv7di

Grammatical inference as a principal component analysis problem

Raphaël Bailly, François Denis, Liva Ralaivola
2009 Proceedings of the 26th Annual International Conference on Machine Learning - ICML '09  
One of the main problems in probabilistic grammatical inference consists in inferring a stochastic language, i.e. a probability distribution, in some class of probabilistic models, from a sample of strings  ...  Hence, a first step in the grammatical inference process can consist in identifying the subspace V * p .  ...  This iterative decision relies on a statistical test with a known drawback: as the structure grows, the test relies on less and less examples.  ... 
doi:10.1145/1553374.1553379 dblp:conf/icml/BaillyDR09 fatcat:6ltujjnyyfacdhh2obhhnnpr5m

Unsupervised Prediction of Acceptability Judgements

Jey Han Lau, Alexander Clark, Shalom Lappin
2015 Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)  
We use a test set generated from the British National Corpus (BNC) containing both grammatical sentences and sentences containing a variety of syntactic infelicities introduced by round trip machine translation  ...  We trained a variety of unsupervised language models on the original BNC, and tested them to see the extent to which they could predict mean speakers' judgements on the test set.  ...  To train the model, we use stochastic gradient descent combined with back propagation through time.  ... 
doi:10.3115/v1/p15-1156 dblp:conf/acl/LauCL15 fatcat:my2xcxiwzrffdpq6sn4svzxteu

Page 1087 of Mathematical Reviews Vol. , Issue 81C [page]

1981 Mathematical Reviews  
Witting, A Chernoff-Savage theory for correlation rank statistics with applications to sequential testing (pp. 85-125); C. B. Bell, F.  ...  Pflug, Stochastic approximation in time series regression (pp. 303-313); Josef Stépan, Probability measures with given expectations (pp. 315-320); Jan Amos Viek, A note on the assumptions used to find  ... 

A Margin-based Loss with Synthetic Negative Samples for Continuous-output Machine Translation

Gayatri Bhat, Sachin Kumar, Yulia Tsvetkov
2019 Proceedings of the 3rd Workshop on Neural Generation and Translation  
Neural models that eliminate the softmax bottleneck by generating word embeddings (rather than multinomial distributions over a vocabulary) attain faster training with fewer learnable parameters.  ...  We follow Kumar and Tsvetkov (2019) in using the standard development (tst2013 and tst2014) and test (tst2015 and tst2016) sets associated with the parallel corpora and in processing the data; train,  ...  development and test splits contain roughly 200K, 2300 and 2200 parallel sentences each.  ... 
doi:10.18653/v1/d19-5621 dblp:conf/emnlp/BhatKT19 fatcat:w2ncvqcfsnhg7plpxoyodh2fme

A semiparametric generative model for efficient structured-output supervised learning

Fabrizio Costa, Andrea Passerini, Marco Lippi, Paolo Frasconi
2008 Annals of Mathematics and Artificial Intelligence  
The main algorithmic idea is to replace the parameters of an underlying generative model (such as a stochastic grammars) with input-dependent predictions obtained by (kernel) logistic regression.  ...  We present a semiparametric generative model for supervised learning with structured outputs.  ...  This experiment shows that improvement can be achieved even when dealing with specific ambiguous grammatical phenomena rather than tackling the whole parsing task.  ... 
doi:10.1007/s10472-009-9137-6 fatcat:twnw6o5osrcozcl6hwtovvwwbu

Parsing Social Network Survey Data from Hidden Populations Using Stochastic Context-Free Grammars

Art F. Y. Poon, Kimberly C. Brouwer, Steffanie A. Strathdee, Michelle Firestone-Cruz, Remedios M. Lozada, Sergei L. Kosakovsky Pond, Douglas D. Heckathorn, Simon D. W. Frost, Alison P. Galvani
2009 PLoS ONE  
Methodology/Principal Findings: Here, we develop a new methodology based on stochastic context-free grammars (SCFGs), which are well-suited to modeling tree-like structure of the RDS recruitment process  ...  Such structure that has no representation in Markov chain-based models can interfere with the estimation of the composition of hidden populations if left unaccounted for, raising critical implications  ...  Our custom software implementation of SCFGs enables the user to: (i) evaluate the likelihood of any MBP model of recruitment that can be expressed in grammatical form; (ii) infer unobserved quantities  ... 
doi:10.1371/journal.pone.0006777 pmid:19738904 pmcid:PMC2734164 fatcat:5ugb264xj5g7vnfpb2wzvewomq

SenGen: Sentence Generating Neural Variational Topic Model [article]

Ramesh Nallapati, Igor Melnyk, Abhishek Kumar, Bowen Zhou
2017 arXiv   pre-print
The stochastic samples, on the other hand, are not grammatically well formed, but do contain topical words.  ...  This is clearly a simplifying assumption that makes inference tractable, and may need to be relaxed in the future.  ... 
arXiv:1708.00308v1 fatcat:wyigm7b3nvh5zhvfxxyypmp7e4

STDP Installs in Winner-Take-All Circuits an Online Approximation to Hidden Markov Model Learning

David Kappel, Bernhard Nessler, Wolfgang Maass, Henning Sprekeler
2014 PLoS Computational Biology  
In fact, one can show that these motifs endow cortical microcircuits with functional properties of a hidden Markov model, a generic model for solving such tasks through probabilistic inference.  ...  We investigate the possible performance gain that can be achieved with this more accurate learning method for an artificial grammar task.  ...  To classify grammatical against non-grammatical sequences the one-sample approximation of the log-likelihood (32) was computed for all test sequences.  ... 
doi:10.1371/journal.pcbi.1003511 pmid:24675787 pmcid:PMC3967926 fatcat:fy7nvgo235hytdkkflp72fnlva

Efficient Pruning of Probabilistic Automata [chapter]

Franck Thollard, Baptiste Jeudy
2008 Lecture Notes in Computer Science  
Applications of probabilistic grammatical inference are limited due to time and space consuming constraints.  ...  In statistical language modeling, for example, large corpora are now available and lead to managing automata with millions of states.  ...  For example, alergia [3] , acyclic-infer [2] , MDI [4] , DDSM [5] , multinomial-infer [6] have a worst case quadratic complexity.  ... 
doi:10.1007/978-3-540-89689-0_11 fatcat:oxfavoabwzedle2yqrzx5zy4ai

Modeling Child Divergences from Adult Grammar

Sam Sahakian, Benjamin Snyder
2013 Transactions of the Association for Computational Linguistics  
Our corpus consists of child sentences with corrected adult forms.  ...  We bridge the gap between these forms with a discriminatively reranked noisy channel model that translates child sentences into equivalent adult utterances.  ...  With the sole exception of word insertions, the distributions are parameterized and learned during training. Our model consists of 217 multinomial distributions, with 6,718 free parameters.  ... 
doi:10.1162/tacl_a_00215 fatcat:ygkwdk5jyvbdtd2gwu4msn4fki

Topic modelling with morphologically analyzed vocabularies

Marcus Spies
2017 Scientific Publications of the State University of Novi Pazar Series A Applied Mathematics Informatics and mechanics  
The output of morphological analysis is a string for each token composed of a sequence of base lemmata together with their grammatical analysis and POS tags.  ...  . • Each of the k mixture component distributions is a multinomial topic-term distribution (defined independently of documents). • LDA relies on parametric Bayesian inference, generating the document-topic  ... 
doi:10.5937/spsunp1701001s fatcat:2w423ygnsjdkpnygup6m6f7ioe
« Previous Showing results 1 — 15 out of 331 results