Filters








2,089 Hits in 4.5 sec

Discriminative Log-Linear Grammars with Latent Variables

Slav Petrov, Dan Klein
2007 Neural Information Processing Systems  
We demonstrate that log-linear grammars with latent variables can be practically trained using discriminative methods.  ...  On full-scale treebank parsing experiments, the discriminative latent models outperform both the comparable generative latent models as well as the discriminative non-latent baselines.  ...  Conclusions and Future Work We have presented a hierarchical pruning procedure that allows efficient discriminative training of log-linear grammars with latent variables.  ... 
dblp:conf/nips/PetrovK07 fatcat:sw2dimtpjvgy3jbwswwnl3ncdu

A Discriminative Latent Variable Model for Statistical Machine Translation

Phil Blunsom, Trevor Cohn, Miles Osborne
2008 Annual Meeting of the Association for Computational Linguistics  
We present a translation model which models derivations as a latent variable, in both training and decoding, and is fully discriminative and globally optimised.  ...  We argue that a principle reason for this failure is not dealing with multiple, equivalent translations.  ...  This method has been demonstrated to be effective for (non-convex) log-linear models with latent variables (Clark and Curran, 2004; Petrov et al., 2007) .  ... 
dblp:conf/acl/BlunsomCO08 fatcat:bz4bfv33cfcizhadcgs37bvy3u

Self-Training with Products of Latent Variable Grammars

Zhongqiang Huang, Mary P. Harper, Slav Petrov
2010 Conference on Empirical Methods in Natural Language Processing  
We study self-training with products of latent variable grammars in this paper.  ...  Our generative self-trained grammars reach F scores of 91.6 on the WSJ test set and surpass even discriminative reranking systems without selftraining.  ...  In this paper, we investigate self-training with products of latent variable grammars.  ... 
dblp:conf/emnlp/HuangHP10 fatcat:hm7lgw4b55hnzcvfu33qyk7mam

Dependency Grammar Induction with a Neural Variational Transition-Based Parser

Bowen Li, Jianpeng Cheng, Yang Liu, Frank Keller
2019 PROCEEDINGS OF THE THIRTIETH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE AND THE TWENTY-EIGHTH INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE  
In this work, we propose a neural transition-based parser for dependency grammar induction, whose inference procedure utilizes rich neural features with O(n) time complexity.  ...  Dependency grammar induction is the task of learning dependency syntax without annotated training data.  ...  This model includes (1) a discriminative RNNG as the encoder for mapping the input sentence into a latent variable, which for the grammar induction task is a sequence of parse actions for building the  ... 
doi:10.1609/aaai.v33i01.33016658 fatcat:tpxashvunvgkdbcerowjl2zimi

Dependency Grammar Induction with a Neural Variational Transition-based Parser [article]

Bowen Li, Jianpeng Cheng, Yang Liu, Frank Keller
2018 arXiv   pre-print
In this work, we propose a neural transition-based parser for dependency grammar induction, whose inference procedure utilizes rich neural features with O(n) time complexity.  ...  Dependency grammar induction is the task of learning dependency syntax without annotated training data.  ...  This model includes (1) a discriminative RNNG as the encoder for mapping the input sentence into a latent variable, which for the grammar induction task is a sequence of parse actions for building the  ... 
arXiv:1811.05889v1 fatcat:pi73lhqbabfbfg44meeftpigcq

Unsupervised Discriminative Induction of Synchronous Grammar for Machine Translation

Xinyan Xiao, Deyi Xiong, Yang Liu, Qun Liu, Shouxun Lin
2012 International Conference on Computational Linguistics  
We present a global log-linear model for synchronous grammar induction, which is capable of incorporating arbitrary features.  ...  Using learned synchronous grammar rules with millions of features that contain rule level, word level and translation boundary information, we significantly outperform a competitive hierarchical phrased-based  ...  In contrast, our interest lies in using latent variable model to learn synchronous grammar directly from sentence pairs.  ... 
dblp:conf/coling/XiaoXLLL12 fatcat:s6lx7zlmrbhmzhp4c5j7b3udoi

Training Factored PCFGs with Expectation Propagation

David Hall, Dan Klein
2012 Conference on Empirical Methods in Natural Language Processing  
Using purely latent variable annotations, we can efficiently train and parse with up to 8 latent bits per symbol, achieving F1 scores up to 88.4 on the Penn Treebank while using two orders of magnitudes  ...  Our method works with linguisticallymotivated annotations, induced latent structure, lexicalization, or any mix of the three.  ...  Critically, for a latent variable parser with M annotation bits, the exact algorithm takes time exponential in M , while this approximate algorithm takes time linear in M .  ... 
dblp:conf/emnlp/HallK12 fatcat:nmbqvpn4kfb5vfsvjhesjmevjm

A Generative Parser with a Discriminative Recognition Algorithm [article]

Jianpeng Cheng, Adam Lopez, Mirella Lapata
2017 arXiv   pre-print
We propose a framework for parsing and language modeling which marries a generative model with a discriminative recognition model in an encoder-decoder setting.  ...  Generative models defining joint distributions over parse trees and sentences are useful for parsing and language modeling, but impose restrictions on the scope of features and are often outperformed by discriminative  ...  Consider a simple directed graphical model with discrete latent variables a (e.g., a is the transition action sequence) and observed variables x (e.g., x is the sentence).  ... 
arXiv:1708.00415v2 fatcat:njuk5i77drgx7bvmvr6274cxm4

A Generative Parser with a Discriminative Recognition Algorithm

Jianpeng Cheng, Adam Lopez, Mirella Lapata
2017 Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)  
We propose a framework for parsing and language modeling which marries a generative model with a discriminative recognition model in an encoder-decoder setting.  ...  Generative models defining joint distributions over parse trees and sentences are useful for parsing and language modeling, but impose restrictions on the scope of features and are often outperformed by discriminative  ...  Acknowledgments We thank three anonymous reviewers and members of the ILCC for valuable feedback, and Muhua Zhu and James Cross for help with data preparation.  ... 
doi:10.18653/v1/p17-2019 dblp:conf/acl/ChengLL17 fatcat:aymzjbsfprdwfjrkhbf3wr5nyy

Latent-Variable PCFGs: Background and Applications

Shay Cohen
2017 Proceedings of the 15th Meeting on the Mathematics of Language  
Latent-variable probabilistic context-free grammars are latent-variable models that are based on context-free grammars.  ...  Nonterminals are associated with latent states that provide contextual information during the top-down rewriting process of the grammar.  ...  Petrov and Klein (2008) extended L-PCFGs to log-linear latent grammars -this means that the rule weights learned are no longer constrained to be probabilities.  ... 
doi:10.18653/v1/w17-3405 dblp:conf/mol/Cohen17 fatcat:lojpfu5x5nathibmwb3nxle53e

Latent-Variable Pcfgs: Background And Applications

Shay Cohen
2017 Zenodo  
Latent-variable probabilistic context-free grammars are latent-variable models that are based on context-free grammars.  ...  Nonterminals are associated with latent states that provide contextual information during the top-down rewriting process of the grammar.  ...  Petrov and Klein (2008) extended L-PCFGs to log-linear latent grammars -this means that the rule weights learned are no longer constrained to be probabilities.  ... 
doi:10.5281/zenodo.827288 fatcat:ysyzmqs2irhcxm5kruq34prawy

Bayesian estimation of a multilevel multidimensional item response model using auxiliary variables method: an exploration of the correlation between multiple latent variables and covariates in hierarchical data

Jiwei Zhang, Jing Lu, Jian Tao
2019 Statistics and its Interface  
The developed Gibbs sampling algorithm based on auxiliary variables can accurately estimate the correlations among multidimensional latent traits, along with the correlation between person-and school-level  ...  covariates and latent traits.  ...  The student's latent traits are considered to be the latent outcome variables of the multilevel regression model.  ... 
doi:10.4310/sii.2019.v12.n1.a4 fatcat:uw6yx7e2mzh47j3s3cff3xxmli

Joint Assessment of the Differential Item Functioning and Latent Trait Dimensionality of Students' National Tests [article]

Michela Gnaldi, Francesco Bartolucci, Silvia Bacci
2012 arXiv   pre-print
, (ii) latent traits represented through a random vector with a discrete distribution, and (iii) the inclusion of (uniform) DIF to account for students' gender and geographical area.  ...  To this aim, we rely on an extended class of multidimensional latent class IRT models characterised by: (i) a two-parameter logistic parameterisation for the conditional probability of a correct response  ...  and (ii ) it is based on latent variables that have a discrete distribution.  ... 
arXiv:1212.0378v1 fatcat:vivpcvo3mrhnrpqbltimei2iea

StructVAE: Tree-structured Latent Variable Models for Semi-supervised Semantic Parsing

Pengcheng Yin, Chunting Zhou, Junxian He, Graham Neubig
2018 Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)  
STRUCTVAE models latent MRs not observed in the unlabeled data as treestructured latent variables.  ...  Experiments on semantic parsing on the ATIS domain and Python code generation show that with extra unlabeled data, STRUCTVAE outperforms strong supervised models. 1  ...  tree-structured latent variables (Fig. 1) .  ... 
doi:10.18653/v1/p18-1070 dblp:conf/acl/NeubigZYH18 fatcat:tfvfebqokve3dhszzhvreexkdu

Simple, Distributed, and Accelerated Probabilistic Programming [article]

Dustin Tran, Matthew Hoffman, Dave Moore, Christopher Suter, Srinivas Vasudevan, Alexey Radul, Matthew Johnson, Rif A. Saurous
2018 arXiv   pre-print
In particular, we distill probabilistic programming down to a single abstraction---the random variable.  ...  For both a state-of-the-art VAE on 64x64 ImageNet and Image Transformer on 256x256 CelebA-HQ, our approach achieves an optimal linear speedup from 1 to 256 TPUv2 chips.  ...  Random Variables Are All You Need We outline probabilistic programs in Edward2. They require only one abstraction: a random variable.  ... 
arXiv:1811.02091v2 fatcat:gzfjqqs4ujfzllyskht4v3al64
« Previous Showing results 1 — 15 out of 2,089 results