Filters








9,562 Hits in 3.5 sec

Cluster Analysis for Internet Public Sentiment in Universities by Combining Methods

Na Zheng, Jie Yu Wu
2018 International Journal of Recent Contributions from Engineering, Science & IT  
A clustering method based on the Latent Dirichlet Allocation and the VSM model to compute the text similarity is presented.  ...  The Latent Dirichlet Allocation subject models and the VSM vector space model weights strategy are used respectively to calculate the text similarity.  ...  The Latent Dirichlet Allocation is a kind of the most common model.  ... 
doi:10.3991/ijes.v6i3.9670 fatcat:nc7vv2aywbgnnoredlmyol23y4

Topic modeling

Hanna M. Wallach
2006 Proceedings of the 23rd international conference on Machine learning - ICML '06  
Dirichlet bigram language model.  ...  On two data sets, each of 150 documents, the new model exhibits better predictive accuracy than either a hierarchical Dirichlet bigram language model or a unigram topic model.  ...  Acknowledgments Thanks to Phil Cowans, David MacKay and Fernando Pereira for useful discussions. Thanks to Andrew Suffield for providing sparse matrix code.  ... 
doi:10.1145/1143844.1143967 dblp:conf/icml/Wallach06 fatcat:qm2xh73naveubnenm4li44xvkq

A Nonparametric Bayesian Technique for High-Dimensional Regression [article]

Subharup Guha, Veerabhadran Baladandayuthapani
2016 arXiv   pre-print
Poisson-Dirichlet processes are utilized to detect lower-dimensional latent clusters of covariates.  ...  This paper proposes a nonparametric Bayesian framework called VariScan for simultaneous clustering, variable selection, and prediction in high-throughput regression settings.  ...  The data are allowed to direct the choice between a class of PDPs and their special case, a Dirichlet process, for selecting a suitable allocation scheme for the covariates.  ... 
arXiv:1604.03615v1 fatcat:ve3qbix3ujfetfqfg6cfsphnuq

Hierarchical Latent Word Clustering [article]

Halid Ziya Yerebakan, Fitsum Reda, Yiqiang Zhan, Yoshihisa Shinagawa
2016 arXiv   pre-print
This paper presents a new Bayesian non-parametric model by extending the usage of Hierarchical Dirichlet Allocation to extract tree structured word clusters from text data.  ...  Acknowledgments We would like to thank Ferit Akova for insightful suggestions.  ...  Latent Dirichlet Allocation (LDA) [2] is an important milestone in this area. LDA uses a Dirichlet prior on Probabilistic Latent Semantic Indexing (PLSI) [4] to avoid overfitting.  ... 
arXiv:1601.05472v1 fatcat:4lzd3g6as5hkfaq6ydxssc2jye

Latent IBP Compound Dirichlet Allocation

Cedric Archambeau, Balaji Lakshminarayanan, Guillaume Bouchard
2015 IEEE Transactions on Pattern Analysis and Machine Intelligence  
We believe that our sampler is simpler than previously proposed samplers for sparse topic models.  ...  We are currently empirically evaluating the performance (perplexity, sparsity, number of topics for the nonparametric version) of the different benchmark data sets including the 20 Newsgroups and Reuters  ...  Latent IBP compound Dirichlet Allocation (LIDA) We obtain the latent IBP compound Dirichlet allocation (LIDA) model by replacing the Dirichlet prior in LDA by a truncated IBP compound Dirichlet prior.  ... 
doi:10.1109/tpami.2014.2313122 pmid:26353244 fatcat:hz323b2vtffz3ht65gcipmu2im

Nonparametric Variable Selection, Clustering and Prediction for High-Dimensional Regression [article]

Subharup Guha, Veerabhadran Baladandayuthapani
2016 arXiv   pre-print
The data are permitted to direct the choice of a suitable cluster allocation scheme, choosing between PDPs and their special case, a Dirichlet process.  ...  We propose an efficient, nonparametric framework for simultaneous variable selection, clustering and prediction in high-throughput regression settings with continuous or discrete outcomes, called VariScan  ...  The data are allowed to direct the choice between a class of PDPs and their special case, a Dirichlet process, for selecting a suitable allocation scheme for the covariates.  ... 
arXiv:1407.5472v3 fatcat:fjd7sy7rarbargn27vdlrchr4e

TOPIC MODELING IN CLINICAL REPORTS - A SURVEY

Ponmalar R, Ponnarasi D, Sangeetha A, Kingsy Grace R
2020 International journal of advanced information and communication technology  
Topic modeling is also a frequently used text-mining tool for discovery of hidden semantic structures in a text body.  ...  It may be loosely characterized as the process of analyzing text to extract information that is useful for particular purposes.  ...  The machine learning tool Latent Dirichlet Allocation (LDA) is used for topic modeling.  ... 
doi:10.46532/ijaict-2020002 fatcat:6wrgf4l6bbbmben345oxiipnqm

Supervised Bayesian Statistical Learning to Identify Prognostic Risk Factor Patterns from Population Data

Colin J Crooks
2020 Studies in Health Technology and Informatics  
Current methods for building risk models assume averaged uniform effects across populations.  ...  Bayesian statistical learning model -Survival Supervised Topic Modelling Blei et al developed the initial Latent Dirichlet allocation method for categorising documents based on associating word frequency  ...  To reduce this an asymmetric prior for document level topic distributions was learnt from the word topic allocations [5] .  ... 
doi:10.3233/shti200195 pmid:32570419 fatcat:z5hokogcerct7c5zpnn3thwc7m

Predicting Phenotypes from Brain Connection Structure [article]

Subharup Guha, Rex Jung, David Dunson
2022 arXiv   pre-print
A spike-and-slab prior for the cluster predictors strikes a balance between regression model parsimony and flexibility, resulting in improved inferences and test case predictions.  ...  The Bayesian Connectomics (BaCon) model class utilizes Poisson-Dirichlet processes to find a lower-dimensional, bidirectional (covariate, subject) pattern in the adjacency matrix.  ...  Latent vector elements. The PDP prior specification is completed by a base distribution in {0, 1} n for each binary latent vector.  ... 
arXiv:1910.02506v3 fatcat:sj6d4ezkrjdujhe2lvil36crk4

Latent Dirichlet learning for document summarization

Ying-Lang Chang, Jen-Tzung Chien
2009 2009 IEEE International Conference on Acoustics, Speech and Signal Processing  
The sentence-based latent Dirichlet allocation (SLDA) is accordingly established for document summarization.  ...  This paper presents a new hierarchical representation of words, sentences and documents in a corpus, and infers the Dirichlet distributions for latent topics and latent themes in word level and sentence  ...  The sentences with high index values were selected. Latent Dirichlet allocation More attractively, Blei et al. [1] presented the latent Dirichlet allocation (LDA) for document representation.  ... 
doi:10.1109/icassp.2009.4959927 dblp:conf/icassp/ChangC09 fatcat:i7iflhvfonbrrcwof5sbj7d6wq

Partially labeled topic models for interpretable text mining

Daniel Ramage, Christopher D. Manning, Susan Dumais
2011 Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining - KDD '11  
In this paper, we present two new partially supervised generative models of labeled text, Partially Labeled Dirichlet Allocation (PLDA) and the Partially Labeled Dirichlet Process (PLDP).  ...  Effective text mining in this setting requires models that can flexibly account for the textual patterns that underlie the observed labels while still discovering unlabeled topics.  ...  j ∈ Λ d when selecting a latent topic for each word.  ... 
doi:10.1145/2020408.2020481 dblp:conf/kdd/RamageMD11 fatcat:iksnzhsbofctffqfy43l6ma2d4

Topic Modeling: A Comprehensive Review

Pooja Kherwa, Poonam Bansal
2018 EAI Endorsed Transactions on Scalable Information Systems  
It includes classification hierarchy, Topic modelling methods, Posterior Inference techniques, different evolution models of latent Dirichlet allocation (LDA) and its applications in different areas of  ...  It is a statistical technique for revealing the underlying semantic structure in large collection of documents.  ...  ) and LDA (Latent Dirichlet Allocation).  ... 
doi:10.4108/eai.13-7-2018.159623 fatcat:lu6al57vp5aahbytyejhqrlzry

A Zero-Inflated Latent Dirichlet Allocation Model for Microbiome Studies

Rebecca A. Deek, Hongzhe Li
2021 Frontiers in Genetics  
In this paper, we introduce a zero-inflated Latent Dirichlet Allocation model (zinLDA) for sparse count data observed in microbiome studies. zinLDA builds on the flexible Latent Dirichlet Allocation model  ...  and allows for zero inflation in observed counts.  ...  ACKNOWLEDGMENTS We thank the participants of the American Gut Project for sharing their data.  ... 
doi:10.3389/fgene.2020.602594 pmid:33552122 pmcid:PMC7862749 fatcat:jmvkixywhvhdfd2426uy4zgodi

Semantic Pattern Detection in COVID-19 Using Contextual Clustering and Intelligent Topic Modeling

Pooja Kherwa, Poonam Bansal
2022 International Journal of E-Health and Medical Communications (IJEHMC)  
For intelligent topic modeling, semantic collocations using pointwise mutual information(PMI) and log frequency biased mutual dependency(LBMD) are selected and latent dirichlet allocation is applied.  ...  For contextual clustering, three level weights at term level, document level, and corpus level are used with latent semantic analysis.  ...  modeling technique called Latent Dirichlet Allocation and Latent semantic analysis are used.  ... 
doi:10.4018/ijehmc.20220701.oa7 fatcat:cao3h5jmzvcvdfh63233quzzli

GPLDA: A Generalized Poisson Latent Dirichlet Topic Model

Ibrahim Bakari Bala, Mohd Zainuri
2019 International Journal of Advanced Computer Science and Applications  
The earliest modification of Latent Dirichlet Allocation (LDA) in terms of words or document attributes is by relaxing its exchangeability assumption via the Bag-of-word (BoW) matrix.  ...  Generalized Poisson Latent Dirichlet Allocation Model (GPLDA) The GPLDA assumes the same structure as LDA except for the change in document length distribution.  ...  After the selection of appropriate prior hyperparameters and for a document , a conditional distribution of topics with parameter is formed and it is assumed to be multinomially distributed from the Dirichlet  ... 
doi:10.14569/ijacsa.2019.0101253 fatcat:mxr5lh3r6fcubdt6ypgkzlqofy
« Previous Showing results 1 — 15 out of 9,562 results