Filters








2,519 Hits in 3.8 sec

Improving Document Clustering for Short Texts by Long Documents via a Dirichlet Multinomial Allocation Model [chapter]

Yingying Yan, Ruizhang Huang, Can Ma, Liyang Xu, Zhiyuan Ding, Rui Wang, Ting Huang, Bowei Liu
2017 Lecture Notes in Computer Science  
the number of clusters; 3) separates discriminative words from irrelevant words for long documents to obtain high quality structural knowledge.  ...  In this paper, we propose a novel model, namely DDMAfs, which 1) improves the clustering performance of short texts by sharing structural knowledge of long documents to short texts; 2) automatically identifies  ...  In [9] , Jin et al. proposed the Dual Latent Dirichlet Allocation (DLDA) model which enhances short text clustering by incorporating auxiliary long texts.  ... 
doi:10.1007/978-3-319-63579-8_47 fatcat:eqkjrwpez5cfxdchzo4skwdltq

TOPIC MODELING IN CLINICAL REPORTS - A SURVEY

Ponmalar R, Ponnarasi D, Sangeetha A, Kingsy Grace R
2020 International journal of advanced information and communication technology  
Topic modeling is a form of text mining, a way of identifying patterns in a corpus. The topics produced by topic modeling techniques are clusters of similar words that are frequently occur together.  ...  Text mining is a process of converting unstructured data into meaningful data.  ...  Multilingual Topic Models for Unaligned Text [8] Latent Dirichlet Allocation models (LDA) quality Improves quality of translation 8.  ... 
doi:10.46532/ijaict-2020002 fatcat:6wrgf4l6bbbmben345oxiipnqm

Covert Video Classification by Codebook Growing Pattern

Liang Du, Haitao Lang, Ying-Li Tian, Chiu C. Tan, Jie Wu, Haibin Ling
2016 2016 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)  
In this paper, we propose a novel descriptor, codebook growing pattern (CGP), which is derived from latent Dirichlet allocation (LDA) over optical flows.  ...  The task is very challenging due to the large intra-class variation and between-class similarity, since there is no limit in the content of a covert video and it may share very similar content with a regular  ...  Codebook Growing Pattern from Latent Dirichlet Allocation The HOF representation captures only short range motion information between consecutive video frames.  ... 
doi:10.1109/cvprw.2016.178 dblp:conf/cvpr/DuLTTWL16 fatcat:3nnqc32s3vcgli36xjehwf23pm

Summarizing Text for Indonesian Language by Using Latent Dirichlet Allocation and Genetic Algorithm

Silvia ., Pitri Rukmana, Vivi Regina Aprilia, Derwin Suhartono, Rini Wongso, Meiliana .
2014 Proceeding of the Electrical Engineering Computer Science and Informatics  
The algorithm is based on sentence features scoring by using Latent Dirichlet Allocation and Genetic Algorithm for determining sentence feature weights.  ...  Best Fmeasure value is 0,556926 (with precision of 0.53448 and recall of 0.58134) and summary ratio of 30%.  ...  of Latent Dirichlet Allocation and some modifications.  ... 
doi:10.11591/eecsi.v1.364 fatcat:w7or6bi7tbckjnznikafcelcqq

The CIST Summarization System at TAC 2011

Hongyan Liu, Ping'an Liu, Wei Heng, Lei Li
2011 Text Analysis Conference  
We introduce an extractive multi-document summarization method based on hierarchical topic model of hierarchical Latent Dirichlet Allocation (hLDA) and sentence compression. hLDA is a representative generative  ...  probabilistic model, which not only can mine latent topics from a large amount of discrete text data, but also can organize these topics into a hierarchy to achieve a deeper semantic analysis.  ...  Hierarchical Latent Dirichlet Allocation (hLDA) (D.  ... 
dblp:conf/tac/LiuLHL11 fatcat:ddzeevxm3ncy5f6shulifyc33i

Review and Comparative Analysis of Topic Identification Techniques

Deepti Sehrawat, Maharshi Dayanand University, Rohtak, Haryana (India)
2019 International Journal of Advanced Trends in Computer Science and Engineering  
Existing solutions include text clustering, latent semantic approach, probabilistic latent semantics approach, latent Dirichlet allocation approach, association rule-based approaches, document clustering  ...  This area is of great interest among researchers as its applications in the real world are very wide. This paper presents a review of topic identification techniques.  ...  Latent Dirichlet Allocation Approaches Authors in [15] proposed a model that addresses the limitations of pLSA and gives a new model known as Latent Dirichlet Allocation (LDA) model.  ... 
doi:10.30534/ijatcse/2019/71832019 fatcat:g46lyzxg7jcehlci4r62nxbtpe

Mining Divergent Opinion Trust Networks through Latent Dirichlet Allocation

N. Dokoohaki, M. Matskin
2012 2012 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining  
To acquire feature sets from topics discussed in a discussion we use a very successful topic modeling technique, namely Latent Dirichlet Allocation (LDA).  ...  While the focus of trust research has been mainly on defining and modeling various notions of social trust, less attention has been given to modeling opinion trust.  ...  Latent Dirichlet Allocation, models each group of tweets as a mixture of latent topics. Figure 3 shows the graphical notation for LDA. By default of the library, we have used multi-grams [31] .  ... 
doi:10.1109/asonam.2012.158 dblp:conf/asunam/DokoohakiM12 fatcat:jqa5jbgrhfdhdo7wafjouevvny

Characterizing Artificial Intelligence Applications in Cancer Research using Latent Dirichlet Allocation (Preprint)

Bach Xuan Tran, Carl A. Latkin, Noha Sharafeldin, Katherina Nguyen, Giang Thu Vu, Wilson W.S. Tam, Ngai-Man Cheung, Huong Lan Thi Nguyen, Cyrus S.H. Ho, Roger C.M. Ho
2019 JMIR Medical Informatics  
Latent Dirichlet Allocation was used for classifying papers into corresponding topics.  ...  Noticeably, this classification has revealed topics examining the incremental effectiveness of AI applications, the quality of life, and functioning of patients receiving these innovations.  ...  Cases, n (%) Eigen value Table 5 . 5 Ten research topics classified by Latent Dirichlet Allocation.  ... 
doi:10.2196/14401 pmid:31573929 pmcid:PMC6774235 fatcat:v3cnn7gu5ndpdj45c2qq4tedxi

Topic Modeling Based Extractive Text Summarization

2020 VOLUME-8 ISSUE-10, AUGUST 2019, REGULAR ISSUE  
In this paper, we propose a novel method to summarize a text document by clustering its contents based on latent topics produced using topic modeling techniques and by generating extractive summaries for  ...  each of the identified text clusters.  ...  the dataset. 2) Generation of Latent Dirichlet Allocation model: Latent Dirichlet Allocation (LDA) [11] is a generative probabilistic model used to determine the abstract topics that are present in  ... 
doi:10.35940/ijitee.f4611.049620 fatcat:to2izh7xb5aurgkgqhka2qqhku

Latent Dirichlet Allocation (LDA) and Topic modeling: models, applications, a survey [article]

Hamed Jelodar, Yongli Wang, Chi Yuan, Xia Feng, Xiahui Jiang, Yanchao Li, Liang Zhao
2018 arXiv   pre-print
There are various methods for topic modeling, which Latent Dirichlet allocation (LDA) is one of the most popular methods in this field.  ...  Topic modeling is one of the most powerful techniques in text mining for data mining, latent data discovery, and finding relationships among data, text documents.  ...  Acknowledgements This article has been awarded by the National Natural Science Foundation of China (61170035, 61272420, 81674099, 61502233), the Fundamental Research Fund for the Central Universities (  ... 
arXiv:1711.04305v2 fatcat:jzsx6owjyjfo3gkbohrc2ggkzq

Using Hashtag Graph-Based Topic Model to Connect Semantically-Related Words Without Co-Occurrence in Microblogs

Yuan Wang, Jie Liu, Yalou Huang, Xia Feng
2016 IEEE Transactions on Knowledge and Data Engineering  
., Latent Dirichlet Allocation [1] and Latent Semantic Analysis [2]) fail to learn high quality topic structures. Tweets are always showing up with rich user-generated hashtags.  ...  The shortness and informality of tweets leads to extreme sparse vector representations with a large vocabulary.  ...  Latent Dirichlet Allocation including Hashtags as Words, which does the same as Tag-LDA [25] does and treats each hashtag as a word in tweets. • LDAHGW, the variant Latent Dirichlet Allocation including  ... 
doi:10.1109/tkde.2016.2531661 fatcat:h4mtgomevbewfnr4c3kyfbblsq

GeoFolk

Sergej Sizov
2010 Proceedings of the third ACM international conference on Web search and data mining - WSDM '10  
We systematically evaluate the solution on a subset of Flickr data, in characteristic scenarios of tag recommendation, content classification, and clustering.  ...  We describe an approach for multi-modal characterization of social media by combining text features (e.g. tags as a prominent example of short, unstructured text labels) with spatial knowledge (e.g. geotags  ...  coined Latent Dirichlet Allocation (LDA) [4] .  ... 
doi:10.1145/1718487.1718522 dblp:conf/wsdm/Sizov10 fatcat:uzdekir4bnbftcwvizbqnimi4q

Latent Dirichlet Allocation Based Semantic Clustering of Heterogeneous Deep Web Sources

Umara Noor, Ali Daud, Ayesha Manzoor
2013 2013 5th International Conference on Intelligent Networking and Collaborative Systems  
Allocation (LDA) for modeling content representative of deep web databases.  ...  In this paper, we propose a novel method DWSemClust to cluster deep web databases based on the semantic relevance found among deep web forms by employing a generative probabilistic model Latent Dirichlet  ...  For both the issues we propose a clustering approach based on directed probabilistic topic model i.e. latent dirichlet allocation (LDA).  ... 
doi:10.1109/incos.2013.28 dblp:conf/incos/NoorDM13 fatcat:xm3uq63b5zcnljam2uyorzkrxy

Topic Modeling for Short Texts via Word Embedding and Document Correlation

Feng Yi, Bo Jiang, Jianjun Wu
2020 IEEE Access  
In fact, each short text usually contains a limited number of topics, and understanding semantic content of short text needs to the relevant background knowledge.  ...  Meanwhile, the method employs the clustering mechanism under document-totopic distributions during the topic inference by using Gibbs Sampling Dirichlet Multinomial Mixture model.  ...  The former integrate a latent feature model into Latent Dirichlet Allocation model [7] , and the latter rely on a one topic-per-document Dirichlet Multinomial Mixture model [34] .  ... 
doi:10.1109/access.2020.2973207 fatcat:qrmkhfoxqjb4bcutcyicuwvpiy

Automatic Detection of the Topics in Customer Complaints with Artificial Intelligence

Sevinç İLHAN OMURCA, Ekin EKİNCİ, Enes YAKUPOĞLU, Emirhan ARSLAN, Berkay ÇAPAR
2021 Balkan Journal of Electrical and Computer Engineering  
Index Terms-Topic modelling, latent dirichlet allocation, gibbs sampling, gibbs sampling for dirichlet multinomial mixture, natural language processing.  ...  In this article, online text based customer complaints are analyzed with Latent Dirichlet Allocation (LDA), GenSim LDA, Mallet LDA and Gibbs Sampling for Dirichlet Multinomial Mixture model (GSDMM) and  ...  ACKNOWLEDGMENT Thanks to TÜBİTAK for their support to the project numbered 1919B011902805 within the scope of TÜBİTAK-2209-A University Students Research Projects Support Program 2019/2.  ... 
doi:10.17694/bajece.832274 fatcat:jhruhjbkufgvdpb33e26xihc3y
« Previous Showing results 1 — 15 out of 2,519 results