Filters








204 Hits in 7.3 sec

A sentence level probabilistic model for evolutionary theme pattern mining from news corpora

Shizhu Liu, Yuval Merhav, Wai Gen Yee, Nazli Goharian, Ophir Frieder
2009 Proceedings of the 2009 ACM symposium on Applied Computing - SAC '09  
To produce a more descriptive representation of the theme pattern, we not only give new representations of sentences and themes with named entities, but we also propose a sentence-level probabilistic model  ...  Some recent topic model-based methods have been proposed to discover and summarize the evolutionary patterns of themes in temporal text collections.  ...  CONCLUSIONS In this paper, we propose a sentence level probabilistic model to discover evolutionary theme patterns from temporal text collections.  ... 
doi:10.1145/1529282.1529672 dblp:conf/sac/LiuMYGF09 fatcat:zararnt37jgi7l2ooylgmhevdu

Tracking Events Using Time-dependent Hierarchical Dirichlet Tree Model [chapter]

Rumeng Li, Tao Wang, Xun Wang
2015 Proceedings of the 2015 SIAM International Conference on Data Mining  
Our model can aptly detect different levels of topic information across corpus and such structure is further used for sentence selection.  ...  Timeline Generation aims at summarizing news from different epochs and telling readers how an event evolves. It is a new challenge that combines salience ranking with novelty detection.  ...  Recently, bayesian topic models such as LDA [6] or HDP [30] have shown the power text mining for its clear probabilistic interpretation.  ... 
doi:10.1137/1.9781611974010.62 dblp:conf/sdm/LiWW15 fatcat:qv3faiqj7ndczc4fdkevdizyzy

Discovery of interactive graphs for understanding and searching time-indexed corpora

Ilija Subašić, Bettina Berendt
2009 Knowledge and Information Systems  
In this paper, we formulate the problem of discovering such stories as Evolutionary Theme Pattern Discovery, Summary and Exploration (ETP3).  ...  We propose a method and a visualisation tool for solving ETP3 by understanding, searching and interacting with such stories and their underlying documents.  ...  Temporal text mining Mei and Zhai [35] described evolutionary theme pattern discovery as one key subproblem of temporal text mining.  ... 
doi:10.1007/s10115-009-0227-x fatcat:qv42q347yfetxnqvsak2z3wtei

Temporal web dynamics and its application to information retrieval

Kira Radinsky, Fernando Diaz, Susan Dumais, Milad Shokouhi, Anlei Dong, Yi Chang
2013 Proceedings of the sixth ACM international conference on Web search and data mining - WSDM '13  
.  Operators for manipulating streams of interest  Filter  Link  Visualize E. Adar, M. Dontcheva, J. Fogarty and D. Weld. Zoetrope: Interacting with the ephemeral web.  ...   Corpora  need to develop standard corpora for sociotemporal modeling.  ...  Using relevant documents for future event prediction [Amodeo, Blanco, Brefeld, CIKM' 11] Based on publication dates of results buil a probabilistic model Future Event Retrieval from social media  Predicting  ... 
doi:10.1145/2433396.2433500 dblp:conf/wsdm/RadinskyDDSDC13 fatcat:v3r5yqpnwjcezm35ux4v5eiyse

Thirty years of artificial intelligence in medicine (AIME) conferences: A review of research themes

Niels Peek, Carlo Combi, Roque Marin, Riccardo Bellazzi
2015 Artificial Intelligence in Medicine  
Conclusions: There has been a major shift from knowledge-based to data-driven methods while the interest for other research themes such as uncertainty management, image and signal processing, and natural  ...  Results: We identified 30 research topics across 12 themes. AIME was dominated by knowledge engineering research in its first decade, while machine learning and data mining prevailed thereafter.  ...  First, NLP researchers developed techniques for text mining, i.e., discovering new knowledge from text corpora in the form of previously unknown patterns and relations between concepts [40] .  ... 
doi:10.1016/j.artmed.2015.07.003 pmid:26265491 fatcat:bnzr2hwtifbd7dckguz5mgjhmq

Literature Explorer: effective retrieval of scientific documents through nonparametric thematic topic detection

Shaopeng Wu, Youbing Zhao, Farzad Parvinzamir, Nikolaos Th. Ersotelos, Hui Wei, Feng Dong
2019 The Visual Computer  
We propose a novel topic mining method that is able to uncover "thematic topics" from a scientific corpus.  ...  Comparisons are also made against the outcomes from the traditional topic modelling methods.  ...  Most of these methods are statistical based to mine text patterns and themes from a document corpus.  ... 
doi:10.1007/s00371-019-01721-7 fatcat:rgz3z4eiojbkdfia4xtdzyjl5a

Regional variation in probabilistic grammars: A multifactorial study of the English dative alternation

Melanie Röthlisberger, Benedikt Szmrecsanyi, Jason Grafmiller
2018 Zenodo  
More precisely, the research is concerned with the probabilistic constraints that influence the choice between a ditransitive (e.g. "Mary gives John the apple") and a prepositional dative (e.g.  ...  This thesis grew out of the project "Exploring probabilistic grammar(s) in varieties of English around the World" and explores the underlying constraints that shape syntactic variation in new varieties  ...  The test statistics point out that, overall, a model with a five-level predictor performs better than a model with a two-level predictor for theme and recipient complexity.  ... 
doi:10.5281/zenodo.4022349 fatcat:t5guypdb7jayznjemruwttlqei

Topic Modeling: A Comprehensive Review

Pooja Kherwa, Poonam Bansal
2018 EAI Endorsed Transactions on Scalable Information Systems  
Topic modelling is the new revolution in text mining. It is a statistical technique for revealing the underlying semantic structure in large collection of documents.  ...  After analysing approximately 300 research articles on topic modeling, a comprehensive survey on topic modelling has been presented in this paper.  ...  Topic modelling is the new revolution in text mining. It is a statistical technique for revealing the underlying semantic structure in large collection of documents.  ... 
doi:10.4108/eai.13-7-2018.159623 fatcat:lu6al57vp5aahbytyejhqrlzry

Intelligent and fuzzy systems applied to language & knowledge engineering

D. Pinto, V. Singh, David Pinto, Vivek Singh
2019 Journal of Intelligent & Fuzzy Systems  
The call for papers of this special issue received an overwhelming response from the community.  ...  Experiments for 7 classifiers and 4 methods of linear regression on Russian Readability corpus demonstrated that ranking textbooks for native speakers is a much more difficult task than ranking examination  ...  Rodríguez-González et al. in their paper "Frequent Similar Pattern Mining using Non Boolean Similarity Functions" extend the similar frequent pattern mining by allowing the use of non Boolean similarity  ... 
doi:10.3233/jifs-179006 fatcat:hs76dvlsfnbpngmglwuq6aemda

The Voices of European Law: Legislators, Judges and Law Professors

Arthur Dyevre, Monika Glavina, Michal Ovádek
2021 German Law Journal  
articles from a leading EU law journal.  ...  Applying an unsupervised machine learning technique known as probabilistic topic modelling, we find that economic integration remains the focus of EU law, but that scholars tend to emphasize rights issues  ...  Methodology Our main text-mining method is known as probabilistic topic modelling. 28 Probabilistic topic modelling has been developed for the purpose of discovering and annotating large archives of documents  ... 
doi:10.1017/glj.2021.47 fatcat:75qnryd62jhu3fpxakwvnuz6c4

Scalable Topical Phrase Mining from Text Corpora [article]

Ahmed El-Kishky, Yanglei Song, Chi Wang, Clare Voss, Jiawei Han
2014 arXiv   pre-print
Our solution combines a novel phrase mining framework to segment a document into single and multi-word phrases, and a new topic model that operates on the induced document partition.  ...  While most topic modeling algorithms model text corpora with unigrams, human interpretation often relies on inherent grouping of terms into phrases.  ...  Sentence-LDA, is a generative model with an extra level generative hierarchy that assigns the same topic to all the words in a single sentence [13] .  ... 
arXiv:1406.6312v2 fatcat:umrmdntoabhntf4knzkvflywji

TextFlow: Towards Better Understanding of Evolving Topics in Text

Weiwei Cui, Shixia Liu, Li Tan, Conglei Shi, Yangqiu Song, Zekai Gao, Huamin Qu, Xin Tong
2011 IEEE Transactions on Visualization and Computer Graphics  
In this paper, we introduce TextFlow, a seamless integration of visualization and topic mining techniques, for analyzing various evolution patterns that emerge from multiple topics.  ...  Then a coherent visualization that consists of three new visual components is designed to convey complex relationships between them.  ...  ACKNOWLEDGMENTS The authors would like to thank Stephen Lin for proofreading the paper, Kwan-Liu Ma for his support, and the anonymous reviewers for their valuable comments.  ... 
doi:10.1109/tvcg.2011.239 pmid:22034362 fatcat:2xyiw4hxlbhrpfq7aj7kyfm4mm

Understanding text corpora with multiple facets

Lei Shi, Furu Wei, Shixia Liu, Li Tan, Xiaoxiao Lian, Michelle X. Zhou
2010 2010 IEEE Symposium on Visual Analytics Science and Technology  
In this paper, we propose a data model that can be used to represent most of the text corpora.  ...  To help people discover evolutionary and correlation patterns, we also develop several visual interaction methods that allow people to interactively analyze text by one or more facets.  ...  The authors also like to thank Xiaohua Sun from Tongji University, for her suggestions in the visualization and interaction design.  ... 
doi:10.1109/vast.2010.5652931 dblp:conf/ieeevast/ShiWLTLZ10 fatcat:6cnm7lp7uja65owcvho34sky44

Scalable topical phrase mining from text corpora

Ahmed El-Kishky, Yanglei Song, Chi Wang, Clare R. Voss, Jiawei Han
2014 Proceedings of the VLDB Endowment  
Our solution combines a novel phrase mining framework to segment a document into single and multi-word phrases, and a new topic model that operates on the induced document partition.  ...  While most topic modeling algorithms model text corpora with unigrams, human interpretation often relies on inherent grouping of terms into phrases.  ...  Sentence-LDA, is a generative model with an extra level generative hierarchy that assigns the same topic to all the words in a single sentence [13] .  ... 
doi:10.14778/2735508.2735519 fatcat:l5dgrmk3sngg3aghbnyrbahmli

A Probabilistic Approach in Historical Linguistics Word Order Change in Infinitival Clauses: from Latin to Old French [article]

Olga Scrivner
2020 arXiv   pre-print
Finally, I present a three-stage probabilistic model of word order change, which also conforms to traditional language change patterns.  ...  For this investigation, the data are extracted from annotated corpora spanning several centuries of Latin and Old French and from additional resources created by using computational linguistic methods.  ...  To evaluate the accuracy of the new trained model, I have selected 50 sentences from Gregory of Tours, a text that represents a different chronological period than the trained model.  ... 
arXiv:2011.08262v1 fatcat:ror5javry5fjvmqxjgvbfiuxf4
« Previous Showing results 1 — 15 out of 204 results