6,479 Hits in 6.1 sec

Exploration of textual document archives using a fuzzy hierarchical clustering algorithm in the GAMBAL system

Vicenç Torra, Sadaaki Miyamoto, Sergi Lanau
2005 Information Processing & Management  
In this work we present an extension of the Gambal system for clustering and visualization of documents based on fuzzy clustering techniques.  ...  The Internet, together with the large amount of textual information available in document archives, has increased the relevance of information retrieval related tools.  ...  Acknowledgements Partial support by Generalitat de Catalunya (AGAUR, 2002XT 00111 and 2002BEAI400017) and by Grant-in-Aid for Scientific Research (c), Japan Society for the Promotion of Science no. 13680475  ... 
doi:10.1016/j.ipm.2004.01.001 fatcat:qaobvpcycbd7bbwih3s4yvt424

Learning a taxonomy from a set of text documents

Mari-Sanna Paukkeri, Alberto Pérez García-Plaza, Víctor Fresno, Raquel Martínez Unanue, Timo Honkela
2012 Applied Soft Computing  
The taxonomy is obtained by a hierarchical approach of Self-Organizing Map clustering of the concept definition documents.  ...  The third approach is the traditional tf-idf weighting scheme with commonly used rulebased stemming. Experiments are conducted for the English, Finnish, and Spanish languages.  ...  Somewhat similar approach is the growing Self-Organizing Map [48] and further the growing hierarchical Self-Organizing Map [49] .  ... 
doi:10.1016/j.asoc.2011.11.009 fatcat:qxond2lfpzhehm3nt7ugcsqsge

Construction of supervised and unsupervised learning systems for multilingual text categorization

Chung-Hong Lee, Hsin-Chang Yang
2009 Expert systems with applications  
We selected support vector machines (SVM) as representative of supervised techniques as well as latent semantic indexing (LSI) and self-organizing maps (SOM) techniques as our selective ones of unsupervised  ...  , retrieve and categorize relevant information, in whatever language and form it may have been stored.  ...  Self-organizing maps (SOM) techniques provide document clustering and word clustering methods to group similar texts.  ... 
doi:10.1016/j.eswa.2007.12.052 fatcat:rwqnbwz2lra33o4nwvpge6peqi

Mapping weblog communities [article]

Juan-J. Merelo-Guervos, Beatriz Prieto, Fatima Rateb, Fernando Tricas
2003 arXiv   pre-print
In this paper, we will map a network of websites using Kohonen's self-organizing map (SOM), a neural-net like method generally used for clustering and visualization of complex data sets.  ...  Websites of a particular class form increasingly complex networks, and new tools are needed to map and understand them. A way of visualizing this complex network is by mapping it.  ...  Acknowledgements This paper has been funded in part by project TIC2003-09481-C04, of the Spanish ministry of science and technology, and a project awarded the Quality and Innovation department of the University  ... 
arXiv:cs/0312047v1 fatcat:os44kh7nkzbupbtshp7wqoxzmm

A new technique for building maps of large scientific domains based on the cocitation of classes and categories

Félix Moya-Anegón, Benjamín Vargas-Quesada, Victor Herrero-Solana, Zaida Chinchilla-Rodríguez, Elena Corera-Álvarez, Francisco J. Munoz-Fernández
2004 Scientometrics  
We propose a new technique that uses thematic classification (classes and categories) as entities of cocitation and units of measure, and demonstrate the viability of this methodology through the representation  ...  The main features of the maps obtained are discussed, and proposals are made for future improvements and applications.  ...  Lin, Soergel and Marchionini 29 develop a Self-Organizing Map (SOM) that represents the semantic relationships among documents and can be used as a bibliographic interface for the retrieval of online information  ... 
doi:10.1023/b:scie.0000037368.31217.34 fatcat:vw7f6v62mrcrtgnv3h4jqldna4

Content-Based Clustering for Tag Cloud Visualization

Arkaitz Zubiaga, Alberto P. García-Plaza, Víctor Fresno, Raquel Martínez
2009 2009 International Conference on Advances in Social Network Analysis and Mining  
We present a methodology to obtain and visualize a cloud of related tags based on the use of self-organizing maps, and where the relations among tags are established taking into account the textual content  ...  of tagged documents.  ...  ACKNOWLEDGMENT This work has been supported by the Research Network MAVIR(S-0505/TIC-0267), and by the Spanish Ministry of Science and Innovation project QEAVis-Catiex(TIN2007-67581-C02-01).  ... 
doi:10.1109/asonam.2009.19 dblp:conf/asunam/ZubiagaGFM09 fatcat:o5amfy7g6rgihl6hgqlqso42se


Mehran Sahami, Salim Yusufali, Michelle Q. W. Baldonaldo
1998 Proceedings of the third ACM conference on Digital libraries - DL '98  
It employs a combination of technologies that takes the results of queries to networked information sources and, in real-time, automatically retrieve, parse and organize these documents into coherent categories  ...  It also makes use of Bayesian classification techniques to classify new documents within an existing categorization scheme.  ...  Additional thanks go to Marti Hearst and the anonymous reviewers for providing valuable comments on this paper. We are indebted to Scott Hassan who implemented the web crawler module of SONIA.  ... 
doi:10.1145/276675.276697 dblp:conf/dl/SahamiYB98 fatcat:aifcsattj5bsbofxlzs7ktlske

A Self-Organizing Map Based Knowledge Discovery for Music Recommendation Systems [chapter]

Shankar Vembu, Stephan Baumann
2005 Lecture Notes in Computer Science  
In this paper, we present an approach for musical artist recommendation based on Self-Organizing Maps (SOMs) of artist reviews from Amazon web site.  ...  The Amazon reviews for the artists are obtained using the Amazon web service interface and stored in the form of textual documents that form the basis for the formation of the SOMs.  ...  Self-Organizing Maps The Self-Organizing Map (SOM) is an unsupervised learning algorithm used to visualize and interpret large high-dimensional data sets.  ... 
doi:10.1007/978-3-540-31807-1_9 fatcat:hang3gkd55fg5lg3ajm2c365ey

Global output of research on epidermal parasitic skin diseases from 1967 to 2017

Waleed M. Sweileh
2018 Infectious Diseases of Poverty  
Documents that specifically and explicitly discuss EPSD in animals, aquatic organisms, and birds were excluded. Results: In total, 4186 documents were retrieved.  ...  Methods: A bibliometric analysis methodology was used. The Scopus database was used to retrieve documents about EPSD for the study period .  ...  Results Types, languages, and subject areas of retrieved documents In total, 4186 documents were retrieved.  ... 
doi:10.1186/s40249-018-0456-x pmid:30078380 pmcid:PMC6091169 fatcat:dp7gr54sqfbf5icv5bevlfvaxm

Text mining tools [chapter]

A. Zanasi
2005 Text Mining and its Applications to Intelligence, CRM and Knowledge Management  
PolyAnalyst for Text™ performs semantic text analysis, record coding, identification and visualization of patterns and clusters of information, automated or manual taxonomy creation and editing, taxonomy-based  ...  forced or self-learning, ISSN 1755-8336 (on-line) WIT Transactions on Text Mining and its Applications to Intelligence, CRM and Knowledge Management 317 IT requirements PolyAnalyst for  ...  Surfing among clusters, documents, or terms and propagating their effects. Use of a dynamic hierarchical cluster browser. Customizing term start lists, stop lists, and synonym lists.  ... 
doi:10.2495/978-1-85312-995-7/21 fatcat:2w5273jjobghfmdgrxnjal22xy

Multilingual document mining and navigation using self-organizing maps

Hsin-Chang Yang, Han-Wei Hsiao, Chung-Hong Lee
2011 Information Processing & Management  
In this approach, a self-organizing map is constructed to train each set of monolingual Web pages and obtain two feature maps, which reveal the relationships among Web pages and thematic keywords respectively  ...  However, such directories are generally constructed manually and may have disadvantages of narrow coverage and inconsistency.  ...  When a set of monolingual Web pages is input, we will first cluster them using self-organizing map algorithm.  ... 
doi:10.1016/j.ipm.2009.12.003 fatcat:lg26lu7menfkbl6qhsywpbpngy

Mapping urban tourism issues: analysis of research perspectives through the lens of network visualization

Marjan Hocevar, Tomaz Bartol
2021 International Journal of Tourism Cities  
Clustering is used to evaluate information retrieval (inclusivity or selectivity of the search query), publication patterns (journal articles), author keywords, terminology and to identify the respective  ...  More consistent use of terms would benefit authors in the field of urban tourism when searching for references (information retrieval) and, as a consequence, would allow better integration of the field  ...  In retrieval with the NQ, half of all articles were mapped to HLST.  ... 
doi:10.1108/ijtc-05-2020-0110 fatcat:vasuh5na55e6lgdhzfiss7gdca

Classifying Amharic webnews

Lars Asker, Atelach Alemu Argaw, Björn Gambäck, Samuel Eyassu Asfeha, Lemma Nigussie Habte
2009 Information retrieval (Boston)  
The first two sets of experiments investigated the use of Self-Organizing Maps (SOMs) for document classification.  ...  We discuss the issues of compiling and annotating a corpus of Amharic news articles from the Web. This corpus was then used in three sets of text classification experiments.  ...  Gashaw Kebede, Kibur Lisanu, and Meshesha Legesse at Addis Ababa University; and Gunnar Eriksson, Fredrik Olsson, and Dr. Magnus Sahlgren at the Swedish Institute of Computer Science.  ... 
doi:10.1007/s10791-008-9080-x fatcat:jexrlbut4jbsbaujngj4yq4zoe

MultiLingMine 2016: Modeling, Learning and Mining for Cross/Multilinguality [chapter]

Dino Ienco, Mathieu Roche, Salvatore Romeo, Paolo Rosso, Andrea Tagarelli
2016 Lecture Notes in Computer Science  
The paper presents a new framework for discrimination of Latin and Italian languages. The first phase maps the text in the given language into a uniformly coded text.  ...  This segmenter is based on Rhetorical Structure Theory (RST) for Spanish, and uses lexical and syntactic information to translate rules valid for Spanish into rules for Catalan.  ...  Zagorka Brodić, professor of French and Serbo-Croatian languages, for the helpful discussions about Latin and Italian languages.  ... 
doi:10.1007/978-3-319-30671-1_83 fatcat:znq74oljzfefrfhzdkpphzekz4

Towards a Linguistic Stylometric Model for the Authorship Detection in Cybercrime Investigations

Abdulfattah Omar, Aldawsari Bader Deraan
2019 International Journal of English Linguistics  
It is also clear that the use of a self-organizing map (SOM) led to better clustering performance because of its capacity to integrate two different linguistic levels for each author profile.  ...  The data analyzed is from a corpus of 12,240 tweets derived from 87 Twitter accounts. A self-organizing map (SOM) model was used to classify input patterns in the tweets that shared common features.  ...  For classification purposes, the self-organizing map (SOM) model is used.  ... 
doi:10.5539/ijel.v9n5p182 fatcat:2n5omcn7vfhwvh4js3thporcs4
« Previous Showing results 1 — 15 out of 6,479 results