4,350 Hits in 1.6 sec

Development of a multilingual text mining approach for knowledge discovery in patents

Chung-Hong Lee, Hsin-Chang Yang, Yi-Ju Li
2009 2009 IEEE International Conference on Systems, Man and Cybernetics  
These multilingual patent documents could then be mapped into the semantic vector space for evaluating their similarity by means of text clustering techniques.  ...  In this paper we describe our work on developing a novel technique for discovery of implicit knowledge about patents from multilingual patent information sources.  ...  among multilingual patent documents by means of text clustering techniques.  ... 
doi:10.1109/icsmc.2009.5345953 dblp:conf/smc/LeeYL09 fatcat:dabjuw7frveg7cm2s2tjkngaha

Towards Multilingual Information Discovery through a SOM based Text Mining approach

Chung-Hong Lee, Hsin-Chang Yang
2000 Pacific Rim International Conference on Artificial Intelligence  
This paper describes our approach for concept discovery from multilingual text collections through a text mining technique.  ...  The initial experiments show some interesting results and a couple of potential ways for future work towards the field of multilingual information discovery.  ...  For multilingual information discovery, we take advantage of the fact that the SOM based text clustering method is independent of languages used in the contents of the texts. (2) .  ... 
dblp:conf/pricai/LeeY00 fatcat:akjgyjwqpzdexmfs7oiusmmj4a

A multilingual text mining approach to web cross-lingual text retrieval

Rowena Chau, Chung-Hsing Yeh
2004 Knowledge-Based Systems  
Second, the multilingual concept -term relationships, in turn, are used to discover the conceptual content of the multilingual text, which is either a document containing potentially relevant information  ...  To enable concept-based cross-lingual text retrieval (CLTR) using multilingual text mining, our approach will first discover the multilingual concept -term relationships from linguistically diverse textual  ...  In terms of the functions they perform, CLTR facilitates multilingual information access while multilingual text mining enables knowledge discovery from multilingual texts.  ... 
doi:10.1016/j.knosys.2004.04.001 fatcat:xcvrtmnhkve67eauw6ohzjpky4

How Does Language Influence Documentation Workflow? Unsupervised Word Discovery Using Translations in Multiple Languages [article]

Marcely Zanon Boito, Aline Villavicencio, Laurent Besacier
2019 arXiv   pre-print
We translate the bilingual Mboshi-French parallel corpus (Godard et al. 2017) into four other languages, and we perform bilingual-rooted unsupervised word discovery.  ...  In this paper we investigate language-related impact in automatic approaches for computational language documentation.  ...  Table 1 shows some statistics for the produced Multilingual Mboshi parallel corpus. 2 Bilingual Unsupervised Word Segmentation/Discovery Approach: We use the bilingual neuralbased Unsupervised Word Segmentation  ... 
arXiv:1910.05154v1 fatcat:sdvmzcun5nejvkwxaz5daqjlw4

Unsupervised Spoken Term Discovery on Untranscribed Speech [article]

Man-Ling Sung
2020 arXiv   pre-print
Multilingual neural network with bottleneck layer is used for feature extraction.  ...  It can be further divided into two parts: Acoustic segment modelling (ASM) and unsupervised pattern discovery.  ...  Each session is treated as one document.  ... 
arXiv:2011.14060v1 fatcat:vqxrmzjq35codkddbza6fdd4a4

Breaking down language barriers through multilingual federated search

Abe Lederman, Walter Warnick, Brian Hitson, Lorrie Johnson, Marjorie M.K. Hlava
2011 Information Services and Use  
Lederman et al. / Breaking down language barriers through multilingual federated search  ...  The WWS portal plays a leading role in bringing together the world's scientists to accelerate the discoveries needed to solve the planet's most pressing problems.  ...  Introduction Discovery drives science.  ... 
doi:10.3233/isu-2010-0617 fatcat:rvvrpqcig5c2zbtinybnshhflu

Multilingual document mining and navigation using self-organizing maps

Hsin-Chang Yang, Han-Wei Hsiao, Chung-Hong Lee
2011 Information Processing & Management  
In this work, we will propose an approach that could automatically arrange multilingual Web pages into a multilingual Web directory to break the language barriers in Web navigation.  ...  Finally, a multilingual Web directory is constructed according to such associations.  ...  Two related clusters are merged into single cluster in multilingual hierarchy.  ... 
doi:10.1016/j.ipm.2009.12.003 fatcat:lg26lu7menfkbl6qhsywpbpngy

An Empirical Evaluation of Zero Resource Acoustic Unit Discovery [article]

Chunxi Liu, Jinyi Yang, Ming Sun, Santosh Kesiraju, Alena Rott, Lucas Ondel, Pegah Ghahremani, Najim Dehak, Lukas Burget, Sanjeev Khudanpur
2017 arXiv   pre-print
Acoustic unit discovery (AUD) is a process of automatically identifying a categorical acoustic unit inventory from speech and producing corresponding acoustic unit tokenizations.  ...  representations can be significantly improved by (i) performing linear discriminant analysis (LDA) in an unsupervised self-trained fashion, and (ii) leveraging resources of other languages through building a multilingual  ...  Table 1 . 1 AUD Performance evaluated by NMI, same-different task, document classification and clustering on Switchboard Acoustic Features Average Document Classification Document Clustering Dataset  ... 
arXiv:1702.01360v1 fatcat:3ydcq25fdvc2hpiimtepwfrheq

Editors' introduction special issue on multilingual knowledge management

Christopher C. Yang, Chih-Ping Wei, Hsinchun Chen
2008 Decision Support Systems  
centers and intelligence agencies, the Dark Web project has generated one of the largest databases in the world about extremist/terrorist-generated Internet contents (web sites, forums, and multimedia documents  ...  As a result, the target multilingual documents are indexed in the language-independent LSI space and monolingual document clustering technique can be utilized to cluster the target multilingual documents  ...  Wei et al. develop a Latent Semantic Indexing (LSI)based multilingual document clustering technique.  ... 
doi:10.1016/j.dss.2007.07.002 fatcat:u3ph7x2vhndhfdyeaygoq7uyrq

Text Mining: Techniques, Applications and Issues

Ramzan Talib, Muhammad Kashif, Shaeela Ayesha, Fakeeha Fatima
2016 International Journal of Advanced Computer Science and Applications  
The discovery of appropriate patterns and trends to analyze the text documents from massive volume of data is a big issue.  ...  Text mining is a process of extracting interesting and nontrivial patterns from huge amount of text documents.  ...  Clustering Clustering is an unsupervised process to classify the text documents in groups by applying different clustering algorithms.  ... 
doi:10.14569/ijacsa.2016.071153 fatcat:y7owr443uneqhj457ops34m3xa

Cross-language information retrieval using PARAFAC2

Peter A. Chew, Brett W. Bader, Tamara G. Kolda, Ahmed Abdelali
2007 Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining - KDD '07  
As a result, when using multilingual LSA, documents will in practice cluster by language, not by topic.  ...  From our results, we conclude that PARAFAC2 offers a very promising alternative to LSA not only for multilingual document clustering, but also for solving other problems in crosslanguage information retrieval  ...  In line with our expectations, we found that this was particularly true for multilingual document clustering.  ... 
doi:10.1145/1281192.1281211 dblp:conf/kdd/ChewBKA07 fatcat:xzt7b4wwhrellfycr5vpzjspum

Extraction of Code-mixed Aspect Topics in Semantic Representation

Kavita Sanjay Asnani, Jyoti D Pawar
2018 Journal of Computacion y Sistemas  
This results not only in retrieval of implicit aspects but also in clustering them together.  ...  In this paper we propose knowledge based language independent code-mixed semantic LDA (lcms-LDA) model, with an aim to improve the coherence of clusters.  ...  It should be noted that work in [1] only described the discovery of language independent aspects and did not include semantics for coherence improvement of aspect clusters.  ... 
doi:10.13053/cys-22-1-2771 fatcat:myj6nppbrzfhdj36qyokyo4o64

Role of Text Mining in Information Extraction and Information Management

M. Natarajan
2005 DESIDOC Bulletin of Information Technology  
The three stages of Knowledge Discovery in Data and Data Mining (KDD) process is given with the applications of text mining.  ...  Applications of text mining vary from Information retrieval, bioinformatics, patent analysis, sorting gene expression, mining hospital records and multilingual approach to cross-lingual text retrieval  ...  In terms of the functions they perform, CLTR facilitates multilingual information access while multilingual text mining enables knowledge discovery from multilingual texts.  ... 
doi:10.14429/dbit.25.4.3663 fatcat:qfzdgomajrdbnmjorplfaullfq

Multilingual Word Sense Induction to Improve Web Search Result Clustering

Lorenzo Albano, Domenico Beneventano, Sonia Bergamaschi
2015 Proceedings of the 24th International Conference on World Wide Web - WWW '15 Companion  
In [12] a novel approach to Web search result clustering based on Word Sense Induction, i.e. the automatic discovery of word senses from raw text was presented; key to the proposed approach is the idea  ...  In this paper we give some preliminary ideas to exploit our multilingual Word Sense Induction method to Web search result clustering.  ...  all the documents written in different languages.  ... 
doi:10.1145/2740908.2743009 dblp:conf/www/AlbanoBB15 fatcat:nzjce2w4dfggnei6honeptvkwy

Construction of supervised and unsupervised learning systems for multilingual text categorization

Chung-Hong Lee, Hsin-Chang Yang
2009 Expert systems with applications  
This drives a convergence of numerous interests from diverse research communities focusing on the issues related to multilingual text categorization.  ...  In this work, we implemented and measured the performance of the leading supervised and unsupervised approaches for multilingual text categorization.  ...  The aim is to cluster the documents without additional knowledge or intervention such that documents within a cluster are similar than documents between clusters.  ... 
doi:10.1016/j.eswa.2007.12.052 fatcat:rwqnbwz2lra33o4nwvpge6peqi
« Previous Showing results 1 — 15 out of 4,350 results