3,476 Hits in 9.5 sec

A multilingual text mining approach to web cross-lingual text retrieval

Rowena Chau, Chung-Hsing Yeh
2004 Knowledge-Based Systems  
This approach is employed for developing a multi-agent system to facilitate concept-based CLTR on the Web. q  ...  To enable concept-based cross-lingual text retrieval (CLTR) using multilingual text mining, our approach will first discover the multilingual concept -term relationships from linguistically diverse textual  ...  An alternative to machine-readable dictionary is using a parallel corpus. A parallel corpus is a set of identical text written in multiple languages.  ... 
doi:10.1016/j.knosys.2004.04.001 fatcat:xcvrtmnhkve67eauw6ohzjpky4

The Use of Ontology in Retrieval: A Study on Textual, Multilingual and Multimedia Retrieval

Muhammad Nabeel Asim, Muhammad Wasim, Muhammad Usman Ghani Khan, Nasir Mahmood, Waqar Mahmood
2019 IEEE Access  
Ontological information retrieval systems retrieve data based on the similarity of semantics between the user query and the indexed data.  ...  Web contains a vast amount of data, which are accumulated, studied, and utilized by a huge number of users on a daily basis.  ...  They replaced machine readable dictionary (MRD) with multilingual ontology (MO) in order to provide better search engine for tourism domain.  ... 
doi:10.1109/access.2019.2897849 fatcat:ei2zxyxdjndbvgzzue2indwqy4

CRIE: An automated analyzer for Chinese texts

Yao-Ting Sung, Tao-Hsing Chang, Wei-Chun Lin, Kuan-Sheng Hsieh, Kuo-En Chang
2015 Behavior Research Methods  
Furthermore, the integration of linguistic features with machine learning models enables CRIE to provide leveling and diagnostic information for texts in language arts, texts for learning Chinese as a  ...  This article introduces a tool for the automated analysis of simplified and traditional Chinese texts, called the Chinese Readability Index Explorer (CRIE).  ...  On the next page, users can paste a single text directly or upload multiple texts compressed in a zip file.  ... 
doi:10.3758/s13428-015-0649-1 pmid:26424442 fatcat:kzb5okwr4nhcrbxdknafvtsmei

A Review Paper on Automatic Text Summarization in Indonesia Language

Nurul Khotimah, Computer Science Department, BINUS Graduate Program – Master of Computer Science, Bina Nusantara University, Jakarta, Indonesia 11480., Adi Wibowo P, Bryan Andreas, Abba Suganda Girsang
2021 International Journal of Emerging Technology and Advanced Engineering  
Text summarization is one problem in natural language processing that generates a brief version of the original document.  ...  This paper investigates some methods such as Statistical Based Approach, Graph Based Approach, Machine Learning Approach, Fuzzy Logic Approach, Algebraic Approach, and Hybrid Approach.  ...  based on their semantic features.  ... 
doi:10.46338/ijetae0821_11 fatcat:clqguhupejfklcztgzmyrnvxdq

Visualizing Bag-of-Features Image Categorization Using Anchored Maps

Gao Yi, Hsiang-Yun Wu, Kazuo Misue, Kazuyo Mizuno, Shigeo Takahashi
2014 Proceedings of the 7th International Symposium on Visual Information Communication and Interaction - VINCI '14  
The bag-of-features models is one of the most popular and promising approaches for extracting the underlying semantics from image databases.  ...  Voronoi partitioning has been also incorporated into our approach so that we can visually identify the image categorization based on support vector machine.  ...  A main idea behind the BoF model is to seek an analogy of methods for inferring text categorization based on the bag-ofwords model, where each document is represented as a sparse vector of representative  ... 
doi:10.1145/2636240.2636858 dblp:conf/vinci/YiWMMT14 fatcat:ql25ypcezrffnjxoypgzt6hhwe


Shelly Gupta
2018 International Journal of Advanced Research in Computer Science  
The hybrid approach is using the concepts of dictionary-based approach and semantic-based approach i.e. matching words from the dictionary and assigning their sentimental value and also using some specific  ...  It is estimated that till 2025, most of the world's trade will be based on Data Mining [1]. There is vast availability of people opinion data on twitter for almost every product and service.  ...  [17] have proposed a Gini Index based feature selection method with Support Vector Machine (SVM) classifier for sentiment classification for large movie review data set.  ... 
doi:10.26483/ijarcs.v9i2.5581 fatcat:dpuqgpvnbrbktpj4qvtb7rlwji

Text Document Categorization using Enhanced Sentence Vector Space Model and Bi-Gram Text Representation Model Based on Novel Fusion Techniques

2020 New Media and Mass Communication  
The main objective of the study is to boost the accuracy of text classification by accounting for the features extracted from the text document.  ...  A word level neural network Bigram representation of text documents is proposed for effective capturing of semantic information present in the text data.  ...  A Bigram is two consecutive elements from a texts or tokens. For users, Bi-grams are often easier to interpret than single words and it reduces the computational complexity.  ... 
doi:10.7176/nmmc/93-03 fatcat:o2y7lh4ftnf2piqz5dgwjoiblu

Intelligent Information Access: A Survey

Nupur Choudhury, Rupesh Mandal, Vikas Sharma
2018 International Journal of Computer Applications  
It also describes various technological standards that are based on resource development framework, SPARQL, Web Ontology language and different techniques that are related to Intelligent Information Access  ...  This paper primarily deals with the better understanding of the underlying working of the concept of Web 3.0 or the Semantic web.  ...  During the phase of feature extraction, the primary advantage of this method is its ability to generate a local dictionary for each different category.  ... 
doi:10.5120/ijca2018916393 fatcat:dl4hqs2e5vdi5oxvc47kp4beqm

A Comprehensive Framework for Ontology based Classifier using Unstructured Data

2019 International Journal of Engineering and Advanced Technology  
In this paper, the domain knowledge is presented as a knowledge graph, derived from the semantic data modeling.  ...  One major problem with text processing is most data generated is unstructured and ambiguous, as, data with a structure helps to identify meaningful patterns and eventually exhibit the latent knowledge.  ...  Text normalization [21] will convert human readable data into machine readable form.  ... 
doi:10.35940/ijeat.a2042.109119 fatcat:t2mh5jlxhfa6pky5ipmw43doza

ChemEx: information extraction system for chemical data curation

Atima Tharatipyakul, Somrak Numnark, Duangdao Wichadakul, Supawadee Ingsriswang
2012 BMC Bioinformatics  
Text annotator is able to extract compound, organism, and assay entities from text content while structure image recognition enables translation of chemical raster images to machine readable format.  ...  A user can view annotated text along with summarized information of compounds, organism that produces those compounds, and assay tests.  ...  Acknowledgements This work was supported by National Center for Genetic Engineering and Biotechnology (BIOTEC).  ... 
doi:10.1186/1471-2105-13-s17-s9 pmid:23282330 pmcid:PMC3521388 fatcat:quqcssngxvcr3f4w4z4d3mwwkm

The Semantic Data Dictionary – An Approach for Describing and Annotating Data

Sabbir M. Rashid, James P. McCusker, Paulo Pinheiro, Marcello P. Bax, Henrique Santos, Jeanette A. Stingone, Amar K. Das, Deborah L. McGuinness
2020 Data Intelligence  
While these documents are useful in helping an end-user properly interpret the meaning of a column in a data set, existing data dictionaries typically are not machine-readable and do not follow a common  ...  It is common practice for data providers to include text descriptions for each column when publishing data sets in the form of data dictionaries.  ...  We acknowledge the members of the Tetherless World Constellation (TWC) and the Institute for Data Exploration and Applications (IDEA) at Rensellaer Polytechnic Institute (RPI) for their contributions,  ... 
doi:10.1162/dint_a_00058 pmid:33103120 pmcid:PMC7583433 fatcat:bk5swnaoirbh5artljs7kixfxu

Extractive and Abstractive Text Summarization Techniques

2020 International journal of recent technology and engineering  
The manual summarization consumes a large amount of time and hence an automated text summarization model is required. The summarization can be performed from a single source or multiple sources.  ...  Text summarization generates an abstract version of information on a particular topic from various sources without modifying its originality.  ...  The NLP models are adopted in semantic based approaches that categorize the nouns and verbs from the dataset. A.  ... 
doi:10.35940/ijrte.a2235.059120 fatcat:4bfnvpyaxbbw7apxayo4zkuy7u

BioKB - Text Mining and Semantic Technologies for Biomedical Content Discovery

Maria Biryukov, Valentin Groues, Venkata Satagopam, Reinhard Schneider
2018 Figshare  
Extracted knowledge is stored in a knowledge base publicly available for both, human and machine access, via web interface and SPARQL end- point.  ...  We have implemented a pipeline which, by exploiting text min- ing and semantic technologies, helps researchers easily access semantic content of thousands of abstracts and full text articles from PubMed  ...  To allow both human and machine access to the knowledge base, SPARQL endpoint provides machine readable access while a web application allows users to browse the content of the knowledge base.  ... 
doi:10.6084/m9.figshare.6994121.v1 fatcat:7qncychbrjcj7ptwi7cpnpozxm

A Review of Machine Learning Algorithms for Text-Documents Classification

Baharum Baharudin, Lam Hong Lee, Khairullah Khan
2010 Journal of Advances in Information Technology  
that remain to be solved, focused mainly on text representation and machine learning techniques.  ...  This paper provides a review of the theory and methods of document classification and text mining, focusing on the existing literature.  ...  Based on ant colony optimization a new feature selection algorithm is presented in [18] , to improve the text categorization.  ... 
doi:10.4304/jait.1.1.4-20 fatcat:nx23oqf3gbgiha45s2enn2hqqq

Knowledge Extraction and Semantic Annotation of Text from the Encyclopedia of Life

Anne E. Thessen, Cynthia Sims Parr, Luis M. Rocha
2014 PLoS ONE  
Numerous digitization and ontological initiatives have focused on translating biological knowledge from narrative text to machine-readable formats.  ...  One workflow tags text with DBpedia URIs based on keywords. Another workflow finds taxon names in text using GNRD for the purpose of building a species association network.  ...  Boston and the Boston Python Group for coding advice.  ... 
doi:10.1371/journal.pone.0089550 pmid:24594988 pmcid:PMC3940440 fatcat:aop64qgy6jhw3l3ni6ll55aujq
« Previous Showing results 1 — 15 out of 3,476 results