1,861 Hits in 4.7 sec

Review implementation of linguistic approach in schema matching

Galih Hendro Martono, Azhari SN
2017 IJAIN (International Journal of Advances in Intelligent Informatics)  
Implementation of linguistic approach itself has been used a long time with various problem such as to calculated entity similarity values in two or more schemas.  ...  The purpose of this paper was to provide an overview of previous studies related to the implementation of the linguistic approach in the schema matching and finding gap for the development of existing  ...  Some research in linguistic focused to calculation of similarity between of two or more schemas. In calculating the entity similarity in a database used help from dictionary and thesaurus.  ... 
doi:10.26555/ijain.v3i1.75 fatcat:fxun3y4aabgzdmmg7i5rlgbxgq


Iryna Dilay, Mykhailo Bilynskyi
Also we will enumerate the applications of the positional string metric for the reversal of a number of dictionaries of synonyms.  ...  It is simple but requires a rich hierarchy and only uses 'is-a' relation.  ...  It has been used as a user-friendly comprehensive resource to calculate the semantic similarity of the verbs in WordNet.  ... 
doi:10.30970/fpl.2016.129.593 fatcat:ttzngr5d2zggtnhvb7dxhnxhym

Combining Web-Based Searching with Latent Semantic Analysis to Discover Similarity Between Phrases [chapter]

Sean M. Falconer, Dmitri Maslov, Margaret-Anne Storey
2006 Lecture Notes in Computer Science  
Determining semantic similarity between words, concepts and phrases is important in many areas within Artificial Intelligence.  ...  We propose a new technique for the comparison of general expressions that combines web searching with Latent Semantic Analysis.  ...  Supekar and Mark A. Musen for use of their ontology matching data.  ... 
doi:10.1007/11914853_69 fatcat:nvuxamgi4fat5gvimbpeitlnuq

Extractive Summarization Using Structural Syntax, Term Expansion and Refinement

Mohamed Taybe Elhadi
2017 International Journal of Intelligence Science  
Calculated similarity was based on LCS of pairs of sentences that make up the document. A normalized score was calculated and used to rank sentences.  ...  [1] and a local thesaurus [2] in the selection of the most appropriate extractive text summarization for a particular document.  ...  A normalized score between 0 and 1 was calculated for each pair of sentences using the longest common subsequences to produce a final measure of similarity. An initial set of sentences was produced.  ... 
doi:10.4236/ijis.2017.73004 fatcat:nh5tbfmr5zhzdda5zo57zl7h4e

First Token Algorithm for Searching Compound Terms Using Thesaurus Database

2012 Journal of Computer Science  
to find a specific word into a text, others to find a multi word term (pattern matching) into a text.  ...  Searching for term into a thesaurus database can be carried out using many searching algorithm such as brute-force algorithm and others.  ...  A Thesaurus is a list of very important term (single-word or multi-word) in a given domain of knowledge and a set of related terms for each term in the list.  ... 
doi:10.3844/jcssp.2012.61.67 fatcat:ky5oq3d67zefppo2hiiybf4xou

A User-Oriented Special Topic Generation System for Digital Newspaper [chapter]

Xi Xu, Mao Ye, Zhi Tang, Jian-Bo Xu, Liang-Cai Gao
2015 Lecture Notes in Computer Science  
Lastly, organize and refine the special topic according to the similarity between the candidate news and the topic, and the density of topic-related terms.  ...  Secondly, remove semantically repetitive vector component by constructing a synonymy word map.  ...  The similarity is actually between the key words of the new "content" and the subject headings of the topic.  ... 
doi:10.1007/978-3-319-25207-0_45 fatcat:aox5yqyf5fgrnfuitkhvq7hicu

dh2loop 1.0: an open-source Python library for automated processing and classification of geological logs

Ranee Joshi, Kavitha Madaiah, Mark Jessell, Mark Lindsay, Guillaume Pirot
2021 Geoscientific Model Development  
Since this process can be tedious, we attempted to test the string matching with the comments, which resulted in a matching rate of 16 % (7870 successfully matched records out of 47 823 records).  ...  The contribution also addresses the subjective nature and variability of the nomenclature of lithological descriptions within and across different drilling campaigns by using thesauri and fuzzy string  ...  The research was carried out while the first author was in receipt of a Scholarship for International Research Fees (Australian Government Research Training Program Scholarship) and an Automated 3D Geology  ... 
doi:10.5194/gmd-14-6711-2021 fatcat:nqujxtgkkrgzxexpdlrrhyy3xe

Textual Similarity Measurement Approaches: A Survey (1)

Amira Abo-Elghit, Aya Al-Zoghby, Taher Hamza
2020 The Egyptian Journal of Language Engineering  
Finding the similarity between terms is the essential portion of textual similarity, then used as a major phase for sentence-level, paragraph-level, and script-level similarities.  ...  This paper aims to provide a general overview of the textual similarity in the literature.  ...  It uses the statistical lexical similarity between the vectors of similar words (second-order word vectors) extracted from the corpus instead of relying on only word distribution similarity calculations  ... 
doi:10.21608/ejle.2020.42018.1012 fatcat:a2fhtkub7nazlkgzqewqbb7koi

Toward building recommender systems for the circular economy: Exploring the perils of the European Waste Catalogue

Guido van Capelleveen, Chintan Amrit, Henk Zijm, Devrim Murat Yazan, Asad Abdi
2021 Journal of Environmental Management  
We experiment with semantic enhancement (an EWC thesaurus) and the linguistic contexts of words (learned by Word2vec) for detecting term vector similarity in addition to direct term matching algorithms  ...  The growth in the number of industries aiming at more sustainable business processes is driving the use of the European Waste Catalogue (EWC).  ...  Another useful concept is the bag of words (BOW), which is a sequence of keywords extracted from a general text string.  ... 
doi:10.1016/j.jenvman.2020.111430 pmid:33075657 fatcat:pynbgk2ikvcslg7i7qfzddbzma

Automatic document indexing in large medical collections

Angelos Hliaoutakis, Kalliopi Zervanou, Euripides G.M. Petrakis, Evangelos E. Milios
2006 Proceedings of the international workshop on Healthcare information and knowledge management - HIKM '06  
National Library of Medicine (NLM). AMTE X combines MeSH, the terminological thesaurus resource of NLM, with a well-established method for extraction of domain terms, the C/NC-value method.  ...  Term extraction relates to extracting the most characteristic or important terms (words or phrases) in a document.  ...  Candidate Evaluation: The candidate set of Metathesaurus mappings is evaluated. The evaluation process computes the mapping strength between the candidate Metathesaurus string and the text string.  ... 
doi:10.1145/1183568.1183570 dblp:conf/hikm/HliaoutakisZPM06 fatcat:zcgnbn6jsvcgvly3q5xnwefzvu

Initial Exploitation of Natural Language Processing Techniques on NATO Strategy and Policies

Giavid Valiyev, Marcello Piraino, Arvid Kok, Michael Street, Ivana Ilic Mestric, Retzius Birger
2020 Information & Security An International Journal  
Acknowledgment The authors wish to thank Eleanor Williams, Emilia Dettorres and Ivano Pennacchio whose subject matter expertise and exceptional support made this work possible.  ...  In scenario number one we will use glossary in order to tag multi-words in these sentences. The result of this process will lead us to have the term "Metadata management" as a multi-word term.  ...  The role of multi-words is important in improving the final accuracy of similarity scores.  ... 
doi:10.11610/isij.4713 fatcat:ah7uybxug5hffdyfhrtini4qb4

Improving Translation of Organization Names Combining Translation Model and Web Mining

Bin Li, Yin Zhou, Ning Ma, Wuqi Liang, Lulu Dong
2016 International Journal of Database Theory and Application  
Then a method based on the frequency shift and adjacency information was used to extract candidate translation strings.  ...  string abstracts.  Try to use linguistical characteristics (such as lexical category, grammar, and so on) as candidate unit models to improve the quality of candidate translation strings.  Adopt more  ...  We judged whether the left and right adjoining words of a candidate string constructed a translation string of an ON by calculating R(s).  ... 
doi:10.14257/ijdta.2016.9.1.13 fatcat:jtvim7dubvaehpieecm4ezqeru

Constructing virtual documents for ontology matching

Yuzhong Qu, Wei Hu, Gong Cheng
2006 Proceedings of the 15th international conference on World Wide Web - WWW '06  
Basically, as a collection of weighted words, the virtual document of a URIref declared in an ontology contains not only the local descriptions but also the neighboring information to reflect the intended  ...  On the investigation of linguistic techniques used in ontology matching, we propose a new idea of virtual documents to pursue a cost-effective approach to linguistic matching in this paper.  ...  The first author of this paper is also supported by Ministry of Education of China under Grant NCET-04-0472.  ... 
doi:10.1145/1135777.1135786 dblp:conf/www/QuHC06 fatcat:t2yuffcrtja6vbwl5hjsxyxpfi

The AMTEx approach in the medical document indexing and retrieval application

Angelos Hliaoutakis, Kaliope Zervanou, Euripides G.M. Petrakis
2009 Data & Knowledge Engineering  
The performance evaluation of two AMTEx configurations is measured against the current state-of-the-art, the MetaMap Transfer (MMTx) method in four experiments, using two types of corpora: a subset of  ...  National Library of Medicine (NLM). AMTEx combines MeSH, the terminological thesaurus resource of NLM, with a wellestablished method for extraction of terminology, the C/NC-value method.  ...  Dr Makreas proposed a methodology for the intellectual evaluation of the PMC answer sets.  ... 
doi:10.1016/j.datak.2008.11.002 fatcat:htjrnws7rfhnzb5wvbwce4hgui

Survey of Text Plagiarism Detection

Ahmed Hamza Osman, Naomie Salim, Albaraa Abuobieda
2012 Computer Engineering and Applications Journal  
It was found that many of the proposed methods for plagiarism detection have a weakness and lacking for detecting some types of plagiarized text.  ...  In this paper we are going to review and list the advantages and limitations of the significant effective techniques employed or developed in text plagiarism detection.  ...  The authors would like to thank International University of Africa (IUA) and Research Management Centre (RMC) Universiti Teknologi Malaysia for the support and incentive extended in making this study a  ... 
doi:10.18495/comengapp.v1i1.5 fatcat:xyvk24mevvhgpokhyn3bgo64ge
« Previous Showing results 1 — 15 out of 1,861 results