Filters








49 Hits in 2.0 sec

Pattern Based Term Extraction Using ACABIT System [article]

Koichi Takeuchi , Béatrice Daille
2009 arXiv   pre-print
In this paper, we propose a pattern-based term extraction approach for Japanese, applying ACABIT system originally developed for French.  ...  After extracting term candidates, ACABIT system filters out non-terms from the candidates based on log-likelihood.  ...  The output of the ACABIT system is a list of basic terms and extract basic terms and term variants using term variants.  ... 
arXiv:0907.2452v1 fatcat:cjf64lk4azfgdj2oqlyhrozv5i

Tools for Terminology Processing [article]

C. Enguehard, B. Daille, E. Morin
2014 arXiv   pre-print
Such processing may be statistically or linguistically based and produces terminology resources that can be used in a number of applications : indexing, information retrieval, technology watch, etc.  ...  They all take as input texts (or collection of texts) and reflect different states of terminology processing: term acquisition, term recognition and term structuring.  ...  For our work, we use a low-cost system (called: Promethee) which extracts and uses lexico-syntactic patterns to acquire semantic relations between terms.  ... 
arXiv:1412.4401v1 fatcat:ztfz7tktbnegxmpxcxw43ey63a

In vitro evaluation of a program for machine-aided indexing

Christian Jacquemin, Béatrice Daille, Jean Royauté, Xavier Polanco
2002 Information Processing & Management  
Human evaluation is divided into three parts: firstly the evaluation of controlled indexing, then free indexing and finally term variant extraction performed during controlled indexing.  ...  evaluation, Etienne Fleuret, former Director of INIST's Database Technical 1 Department, and Vivianne Berthelier, head of the Biological Sciences and Agronomy section of that department, for allowing us  ...  The precision of the term extraction performed by ACABIT is evaluated using a set of 300 randomly chosen sentences.  ... 
doi:10.1016/s0306-4573(01)00050-4 fatcat:tyz5mivrwfgzzl26catu7p4zqi

Knowledge-poor and Knowledge-rich Approaches for Multilingual Terminology Extraction

Béatrice Daille, Helena Blancafort
2013 Research in Computing Science  
The third evaluation scenario compares both tools and demonstrates that a probabilistic term extraction approach, developed with minimal eort, achieves satisfactory results when compared to a rule-based  ...  We run an evaluation on six languages and two dierent domains using crawled comparable corpora and hand-crafted reference term lists.  ...  Related work To our knowledge, no previous research has been done to use a probabilistic method for term extraction based on POS induction.  ... 
doi:10.13053/rcs-70-1-13 fatcat:lfw4c44e75bwvnpfwrqab42bae

A Linguistic Model for Terminology Extraction based Conditional Random Fields [article]

Fethi Fkih, Mohamed Nazih Omri, Imen Toumia
2014 arXiv   pre-print
In this paper, we show the possibility of using a linear Conditional Random Fields (CRF) for terminology extraction from a specialized text corpus.  ...  ACABIT, a system developed by Beatrice Daille [3] , applies primarily a morphosyntactic analysis to identify the different terminological variations.  ...  In general, the text is chunked into non-recursive chunks using a rules system (or grammar) to detect terms by identifying their components or their contexts, based on the morphosyntactic annotation [  ... 
arXiv:1210.0252v2 fatcat:kujkcoupjbdcdnrq4zgczabqbi

Terminology Extraction with Term Variant Detection

Damien Cram, Beatrice Daille
2016 Proceedings of ACL-2016 System Demonstrations  
We focus on the main components: UIMA Tokens Regex for defining term and variant patterns over word annotations, and the grouping component for clustering terms and variants that works both at morphological  ...  TermSuite follows the classic two steps of terminology extraction tools, the identification of term candidates and their ranking, but implements new features.  ...  Application to terminology extraction Example In TermSuite type system, the values of the feature category are the part-ofspeech (POS) tags.  ... 
doi:10.18653/v1/p16-4003 dblp:conf/acl/CramD16 fatcat:cztaour7tnctrj6un2xlxzfkva

Projecting corpus-based semantic links on a thesaurus

Emmanuel Morin, Christian Jacquemin
1999 Proceedings of the 37th annual meeting of the Association for Computational Linguistics on Computational Linguistics -  
The termer used for multi-word term acquisition is ACABIT (Daille, 1996) . It has produced 15,875 multi-word terms composed of 4,194 single words.  ...  The quality of the projected links resulting from corpus-based acquisition is compared with projected links extracted from a technical thesaurus. * The experiments presented in this paper were performed  ...  Iterative Acquisition of Hypernym Links We first present the system for corpus-based information extraction that produces hypernym links between single words.  ... 
doi:10.3115/1034678.1034739 dblp:conf/acl/Morin99 fatcat:qs6wlczjufgx7hmuwd2rjtgu34

Automatic Acquisition and Expansion of Hypernym Links

Emmanuel Morin, Christian Jacquemin
2004 Language Resources and Evaluation  
Hypernym links acquired through an information extraction procedure are projected on multi-word terms through the recognition of semantic variations.  ...  This paper proposes to bridge the gap between term acquisition and thesaurus construction by offering a framework for automatic structuring of multi-word candidate terms with the help of corpus-based links  ...  The Prométhée system cannot be used to extract patterns relative to the CJ:produce relation.  ... 
doi:10.1007/s10579-004-1926-2 fatcat:7jmqws745fhutglojqga2a7r3q

A Novel Method for Arabic Multi-Word Term Extraction

Hadni Meryem, Said Alaoui Ouatik, Abdelmonaime Lachkar
2014 International Journal of Database Management Systems  
The Linguistic Filter uses our proposed Part Of Speech (POS) Tagger and the Sequence identifier as patterns in order to extract candidate AMWTs.  ...  Once they are automatically extracted, they can be used to increase the performance of any Arabic Text Mining applications such as Categorization, Clustering, Information Retrieval System, Machine Translation  ...  to extract candidate MWTs based on syntactic patterns.  ... 
doi:10.5121/ijdms.2014.6304 fatcat:iv22zu7tkzd5dcnh3em7tjda3e

Detecting Noun Phrases in Biomedical Terminologies: The First Step in Managing the Evolution of Knowledge [chapter]

Adila Merabti, Lina F. Soualmia, Stéfan J. Darmoni
2014 Lecture Notes in Computer Science  
This step is based on patterns constructed from six main medical terminologies used in document indexing. The patterns are constructed by using a Tree Tagger.  ...  Currently, we are developing a comparison approach to extract similar and different elements between medical documents in French in order to identify any significant changes such as new medical terms or  ...  Detection and extraction of noun phrases using patterns based on different medical terminologies and Tree Tagger is the first step in the procedure.  ... 
doi:10.1007/978-3-319-06269-3_12 fatcat:24uir2ryijapdfy6akdpfka7ca

Using Biomedical Terminologies to extract Noun Phrases for managing knowledge evolution

Adila Merabti, Lina Fatima Soualmia, Stéfan Jacques Darmoni
2014 International Work-Conference on Bioinformatics and Biomedical Engineering  
This step is based on patterns constructed from six main medical terminologies used in document indexing. The patterns are constructed by using a Tree Tagger.  ...  Currently, we are developing a comparison approach to extract similar and different elements between medical documents in French in order to identify any significant changes such as new medical terms or  ...  Detection and extraction of noun phrases using patterns based on different medical terminologies and Tree Tagger is the first step in the procedure.  ... 
dblp:conf/iwbbio/MerabtiSD14 fatcat:s7zfunqi2zddvjrxpmaptr7v5m

TBXTools: A Free, Fast and Flexible Tool for Automatic Terminology Extraction

Antoni Oliver, Mercè Vàzquez
2015 Recent Advances in Natural Language Processing  
In this paper we present the main features of TBXTools along with evaluation results for term extraction, both using statistical and linguistic methodology, for several corpora.  ...  This paper presents TBXTools, a free automatic terminology extraction tool that implements linguistic and statistical methods for multiword term extraction.  ...  This significant difference between these two values (15.84 points) indicates that the simple approach to lemmatization based on morphological normalization using simple morphological patterns is not very  ... 
dblp:conf/ranlp/OliverV15 fatcat:woh2yx5hpffx7fts2x5qs6smde

Constructing and maintaining knowledge organization tools: a symbolic approach

Fidelia Ibekwe‐SanJuan
2006 Journal of Documentation  
Acknowledgements This research benefited from collaboration with Eric SanJuan, Lecturer in Computer sciences at the University of Metz (France) who implemented the TermWatch system.  ...  This tool extracts generic/specific term candidates incrementally. Acabit is a term extractor based on symbolic and statistical features.  ...  The system comprises three major components: a linguistic component which extracts terms and identifies relations between them, a clustering component that clusters terms based on the explicit linguistic  ... 
doi:10.1108/00220410610653316 fatcat:lzjnqzc2lvdcdipg25agvy2vw4

Automatic extraction of keywords from scientific publications for indexing and open data in agronomy

Mathieu Roche, Sophie Fortuno, Juan Antonio Lossio-Ventura, Amira Akli, Salim Belkebir, Thinhinan Lounis, Serigne Toure
2015 Agricultures  
Experiments conducted on CIRAD data, show the validity of the approach used to extract new and relevant terms.  ...  This paper investigates the use and combination of text mining methodologies to highlight and publish the most appropriate terms from documents in open data systems.  ...  Table 1 . 1 Sample of keywords (in French) with different terminology extraction systems.  ... 
doi:10.1684/agr.2015.0773 fatcat:adhx6mpdlva4dlv3lex64flh4y

Learning To Order Terms: Supervised Interestingness Measures In Terminology Extraction

Jérôme Azé, Mathieu Roche, Yves Kodratoff, Michèle Sebag
2007 Zenodo  
Term Extraction, a key data preparation step in Text Mining, extracts the terms, i.e. relevant collocation of words, attached to specific concepts (e.g. genetic-algorithms and decisiontrees are terms associated  ...  This ranking is optimized using genetic algorithms, maximizing the trade-off between the false positive and true positive rates (Area Under the ROC curve).  ...  ACKNOWLEDGMENT We thank Mary Felkin for her English review, Oriane Matte-Tailliez for the expertise of the terms in Molecular Biology, and PASCAL (Pattern Analysis, Statistical Modelling and Computational  ... 
doi:10.5281/zenodo.1333849 fatcat:heszncc5brbbxeuyamddncvnxq
« Previous Showing results 1 — 15 out of 49 results