Filters








402 Hits in 3.9 sec

Plug-Tagger: A Pluggable Sequence Labeling Framework Using Language Models [article]

Xin Zhou, Ruotian Ma, Tao Gui, Yiding Tan, Qi Zhang, Xuanjing Huang
2021 arXiv   pre-print
Specifically, for each task, a label word set is first constructed by selecting a high-frequency word for each class respectively, and then, task-specific vectors are inserted into the inputs and optimized  ...  In this work, we propose the use of label word prediction instead of classification to totally reuse the architecture of pre-trained models for sequence labeling tasks.  ...  We count the time it takes to complete the tasks for all sampled samples in order. The data for each task is sampled from CoNLL2003.  ... 
arXiv:2110.07331v1 fatcat:s2cve65b7vbzllqqzpwmideqlm

Collecting and POS-tagging a lexical resource of Japanese biomedical terms from a corpus

Carlos Herrero-Zorita, Leonardo Campillos Llanos, Antonio Moreno-Sandoval
2014 Revista de Procesamiento de Lenguaje Natural (SEPLN)  
In other words, the reliability of the current taggers for Automatic Term Recognition is very low.  ...  There are three widespread taggers in Japanese that we considered using for this task: the Juman, ChaSen, and Mecab. The main problem in this step was oversegmentation.  ...  Collecting and POS-tagging a lexical resource of Japanese biomedical terms from a corpus  ... 
dblp:journals/pdln/Herrero-ZoritaLM14 fatcat:enhu34ljvfcw7a2x46ecvynamu

Zero-Shot Adaptive Transfer for Conversational Language Understanding

Sungjin Lee, Rahul Jha
2019 PROCEEDINGS OF THE THIRTIETH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE AND THE TWENTY-EIGHTH INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE  
To tackle this, we introduce a novel Zero-Shot Adaptive Transfer method for slot tagging that utilizes the slot description for transferring reusable concepts across domains, and enjoys efficient training  ...  A massive amount of labeled data is required for training each new domain.  ...  domain adaptation with zero-shot models, we first construct a joint training dataset by combining the training datasets of size 2000 from all domains except for a target domain.  ... 
doi:10.1609/aaai.v33i01.33016642 fatcat:ovoi4zaeynf3znasjne7aomx54

A Computational Effective Document Semantic Representation

Robert Williams
2007 2007 Inaugural IEEE-IES Digital EcoSystems and Technologies Conference  
An implementation of such a technique is described, and sample output is presented.  ...  Thesaurus head word index numbers are placed in the appropriate document sentence clause slots to represent the meta level meaning of the sentences.  ...  The AEG system uses such a technique when comparing student essays against model an- The POS tagger, as with most taggers, does not accurately tag words in all cases, and so the chunking process does not  ... 
doi:10.1109/dest.2007.372007 fatcat:762zoxg6jzervnfzc7qry47i2q

The spoken corpus of Cameroon Pidgin English

GABRIEL OZÓN, MIRIAM AYAFOR, MELANIE GREEN, SARAH FITZGERALD
2017 World Englishes  
The project has also necessitated the development of a designated tagset for CPE, which has been adapted from the CLAWS5 tagset.  ...  The corpus consists of private and public dialogues and monologues, with mark-up and POS-tagging.  ...  As a result, the tagset adapted for CPE differs considerably from the CLAWS5 set in a number of ways.  ... 
doi:10.1111/weng.12280 fatcat:hwqhhh5nwnh4lmtubgsuebtnhe

Dependency Parsing Using Global Features [chapter]

Tetsuji Nakagawa
2010 Text, Speech and Language Technology  
In an extrinsic evaluation setup, ELMoLex ranked 7 th for Event Extraction, Negation Resolution tasks and 11 th for Opinion Analysis task by F1 score.  ...  In this paper, we present the details of the neural dependency parser and the neural tagger submitted by our team 'ParisNLP' to the CoNLL 2018 Shared Task on parsing from raw text to Universal Dependencies  ...  We also thank José Carlos Rosales Núñez for useful discussions. This work was funded by the ANR projects ParSiTi (ANR-16-CE33-0021) and SoSweet (ANR15-CE38-0011-01).  ... 
doi:10.1007/978-90-481-9352-3_5 fatcat:wxfl4um2efe5tigujw427ka3eu

A question answering system for project management applications

Jinxing Cheng, Bimal Kumar, Kincho H Law
2002 Advanced Engineering Informatics  
The usage of computer applications in the construction industry has steadily increased over the years, as has the complexity of many software applications.  ...  We explore the mechanisms of utilizing information in the knowledge base for question understanding.  ...  The system can answer various questions about project schedule, such as the start date, ACKNOWLEDGEMENTS This work is partially sponsored by a Stanford Graduate Fellowship and the Product Engineering  ... 
doi:10.1016/s1474-0346(03)00014-4 fatcat:ci7hr6o7rnagpdoxsojp3vbis4

Statistical MWE-aware parsing [chapter]

Mathieu Constant, Gülşen Eryiğit, Carlos Ramisch, Mike Rosner, Gerold Schneider
2019 Zenodo  
We discuss MWE representation in treebanks, pipeline and joint orchestrations, the integration of external lexicons and the evaluation of MWE-aware parsers, concluding with our suggestions for future research  ...  This chapter aims at presenting different strategies that have been designed to incorporate multiword expression (MWE) identification in the process of syntactic parsing using statistical approaches.  ...  In short, sometimes it is better to adapt statistical models (in this case, a domain-adapted tagger) rather than using lexical resources (in this case, an MWE gazetteer of the domain).  ... 
doi:10.5281/zenodo.2579042 fatcat:7gnmn55vg5dhhe2uhajd34jwmm

Statistical MWE-aware parsing [chapter]

Mathieu Constant, Gülşen Eryiğit, Carlos Ramisch, Mike Rosner, Gerold Schneider
2019 Zenodo  
We discuss MWE representation in treebanks, pipeline and joint orchestrations, the integration of external lexicons and the evaluation of MWE-aware parsers, concluding with our suggestions for future research  ...  This chapter aims at presenting different strategies that have been designed to incorporate multiword expression (MWE) identification in the process of syntactic parsing using statistical approaches.  ...  In short, sometimes it is better to adapt statistical models (in this case, a domain-adapted tagger) rather than using lexical resources (in this case, an MWE gazetteer of the domain).  ... 
doi:10.5281/zenodo.2579043 fatcat:uycyose5kjdqtexqrl5x7b4ode

Reshaping text data for efficient processing on Amazon EC2

Gabriela Turcu, Ian Foster, Svetlozar Nestorov
2010 Proceedings of the 19th ACM International Symposium on High Performance Distributed Computing - HPDC '10  
We rely on the empirical performance of the application of interest on smaller subsets of data, to construct an execution plan.  ...  Using the subset-sum first fit heuristic we reshape the input data by merging files in order to match as closely as possible the desired file size.  ...  A direction for our future research is also to devise good execution plans for more complex workflows arising in text processing.  ... 
doi:10.1145/1851476.1851540 dblp:conf/hpdc/TurcuFN10 fatcat:nmmgwvgflrc3dnqr2ptuuhj2xa

Cross-lingual Transfer Learning and Multitask Learning for Capturing Multiword Expressions

Shiva Taslimipoor, Omid Rohanian, Le An Ha
2019 Proceedings of the Joint Workshop on Multiword Expressions and WordNet (MWE-WN 2019)  
In this study, we explore for the frst time, the application of transfer learning (TRL) and multitask learning (MTL) to the identifcation of Multiword Expressions (MWEs).  ...  For MTL, we exploit the shared syntactic information between MWE and dependency parsing models to jointly train a single model on both tasks.  ...  A sample of all three-fold labels that the model should predict for a sentence is depicted in fgure 2.  ... 
doi:10.18653/v1/w19-5119 dblp:conf/mwe/TaslimipoorRH19 fatcat:w36xm7rmajhkzjkldujs4lw7dy

Corpus and Models for Lemmatisation and POS-tagging of Old French [article]

Jean-Baptiste Camps, Thibault Clérice, Frédéric Duval, Lucence Ing, Naomi Kanaoka, Ariane Pinche
2021 arXiv   pre-print
In this paper, we present the current results of a long going project (2015-...) and describe how we broached the difficult question of providing lemmatisation andPOS models for Old French with the help  ...  of neural taggers and the progressive constitution of dedicated corpora.  ...  We thank the DIM Science du texte et connaissances nouvelles for funding the acquisition of a GPU server, as well as the École nationale des chartes for providing infrastructure and support for the server  ... 
arXiv:2109.11442v1 fatcat:dlriuf2dbbgtle2tumh3xjgb7a

Multiword Expression Processing: A Survey

Mathieu Constant, Gülşen Eryiğit, Johanna Monti, Lonneke van der Plas, Carlos Ramisch, Michael Rosner, Amalia Todirascu
2017 Computational Linguistics  
For each of the two MWE processing subtasks and for each of the two use cases, we conclude on open issues and research perspectives.  ...  Many of the approaches in the literature can be differentiated according to how MWE processing is timed with respect to underlying use cases.  ...  For instance, Ramisch, Besacier, and Kobzar (2013) identify discontiguous verb-particle constructions in English made of a verb + at most five words + a particle, adapting the discovery method proposed  ... 
doi:10.1162/coli_a_00302 fatcat:rjrfyfbfpfblfbtwpbdqnpgsiu

Extracting Elements of Component-Based Systems from Natural Language Requirements

Kung-Kiu Lau, Azlin Nordin, Keng-Yap Ng
2011 2011 37th EUROMICRO Conference on Software Engineering and Advanced Applications  
Extracting keywords from requirements has been done for various modelling purposes, e.g. for defining objectoriented analysis and design models, but it has not been done for mapping requirements directly  ...  In this paper we argue that the latter is possible if the underlying component model provides suitable encapsulation and hence separation of key elements of component-based systems.  ...  We use the POS tagger to extract all the nouns.  ... 
doi:10.1109/seaa.2011.16 dblp:conf/euromicro/LauNN11 fatcat:74zb7s3girc6za2bmeclcyerim

TNT-KID: Transformer-based neural tagger for keyword identification

Matej Martinc, Blaž Škrlj, Senja Pollak
2021 Natural Language Engineering  
Tagger for Keyword IDentification (TNT-KID).  ...  In this research, we present a novel algorithm for keyword identification, that is, an extraction of one or multiword phrases representing key aspects of a given document, called Transformer-Based Neural  ...  For generating the additional POS tag sequence input described in Section 3.1, which was not used in the best-performing model, Averaged Perceptron Tagger from the NLTK library (Bird and Loper 2004 )  ... 
doi:10.1017/s1351324921000127 fatcat:afzhu3ejpfdllgjwgcm7ormqji
« Previous Showing results 1 — 15 out of 402 results