Filters








19 Hits in 6.2 sec

TNT-KID: Transformer-based neural tagger for keyword identification

Matej Martinc, Blaž Škrlj, Senja Pollak
2021 Natural Language Engineering  
By adapting the transformer architecture for a specific task at hand and leveraging language model pretraining on a domain-specific corpus, the model is capable of overcoming deficiencies of both supervised  ...  and unsupervised state-of-the-art approaches to keyword extraction by offering competitive and robust performance on a variety of different datasets while requiring only a fraction of manually labeled  ...  As is pointed out in the study by Gallina, Boudin, and Daille (2020) , evaluation and comparison of keyphrase extraction algorithms is not a trivial task, since keyphrase extraction models in different  ... 
doi:10.1017/s1351324921000127 fatcat:afzhu3ejpfdllgjwgcm7ormqji

TNT-KID: Transformer-based Neural Tagger for Keyword Identification [article]

Matej Martinc, Blaž Škrlj, Senja Pollak
2020 arXiv   pre-print
By adapting the transformer architecture for a specific task at hand and leveraging language model pretraining on a domain specific corpus, the model is capable of overcoming deficiencies of both supervised  ...  and unsupervised state-of-the-art approaches to keyword extraction by offering competitive and robust performance on a variety of different datasets while requiring only a fraction of manually labeled  ...  This paper is supported by European Union's Horizon 2020 research and innovation programme under grant agreement No. 825153, project EMBEDDIA (Cross-Lingual Embeddings for Less-Represented Languages in  ... 
arXiv:2003.09166v2 fatcat:dqacyazwnvcqhf5d7y6gjr2gl4

Inferring multilingual domain-specific word embeddings from large document corpora

Luca Cagliero, Moreno La Quatra
2021 IEEE Access  
It proposes a new methodology to automatically infer aligned domain-specific word embeddings for a target language on the basis of the general-purpose and domain-specific models available for a source  ...  However, in several cross-lingual NLP domains both large enough domain-specific document corpora and pre-trained domain-specific word vectors are hard to find for languages other than English.  ...  ) Extract the set of words occurring in the keyphrases (except for the stopwords). 6) Definition ← w D 1 , w D 2 , . . . , w D m To our purposes, we reformulate the word retrieval task as follows: given  ... 
doi:10.1109/access.2021.3118093 fatcat:pyxp6lre5naktagtgua4ucyyi4

Neural Machine Translation for Bilingually Scarce Scenarios: A Deep Multi-task Learning Approach [article]

Poorya Zaremoodi, Gholamreza Haffari
2018 arXiv   pre-print
This is particularly inconvenient for language pairs for which enough parallel text is not available.  ...  We empirically evaluate and show the effectiveness of our multi-task learning approach on three translation tasks: English-to-French, English-to-Farsi, and English-to-Vietnamese.  ...  We are very grateful to the workshop members for the insightful discussions and data pre-processing.  ... 
arXiv:1805.04237v1 fatcat:af2ixee74ra4tidacxxj55zonu

Neural Machine Translation for Bilingually Scarce Scenarios: a Deep Multi-Task Learning Approach

Poorya Zaremoodi, Gholamreza Haffari
2018 Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers)  
This is particularly inconvenient for language pairs for which enough parallel text is not available.  ...  We empirically evaluate and show the effectiveness of our multi-task learning approach on three translation tasks: English-to-French, English-to-Farsi, and English-to-Vietnamese.  ...  Gaurav Kumar for the insightful discussions and data pre-processing.  ... 
doi:10.18653/v1/n18-1123 dblp:conf/naacl/ZareMoodiH18 fatcat:qdxygbcanjdrrdl6whwpo3fify

A Multi-Lingually Applicable Journalist Toolset For The Big-Data Era

G. Kiomourtzis, G. Giannakopoulos, V. Karkaletsis, A. Kosmopoulos
2016 Zenodo  
Octavian Popescu for his constant guidance, endless suggestions and encouragement and full support to finish this work.  ...  Acknowledgments We thank the anonymous reviewers and the participants in the Fall 2015 edition of the course "Natural Language Processing and Social Interaction" for helpful comments and discussion.  ...  Keyphrase extraction methods extract a set of descriptive phrases from a given corpus, and have proven their potential and have been used for various Natural Language Processing (NLP) purposes (see [Hasan  ... 
doi:10.5281/zenodo.1242850 fatcat:nfkqg7jhjffdvgezdjzc6xxppa

Machine Knowledge: Creation and Curation of Comprehensive Knowledge Bases [article]

Gerhard Weikum, Luna Dong, Simon Razniewski, Fabian Suchanek
2021 arXiv   pre-print
On top of this, the article discusses the automatic extraction of entity-centric properties.  ...  To support the long-term life-cycle and the quality assurance of machine knowledge, the article presents methods for constructing open schemas and for knowledge curation.  ...  We are most grateful for the thoughtful and extremely helpful comments by Soumen Chakrabarti, AnHai Doan and two other (anonymous) reviewers.  ... 
arXiv:2009.11564v2 fatcat:vh2lqfmhhbcwpf6dcsej3hhvgy

Program Committee

2006 2006 Sixth IEEE International Workshop on Source Code Analysis and Manipulation  
The conference and workshops were partially supported by the NTU Centre for Liberal Arts and Social Sciences (CLASS) and the Singapore MOE TRF Grant Syntactic Well-Formedness Diagnosis and Error-Based  ...  Support for students came from the Global Wordnet Association. We would like to thank the programme committee for their thoughtful and timely reviews. The conference homepage is  ...  Fund as well as the South African Centre for Digital Language Resources for providing funding in the various phases of the AWN project; as well as reviewers and conference participants for valuable inputs  ... 
doi:10.1109/scam.2006.23 dblp:conf/scam/X06c fatcat:2dhsf7loj5hlffu2jxpmlo2qcq

User-Generated Content in Social Media (Dagstuhl Seminar 17301)

Tat-Seng Chua, Norbert Fuhr, Gregory Grefenstette, Kalervo Järvelin, Jaakko Paltonen, Marc Herbstritt
2018 Dagstuhl Reports  
WG2 developed a framework for summarizing heterogeneous, multilingual and multimodal data, discussed key challenges and applications of this framework.  ...  WG1 invented an "Information Nutrition Label" that characterizes a document by different features such as e.g. emotion, opinion, controversy, and topicality; For computing these feature values, available  ...  Writing quality refers to the grammatical correctness of the text (morphology, syntax) such as taught in elementary schools [79] .  ... 
doi:10.4230/dagrep.7.7.110 dblp:journals/dagstuhl-reports/ChuaFGJP17 fatcat:bman5u6q5zdg7a6csnzwpba7sm

Semantic and sentiment analysis of selected Bhagavad Gita translations using BERT-based language framework

Rohitash Chandra, Venkatesh Kulkarni
2022 IEEE Access  
Recent progress of language models powered by deep learning has enabled not only translations but better understanding of language and texts with semantic and sentiment analysis.  ...  Finally, we use the aforementioned models for sentiment and semantic analyses and provide visualisation of results.  ...  morphology, etc).  ... 
doi:10.1109/access.2022.3152266 fatcat:3x3amj7jnrayjhqfq3ycdf5vi4

International Research Conference on Smart Computing and Systems Engineering SCSE 2020 Proceedings [Full Conference Proceedings]

2020 2020 International Research Conference on Smart Computing and Systems Engineering (SCSE)  
ACKNOWLEDGMENT The authors would like to thank the Department of Census and Department of Irrigation, Sri Lanka for providing the paddy yield and climate data for this study.  ...  in the University of Kelaniya for their immense support and encouragement they gave throughout the development phase of the data sets.  ...  The statistical models show the potential in achieving better natural language models for Sinhala.  ... 
doi:10.1109/scse49731.2020.9313027 fatcat:gjk5az2mprgvrpallwh6uhvlfi

Automatic Summarization

Martha Larson
2012 Foundations and Trends in Information Retrieval  
This growth has motivated extensive research on techniques and technologies that facilitate reliable indexing and retrieval.  ...  Speech media, that is, digital audio and video containing spoken content, has blossomed in recent years. Large collections are accruing on the Internet as well as in private and enterprise settings.  ...  for search use syntax factors for search use syntax factors for surge using text fact or surge using tags fact or surge encode time information.  ... 
doi:10.1561/1500000020 fatcat:o424mjxnp5abbexhjsobtom2ry

Rachel Mor-ton, and Hartmut Wick

Renlong Ai, Maria Bissiri, Hartmut Pfitzinger, Hans
2013 References Eric Atwell   unpublished
This survey examines the feedback in current Computer Assisted Pronunciation Training (CAPT) systems and focus on perceptual feedback.  ...  Shvetcov, for his guidance and support, and the anonymous reviewers for their insightful remarks.  ...  Ivan Sekaj for their support and to Aliancia Fair-Play for permission to execute some code on their servers.  ... 
fatcat:jiw36rob5nhlhd3zeknq2ohate

Towards Explainable Fact Checking [article]

Isabelle Augenstein
2021 arXiv   pre-print
While this has been known for some time, the issues this raises have been exacerbated by models increasing in size, and by EU legislation requiring models to be used for decision making to provide explanations  ...  Finally, the thesis presents some first solutions for explainable fact checking.  ...  “SemEval 2017 Task 10: ScienceIE - Extracting Keyphrases and Relations from Scientific Publications”.  ... 
arXiv:2108.10274v2 fatcat:5s4an6irezcjfmvvhmiaeqarh4

Discourse analysis of asynchronous conversations

Shafiq Rayhan Joty
2013
We propose novel computational models for topic segmentation and labeling, rhetorical parsing and dialog act recognition in asynchronous conversation.  ...  Effective processing of these conversations can be of great strategic value for both organizations and individuals.  ...  Traditionally keyphrase extraction is evaluated using precision, recall and F-measure based on exact matches between the extracted keyphrases and the human-assigned keyphrases [135, 140] .  ... 
doi:10.14288/1.0165726 fatcat:jixdchdwqzecployo5xgznb534
« Previous Showing results 1 — 15 out of 19 results