Filters








75 Hits in 8.7 sec

Word Sense Disambiguation with LSTM: Do We Really Need 100 Billion Words? [article]

Minh Le, Marten Postma, Jacopo Urbani
2017 arXiv   pre-print
Recently, Yuan et al. (2016) have shown the effectiveness of using Long Short-Term Memory (LSTM) for performing Word Sense Disambiguation (WSD).  ...  Their proposed technique outperformed the previous state-of-the-art with several benchmarks, but neither the training data nor the source code was released.  ...  Experiments were also carried out on the Dutch national einfrastructure with the support of SURF Cooperative.  ... 
arXiv:1712.03376v2 fatcat:ydhne4b2b5gjjcp2xvnfd4ch4m

KDSL: a Knowledge-Driven Supervised Learning Framework for Word Sense Disambiguation [article]

Shi Yin, Yi Zhou, Chenguang Li, Shangfei Wang, Jianmin Ji, Xiaoping Chen, Ruili Wang
2018 arXiv   pre-print
We propose KDSL, a new word sense disambiguation (WSD) framework that utilizes knowledge to automatically generate sense-labeled data for supervised learning.  ...  First, from WordNet, we automatically construct a semantic knowledge base called DisDict, which provides refined feature words that highlight the differences among word senses, i.e., synsets.  ...  For fairness, in Table 3 , we only make comparison with methods do not need manually labeled data.  ... 
arXiv:1808.09888v4 fatcat:yzr2cjhzvbgy7ejimfbxxwlk34

Improving the Coverage and the Generalization Ability of Neural Word Sense Disambiguation through Hypernymy and Hyponymy Relationships [article]

Loïc Vial, Benjamin Lecouteux, Didier Schwab
2018 arXiv   pre-print
In Word Sense Disambiguation (WSD), the predominant approach generally involves a supervised system trained on sense annotated corpora.  ...  to reduce the number of different sense tags that are necessary to disambiguate all words of the lexical database.  ...  supervised system is almost 100% on most WSD tasks, and so this provides a solid alternative to the automatic or semi-automatic creation of sense annotated corpora, and this nearly eliminates the need  ... 
arXiv:1811.00960v1 fatcat:iveebfygo5b7zktmyz5niutudi

Embedding Words as Distributions with a Bayesian Skip-gram Model [article]

Arthur Bražinskas, Serhii Havrylov, Ivan Titov
2018 arXiv   pre-print
Interestingly, unlike the Gaussian embeddings, we can also obtain context-specific densities: they encode uncertainty about the sense of a word given its context and correspond to posterior distributions  ...  We introduce a method for embedding words as probability densities in a low-dimensional space.  ...  (WG(S)), W2G with the diagonal covariance matrix (WG(D)) and SG, respectively.  ... 
arXiv:1711.11027v2 fatcat:4cyop532ebbe7k44rb3jz7tjzq

Word-Class Embeddings for Multiclass Text Classification [article]

Alejandro Moreo, Andrea Esuli, Fabrizio Sebastiani
2019 arXiv   pre-print
Pre-trained word embeddings encode general word semantics and lexical regularities of natural language, and have proven useful across many NLP tasks, including word sense disambiguation, machine translation  ...  We propose (supervised) word-class embeddings (WCEs), and show that, when concatenated to (unsupervised) pre-trained word embeddings, they substantially facilitate the training of deep-learning models  ...  The authors' opinions do not necessarily reflect those of the European Commission. Thanks to NVidia for donating the two Titan GPUs on which many of the experiments discussed in this paper were run.  ... 
arXiv:1911.11506v1 fatcat:74dnzbkpqjcf7jk4whjfpxy4ii

Multi-word Entity Classification in a Highly Multilingual Environment

Sophie Chesney, Guillaume Jacquet, Ralf Steinberger, Jakub Piskorski
2017 Proceedings of the 13th Workshop on Multiword Expressions (MWE 2017)  
We also want to thank the IC1207 COST Action PARSEME and SIGLEX for their endorsement and support, as well as the EACL 2017 organizers.  ...  We would like to thank the members of the program committee for the timely reviews, authors for their valuable contributions, shared task organizers, annotators, and system developers for their hard work  ...  We thank the annotators for their work and the anonymous reviewers for their insightful comments. We thank Nikola Ljubešić for his help with the hrMWELex lexicon.  ... 
doi:10.18653/v1/w17-1702 dblp:conf/mwe/ChesneyJSP17 fatcat:bv7aavgth5eurmzuphuowtuuhq

Deep Learning for Political Science [article]

Kakia Chatsiou, Slava Jankin Mikhaylov
2020 arXiv   pre-print
Focusing on the areas where the strengths of the methods coincide with challenges in these fields, the chapter first presents an introduction to AI and its core technology - machine learning, with its  ...  The discussion of deep neural networks is illustrated with the NLP tasks that are relevant to political science.  ...  100 billion tokens from Google News).  ... 
arXiv:2005.06540v1 fatcat:kz2cbxjrmrfhdlfss5gqziefoq

Automated doubt identification from informal reflections through hybrid sentic patterns and machine learning approach

Siaw Ling Lo, Kar Way Tan, Eng Lieh Ouh
2021 Research and Practice in Technology Enhanced Learning  
With the focus on learner-centered pedagogy, is it feasible to provide timely and relevant guidance to individual learners according to their levels of understanding?  ...  In this paper, we derived a hybrid approach that leverages a novel Doubt Sentic Pattern Detection (SPD) algorithm and a machine learning model to automate the identification of doubts from students' informal  ...  On the other hand, word2vec pre-trained model includes word vectors of 3 million words and phrases that are trained on roughly 100 billion words from a Google news dataset and stored using a 300-dimension  ... 
doi:10.1186/s41039-021-00149-9 fatcat:vxnuz3zdpbdyxmszyiqxgmuqqi

On the Linguistic Representational Power of Neural Machine Translation Models

Yonatan Belinkov, Nadir Durrani, Fahim Dalvi, Hassan Sajjad, James Glass
2020 Computational Linguistics  
In particular, we seek answers to the following questions: (i) How accurately is word-structure captured within the learned representations, which is an important aspect in translating morphologically-rich  ...  We conduct a thorough investigation along several parameters: (i) Which layers in the architecture capture each of these linguistic phenomena; (ii) How does the choice of translation unit (word, character  ...  Acknowledgments This work was funded by the QCRI, HBKU, as part of the collaboration with the MIT, CSAIL.  ... 
doi:10.1162/coli_a_00367 fatcat:5xux2ogrbfhozncsgkneuaz2k4

On the Linguistic Representational Power of Neural Machine Translation Models [article]

Yonatan Belinkov, Nadir Durrani, Fahim Dalvi, Hassan Sajjad, James Glass
2019 arXiv   pre-print
In particular, we seek answers to the following questions: (i) How accurately is word-structure captured within the learned representations, an important aspect in translating morphologically-rich languages  ...  We conduct a thorough investigation along several parameters: (i) Which layers in the architecture capture each of these linguistic phenomena; (ii) How does the choice of translation unit (word, character  ...  Acknowledgements This work was funded by the QCRI, HBKU, as part of the collaboration with the MIT, CSAIL.  ... 
arXiv:1911.00317v1 fatcat:uncw5nmpefhrpo7jtverg6c7qy

Neural Machine Translation [article]

Philipp Koehn
2017 arXiv   pre-print
For billion word corpora, even with the use of GPUs, training takes several days with modern compute clusters.  ...  Training is faster, since we only need to compute the output node value for the given training and noise examples -there is no need to compute the other values, since we do not normalize with the softmax  ...  We built English-Spanish systems on WMT data, 13 about 385.7 million English words paired with Spanish. To obtain a learning curve, we used 1 1024 , 1 512 , ..., 1 2 , and all of the data.  ... 
arXiv:1709.07809v1 fatcat:kj23sup7yfaxvllfha4v7xbugq

Fully-Unsupervised Embeddings-Based Hypernym Discovery

Maurizio Atzori, Simone Balloccu
2020 Information  
words if provided, allowing to find common hypernyms for a set of co-hyponyms—a task ignored in other systems but very useful when coupled with set expansion (that finds co-hyponyms automatically).  ...  We also evaluate the algorithm on a new dataset to measure the improvements when finding hypernyms for sets of words instead of singletons.  ...  The disambiguation is achieved by using "sense-aware word embeddings" that allow to capture multiple meanings for a single term.  ... 
doi:10.3390/info11050268 fatcat:rck4yyygubemvg2wysfj2lr2ti

Harbsafe-162. A Domain-Specific Data Set for the Intrinsic Evaluation of Semantic Representations for Terminological Data [article]

Susanne Arndt, Dieter Schnäpp
2020 arXiv   pre-print
This application is needed to solve a more complex problem: the harmonization of terminologies of standards and standards bodies (i.e. resolution of doublettes and inconsistencies).  ...  Considering recent criticism on intrinsic evaluation methods, the article concludes with an evaluation of Harbsafe-162 and joins a more general discussion about the nature of similarity rating tasks.  ...  They argue that they do not have to be sense-specific, since separate senses are just an expression of lexicographic convention.  ... 
arXiv:2005.14576v1 fatcat:vk34niystvfhvaalvr6rzdiaoi

Neural Machine Reading Comprehension: Methods and Trends

Shanshan Liu, Xin Zhang, Sheng Zhang, Hui Wang, Weiming Zhang
2019 Applied Sciences  
Machine reading comprehension (MRC), which requires a machine to answer questions based on a given context, has attracted increasing attention with the incorporation of various deep-learning techniques  ...  Specifically, we give a thorough review of this research field, covering different aspects including (1) typical MRC tasks: their definitions, differences, and representative datasets; (2) the general  ...  In the process of human reading comprehension, we may use common sense when we cannot answer a question simply by knowing about the context.  ... 
doi:10.3390/app9183698 fatcat:bpwwfikrpvh4dhphyl3ezpnn5e

GPT-GNN: Generative Pre-Training of Graph Neural Networks [article]

Ziniu Hu and Yuxiao Dong and Kuansan Wang and Kai-Wei Chang and Yizhou Sun
2020 arXiv   pre-print
One effective way to reduce the labeling effort is to pre-train an expressive GNN model on unlabeled data with self-supervision and then transfer the learned model to downstream tasks with only a few labels  ...  In this paper, we present the GPT-GNN framework to initialize GNNs by generative pre-training.  ...  Thus, we only consider them for Attribute Generation, with a 2-layer LSTM as decoder.  ... 
arXiv:2006.15437v1 fatcat:h5jithn2uvginbechslaufc7cy
« Previous Showing results 1 — 15 out of 75 results