171 Hits in 7.5 sec

NTHU at NTCIR-10 CrossLink-2: An Approach toward Semantic Features

Yu-Lan Liu, Joanne Boisson, Jason S. Chang
2013 NTCIR Conference on Evaluation of Information Access Technologies  
of anchor texts in Chinese Wikipedia.  ...  In this task, we aim to discover valuable anchors in Chinese, Japanese or Korean (CJK) articles and to link these anchors to related English Wikipedia pages.  ...  However, HITS' method relied on the original cross-lingual resources in Wikipedia to address the anchors mining and the cross-lingual linking subproblems.  ... 
dblp:conf/ntcir/LiuBC13 fatcat:ro62au3jmrgdpf76hhbe2mople

The Effectiveness of Cross-lingual Link Discovery

Ling-Xiang Tang, Kelly Y. Itakura, Shlomo Geva, Andrew Trotman, Yue Xu
2011 NTCIR Conference on Evaluation of Information Access Technologies  
Cross-lingual link discovery is a way of automatically finding prospective links between documents in different languages, which is particularly helpful for knowledge discovery of different language domains  ...  This paper describes the evaluation in benchmarking the effectiveness of cross-lingual link discovery (CLLD).  ...  CONCLUSION AND FUTURE WORK In this paper we describe the motivation for cross-lingual link discovery in a knowledge base such as Wikipedia.  ... 
dblp:conf/ntcir/TangIGTX11 fatcat:acoaju4ppbenfnbdnum5saamrm

An evaluation framework for cross-lingual link discovery

Ling-Xiang Tang, Shlomo Geva, Andrew Trotman, Yue Xu, Kelly Y. Itakura
2014 Information Processing & Management  
This is particularly helpful in knowledge discovery if a multi-lingual knowledge base is sparse in one language or another, or the topical coverage in each language is different; such is the case with  ...  The assessments are further divided into two separate sets: manual assessments performed by human assessors; and automatic assessments based on links extracted from the Wikipedia itself.  ...  Together CrossLink and CrossLink-2 provides a real opportunity to enhance Wikipedia cross language linking and we look forward to reporting on that work in the future.  ... 
doi:10.1016/j.ipm.2013.07.003 fatcat:27tuyqp4brfjbpfdavjxuzr54q

Simple Yet Effective Methods for Cross-Lingual Link Discovery (CLLD) - KMI @ NTCIR-10 CrossLink-2

Petr Knoth, Drahomira Herrmannova
2013 NTCIR Conference on Evaluation of Information Access Technologies  
Our methods (team KMI) achieved in the NTCIR-10 CrossLink-2 evaluation the best overall results in the English to Chinese, Japanese and Korean (E2CJK) task and were the top performers in the Chinese, Japanese  ...  Cross-Lingual Link Discovery (CLLD) aims to automatically find links between documents written in different languages.  ...  In Section 2, we present CLLD methods designed by team KMI that can be used to suggest a set of cross-lingual links from an English Wikipedia article to articles in Chinese, Japanese and Korean (English  ... 
dblp:conf/ntcir/KnothH13 fatcat:p4ysm4gccfeqlog5yy2tghk7ba

The RELX Dataset and Matching the Multilingual Blanks for Cross-Lingual Relation Classification [article]

Abdullatif Köksal, Arzucan Özgür
2020 arXiv   pre-print
To overcome this issue, we propose two cross-lingual relation classification models: a baseline model based on Multilingual BERT and a new multilingual pretraining setup, which significantly improves the  ...  Relation classification is one of the key topics in information extraction, which can be used to construct knowledge bases or to provide useful information for question answering.  ...  on the translation and for their quality check service.  ... 
arXiv:2010.09381v1 fatcat:3tbw5kmvhvbpbaaweq6sdcyjv4

XOR QA: Cross-lingual Open-Retrieval Question Answering [article]

Akari Asai, Jungo Kasai, Jonathan H. Clark, Kenton Lee, Eunsol Choi, Hannaneh Hajishirzi
2021 arXiv   pre-print
Based on this dataset, we introduce three new tasks that involve cross-lingual document retrieval using multi-lingual and English resources.  ...  We establish baselines with state-of-the-art machine translation systems and cross-lingual pretrained models.  ...  We thank Sewon Min, Kristina Toutanova, David Wadden, the members of the UW NLP group, and the anonymous reviewers for their insightful feedback on this paper, Nancy Li, Xun Cao, Hitesh Boinpally, Samek  ... 
arXiv:2010.11856v3 fatcat:7mujquejk5fxxmhggc22plic7u

ZusammenQA: Data Augmentation with Specialized Models for Cross-lingual Open-retrieval Question Answering System [article]

Chia-Chien Hung, Tommaso Green, Robert Litschko, Tornike Tsereteli, Sotaro Takeshita, Marco Bombieri, Goran Glavaš, Simone Paolo Ponzetto
2022 arXiv   pre-print
This paper introduces our proposed system for the MIA Shared Task on Cross-lingual Open-retrieval Question Answering (COQA).  ...  For passage retrieval, we evaluated the monolingual BM25 ranker against the ensemble of re-rankers based on multilingual pretrained language models (PLMs) and also variants of the shared task baseline,  ...  It is made of two mBERT-based encoders , one for the question and one for the passages.  ... 
arXiv:2205.14981v1 fatcat:72s2goyk2zhzvii7knfvicwkku


Masumi Shirakawa, Takahiro Hara, Shojiro Nishio
2014 Proceedings of the VLDB Endowment  
Given a new tweet as an input, MLJ generates a vector using CL-ESA and classifies it into one of clusters using one-pass DP-means.  ...  To overcome the language barrier and the sparsity of words, MLJ harnesses CL-ESA, a Wikipediabased language-independent method to generate a vector of Wikipedia pages (entities) from an input text.  ...  The future work includes supporting more languages such as Chinese and Russian, detecting popular news stories in each country or across the border, and evaluating the precision and recall of search results  ... 
doi:10.14778/2733004.2733041 fatcat:u2y2ylo7xngfrioapjedjzzxby

xGQA: Cross-Lingual Visual Question Answering [article]

Jonas Pfeiffer and Gregor Geigle and Aishwarya Kamath and Jan-Martin O. Steitz and Stefan Roth and Ivan Vulić and Iryna Gurevych
2022 arXiv   pre-print
We extend the established English GQA dataset to 7 typologically diverse languages, enabling us to detect and explore crucial challenges in cross-lingual visual question answering.  ...  We further propose new adapter-based approaches to adapt multimodal transformer-based models to become multilingual, and -- vice versa -- multilingual models to become multimodal.  ...  Ribeiro, Ji-Ung Lee, and Chen Liu for insightful feedback and suggestions on a draft of this paper.  ... 
arXiv:2109.06082v2 fatcat:3wofbl56ffbujfutrxlkpjubfq

Linguistically Driven Multi-Task Pre-Training for Low-Resource Neural Machine Translation

Zhuoyuan Mao, Chenhui Chu, Sadao Kurohashi
2022 ACM Transactions on Asian and Low-Resource Language Information Processing  
Experiments on ASPEC Japanese–English & Japanese–Chinese, Wikipedia Japanese–Chinese, News English–Korean corpora demonstrate that JASS and ENSS outperform MASS and other existing language-agnostic pre-training  ...  JASS focuses on masking and reordering Japanese linguistic units known as bunsetsu, whereas ENSS is proposed based on phrase structure masking and reordering tasks.  ...  scenarios for Wikipedia Japanese-Chinese [4, 5] and News English-Korean [32] translations.  ... 
doi:10.1145/3491065 fatcat:plugwg6sxvfczlteqbgnbx5lpu

Mining Documents and Sentiments in Cross-lingual Context

Motaz Saad
2016 Figshare  
Then, we propose a cross-lingual sentiment annotation method to label source and target documents with sentiments.  ...  Second, we present a cross-lingual document similarity measure to automatically retrieve and align comparable documents.  ...  These methods are based on bilingual dictionaries, on Cross-Lingual Information Retrieval (CL-IR), or based on Cross-Lingual Latent Semantic Indexing (CL-LSI).  ... 
doi:10.6084/m9.figshare.3204040.v1 fatcat:5kb4k2kylnc7nhdumanxjw5wpe

Current Approaches and Applications in Natural Language Processing

Arturo Montejo-Ráez, Salud María Jiménez-Zafra
2022 Applied Sciences  
Artificial Intelligence has gained a lot of popularity in recent years thanks to the advent of, mainly, Deep Learning techniques [...]  ...  to Korean translation.  ...  Different combinations of these approaches were evaluated on two tasks, bilingual dictionary induction and cross-lingual sentiment analysis.  ... 
doi:10.3390/app12104859 fatcat:yhoyyoqcazflrbx7veksnkrrdq

A Systematic Literature Review on Extraction of Parallel Corpora from Comparable Corpora

Dilshad Kaur, Satwinder Singh
2021 Journal of Computer Science  
Machine Translation, which is dependent on corpora availability, is a medium for meeting this high demand for translation. Parallel corpora are used to gain most translation knowledge.  ...  In today's Globalized Scenario, the requirement for translation is high and increasing rapidly in the number of fields, but it is difficult to translate everything manually.  ...  Satwinder Singh, HOD and Associate Professor, Computer Science and Technology, Central University of Punjab for guiding me throughout and giving valuable inputs for improving this review article.  ... 
doi:10.3844/jcssp.2021.924.952 fatcat:irlfbohfhzgpjo7qgoxzptlh6e

POLYGLOT-NER: Massive Multilingual Named Entity Recognition [article]

Rami Al-Rfou, Vivek Kulkarni, Bryan Perozzi, Steven Skiena
2014 arXiv   pre-print
Second, for languages where no gold-standard benchmarks are available, we propose a new method, distant evaluation, based on statistical machine translation.  ...  The increasing diversity of languages used on the web introduces a new level of complexity to Information Retrieval (IR) systems.  ...  Wikipedia Cross-lingual links will be used in combination with Freebase to extend our approach to all languages as future work.  ... 
arXiv:1410.3791v1 fatcat:kqkxgidkgzf2lp4iuh4twuicnm

Cross-lingual text classification with model translation and document translation

Teng-Sheng Moh, Zhang Zhang
2012 Proceedings of the 50th Annual Southeast Regional Conference on - ACM-SE '12  
In this project, the author proposes a new method which adapts both the model translation and document translation.  ...  This method can take advantage of the very best functionality between both the document translation and model translation methods.  ...  The difference between the poly-lingual and cross-lingual is for poly-lingual training, we have labeled data for all languages, but for cross-lingual training, we only have labeled corpus for one base  ... 
doi:10.1145/2184512.2184530 dblp:conf/ACMse/MohZ12 fatcat:fx7qigosrzg4rdadtzre4p4azq
« Previous Showing results 1 — 15 out of 171 results