1,592 Hits in 5.5 sec

Comparative study of monolingual and multilingual search models for use with asian languages

Jacques Savoy
2005 ACM Transactions on Asian Language Information Processing  
Based on the NTCIR-4 test-collection, our first objective is to present an overview of the retrieval effectiveness of nine vector-space and two probabilistic models that perform monolingual searches in  ...  Finally, we address basic problems related to multilingual searches, in which queries written in English are used to search documents written in the English, Chinese, Japanese, and Korean languages.  ...  III MAP for Various IR Models, English and Korean Monolingual Search Table V.  ... 
doi:10.1145/1105696.1105701 fatcat:zpiscx4znzgyvfyzqqsxxckniq

Extremely low-resource neural machine translation for Asian languages

Raphael Rubino, Benjamin Marie, Raj Dabre, Atushi Fujita, Masao Utiyama, Eiichiro Sumita
2021 Machine Translation  
search, data augmentation with forward and backward translation in combination with tags and noise, as well as joint multilingual training.  ...  AbstractThis paper presents a set of effective approaches to handle extremely low-resource language pairs for self-attention based neural machine translation (NMT) focusing on English and four Asian languages  ...  We compare the results obtained with our NMT models to SMT, with and without large language models trained on monolingual corpora.  ... 
doi:10.1007/s10590-020-09258-6 fatcat:6xvzc5x2anfdteobb7y22rvfsy

Supporting Multilingual Information Retrieval in Web Applications: An English-Chinese Web Portal Experiment [chapter]

Jialun Qin, Yilu Zhou, Michael Chau, Hsinchun Chen
2003 Lecture Notes in Computer Science  
Cross-language information retrieval (CLIR) and multilingual information retrieval (MLIR) techniques have been widely studied, but they are not often applied to and evaluated for Web applications.  ...  The approach was evaluated by domain experts and the results showed that co-occurrence-based phrasal translation achieved a 74.6% improvement in precision when compared with simple word-by-word translation  ...  Such expansion will allow us to study whether the reported techniques will perform differently for a multilingual Web portal with more than two languages.  ... 
doi:10.1007/978-3-540-24594-0_13 fatcat:h35oxmhgfnesrdbavcbzlmxoru

Cross-Language Information Retrieval: the way ahead

Fredric C. Gey, Noriko Kando, Carol Peters
2005 Information Processing & Management  
In particular, we find that insufficient attention has been given to the Web as a resource for multilingual research, and to languages which are spoken by hundreds of millions of people in the world but  ...  We present our view of some major directions for CLIR research in the future.  ...  Only seven groups submitted multilingual CLIR runs and many participants focused on using English topics to search Asian language documents for bilingual CLIR.  ... 
doi:10.1016/j.ipm.2004.06.006 fatcat:xy2zdmzd2ffy5op7wxlrd4avkq

Softmax Tempering for Training Neural Machine Translation Models [article]

Raj Dabre, Atsushi Fujita
2020 arXiv   pre-print
We also study the impact of softmax tempering on multilingual NMT and recurrently stacked NMT, both of which aim to reduce the NMT model size by parameter sharing thereby verifying the utility of temperature  ...  Neural machine translation (NMT) models are typically trained using a softmax cross-entropy loss where the softmax distribution is compared against smoothed gold labels.  ...  Acknowledgments A part of this work was conducted under the commissioned research program "Research and Development of Advanced Multilingual Translation Technology" in the "R&D Project for Information  ... 
arXiv:2009.09372v1 fatcat:nr4vhnrufzeazkiiebsuvdoq2u

Multilingual word translation using auxiliary languages

Hagai Taitelbaum, Gal Chechik, Jacob Goldberger
2019 Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)  
For each source word, we first search for the most relevant languages. We then use the auxiliary translations to these languages to form an improved representation of the source word.  ...  In this study we propose a multilingual translation procedure that uses all the learned mappings to translate a word from one language to another.  ...  We discussed several variants for deciding which and how many languages should be used as suitable auxiliary languages.  ... 
doi:10.18653/v1/d19-1134 dblp:conf/emnlp/TaitelbaumCG19 fatcat:5yopaknevrbjdb2bvlkqie4r4e

ByT5 model for massively multilingual grapheme-to-phoneme conversion [article]

Jian Zhu, Cong Zhang, David Jurgens
2022 arXiv   pre-print
Pairwise comparison with monolingual models in these languages suggests that multilingual ByT5 models generally lower the phone error rate by jointly learning from a variety of languages.  ...  We have curated a G2P dataset from various sources that covers around 100 languages and trained large-scale multilingual G2P models based on ByT5.  ...  Evaluations During evaluation, we used beam search to generate the predicted pronunciations with a beam size of 5.  ... 
arXiv:2204.03067v1 fatcat:b56upscqjfhrzdkkdj33jbaxz4

Improving non-English web searching (iNEWS07)

Fotis Lazarinis, Jesus Vilares Ferro, John Tait
2007 SIGIR Forum  
Conclusions were that search engines would be more effective if they took more account of the properties of individual languages, and that there is a need for more studies of real user behaviour in practical  ...  They do not take full account of inflectional semantics nor, for example, diacritics or the use of capitals.  ...  In his talk Professor de Rijke mentioned that "Over the past few years there has been a lot of progress in technology used for addressing monolingual or multilingual web queries in languages other than  ... 
doi:10.1145/1328964.1328977 fatcat:bpjjvbbbl5gwxphcxtnkd2mqvy

Contrastive Learning for Many-to-many Multilingual Neural Machine Translation [article]

Xiao Pan, Mingxuan Wang, Liwei Wu, Lei Li
2021 arXiv   pre-print
For non-English directions, mRASP2 achieves an improvement of average 10+ BLEU compared with the multilingual Transformer baseline.  ...  representations of different languages, and b) data augmentation on both multiple parallel and monolingual data to further align token representations.  ...  For non-English directions, mRASP2 achieves an improvement of average 10+ BLEU compared with the multilingual Transformer baseline.  ... 
arXiv:2105.09501v3 fatcat:2yui6p4t3bhgzpbmxshb6xctfi

Multilingual Web retrieval: An experiment in English–Chinese business intelligence

Jialun Qin, Yilu Zhou, Michael Chau, Hsinchun Chen
2006 Journal of the American Society for Information Science and Technology  
In this article, the authors present their research in developing and evaluating a multilingual English-Chinese Web portal that incorporates various CLIR techniques for use in the business domain.  ...  Cross-language information retrieval (CLIR), the study of retrieving information in one language by queries expressed in another language, is a promising approach to the problem.  ...  Finally, we also want to thank the domain experts who took part in the evaluation study.  ... 
doi:10.1002/asi.20329 fatcat:kai3o7cidfdlvjcwkdkx2hwmzy

CVIT's submissions to WAT-2019

Jerin Philip, Shashank Siripragada, Upendra Kumar, Vinay Namboodiri, C V Jawahar
2019 Proceedings of the 6th Workshop on Asian Translation  
We employ Transformer architecture experimenting with multilingual models and methods for lowresource languages.  ...  This paper describes the Neural Machine Translation systems used by IIIT Hyderabad (CVIT-MT) for the translation tasks part of WAT-2019.  ...  Acknowledgements We thank the multilingual milieu at our lab which enable us to worry less about the challenges in interpreting the results which comes with the many languages involved -special thanks  ... 
doi:10.18653/v1/d19-5215 dblp:conf/aclwat/PhilipSKNJ19 fatcat:qe7yveko2jbe3a6yk7szp7bw4u


Pezhman Sheinidashtego, Aibek Musaev
2019 Zenodo  
Recent advances in generating monolingual word embeddings based on word co-occurrence for universal languages inspired new efforts to extend the model to support diversified languages.  ...  State-of-the-art methods for learning cross-lingual word embeddings rely on the alignment of monolingual word embedding spaces.  ...  ACKNOWLEDGEMENTS I would like to thank the Computer Science Department at The University of Alabama, because through their funding and support, the faculty and staff had made it possible for me to work  ... 
doi:10.5281/zenodo.3889326 fatcat:cfmiwnmcazh5flfdbpsr6jwh5q

Cross Language Information Retrieval (CLIR): A Survey of Approaches for Exploring Web Across Languages

Starting with the most fundamental approaches of translation, it is attempted to study and present a review of more advanced approaches for enhancing the retrieval results in CLIR proposed by various researchers  ...  The need of information retrieval systems to become multilingual has given rise to the research in Cross Language Information Retrieval (CLIR) which can cross the language barriers and retrieve more relevant  ...  of the query and for estimating the domain of search results using hierarchic structures of Web directories" [22] .  ... 
doi:10.35940/ijitee.k7833.1110120 fatcat:qnfk26zmnndt7loi2kujjfiwr4

Cross-Lingual Approaches to Reference Resolution in Dialogue Systems [article]

Amr Sharaf, Arpit Gupta, Hancheng Ge, Chetan Naik, Lambert Mathias
2018 arXiv   pre-print
Furthermore, when combined with machine translation we can get performance very close to actual live data in the target language, with only 25% of the data projected into the target language.  ...  In the cross-lingual setup, we assume there is access to annotated resources as well as a well trained model in the source language and little to no annotated data in the target language.  ...  We use standard definitions of precision, recall and F1 by comparing the reference slots with the model hypothesis slots.  ... 
arXiv:1811.11161v1 fatcat:fxt6dfdtojef7imk25nj7imcoq

Historical knowledge and reinventing English writing teacher identity in Asia

Xiaoye You
2016 Writing & Pedagogy  
Recent scholarship in applied linguistics and literacy studies has suggested ways to embrace multilingualism in teaching and research.  ...  Next, drawing on historical examples related to the teaching of English writing in China, I demonstrate that Chinese students and teachers have struggled with a monolingual ideology endorsed by the state  ...  China About the author Xiaoye you is an associate professor of English and Asian Studies at Pennsylvania State University, USA and a yunshan Chair Professor at Guangdong University of Foreign Studies  ... 
doi:10.1558/wap.31016 fatcat:7jc667hvnnhdhp3qb7h7lkrkny
« Previous Showing results 1 — 15 out of 1,592 results