25 Hits in 8.9 sec

Building Machine Translation Systems for the Next Thousand Languages [article]

Ankur Bapna, Isaac Caswell, Julia Kreutzer, Orhan Firat, Daan van Esch, Aditya Siddhant, Mengmeng Niu, Pallavi Baljekar, Xavier Garcia, Wolfgang Macherey, Theresa Breiner, Vera Axelrod (+12 others)
2022 arXiv   pre-print
models, highlighting several frequent error modes of these types of models.  ...  filtering techniques; (ii) Developing practical MT models for under-served languages by leveraging massively multilingual models trained with supervised parallel data for over 100 high-resource languages  ...  A good example of what it means for words to have different meanings but to be distributionally similar given the usage of the language is the translation of the string "English Language".  ... 
arXiv:2205.03983v2 fatcat:smfytyrwcjdjhcqp6f2wrr4riu

Word Division in the Transcription of Chinese Script in the Title Fields of Bibliographic Records

Clément Arsenault
2001 Cataloging & Classification Quarterly  
Thus, transliteration and transcription are two similar but distinctive aspects of script conversion.  ...  In word-based approaches the idea is to ûansfonn the linear undivided string of characters into a word-hgrnented text.  ...  5 Word count based on number o f characters, with multi-character persona1 and place names counting as one.  ... 
doi:10.1300/j104v32n03_08 fatcat:hqugqtngxbabje243cvrt6u6ce

The Unicode Cookbook for Linguists: Managing writing systems using orthography profiles

Steven Moran, Michael Cysouw
C3. the symbol-string to be inserted for unmatched strings in the tokenized and transliterated output.  ...  C10. the tokenized strings, with additionally any transliterated strings, if transliteration is requested.  ...  as separator tokenize( example , profile = "~/Desktop/profile_skeleton.txt" , regex = TRUE , transliterate = "IPA" , sep = "" )$strings ## originals tokenized transliterated ## 1 cane cane kane ## 2 cena  ... 
doi:10.5167/uzh-135400 fatcat:lqa22wbbirdfhcrdkt6zbdppw4

Joint Discourse-aware Concept Disambiguation and Clustering

Angela Petra Fahrni
Concept disambiguation is the task of linking common nouns and proper names in a text -henceforth called mentions -to their corresponding concepts in a predefined inventory.  ...  In this thesis, we investigate concept disambiguation and clustering from a discourse perspective and propose a discourse-aware approach for joint concept disambiguation and clustering in the framework  ...  This relatedness measure has been successfully applied for disambiguation (Milne & Witten, 2008b; Kulkarni et al., 2009; and leads to higher results than for instance measures based on the cosine similarity  ... 
doi:10.11588/heidok.00020737 fatcat:vhljgiqbrbcwtpjxce6k4vcp2a

Lexikos 18

Lexikos Lexikos
2012 Lexikos  
A. Wilkes. Thanks too, to the Publisher who was willing to embark on this innovative project, as well as to Ghent University for its continued support of my field trips to South Africa.  ...  A. Wilkes. Thanks too, to the Publisher who was willing to embark on this innovative project, as well as to Ghent University for its continued support of my field trips to South Africa.  ...  Elle est basée sur des données écrites et orales, à savoir les textes liturgiques, les prières, les cantiques, les prêches et les commentaires des modérateurs des messes.  ... 
doi:10.5788/18-0-503 fatcat:3jxecvzxdzfdzjq2ytxosp2xqm

Chapter Three. STRUCTURE [chapter]

2019 Keys to The Gift  
The Interlinguistic Pun Nabokov is famous for his multilingual games, and with The Gift one cannot rule out the possibility of a pun based on the German meaning of the word Gift (in English, "poison").  ...  pseudo-genetic map for creating infi nite meanings out of a single string (rhyme scheme)" (Bethea 139) .  ...  In the opinion of Alexander Dolinin, the presence of a specifi c name for a person that Fyodor Godunov-Cherdyntsev does not and could not know indicates that Chernyshevski's stream of consciousness is  ... 
doi:10.1515/9781618117045-008 fatcat:x4sreckulzcbvbj6gc2chdadvu

Konrad von Megenberg: German terminologies and expressions as created on Latin models

Kathrin Chlench-Priber, Jens Braarvig, Markham J. Geller
the results of scientific meetings on current issues and supports, at the same time, further cooperation on these issues by offering an electronic platform with further resources and the possibility for  ...  A. Fishman (1997). The Multilingual Apple: Languages in New York City. Berlin: De Gruyter. Gianto, A. (1999) (1996) (1997) (1996) , vol. II (1997) . Berlin: De Gruyter.  ...  . -(2013 Acknowledgements This paper was prepared with the support of the Deutsche Forschungsgemeinschaft during a fellowship at the Lichtenberg-Kolleg of the Georg-August-Universität, Göttingen, for  ... 
doi:10.7892/boris.129432 fatcat:5w6ktrcg4rctnc4hkn3fnlxnee

Multilingual Ethiopia: Linguistic Challenges and Capacity Building Efforts

Binyam Sisay, Mendisu, Janne Johannessen, Binyam Sisay, Mendisu, Janne Johannessen, Binyam Sisay, Mendisu, Janne Bondi, Johannessen
2016 Oslo Studies in Language   unpublished
The name haankut-o ('mess maker') is given to a child born during a time of family problems. The name ʔunn-is-o ('one who caused horror') has a similar but more serious connotation.  ...  Hadiyya names for the most part reflect a patrilineal society. This is evidenced by the preponderance of father-based names and lack of mother-based names.  ...  Bi-morphemic personal names also exist, such as laap'p '-o ('comfort') , hobb-e ('lion'), yabur-o ('lippy'), etc. with base and inflection internal structure.  ... 

An Anglo-Norman Miscellany [chapter]

Jane Bliss
2017 An Anglo-Norman Reader  
Or it could simply have been named a er its builder, perhaps a local landowner by the name of Bodu or similar.  ...  The note to v. 5156 explains that (French) 'verge' has a similar semantic range: a tangible stick, and a term of measurement. 33 See Introduction, above, for the rote; the vïele may be a viol, or a vielle  ...  Each text is introduced and elucidated with notes and full references, and the material is divided into three main secti ons, based on Dean's Catalogue: Story (a variety of narrati ve forms), Miscellany  ... 
doi:10.11647/obp.0110.03 fatcat:slaivop77nbopbq6dkmxc7djwy

Ivelina Nikolova and Natalia Konstantinova Organisers of the Student Workshop

Irina Temnikova, Natalia Konstantinova, Alexandra Balahur, Chris Biemann, Kevin Cohen, Darja Fišer, Najeh Hajlaoui, Laura Hasler, Sobha Lalitha, Devi, Wolfgang Maier, Preslav Nakov (+19 others)
We both experiment with rule-based methods and machine learning approaches.  ...  Our results indicate that existing solutions for detecting light verb constructions can be successfully applied to other domains as well and we conclude that even a little amount of annotated target data  ...  References Acknowledgements We would like to thank the reviewers for their valuable comments, which helped us a lot in improving the paper. We are also grateful to Prof. Maite Taboada and Prof.  ... 

Manga vision: cultural and communicative perspectives

Bounthavy Suvilay
2017 Journal of Graphic Novels and Comics  
Acknowledgements Thanks to Dr Lewis Mayo for feedback on this chapter.  ...  Appendix: Manga Panel Layout faCtoRs infLuenCing non-natiVe ReaDeRs' sequenCing of jaPanese Manga PaneLs Acknowledgements Data entry for this project was greatly assisted by Katherine Pickhaver, Rebecca  ...  The mortal Light, alias Kira (a Japanese transliteration of 'killer'), is able to exact his vision of divine justice by merely writing the name of his victim (his 'judgements') in the shinigami's death  ... 
doi:10.1080/21504857.2017.1403340 fatcat:cdwrh63zyndndng2gbjppnm24a

La divulgazione orientalista francese di fine Ottocento e lo sviluppo sociale dei popoli d'Oriente: gli Armeni di Ernest Chantre [chapter]

Massimiliano Vaghi
2020 Eurasiatica  
An accurate analysis of the works of Ernest Chantre (1843-1924), often published under the auspices of the French Ministry of Public Education, shows a particular interest for civilisations of the Middle  ...  East, in particular for the Armenian people and culture.  ...  Numerous brands and businesses have word for word transliterated their name to make it appeared more modern and western, even without a standardized version of the alphabet (Yergaliyeva 2018) .  ... 
doi:10.30687/978-88-6969-453-0/004 fatcat:w4zuob3vtnhx5gkgxhkfobqhmu


Seth Sanders, Seth Sanders, John Kelly, Gonzalo Rubio, Jacco Dieleman, Jerrold Cooper, Christopher Woods, Annick Payne, William Schniedewind, Michael Silverstein, Piotr Michalowski, Paul-Alain Beaulieu (+4 others)
" -in short, a visual mess available to the would-be reader-for-content as a coherent and unified denotational text only with some difficulty.  ...  A string of twenty-nine signs is to be written on a reed leaf in a dream-sending ritual.  ...  I often think of them as languages that travel much (the big ones) and languages that travel little (the small ones), though geographical dispersal, itself a relative measure, is only a necessary and not  ... 

SICOL: Proceedings of the Second International Conference on Oceanic Linguistics Vol.1, Language contact

Tent, Jan (Ed.), Mugler, France (Ed.), CRCL, CRCL, Pacific Linguistics And/Or The Author(S)
We use 'communalect' here as "a variety spoken by people who claim they use the same speech" (Geraghty 1983: 18), and 'Fijian' as a cover term for both Standard Fijian and these communalects.  ...  We are also grateful to many other students and colleagues for their interest and encouragement. This research project was funded by a University of the South Pacific research grant # 0702-9203.  ...  I consider that a reliable measure of lectal level may be derived from comparing competing variants for the same fe ature: for example, in Belize, copular variants include de (clearly basilectal); zero-morpheme  ... 
doi:10.15144/pl-c141 fatcat:jrysy5onsngbddopdjkluf5m4i

AIUCD 2021 - Book of Extended Abstracts

Federico Boschetti, Angelo Mario Del Grosso, Enrica Salvatori
Questo volume raccoglie gli abstract estesi e sottoposti a review per la conferenza di AIUCD2021 tenutasi in forma virtuale a Pisa.  ...  ACKNOWLEDGEMENTS The authors want to thank Yoann Moranville (DARIAH), Paula Forbes (Abertay University), and Mélanie Bunuel (Huma-num -CNRS) for their contributions to this paper.  ...  management based on the TriMED multilingual medical termbase.  ... 
doi:10.6092/unibo/amsacta/6712 fatcat:672tcvwzsvhixnic2cnjkfw72e
« Previous Showing results 1 — 15 out of 25 results