A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2022; you can also visit the original URL.
The file type is application/pdf
.
Filters
Building Machine Translation Systems for the Next Thousand Languages
[article]
2022
arXiv
pre-print
models, highlighting several frequent error modes of these types of models. ...
filtering techniques; (ii) Developing practical MT models for under-served languages by leveraging massively multilingual models trained with supervised parallel data for over 100 high-resource languages ...
A good example of what it means for words to have different meanings but to be distributionally similar given the usage of the language is the translation of the string "English Language". ...
arXiv:2205.03983v2
fatcat:smfytyrwcjdjhcqp6f2wrr4riu
Word Division in the Transcription of Chinese Script in the Title Fields of Bibliographic Records
2001
Cataloging & Classification Quarterly
Thus, transliteration and transcription are two similar but distinctive aspects of script conversion. ...
In word-based approaches the idea is to ûansfonn the linear undivided string of characters into a word-hgrnented text. ...
5 Word count based on number o f characters, with multi-character persona1 and place names counting as one. ...
doi:10.1300/j104v32n03_08
fatcat:hqugqtngxbabje243cvrt6u6ce
The Unicode Cookbook for Linguists: Managing writing systems using orthography profiles
2017
C3. the symbol-string to be inserted for unmatched strings in the tokenized and transliterated output. ...
C10. the tokenized strings, with additionally any transliterated strings, if transliteration is requested. ...
as separator tokenize( example , profile = "~/Desktop/profile_skeleton.txt" , regex = TRUE , transliterate = "IPA" , sep = "" )$strings ## originals tokenized transliterated ## 1 cane cane kane ## 2 cena ...
doi:10.5167/uzh-135400
fatcat:lqa22wbbirdfhcrdkt6zbdppw4
Joint Discourse-aware Concept Disambiguation and Clustering
2016
Concept disambiguation is the task of linking common nouns and proper names in a text -henceforth called mentions -to their corresponding concepts in a predefined inventory. ...
In this thesis, we investigate concept disambiguation and clustering from a discourse perspective and propose a discourse-aware approach for joint concept disambiguation and clustering in the framework ...
This relatedness measure has been successfully applied for disambiguation (Milne & Witten, 2008b; Kulkarni et al., 2009; and leads to higher results than for instance measures based on the cosine similarity ...
doi:10.11588/heidok.00020737
fatcat:vhljgiqbrbcwtpjxce6k4vcp2a
Lexikos 18
2012
Lexikos
A. Wilkes. Thanks too, to the Publisher who was willing to embark on this innovative project, as well as to Ghent University for its continued support of my field trips to South Africa. ...
A. Wilkes. Thanks too, to the Publisher who was willing to embark on this innovative project, as well as to Ghent University for its continued support of my field trips to South Africa. ...
Elle est basée sur des données écrites et orales, à savoir les textes liturgiques, les prières, les cantiques, les prêches et les commentaires des modérateurs des messes. ...
doi:10.5788/18-0-503
fatcat:3jxecvzxdzfdzjq2ytxosp2xqm
Chapter Three. STRUCTURE
[chapter]
2019
Keys to The Gift
The Interlinguistic Pun Nabokov is famous for his multilingual games, and with The Gift one cannot rule out the possibility of a pun based on the German meaning of the word Gift (in English, "poison"). ...
pseudo-genetic map for creating infi nite meanings out of a single string (rhyme scheme)" (Bethea 139) . ...
In the opinion of Alexander Dolinin, the presence of a specifi c name for a person that Fyodor Godunov-Cherdyntsev does not and could not know indicates that Chernyshevski's stream of consciousness is ...
doi:10.1515/9781618117045-008
fatcat:x4sreckulzcbvbj6gc2chdadvu
Konrad von Megenberg: German terminologies and expressions as created on Latin models
2018
the results of scientific meetings on current issues and supports, at the same time, further cooperation on these issues by offering an electronic platform with further resources and the possibility for ...
A. Fishman (1997). The Multilingual Apple: Languages in New York City. Berlin: De Gruyter. Gianto, A. (1999) (1996) (1997) (1996) , vol. II (1997) . Berlin: De Gruyter. ...
. -(2013
Acknowledgements This paper was prepared with the support of the Deutsche Forschungsgemeinschaft during a fellowship at the Lichtenberg-Kolleg of the Georg-August-Universität, Göttingen, for ...
doi:10.7892/boris.129432
fatcat:5w6ktrcg4rctnc4hkn3fnlxnee
Multilingual Ethiopia: Linguistic Challenges and Capacity Building Efforts
2016
Oslo Studies in Language
unpublished
The name haankut-o ('mess maker') is given to a child born during a time of family problems. The name ʔunn-is-o ('one who caused horror') has a similar but more serious connotation. ...
Hadiyya names for the most part reflect a patrilineal society. This is evidenced by the preponderance of father-based names and lack of mother-based names. ...
Bi-morphemic personal names also exist, such as laap'p '-o ('comfort') , hobb-e ('lion'), yabur-o ('lippy'), etc. with base and inflection internal structure. ...
fatcat:2rhi45squjbsxfzhbzvaksld3a
An Anglo-Norman Miscellany
[chapter]
2017
An Anglo-Norman Reader
Or it could simply have been named a er its builder, perhaps a local landowner by the name of Bodu or similar. ...
The note to v. 5156 explains that (French) 'verge' has a similar semantic range: a tangible stick, and a term of measurement. 33 See Introduction, above, for the rote; the vïele may be a viol, or a vielle ...
Each text is introduced and elucidated with notes and full references, and the material is divided into three main secti ons, based on Dean's Catalogue: Story (a variety of narrati ve forms), Miscellany ...
doi:10.11647/obp.0110.03
fatcat:slaivop77nbopbq6dkmxc7djwy
Ivelina Nikolova and Natalia Konstantinova Organisers of the Student Workshop
unpublished
We both experiment with rule-based methods and machine learning approaches. ...
Our results indicate that existing solutions for detecting light verb constructions can be successfully applied to other domains as well and we conclude that even a little amount of annotated target data ...
References
Acknowledgements We would like to thank the reviewers for their valuable comments, which helped us a lot in improving the paper. We are also grateful to Prof. Maite Taboada and Prof. ...
fatcat:uxotu5b5mbh6jmf5wo4een7pda
Manga vision: cultural and communicative perspectives
2017
Journal of Graphic Novels and Comics
Acknowledgements Thanks to Dr Lewis Mayo for feedback on this chapter. ...
Appendix: Manga Panel Layout faCtoRs infLuenCing non-natiVe ReaDeRs' sequenCing of jaPanese Manga PaneLs Acknowledgements Data entry for this project was greatly assisted by Katherine Pickhaver, Rebecca ...
The mortal Light, alias Kira (a Japanese transliteration of 'killer'), is able to exact his vision of divine justice by merely writing the name of his victim (his 'judgements') in the shinigami's death ...
doi:10.1080/21504857.2017.1403340
fatcat:cdwrh63zyndndng2gbjppnm24a
La divulgazione orientalista francese di fine Ottocento e lo sviluppo sociale dei popoli d'Oriente: gli Armeni di Ernest Chantre
[chapter]
2020
Eurasiatica
An accurate analysis of the works of Ernest Chantre (1843-1924), often published under the auspices of the French Ministry of Public Education, shows a particular interest for civilisations of the Middle ...
East, in particular for the Armenian people and culture. ...
Numerous brands and businesses have word for word transliterated their name to make it appeared more modern and western, even without a standardized version of the alphabet (Yergaliyeva 2018) . ...
doi:10.30687/978-88-6969-453-0/004
fatcat:w4zuob3vtnhx5gkgxhkfobqhmu
MARGINS OF WRITING, ORIGINS OF CULTURES MARGINS OF WRITING, ORIGINS OF CULTURES edited by
unpublished
" -in short, a visual mess available to the would-be reader-for-content as a coherent and unified denotational text only with some difficulty. ...
A string of twenty-nine signs is to be written on a reed leaf in a dream-sending ritual. ...
I often think of them as languages that travel much (the big ones) and languages that travel little (the small ones), though geographical dispersal, itself a relative measure, is only a necessary and not ...
fatcat:r3w62rtodvctxdcvzksdw3b5xm
SICOL: Proceedings of the Second International Conference on Oceanic Linguistics Vol.1, Language contact
1998
We use 'communalect' here as "a variety spoken by people who claim they use the same speech" (Geraghty 1983: 18), and 'Fijian' as a cover term for both Standard Fijian and these communalects. ...
We are also grateful to many other students and colleagues for their interest and encouragement. This research project was funded by a University of the South Pacific research grant # 0702-9203. ...
I consider that a reliable measure of lectal level may be derived from comparing competing variants for the same fe ature: for example, in Belize, copular variants include de (clearly basilectal); zero-morpheme ...
doi:10.15144/pl-c141
fatcat:jrysy5onsngbddopdjkluf5m4i
AIUCD 2021 - Book of Extended Abstracts
2021
Questo volume raccoglie gli abstract estesi e sottoposti a review per la conferenza di AIUCD2021 tenutasi in forma virtuale a Pisa. ...
ACKNOWLEDGEMENTS The authors want to thank Yoann Moranville (DARIAH), Paula Forbes (Abertay University), and Mélanie Bunuel (Huma-num -CNRS) for their contributions to this paper. ...
management based on the TriMED multilingual medical termbase. ...
doi:10.6092/unibo/amsacta/6712
fatcat:672tcvwzsvhixnic2cnjkfw72e
« Previous
Showing results 1 — 15 out of 25 results