320 Hits in 2.6 sec

A Rule-based Kurdish Text Transliteration System [article]

Sina Ahmadi
2018 arXiv   pre-print
In this article, we present a rule-based approach for transliterating two mostly used orthographies in Sorani Kurdish.  ...  Our transliteration system, named Wergor, achieves 82.79% overall precision and more than 99% in detecting the double-usage characters. We also present a manually transliterated corpus for Kurdish.  ...  Figure A. 1 and A.2 in Appendix A shows two transliteration texts using Wergor. CONCLUSIONS AND FUTURE WORK In this paper, we propose a rule-based technique for Kurdish text transliteration.  ... 
arXiv:1811.10278v1 fatcat:hpth3yajivf6vivxah7cl34kfq

Kurdish Interdialect Machine Translation

Hossein Hassani
2017 Proceedings of the Fourth Workshop on NLP for Similar Languages, Varieties and Dialects (VarDial)  
This research suggests a method for machine translation among two Kurdish dialects. We chose the two widely spoken dialects, Kurmanji and Sorani, which are considered to be mutually unintelligible.  ...  The research is the first attempt for inter-dialect machine translation in Kurdish and particularly could help in making online texts in one dialect comprehensible to those who only speak the target dialect  ...  For instance, to develop a system based on a shallow-transfer and rule-based approach using Apertium platform (Peradin et al., 2014) and to compare this method with the previous one in terms of the quality  ... 
doi:10.18653/v1/w17-1208 dblp:conf/vardial/Hassani17 fatcat:swg3su7e5jahfk4b34r7nlsooy

Towards Kurdish Information Retrieval

Kyumars Sheykh Esmaili, Shahin Salavati, Anwitaman Datta
2014 ACM Transactions on Asian Language Information Processing  
This paper reports on the outcomes of a project aimed at providing essential resources for processing Kurdish texts.  ...  A principal output of this project is Pewan, the first standard Test Collection to evaluate Kurdish Information Retrieval systems.  ...  We designed a simple rule-based light-weight stemmer for Kurdish language which uses Pewan's prefix/suffix lists.  ... 
doi:10.1145/2556948 fatcat:ih6s3ordc5cjhinjl74357zzfy

A Method for Proper Noun Extraction in Kurdish

Hossein Hassani, Marc Herbstritt
2017 Symposium on Languages, Applications and Technologies  
We developed an application based on an architecture which includes a number of name lists, a set of rules, and a set of processes that recognizes Kurdish person names.  ...  This paper suggests a method for proper noun identification in Kurdish texts.  ...  Figure 1 Kurdish PNR Architecture -The architecture is based on three dictionaries and a set of rules which is used by the internal part of the system to find out the Proper Names.  ... 
doi:10.4230/oasics.slate.2017.19 dblp:conf/slate/Hassani17 fatcat:dthgc5vy4ve3racqilaavdlc34

Linguistic Errors in Shop Signs in Erbil City

Huda Yaseen Abdulwahid
2017 Cihan University-Erbil Scientific Journal  
This study supposes the reasons behind these sorts of errors include translator's language incompetence, translator's carelessness, and the socio-cultural differences between English, Arabic, and Kurdish  ...  Missing this diacritic in the transliteration process brings about the emergence of initial consonant clusters which are not permissible in the Arabic syllable system.  ...  This reflects the point that there is a grammatical rule in the Kurdish language which states that we must use ‫)ی(‬ even if we don't have it in the original copy because in the English language this (  ... 
doi:10.24086/cuesj.v1n2a14 fatcat:f6c4fb7lnzcyzi44wgmksspo4y

Hunspell for Sorani Kurdish Spell Checking and Morphological Analysis [article]

Sina Ahmadi
2021 arXiv   pre-print
In this paper, we present our efforts in annotating a lexicon with morphosyntactic tags and also, extracting morphological rules of Sorani Kurdish to build a morphological analyzer, a stemmer and a spell-checking  ...  system using Hunspell.  ...  Therefore, we use the rule-based transliteration system provided by (Ahmadi, 2019) to transliterate it into the Arabic-based script which is used in our implementation.  ... 
arXiv:2109.06374v1 fatcat:ltr3z525lrhpxk3pl2gscc5xri

Translation Constraints and Procedures to Overcome them in Rendering Journalistic Texts

Sabir Rasul
2016 Journal of University of Human Development  
It is more so in translating between English and Kurdish, which are marked by different linguistic systems and socio-cultural incongruities.  ...  This study aims to identify the patterns of translation constraints encountered when translating journalistic texts from English into Kurdish, as well as identify the patterns of translation procedures  ...  Introduction Taking a qualitative approach, this study surveys the translation of 45 English journalistic texts along with their Kurdish translations.  ... 
doi:10.21928/juhd.20160203.16 fatcat:dwk6auew4rbrlku642eytnheca

Building a Lemmatizer and a Spell-checker for Sorani Kurdish [article]

Shahin Salavati, Sina Ahmadi
2018 arXiv   pre-print
We propose a hybrid approach based on the morphological rules and a n-gram language model.  ...  The present paper aims at presenting a lemmatization and a word-level error correction system for Sorani Kurdish.  ...  As a rule-based method, we have provided a list of the past and present roots of Sorani Kurdish.  ... 
arXiv:1809.10763v1 fatcat:whnk5ap5zjbszmnk6ifcn3ffji

Leveraging Multilingual News Websites for Building a Kurdish Parallel Corpus [article]

Sina Ahmadi, Hossein Hassani, Daban Q. Jaff
2020 arXiv   pre-print
them across dialects and languages based on lexical similarity and transliteration of scripts.  ...  We present a corpus containing 12,327 translation pairs in the two major dialects of Kurdish, Sorani and Kurmanji.  ...  In the case of Sorani, as it is written in the Arabic-based alphabet, we first transliterate the Sorani text, using WER-GOR (Ahmadi, 2019) , into the Latin-based script which is used for Kurmanji and  ... 
arXiv:2010.01554v1 fatcat:uhjom4vyg5hqfermpwvgq2isbu

MES volume 46 issue 4 Cover and Back matter

2014 International Journal of Middle East Studies  
Articles must be based on original research and the careful analysis of primary source materials.  ...  Submit article manuscripts as MS Word documents through our ScholarOne online submissions system: http://mc.  ...  Transliteration follows a modified Encyclopedia of Islam system, which is detailed on this page. The editor may return manuscripts that do not conform to the guidelines. Text.  ... 
doi:10.1017/s0020743814001342 fatcat:66osm6b4fnhs5aiyomqbmjh3gy

Towards Finite-State Morphology of Kurdish [article]

Sina Ahmadi, Hossein Hassani
2020 arXiv   pre-print
It plays a crucial role in various tasks in Natural Language Processing (NLP) and Computational Linguistics (CL) such as machine translation and text and speech generation.  ...  Kurdish is a less-resourced multi-dialect Indo-European language with highly inflectional morphology.  ...  We believe that the current study will pave the way for evaluating machine transliteration systems in future as well.  ... 
arXiv:2005.10652v1 fatcat:g52h4mnsfbamvliwfrb2fsnuwa

Towards Machine Translation for the Kurdish Language [article]

Sina Ahmadi, Mariam Masoud
2020 arXiv   pre-print
Therefore, in this paper, we are addressing the main issues in creating a machine translation system for the Kurdish language, with a focus on the Sorani dialect.  ...  We describe the available scarce parallel data suitable for training a neural machine translation model for Sorani Kurdish-English translation.  ...  One of the outstanding projects in creating a rule-based machine translation system for Kurmanji and Sorani is the Apertium project (Forcada et al., 2011) .  ... 
arXiv:2010.06041v1 fatcat:qmyl4c4mdzblvbwt64mjso6ony

Central Kurdish machine translation: First large scale parallel corpus and experiments [article]

Zhila Amini, Mohammad Mohammadamini, Hawre Hosseini, Mehran Mansouri, Daban Jaff
2021 arXiv   pre-print
While the computational processing of Kurdish has experienced a relative increase, the machine translation of this language seems to be lacking a considerable body of scientific work.  ...  Our best performing systems achieve 22.72 and 16.81 in BLEU score for Ku→EN and En→Ku, respectively.  ...  In [8] , Kaka-Khan provides such tools and resources as morphological dictionary and bilingual dictionaries to develop a rule-based MT Kurdish-English MT system.  ... 
arXiv:2106.09325v1 fatcat:urqc5un32fafxgjmzelipqmp7a

Stemming for Kurdish Information Retrieval [chapter]

Shahin Salavati, Kyumars Sheykh Esmaili, Fardin Akhlaghian
2013 Lecture Notes in Computer Science  
Sorani and Kurmanji) and investigate their effectiveness on Kurdish Information Retrieval. More specifically, we build Jedar, the first rule-based stemmer for both Sorani and Kurmanji.  ...  Furthermore, they indicate that the gains from the rule-based and the statistical approaches are comparable.  ...  Conclusions and Future Work In this paper we presented Jedar, the first rule-based stemmer for Sorani Kurdish and Kurmanji Kurdish.  ... 
doi:10.1007/978-3-642-45068-6_24 fatcat:r6ddy6y42naotamwlsnlobhps4

Automatic Meter Classification of Kurdish Poems [article]

Aso Mahmudi, Hadi Veisi
2021 arXiv   pre-print
This paper presents a rule-based method for automatic classification of the poem meter for the Central Kurdish language.  ...  Most of the classic texts in Kurdish literature are poems. Knowing the meter of the poems is helpful for correct reading, a better understanding of the meaning, and avoidance of ambiguity.  ...  و‬ Mojiri (2008) looks up the words that cannot be syllabified by the rules from a transliteration dictionary.  ... 
arXiv:2102.12109v1 fatcat:trn3nwuikvfuth4hk3ud6e3znq
« Previous Showing results 1 — 15 out of 320 results