A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2021; you can also visit the original URL.
The file type is application/pdf
.
Filters
Design and Development of Unsupervised Stemmer for Sindhi Language
2020
Procedia Computer Science
This paper presents a stemmer, design and developed for Sindhi Language, using unsupervised approach. Suffixes are extracted using "Linguistica 5 "[22] a tool for unsupervised learning of morphology. ...
This paper presents a stemmer, design and developed for Sindhi Language, using unsupervised approach. Suffixes are extracted using "Linguistica 5 "[22] a tool for unsupervised learning of morphology. ...
Majgaonker [27] design a rule-based stemmer and unsupervised stemmer for Marathi Language and compared the performance on a manually stemmed 1500 words test dataset. Gupta et.al. ...
doi:10.1016/j.procs.2020.03.212
fatcat:bs2mggcwh5bz3oeha25lehmu7u
SALMA: Standard Arabic Language Morphological Analysis
2013
2013 1st International Conference on Communications, Signal Processing, and their Applications (ICCSPA)
The morphological analyzer should add the appropriate linguistic information to each part or morpheme of the word (proclitic, prefix, stem, suffix and enclitic); in effect, instead of a tag for a word, ...
The SALMA-Tools is a collection of open-source standards, tools and resources that widen the scope of Arabic word structure analysisparticularly morphological analysis, to process Arabic text corpora of ...
(iii) It has been reported as a standard for evaluating morphological analyzers for Arabic text and for building a gold standard for evaluating morphological analyzers and part-of-speech taggers for Arabic ...
doi:10.1109/iccspa.2013.6487311
fatcat:zyszkduja5gjlkgsnxknpwq7re
Quality Estimation Of Machine Translation Outputs Through Stemming
[article]
2014
arXiv
pre-print
Every day we can see some machine translators being developed, but getting a high quality automatic translation is still a very distant dream . ...
In this paper, we are emphasizing on English-Hindi language pair, so in order to preserve the correct MT output we present a ranking system, which employs some machine learning techniques and morphological ...
[13] proposed A Lightweight Stemmer for Gujarati, they showed an implementation of a rule based stemmer of Gujarati and created rules for stemming and the richness in morphology. ...
arXiv:1407.2694v1
fatcat:ukog2eg3w5a4pkt22kd3gyipoe
Quality Estimation of Machine Translation Outputs Through Stemming
2014
International Journal on Computational Science & Applications
Every day we can see some machine translators being developed , but getting a high quality automatic translation is still a very distant dream . ...
In this paper, we are emphasizing on English-Hindi language pair, so in order to preserve the correct MT output we present a ranking system, which employs some machine learning techniques and morphological ...
[13] proposed A Lightweight Stemmer for Gujarati, they showed an implementation of a rule based stemmer of Gujarati and created rules for stemming and the richness in morphology. ...
doi:10.5121/ijcsa.2014.4302
fatcat:2okcf2wv6vdujft3fjhaesbdqa
Parallel hardware for faster morphological analysis
2018
Journal of King Saud University: Computer and Information Sciences
The investigation includes a thorough evaluation of the methodology, and performance and accuracy analyses of the developed software and hardware implementations. ...
The developed stemmer for verb root extraction with infix processing attained accuracies of 87% and 90.7% for analyzing the texts of the Holy Quran and its Chapter 29 - Surat Al-Ankabut. ...
A thorough analysis and evaluation is presented in Section 6 including validation and testing, performance analysis, accuracy analysis, and a general evaluation. ...
doi:10.1016/j.jksuci.2017.07.003
fatcat:rti3inukvfgzbmo76i3w57z5kq
Influence of GUJarati STEmmeR in Supervised Learning of Web Page Categorization
2021
International Journal of Intelligent Systems and Applications
This research work is intended to focus on the analysis of Web Page Categorization (WPC) of Gujarati language and concentrate on a research problem to do verify the influence of a stemming algorithm in ...
the corpus as a word by word for the given query. ...
To evaluate this method, a framework for the Gujarati WPC is developed that implements the general method and provides support for several algorithms that have been considering for study. ...
doi:10.5815/ijisa.2021.03.03
fatcat:hylx7xnbufathfn7gqdzxbzlfy
A Frequent Term and Semantic Similarity based Single Document Text Summarization Algorithm
2011
International Journal of Computer Applications
In this paper a frequent term based text summarization algorithm is designed and implemented in java. The designed algorithm works in three steps. ...
The designed algorithm is implemented using open source technologies like java, DISCO, Porters stemmer etc. and verified over the standard text mining corpus. ...
Java is a general-purpose, concurrent, class-based, object-oriented language that is specifically designed to have as few implementation dependencies as possible. ...
doi:10.5120/2190-2778
fatcat:6hpb3cpnqjh7fcdjnxubzeybka
UniNE at CLEF 2008: TEL, and Persian IR
[chapter]
2009
Lecture Notes in Computer Science
As a second objective we wanted to design and evaluate a stopword list and a light stemming strategy for the Persian (Farsi), a member of the Indo-European family of languages and whose morphology is more ...
records) and also to evaluate the retrieval effectiveness of several IR models. ...
Introduction During the last few years, the IR group at University of Neuchatel has focused on designing, implementing and evaluating IR systems for various natural languages, including European [1] ...
doi:10.1007/978-3-642-04447-2_22
fatcat:kzkwdal6pfekpfml3to6du4hcq
Towards an error-free Arabic stemming
2008
Proceeding of the 2nd ACM workshop on Improving non english web searching - iNEWS '08
The ETS stemmer is evaluated by comparison with output from human generated stemming and the stemming weight technique. ...
The novelty of the work arises from the use of neglected Arabic stop-words. These stop-words can be highly important and can provide a significant improvement to processing Arabic documents. ...
EVALUATION AND EXPERIMENTS Different criteria are used to evaluate the performance of a stemmer. A good stemmer (by definition) is a stemmer that stems all the words to their correct roots. ...
doi:10.1145/1460027.1460030
dblp:conf/cikm/Al-ShammariL08
fatcat:wol556egtzdhtc2zjpn3fkioda
AutoClass: Automatic Text to OOP Concept Identification Model
2016
International Journal of Computer Applications
This paper presents a CASE tool called AutoClass which extracts class diagrams and generates C# source code from the requirement documents. ...
Natural Language Processing (NLP) techniques and rule-based model are used to implement automatic concept identification model in the study. ...
In next section, a survey of the related works which implement automatic concept identification is presented. ...
doi:10.5120/ijca2016911647
fatcat:5qfrzklysjg7xpt5te2ugxgb4u
A Survey of Common Stemming Techniques and Existing Stemmers for Indian Languages
2013
Journal of Emerging Technologies in Web Intelligence
The design of stemmers is language specific, and requires some to significant linguistic expertise in the language, as well as the understanding of the needs for a spelling checker for that language. ...
In this paper a survey of common stemming techniques and existing stemmers for Indian languages have been presented. ...
The design of stemmers is language specific, and requires some to significant linguistic expertise in the language, as well as the understanding of the needs for a spelling checker for that language. ...
doi:10.4304/jetwi.5.2.157-161
fatcat:5f4y4de4qnasbjxp2xtqbdnqmu
Searching strategies for the Hungarian language
2008
Information Processing & Management
It describes evaluations carried out on two general stemming strategies for this language, and also demonstrates that a light stemming approach could be quite effective. ...
Finally, we demonstrate that applying an automatic decompounding procedure for both queries and documents significantly improves IR performance (+10%), compared to word-based indexing strategies. ...
While stemming schemes are normally designed to work with general texts, some may also be especially designed for a specific domain (e.g., in medicine) or a given document collection, such as that developed ...
doi:10.1016/j.ipm.2007.01.022
fatcat:2ffg3z4tpjglxhzapjzfl74qui
Improving a Lightweight Stemmer for Gujarati Language
2016
International Journal of Information Sciences and Techniques
Establish a stemmer effective for the language of Gujarati has been always a search domain hot since the Gujarati has a very different structure and difficult that the other language due to the rich morphology ...
It is usually used in several types of applications such as Natural Language Processing (NLP), Information Retrieval (IR) and Text Mining (TM) including Text Categorization (TC), Text Summarization (TS ...
We also evaluate new algorithm with IRS with precision and recall, improved. Since implementation of this algorithm also testing using different regional language for further processing. ...
doi:10.5121/ijist.2016.6214
fatcat:zfdsa4nw2ndilbxhqerx4sxaiy
Automated arabic text classification with P-Stemmer, machine learning, and a tailored news article taxonomy
2015
Journal of the Association for Information Science and Technology
We designed a simple taxonomy for Arabic news stories that is suitable for the needs in Qatar and other nations, is compatible with the subject codes of the International Press Telecommunications Council ...
We developed tailored stemming (i.e., a new Arabic light stemmer) and automatic classification methods (the best being binary SVM classifiers) to work with the taxonomy. ...
Acknowledgments We acknowledge QNRF for their support. This research was made possible by NPRP grant # 4-029-1-007 from the Qatar National Research Fund (a member of Qatar Foundation). ...
doi:10.1002/asi.23609
fatcat:lzsmz2t3p5bcpnig4gdgpatkia
An evaluation of conflation accuracy using finite‐state transducers
2006
Journal of Documentation
Design/methodology/approach -Incorrectly lemmatized and stemmed forms may lead to the retrieval of inappropriate documents. ...
Conflation performance was evaluated in terms of an adaptation of recall and precision measures, based on accuracy and coverage, not actual retrieval. ...
At the same time, stemmers are typically easy to implement, and run fast, yet they do not give a high percentage of accuracy, making them inappropriate for some applications. ...
doi:10.1108/00220410610666493
fatcat:rcf2r7vxqbbvlcuyvuscy2wopq
« Previous
Showing results 1 — 15 out of 1,522 results