Filters








16 Hits in 4.0 sec

Classifying Amharic webnews

Lars Asker, Atelach Alemu Argaw, Björn Gambäck, Samuel Eyassu Asfeha, Lemma Nigussie Habte
2009 Information retrieval (Boston)  
We present work aimed at compiling an Amharic corpus from the Web and automatically categorizing the texts.  ...  We discuss the issues of compiling and annotating a corpus of Amharic news articles from the Web. This corpus was then used in three sets of text classification experiments.  ...  The results of our first two experiments are compatible with those of GebreMeskel (2003) who used the standard vector space model and latent semantic indexing for text categorization.  ... 
doi:10.1007/s10791-008-9080-x fatcat:jexrlbut4jbsbaujngj4yq4zoe

An Empirical Study on Crosslingual Transfer in Probabilistic Topic Models [article]

Shudong Hao, Michael J. Paul
2019 arXiv   pre-print
Probabilistic topic modeling is a popular choice as the first step of crosslingual tasks to enable knowledge transfer and extract multilingual features.  ...  In this paper, we systematically study the knowledge transfer mechanisms behind different multilingual topic models, and through a broad set of experiments with four models on ten languages, we provide  ...  Two main branches for topic models are non-probabilistic approaches such as Latent Semantic Analysis (lsa, Deerwester et al. (1990) ) and Non-Negative Matrix Factorization (nmf, Xu, Liu, and Gong (2003  ... 
arXiv:1810.05867v2 fatcat:3ex7jmxgqfdx3pnkiyey2zzdxm

Psychosocial Features for Hate Speech Detection in Code-switched Texts

Edward Ombui, School of Science and Technology, Africa Nazarene University, Nairobi, Kenya, Lawrence Muchemi, Peter Wagacha
2021 International Journal of Information Technology and Computer Science  
The study espouses a novel approach to handle this challenge by introducing a hierarchical approach that employs Latent Dirichlet Analysis to generate topic models that help build a high-level Psychosocial  ...  This study examines the problem of hate speech identification in codeswitched text from social media using a natural language processing approach.  ...  Therefore, the Latent Dirichlet Allocation (LDA) algorithm was used to generate twenty-three semantically meaningful topics or clusters from the large corpus of short- text messages from social media.  ... 
doi:10.5815/ijitcs.2021.06.03 fatcat:zfpl6tc3ardzxambqazyqo2z3i

Building and Annotating a Codeswitched Hate Speech Corpora

Edward Ombui, School of Science and Technology, Africa Nazarene University, Nairobi, Kenya, Lawrence Muchemi, Peter Wagacha
2021 International Journal of Information Technology and Computer Science  
This paper describes the methodology that was used to develop a multidimensional hate speech framework based on the duplex theory of hate [1] components that include distance, passion, commitment to hate  ...  Subsequently, an annotation scheme based on the framework was used to annotate a random sample of ~51k tweets from ~400k tweets that were collected during the August and October 2017 presidential campaign  ...  Topic modeling was a useful method to identify the latent semantic representations underlying the data from social media.  ... 
doi:10.5815/ijitcs.2021.03.03 fatcat:hbfwfe7nnrgqblzvnwd75caquy

An Empirical Study on Crosslingual Transfer in Probabilistic Topic Models

Shudong Hao, Michael J. Paul
2020 Computational Linguistics  
Probabilistic topic modeling is a common first step in crosslingual tasks to enable knowledge transfer and extract multilingual features.  ...  In this article, the knowledge transfer mechanisms behind different multilingual topic models are systematically studied, and through a broad set of experiments with four models on ten languages, we provide  ...  Two main branches for topic models are non-probabilistic approaches such as Latent Semantic Analysis (LSA; Deerwester et al. 1990 ) and Non-Negative Matrix Factorization (Xu, Liu, and Gong 2003) , and  ... 
doi:10.1162/coli_a_00369 fatcat:zlhocz45jfc6xkplevlvmfnhzi

Systematic Inequalities in Language Technology Performance across the World's Languages [article]

Damián Blasi, Antonios Anastasopoulos, Graham Neubig
2021 arXiv   pre-print
Our analyses involve the field at large, but also more in-depth studies on both user-facing technologies (machine translation, language understanding, question answering, text-to-speech synthesis) as well  ...  the process, we (1) quantify disparities in the current state of NLP research, (2) explore some of its associated societal and academic factors, and (3) produce tailored recommendations for evidence-based  ...  analysis of syntactic or semantic relationships between words).  ... 
arXiv:2110.06733v1 fatcat:3euhbh7u7rcitc4d4jpra5yvny

Towards a Theory of Communicative Efficiency in Human Languages

Natalia Levshina
2018 Zenodo  
This study develops a probabilistic theory of efficiency in natural language.  ...  Part IV contains corpus-based synchronic and diachronic studies of several English alternations: help + (to) Infinitive, stative verb + (at) home and go (and) Verb, and want to/wanna + Infinitive.  ...  Probabilistic expectations based on the previous experience with language.  ... 
doi:10.5281/zenodo.1542857 fatcat:wo36mdb3tjhwpkwgih2vhbldwa

What is computational phonology?

Robert Daland
2014 Loquens  
An HMM is a close relative of a probabilistic FSM, with two key differences.  ...  Work on this topic is generally concerned with 'learnability', which is typically formulated at an abstract, algebraic level.  ... 
doi:10.3989/loquens.2014.004 fatcat:e3wfgveufjfehmlsdfjop3axnm

Document analysis by means of data mining techniques

Saima Jabeen, Prof. Elena Baralis
2014
propose to adopt Latent Semantic Analysis in document summarization.  ...  A method of text summarization based on latent semantic indexing (LSI) is also proposed in [8] .  ... 
doi:10.6092/polito/porto/2537297 fatcat:j2ly6kzr25djpnw574vn3spira

Joint Meeting of the FESN (Federation of the European Societies of Neuropsychology)/GNP (Gesellschaft für Neuropsychologie) September 12-14, 2013, Berlin, Germany

2013 Behavioural Neurology  
., "Social Talent Show Task" corresponding to a probabilistic decision-making procedure). Regional brain volumes were assessed with voxel-based MRT morphometry.  ...  In a third study, a French-English-Amharic trilingual individual with aphasia, was trained first in French and then in English.  ...  This is a case study of a right handed 65-year-old woman with an anaplastic astrocytoma grade III in the superior frontal cortex with intra-axial extensions to the corpus callosum.  ... 
doi:10.3233/ben-139900 pmid:28149002 pmcid:PMC5215690 fatcat:f4olin3ogfg7zlznht4ryjbtym

Joint Meeting of the FESN (Federation of the European Societies of Neuropsychology)/GNP (Gesellschaft für Neuropsychologie) September 12–14, 2013, Berlin, Germany

2013 Behavioural Neurology  
., "Social Talent Show Task" corresponding to a probabilistic decision-making procedure). Regional brain volumes were assessed with voxel-based MRT morphometry.  ...  In a third study, a French-English-Amharic trilingual individual with aphasia, was trained first in French and then in English.  ...  This is a case study of a right handed 65-year-old woman with an anaplastic astrocytoma grade III in the superior frontal cortex with intra-axial extensions to the corpus callosum.  ... 
doi:10.1155/2013/976782 fatcat:xln76cklwzhvne2ry4fwxm4jzm

Scanning the Science-Society Horizon [article]

Brenda Moon, University, The Australian National, University, The Australian National
2016
By conducting a series of studies based on simpler questions, I gradually build up a view of who is contributing on Twitter, how often, and what topics are being discussed that include the keyword 'science  ...  Consideration of word frequency and bigrams in the text of the tweets found that while wo [...]  ...  A sensor detects a target event and makes a report probabilistically" (p. 853). Semantic analysis was used to classify tweets as positive or negative observations.  ... 
doi:10.25911/5d6664e8354b8 fatcat:jmgtblj2n5e6xosu7ue3sokami

Complete Issue

- -
2015
Kramer (2012) , analyzing data from Amharic, a Semitic language, develops an analysis of gender based on two central elements: (i) the division between natural gender and grammatical gender and (ii) the  ...  ., hay for strawberry) was selected using Latent Semantic Analysis (LSA) scores (Landauer 2002; Landauer and Dumais 1997) and the unrelated prime (e.g., pine for strawberry) was selected from the SUBTLEXus  ...  Coindexation works with semantic compatibility between the affix and the base. Semantic features of the affixes are observable in a non-direct way, in the derivative.  ... 
doi:10.26220/mmm.2276 fatcat:qnr63zjqn5cmteoxt7hzjry4eq

The Postcolonial Turn: Re-imagining anthropology and Africa

D. Turkon
2012 African Affairs  
Acknowledgments The field research on which this article is based benefited from the financial support of the European Commission (DG XII STD3, projects TS2 M 0202 B and STD4, TS3* CT94 0326), the Belgian  ...  If Keita (2004) briefly acknowledges philosophical texts in Amharic and Arabic, respectively in ancient Axum and Timbuktu, in his reply to my academic lecture (2008) he does not, however, refer to other  ...  Desire may self-destructively compose with death drive, which is ever-susceptible of deconstructing the Ethical base of society.  ... 
doi:10.1093/afraf/ads050 fatcat:6r4bpe2egbfbjcvnpzkoxipgry

The Full Has Never Been Told: Theology and the Encounter with Globalization

Christopher Duncanson-Hales, Université D'Ottawa / University Of Ottawa, Université D'Ottawa / University Of Ottawa
2011
My research begins with an understanding of globalization not as a clash of civilizations, but as a clash of symbols and metaphors.  ...  My thesis, "The Full has Never been Told: Theology and the Encounter with Globalization," is an investigation of the encounter between religion and globalization.  ...  How shared are these latent cultural objects?  ... 
doi:10.20381/ruor-725 fatcat:eupawln575eb5ainsgh7xqtycm
« Previous Showing results 1 — 15 out of 16 results