2,558 Hits in 4.7 sec

Assessment on Stylometry for Multilingual Manuscript

Sushil Kumar
2012 IOSR Journal of Engineering  
applicable in cyber crime, detective agencies etc. etc. have called upon the expertise of linguists in cases of disputed authorship.  ...  This paper pact shows text author verification problem using character n-gram information( final n-gram & initial n-gram) for both English & Arabic Text.  ...  Authors' profiles are achieved using Arabic training texts. For both bi-& tri-gram profile size (L), we used L = 200, 500 & 700. The main focus in Arabic novels is selection of profile size (L).  ... 
doi:10.9790/3021-02910106 fatcat:sqv4xeinmze2rakogq4bjbgjwe

Detecting and Classifying Crimes from Arabic Twitter Posts using Text Mining Techniques

Hissah AL-Saif, Hmood Al-Dossari
2018 International Journal of Advanced Computer Science and Applications  
The proposed system aims to detect and classify crimes in Twitter posts that written in the Arabic language, one of the most widespread languages today.  ...  In this paper, classification techniques are used to detect crimes and identify their nature by different classification algorithms.  ...  What is the best method for features extraction from selected techniques for the Arabic language in general and the crime domain in particular?  ... 
doi:10.14569/ijacsa.2018.091046 fatcat:h5sygi4rfjgd5j54llrm74q2ya

Code-switching on Social Media Amongst Algerians in the UK

2018 International Journal of Technology and Engineering Studies  
Gender was demonstrated to be an aspect of identity to investigate the potential link between language use and identity.  ...  A small sample was interviewed for in-depth exploring the functions of the linguistic behaviors that have been detected.  ...  The two researchers considered both language use and language choice as a primary tools in marking the cultural identity for individuals on the text-based CMC (Computer-Mediated Communication) that surpasses  ... 
doi:10.20469/ijtes.4.10005-5 fatcat:v6umkmj2mregtkfygfxvghzkx4

State of the Art in Authorship Attribution With Impact Analysis of Stylometric Features on Style Breach Prediction

Rajesh Shardanand Prasad, Midhun Chakkaravarthy
2022 Journal of Cases on Information Technology  
The outcomes of this study can by deployed for dialectology analysis and corpus linguistics, stylistics, natural language processing, classification, and literary and historical analysis, forensic analysis  ...  The reference material contributes robust classifiers with reasonable array of feature extraction techniques, such as Dirichlet–multinomial change point regression to extract the progress of inscription  ...  Even so, numerous researchers experience unreciprocated difficulties for some languages specifically Arabic.  ... 
doi:10.4018/jcit.296716 fatcat:5i6sb6od5bafvdrv4ly5vpz46u

Authorship attribution of Morsi Gameel Aziz's lyrics: A clustering-based stylometry approach

Abdulfattah Omar
2021 Journal of Language and Linguistic Studies  
The results of the study show that machine learning systems and stylometric authorship techniques can be used in resolving many authorship questions that remain controversial and unanswered in Arabic literature  ...  With the advent of machine learning systems and data mining techniques, it is now possible to process thousands of texts using replicable methods.  ...  the fulfillment of the current research project.  ... 
doi:10.52462/jlls.36 fatcat:mfq7efl6lffhzjptqpm5d3bsgq

Arabia Felix 2.0: a cross-linguistic Twitter analysis of happiness patterns in the United Arab Emirates

Aamna Al Shehhi, Justin Thomas, Roy Welsch, Ian Grey, Zeyar Aung
2019 Journal of Big Data  
Acknowledgements We would like to thank Pegasus FZ LLC for assistance with the extraction of Twitter data. Competing interests The authors have declare that they have no competing interests.  ...  Availability of data and materials Data cannot be made publicly available due to comply with the Twitter terms of service.  ...  Figure 3 shows the mean happiness computed using Eq. 1 for both the Arabic and the English tweets over the days throughout the five-year period.  ... 
doi:10.1186/s40537-019-0195-2 fatcat:jr5os35sbreqhf5t6flpg4ln3a

A Second Pandemic? Analysis of Fake News About COVID-19 Vaccines in Qatar [article]

Preslav Nakov, Firoj Alam, Shaden Shaar, Giovanni Da San Martino, Yifan Zhang
2021 arXiv   pre-print
In terms of propaganda techniques, about half of the Arabic tweets express doubt, and 1/5 use loaded language, while English tweets are abundant in loaded language, exaggeration, fear, name-calling, doubt  ...  While COVID-19 vaccines are finally becoming widely available, a second pandemic that revolves around the circulation of anti-vaxxer fake news may hinder efforts to recover from the first one.  ...  Figure 7 shows the top propaganda techniques for Arabic and English. We can see that, for Arabic, 50% of the tweets express doubt, and 20% use loaded language.  ... 
arXiv:2109.11372v1 fatcat:ljljdjbgqvcwvdhckmfzaweywe

Cross-Language Applicability of Linguistic Features Associated with Veracity and Deception

David Matsumoto, Hyisung C. Hwang, Vincent A. Sandoval
2014 Journal of Police and Criminal Psychology  
Air Force Materiel Command REPORT DOCUMENTATION PAGE Public reporting burden for this collection of information is estimated to average 1 hour per response, including the time for reviewing instructions  ...  PLEASE DO NOT RETURN YOUR FORM TO THE ABOVE ADDRESS.  ...  Sandoval and Ling-Ling Beerman for their assistance as coders; and Jenna Propst, Laurel Richards, Sophia Nguyen, Tram Nguyen, and Isaac Perry, for their assistance in collecting data used in this study  ... 
doi:10.1007/s11896-014-9155-0 fatcat:2igwwfvpbff3zdn6s45g4ymayu

A Profile-Based Authorship Attribution Approach to Forensic Identification in Chinese Online Messages [chapter]

Jianbin Ma, Bing Xue, Mengjie Zhang
2016 Lecture Notes in Computer Science  
The profile-based method is used to represent the suspects as category profiles.  ...  The performance for short samples would be decreased greatly using traditional machine learning algorithms.  ...  The authors also gratefully acknowledge the helpful comments and suggestions of the reviewers, which have improved the presentation.  ... 
doi:10.1007/978-3-319-31863-9_3 fatcat:hsfoauveujauzhu63nv7b7wjoi

Data Mining System and Applications: A Review

S.P Deshpande, V.M Thakare
2010 International Journal of Distributed and Parallel systems  
Due to vast use of computers and electronics devices and tremendous growth in computing power and storage capacity, there is explosive growth in data collection.  ...  In the Information Technology era information plays vital role in every sphere of the human life.  ...  The linguistic profiling of text effectively used to control the quality of language and for the automatic language verification.  ... 
doi:10.5121/ijdps.2010.1103 fatcat:bf7smhjk5ja7pciuk3ooesdu5m

Online-Dating Romance Scam in Malaysia: An Analysis of Online Conversations between Scammers and Victims

Azianura Hani Shaari, Mohammad Rahim Kamaluddin, Wan Fariza Paizi@Fauzi, Masnizah Mohd
2019 GEMA Online® Journal of Language Studies  
Apart from that, it also aims to identify the pattern of deceptive language used in online romance scam in Malaysia through a comprehensive linguistic analysis of actual online conversations between scammers  ...  The empirical investigation of this research focuses on the language strategies used by scammers as a modus operando in deceiving their targets.  ...  In the final stage, monetary requests will be made using several techniques after targets have completely fallen for scammers.  ... 
doi:10.17576/gema-2019-1901-06 fatcat:wddiz5n4szhnlnrpk3cgunjyxm

Author Profiling for Vietnamese Blogs

Dang Duc Pham, Giang Binh Tran, Son Bao Pham
2009 2009 International Conference on Asian Language Processing  
This paper presents the first work in the task of author profiling for Vietnamese blogs. This task is important in threat identification and marketing intelligence.  ...  We have developed a Vietnamese Blog Profiling framework to automatically predict age, gender, geographic origin and occupation of weblogs' authors purely based on language use.  ...  Acknowledgement This work is partly supported by the research fund from College of Technology, Vietnam National University, Hanoi.  ... 
doi:10.1109/ialp.2009.47 dblp:conf/ialp/PhamTP09 fatcat:5iuv53icvfb4dgh2n4pf3fqimm

Authorship Analysis Studies: A Survey

2014 International Journal of Computer Applications  
Focus is on outlining the Stylometric features that allow distinguishing between authors and on listing the diverse techniques used to classify an author's texts.  ...  The objective in this paper is to provide a review of the different studies done on authorship analysis.  ...  Two machines learning classifiers are used: C4.5 decision tree algorithm and SVM. Experimental results show a high accuracy of 94, 83% for Arabic data and 97% for English one.  ... 
doi:10.5120/15038-3384 fatcat:nhhdvoqf7zbpjcmnvcw4hwfxte

CBR-Based Decision Support Methodology for Cybercrime Investigation: Focused on the Data-Driven Website Defacement Analysis

Mee Lan Han, Byung Il Kwak, Huy Kang Kim
2019 Security and Communication Networks  
Criminal profiling is a useful technique to identify the most plausible suspects based on the evidence discovered at the crime scene.  ...  Similar to offline criminal profiling, in-depth profiling for cybercrime investigation is useful in analysing cyberattacks and for speculating on the identities of the criminals.  ...  Acknowledgments is work was supported under the framework of international cooperation program managed by the National Research Foundation of Korea (No. 2017K1A3A1A17092614).  ... 
doi:10.1155/2019/1901548 fatcat:kxskbwaxtnaihnwimah5zsmigy

Multilingual Cross-domain Perspectives on Online Hate Speech [article]

Tom De Smedt, Sylvia Jaki, Eduan Kotzé, Leïla Saoud, Maja Gwóźdź, Guy De Pauw, Walter Daelemans
2018 arXiv   pre-print
In this report, we present a study of eight corpora of online hate speech, by demonstrating the NLP techniques that we used to collect and analyze the jihadist, extremist, racist, and sexist content.  ...  To expose the main features, we have focused on text classification, text profiling, keyword and collocation extraction, along with manual annotation and qualitative study.  ...  The tweets posted by these profiles were then automatically collected using the Pattern toolkit for the Python programming language (De Smedt & Daelemans, 2012a These were then used to extract a set  ... 
arXiv:1809.03944v1 fatcat:bkk65x2xivejdfl3ihf34yfwtu
« Previous Showing results 1 — 15 out of 2,558 results