Filters








1,020 Hits in 4.2 sec

Determining Window Size from Plagiarism Corpus for Stylometric Features [chapter]

Šimon Suchomel, Michal Brandejs
2015 Lecture Notes in Computer Science  
The paper shows the pros and cons of the stop words removal for the sliding window document profiling and discusses the utilization of the selected feature for intrinsic plagiarism detection.  ...  It was conducted for a vocabulary richness method called 'average word frequency class' using the PAN 2015 source retrieval training corpus for plagiarism detection.  ...  On the other hand, if a class change between two neighbouring plagiarized passages is detected the intrinsic plagiarism detection is successful, and so there is no need for the classification method to  ... 
doi:10.1007/978-3-319-24027-5_31 fatcat:ylqqjsuw5vdv7bf6e5ab2srbvq

RDI System for Intrinsic Plagiarism Detection (RDI_RID), Working Notes for PANAraPlagDet at FIRE 2015

Ashraf Y. Mahgoub, Ahmed Magooda, Mohsen A. Rashwan, Magda B. Fayek, Hazem M. Raafat
2015 Forum for Information Retrieval Evaluation  
Many researchers have been investigating the task of plagiarism detection lately. In this paper we present RDI system for intrinsic plagiarism detection (RDI_RID).  ...  RDI_RID system achieved a PlagDet (Plagiarism Detection score) of 19% compared to 38% achieved by the base line system.  ...  INTRODUCTION Due to major advances in plagiarism techniques, plagiarized documents have become too difficult and sophisticated to be detected by traditional plagiarism detection methodologies.  ... 
dblp:conf/fire/MahgoubMRFR15 fatcat:vgihussb4ja2lc334toiibmh64

Detection of Plagiarism in Arabic Documents

Mohamed El Bachir Menai
2012 International Journal of Information Technology and Computer Science  
Many language-sensitive tools for detecting plagiarism in natural language documents have been developed, particularly for English.  ...  In this paper, we present a plagiarism detection tool for comparison of Arabic documents to identify potential similarities.  ...  There are two main classes of methods used to reduce plagiarism [2] : plagiarism prevention methods and plagiarism detection methods.  ... 
doi:10.5815/ijitcs.2012.10.10 fatcat:uc3sxp3ctrgvfemkcw3tx7dagy

Using Clustering to Identify Outlier Chunks of Text - Notebook for PAN at CLEF 2011

Navot Akiva
2011 Conference and Labs of the Evaluation Forum  
Intrinsic plagiarism detection is a sub-task of authorship identification in which outlier chunks must be detected solely on the basis of stylistic differences from the main body of the text.  ...  In the first phase of our method we cluster chunks of text represented by usage of infrequent words. In the second phase, we use a training corpus to identify cluster properties of outlier chunks.  ...  Outlier Chunks Identification Our approach consists of two phases: chunks clustering and cluster properties detection.  ... 
dblp:conf/clef/Akiva11 fatcat:2yj6azvssrhnbfafipvppcjbgi

Experiments on Document Chunking and Query Formation for Plagiarism Source Retrieval

Amit Prakash, Sujan Kumar Saha
2014 Conference and Labs of the Evaluation Forum  
Our work is focused on intelligent chunking of suspicious documents and a hybrid approach of query formation.  ...  A method based on term frequency and word co-occurrence is proposed to extract query terms from a non-overlapping chunk of topically related sentences.  ...  The plagiarism detection method we proposed does minimal computations and performs the task at a speed suitable enough for practical applications.  ... 
dblp:conf/clef/PrakashS14 fatcat:r67boygidfgzzn245ecfyyjs6e

Citation pattern matching algorithms for citation-based plagiarism detection

Bela Gipp, Norman Meuschke
2011 Proceedings of the 11th ACM symposium on Document engineering - DocEng '11  
The algorithms are coined Greedy Citation Tiling, Citation Chunking and Longest Common Citation Sequence.  ...  This paper introduces three algorithms and discusses their suitability for the purpose of Citation-based Plagiarism Detection.  ...  detection method [18, p. 155 ].  ... 
doi:10.1145/2034691.2034741 dblp:conf/doceng/GippM11 fatcat:lsrliasw55ditdlprqygi46lyu

Citation-based Plagiarism Detection [chapter]

Bela Gipp
2014 Citation-based Plagiarism Detection  
The advantages and limitations of Citation-based Plagiarism Detection are very different from those of the currently used textbased methods.  ...  Currently used Plagiarism Detection Systems solely rely on textbased comparisons.  ...  In this context, we proposed this definition: Citation-based Plagiarism Detection (CbPD) subsumes methods that use citations and references for determining similarities between documents in order to  ... 
doi:10.1007/978-3-658-06394-8_4 fatcat:mnoijds7o5gs7cxw3c3a2rgaqy

THE CONSTRUCTION OF INDONESIAN-ENGLISH CROSS LANGUAGE PLAGIARISM DETECTION SYSTEM USING FINGERPRINTING TECHNIQUE

Zakiy Firdaus Alfikri, Ayu Purwarianti
2012 Jurnal Ilmu Komputer dan Informasi  
document is written in Indonesian and the source document is written in English.  ...  In this paper, we also propose additional methods to be implemented in heuristic retrieval component to increase the performance of the system: phrase chunking, stop word removal, stemming, and synonym  ...  In this component, there are some additional methods implemented: phrase chunking, stop word removal, and stemming. Phrase chunking method uses library from Stanford Parser [10] .  ... 
doi:10.21609/jiki.v5i1.182 doaj:a825cdb3112d4e4f997e73df813cbc40 fatcat:n7aysyp2fnezdf7eicburvdmny

Improving the Reliability of the Plagiarism Detection System - Lab Report for PAN at CLEF 2010

Jan Kasprzak, Michal Brandejs
2010 Conference and Labs of the Evaluation Forum  
We describe our experiments with intrinsic plagiarism detection and evaluate them.  ...  In this paper we describe our approach at the PAN 2010 plagiarism detection competition. We refer to the system we have used in PAN'09.  ...  Acknowledgements We would like to thank Pavel Rychlý for discussing the properties of n-gram profiling and intrinsic plagiarism detections with us.  ... 
dblp:conf/clef/KasprzakB10 fatcat:f7v766zwxjd57cbnkzoxhanvmm

State of the Art in Detecting Academic Plagiarism

Norman Meuschke, Bela Gipp
2013 Zenodo  
Consequently, methods and systems aiding in the detection of plagiarism have attracted much research within the last two decades.  ...  In the future, plagiarism detection systems may benefit from combining traditional character-based detection methods with these emerging detection approaches.  ...  This article reviews the extensive literature on academic plagiarism detection, describes detection methods, and presents evaluations of their detection performance.  ... 
doi:10.5281/zenodo.3482941 fatcat:e4bl72bt3nboxnjig5nvkpciv4

Overview of the AraPlagDet PAN@FIRE2015 Shared Task on Arabic Plagiarism Detection

Imene Bensalem, Imene Boukhalfa, Paolo Rosso, Lahsen Abouenour, Kareem Darwish, Salim Chikhi
2015 Forum for Information Retrieval Evaluation  
It has two subtasks, namely external plagiarism detection and intrinsic plagiarism detection. A total of 8 runs have been submitted and tested on the standardized corpora developed for the track.  ...  AraPlagDet is the first shared task that addresses the evaluation of plagiarism detection methods for Arabic texts.  ...  Evaluation Baseline We employed a simple baseline, which entails detecting common chunks of word 5-grams between the suspicious documents and the source documents and then merging the adjacent detected  ... 
dblp:conf/fire/BensalemBRADC15 fatcat:bohxplfkrzdptjkh36eqrgessm

A Set-Based Approach to Plagiarism Detection

Robin Küppers, Stefan Conrad
2012 Conference and Labs of the Evaluation Forum  
Furthermore we employ basic strategies from Information Retrieval and Natural Language Processing for stop word removal and language detection.  ...  Our experiments deal with monolingual plagiarism cases, only. We use a simple set-based algorithm, that employs Dice's coefficient as a similarity measure.  ...  At this point we chunked the suspicious document into n and the source document into m chunks.  ... 
dblp:conf/clef/KuppersC12 fatcat:2rhubkwk3bcy7jgdkzei7yechi

Comparison of Overlap Detection Techniques [chapter]

Krisztián Monostori, Raphael Finkel, Arkady Zaslavsky, Gábor Hodász, Máté Pataki
2002 Lecture Notes in Computer Science  
These previous methods share two common stages: chunking of documents and selection of representative chunks.  ...  The applications of these methods are not limited to plagiarism detection but may target other copy-detection problems.  ...  We plan to run our tests on more document sets and we are also developing an online system where these methods are available for copy-detection applications (more specifically plagiarism detection).  ... 
doi:10.1007/3-540-46043-8_4 fatcat:ij6ekyl7ovevhnncwoahw3eopu

Authorship and Plagiarism Detection Using Binary BOW Features

Navot Akiva
2012 Conference and Labs of the Evaluation Forum  
In this work we examine a simplified approach for unsupervised authorship and plagiarism detection which is based on binary bag of words representation.  ...  We evaluate our approach using PAN-2012 Authorship Attribution challenge data, which includes both open/closed class authorship identification and intrinsic plagiarism tasks.  ...  Plagiarism Detection For clustering/plagiarism problems (tasks E and F), we treat each paragraph as a separate document and apply the n-cut clustering algorithm described in [7] .  ... 
dblp:conf/clef/Akiva12 fatcat:xiwi2qcxlbd3npboflnfnua6qu

2L-APD: A Two-Level Plagiarism Detection System for Arabic Documents

El Moatez Billah Nagoudi, Ahmed Khorsi, Hadda Cherroun, Didier Schwab
2018 Cybernetics and Information Technologies  
(STS), Sentiment Analysis (SA) and Plagiarism Detection (PD).  ...  In this paper, we report a plagiarism detection system based on two layers of assessment: 1) Fingerprinting which simply compares the documents fingerprints to detect the verbatim reproduction; 2) Word  ...  Segmentation and Pre-processing In a first step, each document dsus and dsrc is chunked into sentences.  ... 
doi:10.2478/cait-2018-0011 fatcat:vxwm6pegkzbf7cclzpmf56db7u
« Previous Showing results 1 — 15 out of 1,020 results