15 Hits in 6.3 sec

Baseline Approaches for the Authorship Identification Task - Notebook for PAN at CLEF 2011

Darnes Vilariño Ayala, Esteban Castillo, David Pinto, Saúl León, Mireya Tovar
2011 Conference and Labs of the Evaluation Forum  
In this paper we present the evaluation of three different classifiers (Rocchio, Naïve Bayes and Greedy) with the aim of obtaining a baseline in the task of authorship identification.  ...  However, we recommend using both, Rocchio and Naïve Bayes in future evaluations of the PAN competition as baselines from which other teams may compare their own approach.  ...  Section Authorship features and classification The aim of our first participation in PAN'11 was to obtain baselines for the task of authorship identification.  ... 
dblp:conf/clef/AyalaCPLT11 fatcat:slojw3lpc5abvgiftlnuasomxq

Authorship Identification with Modality Specific Meta Features - Notebook for PAN at CLEF 2011

Thamar Solorio, Sangita Pillay, Manuel Montes-y-Gómez
2011 Conference and Labs of the Evaluation Forum  
This paper presents the approach used in the PAN '11 authorship identification competition.  ...  However, considering that our system was not fine tuned for the PAN evaluation data we found our results very encouraging.  ...  This work was partially supported by a UAB faculty development grant and by the UPV, award 1932, under the program Research Visits for Renowned Scientists (PAID-02-11) to the first author.  ... 
dblp:conf/clef/SolorioPM11 fatcat:krzp3bs3pbeipeiklm5f3ymgka

Author Profiling for English and Spanish Text Notebook for PAN at CLEF 2013

Upendra Sapkota, Thamar Solorio, Manuel Montes-y-Gómez, Gabriela Ramírez-de-la-Rosa
2013 Conference and Labs of the Evaluation Forum  
This paper describes an approach for the author profiling task of the PAN 2013 challenge.  ...  For both English and Spanish documents, our system performed well for the age identification task.  ...  It was also supported in part by the CONACYT grant 134186.  ... 
dblp:conf/clef/SapkotaSMR13 fatcat:3y4iwirxxrcjfpk5pr3ptiggiq

Overview of PAN'16 [chapter]

Paolo Rosso, Francisco Rangel, Martin Potthast, Efstathios Stamatatos, Michael Tschuggnall, Benno Stein
2016 Lecture Notes in Computer Science  
Abstract This paper presents an overview of the PAN/CLEF evaluation lab. During the last decade, PAN has been established as the main forum of digital text forensic research.  ...  PAN 2016 comprises three shared tasks: (i) author identification, addressing author clustering and diarization (or intrinsic plagiarism detection); (ii) author profiling, addressing age and gender prediction  ...  Acknowledgements We thank the organizing committees of PAN's shared tasks Ben Verhoeven, Walter Daelemans, Patrick Juola. Our special thanks go to all of PAN's participants, to Adobe 12  ... 
doi:10.1007/978-3-319-44564-9_28 fatcat:qyopkvia5jalvg5lazy2x4ugqu

Overview of the Cross-Domain Authorship Verification Task at PAN 2021

Mike Kestemont, Enrique Manjavacas, Ilia Markov, Janek Bevendorff, Matti Wiegmann, Efstathios Stamatatos, Benno Stein, Martin Potthast
2021 Conference and Labs of the Evaluation Forum  
In this year's edition of PAN, the authorship identification track focused on open-set authorship verification, so that systems are applied to unknown documents by previously unseen authors in a new domain  ...  Idiosyncrasies in human writing styles make it difficult to develop systems for authorship identification that scale well across individuals.  ...  Our thanks also go to the CLEF organizers for the continuation of their hard annual work.  ... 
dblp:conf/clef/KestemontMMBWS021 fatcat:u3iv2jvqwralpprs2pcyetub3e

Overview of the Author Identification Task at PAN 2013

Patrick Juola, Efstathios Stamatatos
2013 Conference and Labs of the Evaluation Forum  
The author identification task at PAN-2013 focuses on author verification where given a set of documents by a single author and a questioned document, the problem is to determine if the questioned document  ...  In this paper we present the evaluation setup, the performance measures, the new corpus we built for this task covering three languages and the evaluation results of the 18 participant teams that submitted  ...  Any opinions, findings, and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the National Science Foundation.  ... 
dblp:conf/clef/JuolaS13 fatcat:mylcetxezravtifwl5oebbtgru

Multilingual Gender Classification with Multi-view Deep Learning: Notebook for PAN at CLEF 2018

Matej Martinc, Blaz Skrlj, Senja Pollak
2018 Conference and Labs of the Evaluation Forum  
We present the results of a gender identification performed on the data set of tweets and images prepared for the PAN 2018 Author profiling shared task.  ...  The proposed approach was 8 th in the global ranking of PAN 2018 Author profiling shared task.  ...  The first PAN event took place in 2011, while the first AP shared task was organized in 2013 [15] .  ... 
dblp:conf/clef/MartincSP18 fatcat:luxhvxxnfjh5vali26rxubqiou

Overview of the Cross-Domain Authorship Verification Task at PAN 2020

Mike Kestemont, Enrique Manjavacas, Ilia Markov, Janek Bevendorff, Matti Wiegmann, Efstathios Stamatatos, Martin Potthast, Benno Stein
2020 Conference and Labs of the Evaluation Forum  
For this edition of PAN, we focused on authorship verification, where the task is to assess whether a pair of documents has been authored by the same individual.  ...  Introduction From the very beginning, authorship analysis tasks have played a key role within the PAN series.  ...  We thank the CLEF organizers for their work in organizing the conference, especially in these trying times.  ... 
dblp:conf/clef/KestemontMMBWSP20 fatcat:a4ihisqt7zbypm6guu4ttvui4y

Idiosyncratic but not Arbitrary: Learning Idiolects in Online Registers Reveals Distinctive yet Consistent Individual Styles [article]

Jian Zhu, David Jurgens
2021 arXiv   pre-print
The neural model achieves strong performance at authorship identification on short texts and through an analogy-based probing task, showing that the learned representations exhibit surprising regularities  ...  We introduce a new approach to studying idiolects through a massive cross-author comparison to identify and encode stylistic features.  ...  Acknowledgements We thank Professor Patrice Beddor, Jiaxin Pei at the University of Michigan and Zuoyu Tian at the University of Indiana Bloomington for their helpful discussions.  ... 
arXiv:2109.03158v2 fatcat:wdpyhxqeuray3jn5ymtsrbh65a

Author Profiling in Social Media with Multimodal Information

Miguel Á. Álvarez Carmona, Esaú Villatoro Tello, Manuel Montes y Gómez, Luis Villaseñor Pineda
2020 Journal of Computacion y Sistemas  
In this thesis work, we propose a solution for the task of profiling authors in social networks.  ...  Determine aspects of a person as gender, age, residency, occupation, among others, through his/her texts is a task that is part of the natural language processing and is known as author profiling.  ...  INAOE's participation at PAN'15: Author profiling task. Working Notes Papers of the CLEF. (37 cites) -Álvarez-Carmona, M. Á., López-Monroy, A.  ... 
doi:10.13053/cys-24-3-3488 fatcat:dj6rhmdbf5grhngea6hwocxo7q

Does It Capture STEL? A Modular, Similarity-based Linguistic Style Evaluation Framework [article]

Anna Wegmann, Dong Nguyen
2021 arXiv   pre-print
However, evaluation methods for style measures are rare, often task-specific and usually do not control for content.  ...  We invite the addition of further tasks and task instances to STEL and hope to facilitate the improvement of style-sensitive measures.  ...  Acknowledgements We thank the anonymous EMNLP reviewers for their helpful feedback. We thank Yupei Du and Qixiang Fang for the productive discussions and their equally helpful feedback.  ... 
arXiv:2109.04817v1 fatcat:m4j6pltrojdrpfd7vdws4oqwhu

Data Leakage Prevention for Secure Cross-Domain Information Exchange

Kyrre Wahl Kongsgard, Nils Agne Nordbotten, Federico Mancini, Raymond Haakseth, Paal E. Engelstad
2017 IEEE Communications Magazine  
External plagiarism detection using information retrieval and sequence alignment-notebook for pan at clef 2011.  ...  AV differs from authorship attribution (AA) in that it is not a closed-world problem, i.e., for AA the task is to determine who in the given candidate set is the true author.  ...  In order to provide a better context for performing classification, we monitor the incoming information flow and use the audit trail to construct controlled environments.  ... 
doi:10.1109/mcom.2017.1700235 fatcat:zwcixu2adrgnpgtkaxg4p5kxh4

Proceedings of the GermEval 2021 Workshop on the Identification of Toxic, Engaging, and Fact-Claiming Comments [article]

Julian Risch, Anke Stoll, Lena Wilms, Michael Wiegand
The availability of language representations learned by large pretrained neural network models (such as BERT and ELECTRA) has led to improvements in many downstream Natural Language Processing tasks in  ...  On out-of-sample data, our best ensemble achieved a macro-F1 score of 0.73 (for all subtasks), and F1 scores of 0.72, 0.70, and 0.76 for subtasks 1, 2, and 3, respectively.  ...  Acknowledgments The authors would like to thank the GermEval-2021 organizers for organizing this interesting shared task and for making the dataset available.  ... 
doi:10.48415/2021/fhw5-x128 fatcat:u3fcq4x23jba7ic2a5ldcsdbna

***INVITED TALK***: Handling and Mining Linguistic Variation in UGC Distributed Representations of Words and Documents for Discriminating Similar Languages

Preslav Nakov, Marcos Zampieri, Petya Osenova, Liling Tan, Cristina Vertan, Nikola Ljubeši´c, Jörg Tiedemann, Cristina Vertan, Željko Agi´c, Laura Alonso, Alemany, Jorge Baptista (+52 others)
2015 unpublished
Yet, DSL remains a challenge for state-of-the-art language identification.  ...  VarDial workshop at COLING2014.  ...  We further thank the LT4VarDial Program Committee for thoroughly reviewing the system papers and the shared task report.  ... 

Citation-based Plagiarism Detection - applying citation pattern analysis to identify currently non-machine-detectible disguised plagiarism in scientific publications [article]

Béla Gipp, Universitäts- Und Landesbibliothek Sachsen-Anhalt, Martin-Luther Universität, Andreas Nürnberger
Acknowledgements 1 Refer to Section 2.1.1, page 10, for a definition of plagiarism.  ...  Devi SL, Rao PRK, Ram VS, Akilandeswari A (2010) External Plagiarism Detection -Lab Report for PAN at CLEF 2010. In: Notebook Papers of CLEF 2010 LABs and Workshops 89.  ...  PAN is an acronym for "Plagiarism Analysis, Authorship Identification, and Near-Duplicate Detection". Competitors in the PAN-PC primarily present research prototypes.  ... 
doi:10.25673/4083 fatcat:zkeonqwdpvgxpnab2wzxnbym7m