Filters








114 Hits in 6.4 sec

Overview of the Cross-Domain Authorship Verification Task at PAN 2021

Mike Kestemont, Enrique Manjavacas, Ilia Markov, Janek Bevendorff, Matti Wiegmann, Efstathios Stamatatos, Benno Stein, Martin Potthast
2021 Conference and Labs of the Evaluation Forum  
In this year's edition of PAN, the authorship identification track focused on open-set authorship verification, so that systems are applied to unknown documents by previously unseen authors in a new domain  ...  The general setup of the task did not change, i.e., systems still had to estimate the probability of a pair of documents being authored by the same person.  ...  Our thanks also go to the CLEF organizers for the continuation of their hard annual work.  ... 
dblp:conf/clef/KestemontMMBWS021 fatcat:u3iv2jvqwralpprs2pcyetub3e

Improving Authorship Verification using Linguistic Divergence [article]

Yifan Zhang, Dainis Boumber, Marjan Hosseinia, Fan Yang, Arjun Mukherjee
2021 arXiv   pre-print
Our design addresses the problem of non-comparability in authorship verification, frequently encountered in small or cross-domain corpora.  ...  We propose an unsupervised solution to the Authorship Verification task that utilizes pre-trained deep language models to compute a new metric called DV-Distance.  ...  The views and conclusions contained in this document are those of the authors and not of the sponsors. The U.S.  ... 
arXiv:2103.07052v1 fatcat:vrowv74chnd63h3cxozmlv5mem

O2D2: Out-Of-Distribution Detector to Capture Undecidable Trials in Authorship Verification [article]

Benedikt Boenninghoff, Robert M. Nickel, Dorothea Kolossa
2021 arXiv   pre-print
The PAN 2021 authorship verification (AV) challenge is part of a three-year strategy, moving from a cross-topic/closed-set AV task to a cross-topic/open-set AV task over a collection of fanfiction texts  ...  In this work, we present a novel hybrid neural-probabilistic framework that is designed to tackle the challenges of the 2021 task.  ...  Acknowledgments This work was in significant parts performed on an HPC cluster at Bucknell University through the support of the National Science Foundation, Grant Number 1659397.  ... 
arXiv:2106.15825v3 fatcat:doplzuzw3jg3hmv2nhx6ps4l6m

Overview of the Cross-Domain Authorship Verification Task at PAN 2020

Mike Kestemont, Enrique Manjavacas, Ilia Markov, Janek Bevendorff, Matti Wiegmann, Efstathios Stamatatos, Martin Potthast, Benno Stein
2020 Conference and Labs of the Evaluation Forum  
For this edition of PAN, we focused on authorship verification, where the task is to assess whether a pair of documents has been authored by the same individual.  ...  Introduction From the very beginning, authorship analysis tasks have played a key role within the PAN series.  ...  Acknowledgements We would like to thank all participants for their much-appreciated contribution to the shared task and we wish to encourage them to stay involved in the community in the next years.  ... 
dblp:conf/clef/KestemontMMBWSP20 fatcat:a4ihisqt7zbypm6guu4ttvui4y

Siamese Networks for Large-Scale Author Identification [article]

Chakaveh Saedi, Mark Dras
2021 arXiv   pre-print
We examine their application to the stylistic task of authorship attribution on datasets with large numbers of authors, looking at multiple energy functions and neural network architectures, and show that  ...  Authorship attribution is the process of identifying the author of a text.  ...  The PAN author attribution task in 2019 changed focus from the previous setups and the setup of the present paper, to focus on cross-domain texts; the task overview (Kestemont et al., 2019) noted that  ... 
arXiv:1912.10616v3 fatcat:has2zgpd2fawrp6b2yk5hhewqa

Idiosyncratic but not Arbitrary: Learning Idiolects in Online Registers Reveals Distinctive yet Consistent Individual Styles [article]

Jian Zhu, David Jurgens
2021 arXiv   pre-print
The neural model achieves strong performance at authorship identification on short texts and through an analogy-based probing task, showing that the learned representations exhibit surprising regularities  ...  Through text perturbation, we quantify the relative contributions of different linguistic elements to idiolectal variation.  ...  Acknowledgements We thank Professor Patrice Beddor, Jiaxin Pei at the University of Michigan and Zuoyu Tian at the University of Indiana Bloomington for their helpful discussions.  ... 
arXiv:2109.03158v2 fatcat:wdpyhxqeuray3jn5ymtsrbh65a

Syllabic Quantity Patterns as Rhythmic Features for Latin Authorship Attribution [article]

Silvia Corbara, Alejandro Moreo, Fabrizio Sebastiani
2021 arXiv   pre-print
We test the impact of these features on the authorship attribution task when combined with other topic-agnostic features.  ...  In this research we investigate the possibility to employ syllabic quantity as a base for deriving rhythmic features for the task of computational authorship attribution of Latin prose texts.  ...  Acknowledgments The first exploratory steps for this research have been conducted during the preparation of the BSc thesis of Giulio Canapa (2021), co-supervised (aside from the 1st author and 3rd author  ... 
arXiv:2110.14203v1 fatcat:oil6s3jow5behorh3xbxzwrlpq

Gender identification on Twitter

Catherine Ikae, Jacques Savoy
2021 Journal of the Association for Information Science and Technology  
Thus, based on 7 CLEF-PAN collections, this study analyzes the effectiveness of 10 different classifiers.  ...  To determine the author of a text's gender, various feature types have been suggested (e.g., function words, n-gram of letters, etc.) leading to a huge number of stylistic markers.  ...  In the cross-domain authorship attribution task in 2019 (Kestemont et al., 2019) , one system took more than 37 h to solve the 25 documents.  ... 
doi:10.1002/asi.24541 fatcat:zqr3fv37uve4bahlwhwvkwb4la

Does It Capture STEL? A Modular, Similarity-based Linguistic Style Evaluation Framework [article]

Anna Wegmann, Dong Nguyen
2021 arXiv   pre-print
We invite the addition of further tasks and task instances to STEL and hope to facilitate the improvement of style-sensitive measures.  ...  We propose the modular, fine-grained and content-controlled similarity-based STyle EvaLuation framework (STEL) to test the performance of any model that can compare two sentences on style.  ...  This research was supported by the "Digital Society -The Informed Citizen" research programme, which is (partly) financed by the Dutch Research Council (NWO), project 410.19.007.  ... 
arXiv:2109.04817v1 fatcat:m4j6pltrojdrpfd7vdws4oqwhu

Analyzing Non-Textual Content Elements to Detect Academic Plagiarism

Norman Meuschke, Bela Gipp, Harald Reiterer, Michael L. Nelson
2021 Zenodo  
To improve the identification of plagiarism in disciplines like mathematics, physics, and engineering, the thesis presents the first plagiarism detection approach that analyzes the similarity of mathematical  ...  The study presents the weaknesses of current detection approaches for identifying strongly disguised plagiarism.  ...  The task in author verification is deciding if a single author, for whom writing samples are available, also authored Authorship Analysis Authorship Attribution a.k.a.  ... 
doi:10.5281/zenodo.4913344 fatcat:xmpaahvwuva53l5l5i2gaidvi4

Algorithmic Fairness Datasets: the Story so Far [article]

Alessandro Fabris, Stefano Messina, Gianmaria Silvello, Gian Antonio Susto
2022 arXiv   pre-print
Secondly, we document and summarize hundreds of available alternatives, annotating their domain and supported fairness tasks, along with additional properties of interest for fairness researchers.  ...  As a result, a growing community of researchers has been investigating the equity of existing algorithms and proposing novel ones, advancing the understanding of risks and opportunities of automated decision-making  ...  Acknowledgements The authors would like to thank the following researchers and dataset creators for the useful feedback on the data briefs: Alain Barrat, Luc Behaghel, Asia Biega, Marko Bohanec, Chris  ... 
arXiv:2202.01711v3 fatcat:kd546yklwjhvtkrbhtzgbzb2xm

D7.4 How to be FAIR with your data. A teaching and training handbook for higher education institutions [article]

Claudia Engelhardt, Katarzyna Biernacka, Aoife Coffey, Ronald Cornet, Alina Danciu, Yuri Demchenko, Christopher Erdmann, Federica Garbuglia, Kerstin Germer, Kerstin Helbig, Margareta Hellström, Kristina Hettne (+27 others)
2021 Zenodo  
It was written and edited by a group of about 40 collaborators in a series of six book sprint events that took place between 1 and 10 June 2021.  ...  It incorporates community feedback received during the public consultation which ran from 27 July to 12 September 2021.  ...  For an overview of all Competence Groups, see Appendix D (taken from Demchenko et al. 2021, pp. 70 et sqq.).  ... 
doi:10.5281/zenodo.5787046 fatcat:ibvm7yo4njfpfihebcgmgoz74y

Knowledge Graph informed Fake News Classification via Heterogeneous Representation Ensembles [article]

Boshko Koloski and Timen Stepišnik-Perdih and Marko Robnik-Šikonja and Senja Pollak and Blaž Škrlj
2021 arXiv   pre-print
To our knowledge this is the first larger-scale evaluation of how knowledge graph-based representations can be systematically incorporated into the process of fake news classification.  ...  One of the key contributions is a set of novel document representation learning methods based solely on knowledge graphs, i.e. extensive collections of (grounded) subject-predicate-object triplets.  ...  Other fake news related tasks include the identification of a potential author as a spreader of fake news and the verification of facts.  ... 
arXiv:2110.10457v1 fatcat:pevnzvabgvcxpkoxpqxrqvsmsq

Biodiversity Community Integrated Knowledge Library (BiCIKL)

Lyubomir Penev, Dimitrios Koureas, Quentin Groom, Jerry Lanfear, Donat Agosti, Ana Casino, Joe Miller, Christos Arvanitidis, Guy Cochrane, Donald Hobern, Olaf Banki, Wouter Addink (+10 others)
2022 Research Ideas and Outcomes  
through provision of access to data, associated tools and services at each separate stage of and along the entire research cycle.  ...  BiCIKL is an European Union Horizon 2020 project that will initiate and build a new European starting community of key research infrastructures, establishing open science practices in the domain of biodiversity  ...  Main tasks of the entity in the project Species 2000 will deliver high-quality virtual access to the taxonomic framework for use throughout the BiCIKL project and strengthen linkages with the taxonomic  ... 
doi:10.3897/rio.8.e81136 fatcat:4owwssl7zzfzna6likizigcv7u

Containing the COVID-19 Pandemic with Drones - Feasibility of a Drone Enabled Back-up Transport System

Maximilian Kunovjanek, Christian Wankmüller
2021 Transport Policy  
of essential goods in the case of emergency.  ...  The novel approach that we propose is to use existing drone infrastructure to perform this task, where drones owned and operated by different public and private entities are retrofitted for the distribution  ...  Acknowledgments The authors want to thank the Carinthian Red Cross Organization and the staff of Air6 Systems for their cooperation during the study development.  ... 
doi:10.1016/j.tranpol.2021.03.015 pmid:33846672 pmcid:PMC8019130 fatcat:c66zs2ywevhpxp46wzvzfvxtmm
« Previous Showing results 1 — 15 out of 114 results