10,753 Hits in 5.4 sec

The Case for Being Average: A Mediocrity Approach to Style Masking and Author Obfuscation [article]

Georgi Karadjov, Tsvetomila Mihaylova, Yasen Kiprov, Georgi Georgiev, Ivan Koychev, Preslav Nakov
2017 arXiv   pre-print
The approach consists of three main steps: first, we calculate the values for some popular stylometric metrics that can indicate authorship; then we apply various transformations to the text, so that these  ...  However, each person has his/her own style of writing, which can be analyzed using stylometry, and as a result, the true identity of the author of a piece of text can be revealed even if s/he has tried  ...  Acknowledgments We thank the anonymous reviewers for their constructive comments, which have helped us improve the quality of the present paper.  ... 
arXiv:1707.03736v2 fatcat:tn2n3dvixfcmjedjf6flwb4j4y

Synonymizer of the Ukrainian Language: Stage of Creation, Features of Database Update and Software Implementation

Hanna Sytar, Oleh Vietrov, Vladyslava Diachenko
2021 International Conference on Computational Linguistics and Intelligent Systems  
Prospects for further research, in particular, the involvement of synonymous transformations at the sentence level (sentence transformation, sentence conversion, etc.).  ...  In the study of the highlighted stage of creating a synonymizer of the Ukrainian language -a computer program that replaces words as synonyms in texts in the Ukrainian language.  ...  This aspect is extremely important when creating a synonymizer, as not all words can be replaced automatically without reservations (taking into account the context or style restrictions, etc.), so it  ... 
dblp:conf/colins/SytarVD21 fatcat:ntlp34sqjfg3hbxhks324yxhy4

Automatic syntax analysis in machine indexing and abstracting

W. D. Climenson, N. H. Hardwick, S. N. Jacobson
1961 American Documentation  
Dividing such a synonym list into smaller and smaller synonym lists is done by stating the contexts of replacement more precisely, using the grammatical combination found by describing the restricted English  ...  If so, additional valid transformation rules will have been discovered. e Applications of Automatic Syntax Analysis A formal linguistic approach to the problems of natu- ral language processing promises  ... 
doi:10.1002/asi.5090120303 fatcat:swomzk6m5zgnrnxgjuiagw755a

Hiding Information in Reversible English Transforms for a Blind Receiver

Salma Banawan, Ibrahim Kamel
2015 Applied Computational Intelligence and Soft Computing  
Moreover, we show that the proposed transformations do not affect the inconspicuousness of the transformed statements, and thus unlikely to draw suspicion.  ...  The paper provides a number of such transformations that can be applied concurrently, while keeping the overall meaning and grammar intact.  ...  Word/Phrase Replacement Schemes.  ... 
doi:10.1155/2015/387985 fatcat:ugie2ofpu5h2tpikqh4m2kwyli

A Survey of Automated Text Simplification

Matthew Shardlow
2014 International Journal of Advanced Computer Science and Applications  
Text simplification modifies syntax and lexicon to improve the understandability of language for an end user.  ...  Simplification can be used for many applications, including: Second language learners, preprocessing in pipelines and assistive technology.  ...  The first effort towards automated simplification is a grammar and style checker developed for writers of simplified English [16] .  ... 
doi:10.14569/specialissue.2014.040109 fatcat:fbskuhircjgo3nykcfbnir7gwi

An Analysis of Crowdsourced Text Simplifications

Marcelo Amancio, Lucia Specia
2014 Proceedings of the 3rd Workshop on Predicting and Improving Text Readability for Target Reader Populations (PITR)  
Our results show that the most common transformation operations performed by humans are paraphrasing (39.80%) and drop of information (26.76%), which are some of the most difficult operations to generalise  ...  We then built machine learning models to attempt to automatically classify segments based on such transformations.  ...  by a synonym; • Discourse marker (0.84%): a discourse marker is altered; • Word definition (0.84%): a word is substituted by its dictionary description; • Writing style (7.56%): the writing style of the  ... 
doi:10.3115/v1/w14-1214 dblp:conf/acl-pitr/AmancioS14 fatcat:sxe5l6o3jvavrn2hldyfjd7dpa

Corpus-centered computation

Eiichiro Sumita
2002 Proceedings of the ACL-02 workshop on Speech-to-speech translation: algorithms and systems -  
High-quality translation has been demonstrated in the domain of travel conversation, and the prospects of this approach are promising due to the benefits of synergistic effects.  ...  To achieve translation technology that is adequate for speech-to-speech translation (S2S), this paper introduces a new attempt named Corpus-Centered Computation, (abbreviated to C 3 and pronounced c-cube  ...  Acknowledgements The author's heartfelt thanks go to Kadokawa-Shoten for providing the Ruigo-Shin-Jiten.  ... 
doi:10.3115/1118656.1118657 dblp:conf/acl/Sumita02 fatcat:4dxgihap5ferzo2jopkjmhd6c4

A Transformer Based Pitch Sequence Autoencoder with MIDI Augmentation [article]

Mingshuo Ding, Yinghao Ma
2021 arXiv   pre-print
To tackle this problem, we apply a masked language model based on ALBERT for composers classification.  ...  The experiment results show our model ranks 3^rd in all the 7 teams in the data challenge in CSMT(2020).  ...  Synonym replacement is not suitable in a sequence of music analysis, because there is no specific semantic like natural language for music notes or sequences.  ... 
arXiv:2010.07758v3 fatcat:ul5hbgupdvfw7og4mkqfvwawxq

Feature instability as a criterion for selecting potential style markers

Moshe Koppel, Navot Akiva, Ido Dagan
2006 Journal of the American Society for Information Science and Technology  
We show that frequent but unstable features are especially useful as discriminators of an author's writing style.  ...  This measure may be perceived as quantifying the degree of available "synonymy" for a language item.  ...  The resulting stability measure is useful for identifying promising candidates for style-based text categorization.  ... 
doi:10.1002/asi.20428 fatcat:mnkhlhomibgcdjt5dcq7h76bga

The COVID-19 fake news detection in Thai social texts

Pakpoom Mookdarsanit, Lawankorn Mookdarsanit
2021 Bulletin of Electrical Engineering and Informatics  
To lead the knowledge in Thai text understanding forward, feature shifting is a promising accuracy improvement in fine-tuning stage.  ...  Machine translation can be used for constructing Thai source dataset to cope with the lack of local dataset for future Thai-NLP applications.  ...  Do not hesitate to contact the corresponding author for the local data and code (and please referred to this paper).  ... 
doi:10.11591/eei.v10i2.2745 fatcat:by3nav3c2jg7hgyrtvse4w6nxa

A Survey on Data Augmentation for Text Classification [article]

Markus Bayer, Marc-André Kaufhold, Christian Reuter
2022 arXiv   pre-print
Data augmentation, the artificial creation of training data for machine learning by transformations, is a widely studied research field across machine learning disciplines.  ...  aims to provide a concise and comprehensive overview for researchers and practitioners.  ...  Synonym Replacement This very popular form of data augmentation describes the paraphrasing transformation of text instances by replacing certain words with synonyms.  ... 
arXiv:2107.03158v4 fatcat:cjw5zo7p3rdfxiy5w5pvk7av2m


Paul Thompson, Sophia Ananiadou
2018 Terminology  
Normalisation methods automatically map divergent phrases to unique concepts in domain-specific terminologies, to allow location and linking of all mentions of a concept of interest.  ...  HYPHEN achieves robust performance for both biomedical academic text and narrative clinical records, and has the ability to significantly outperform related methods.  ...  Acknowledgements The work described in this article has been supported by the EPSRC and MRC (MMPathIC project, Grant. No. MR/N00583X/1), and the BBSRC (EMPATHY project, Grant No. BB/ M006891/1).  ... 
doi:10.1075/term.00015.tho fatcat:emmxgt4gx5cm5oj4yls2yv26qy

Generating similes effortlessly like a Pro: A Style Transfer Approach for Simile Generation [article]

Tuhin Chakrabarty, Smaranda Muresan, Nanyun Peng
2020 arXiv   pre-print
We also show how replacing literal sentences with similes from our best model in machine generated stories improves evocativeness and leads to better acceptance by human judges.  ...  Human evaluation on an independent set of literal statements shows that our model generates similes better than two literary experts 37%[We average 32.6% and 41.3% for 2 humans.] of the times, and three  ...  The authors also thank members of PLUSLab at the University Of California Los Angeles and University Of Southern California and the anonymous reviewers for helpful comments.  ... 
arXiv:2009.08942v2 fatcat:oim6zvwj5baglf5245kx56qspu

Similarity of Semantic Relations [article]

Peter D. Turney
2006 arXiv   pre-print
LRA extends the VSM approach in three ways: (1) the patterns are derived automatically from the corpus, (2) the Singular Value Decomposition (SVD) is used to smooth the frequency data, and (3) automatically  ...  generated synonyms are used to explore variations of the word pairs.  ...  Thanks to Vivi Nastase and Stan Szpakowicz for sharing their 600 classified nounmodifier phrases.  ... 
arXiv:cs/0608100v1 fatcat:dachkmey4jdzxbzntuku5xaqfa

A Girl Has No Name: Automated Authorship Obfuscation using Mutant-X

Asad Mahmood, Faizan Ahmad, Zubair Shafiq, Padmini Srinivasan, Fareed Zaffar
2019 Proceedings on Privacy Enhancing Technologies  
Researchers have proposed several authorship obfuscation approaches that try to make appropriate changes (e.g. word/phrase replacements) to evade attribution while preserving semantics.  ...  The development of powerful machine learning based stylometric authorship attribution methods presents a serious privacy threat for individuals such as journalists and activists who wish to publish anonymously  ...  This research received no specific grant from any funding agency in the public, commercial, or not-for-profit sectors.  ... 
doi:10.2478/popets-2019-0058 dblp:journals/popets/MahmoodASSZ19 fatcat:q2m3o3pg3jao7ff6jnfyv3hfme
« Previous Showing results 1 — 15 out of 10,753 results