Filters








141,418 Hits in 2.5 sec

The Effectiveness of Low-Level Structure-based Approach Toward Source Code Plagiarism Level Taxonomy [article]

Oscar Karnalim, Setia Budi
2018 arXiv   pre-print
., an approach which relies on source code token subsequence matching) in controlled environment.  ...  Our evaluation shows that state of the art in low-level approach is effective to handle most plagiarism attacks.  ...  Evaluation Toward Level-3 Plagiarism Attacks and whitespace tokens.  ... 
arXiv:1805.11035v1 fatcat:iyhrqfqm5bezbm4npm4cxtzpp4

TF-IDF Inspired Detection for Cross-Language Source Code Plagiarism and Collusion

Oscar Karnalim
2020 Computer Science  
Our evaluation shows that the technique outperforms common techniques in academia for handling language conversion disguises.  ...  The datasets and metrics are explained first, followed by the evaluation methodology and results. Evaluation Datasets Three datasets were used in this evaluation.  ...  Evaluation This section describes how the proposed technique was evaluated, which datasets and metrics were used on this evaluation, and what findings could be concluded.  ... 
doi:10.7494/csci.2020.21.1.3389 fatcat:frlo4shfdffglbvhpwmc4n7bkm

A Vocabulary-Free Multilingual Neural Tokenizer for End-to-End Task Learning [article]

Md Mofijul Islam, Gustavo Aguilar, Pragaash Ponnusamy, Clint Solomon Mathialagan, Chengyuan Ma, Chenlei Guo
2022 arXiv   pre-print
subword tokenizers.  ...  In this work, we propose a vocabulary-free neural tokenizer by distilling segmentation information from heuristic-based subword tokenization.  ...  Using monolingual subword tokenizers helps our neural tokenizer avoid bias towards any languages, especially the highresource languages.  ... 
arXiv:2204.10815v1 fatcat:g36mta5kgre7thcsdtlwqawouy

Emotion Carrier Recognition from Personal Narratives [article]

Aniruddha Tammewar, Alessandra Cervone, Giuseppe Riccardi
2021 arXiv   pre-print
We propose evaluation strategies for ECR including metrics that can be appropriate for different tasks.  ...  Token Level The token level evaluation measures the performance of predicting I or O class for each token in a sequence. We use this metric to evaluate our models with different data segmentations.  ...  We presented different baseline models to address the task, and evaluated them using both token-level and agreement metrics.  ... 
arXiv:2008.07481v2 fatcat:vuv3d6fq2ff6jkn5zj2j6ggzy4

Targeted Sentiment to Understand Student Comments

Charles Welch, Rada Mihalcea
2016 International Conference on Computational Linguistics  
Through several comparative evaluations, we show that our system outperforms previous work on a similar task.  ...  We address the task of targeted sentiment as a means of understanding the sentiment that students hold toward courses and instructors, as expressed by students in their comments.  ...  We also run an entity-based evaluation, where we use the IOB tokens to construct full class and instructor names.  ... 
dblp:conf/coling/WelchM16 fatcat:bpvg2v4zhffatol6zsuz6yvq3q

HausaMT v1.0: Towards English-Hausa Neural Machine Translation [article]

Adewale Akinfaderin
2020 arXiv   pre-print
We trained baseline models and evaluated the performance of our models using the Recurrent and Transformer encoder-decoder architecture with two tokenization approaches: standard word-level tokenization  ...  and Byte Pair Encoding (BPE) subword tokenization.  ...  Dataset Conclusion and Future Work Evaluating the model on the test set, we observed that the word-level tokenization outperform the BPE by a BLEU score factor of~1.27-1.42 times ( Table 2 ).  ... 
arXiv:2006.05014v2 fatcat:jbk4zc7p3nb7pgkz5iitjaugoq

An HSS‐based robust and lightweight multiple group authentication for ITS towards 5G

Chingfang Hsu, Lein Harn, Zhe Xia
2021 IET Intelligent Transport Systems  
modulus, thus this design is more suitable for lightweight multiple group authentication in ITS towards 5G.  ...  More importantly, since polynomial evaluations with a smaller modulus are needed in our proposed scheme, it is more efficient than the original scheme which needs modular exponentiations with a larger  ...  From Horner's rule, evaluating a polynomial of degree t − 1 needs t − 1 multiplications and t additions.  ... 
doi:10.1049/itr2.12113 fatcat:absavir7wfge7doisb7xzkv4i4

An English-Hindi Code-Mixed Corpus: Stance Annotation and Baseline System [article]

Sahil Swami, Ankush Khandelwal, Vinay Singh, Syed Sarfaraz Akhtar, Manish Shrivastava
2018 arXiv   pre-print
We can often detect from these views whether the person is in favor, against or neu- tral towards a given topic. These opinions from social media are very useful for various companies.  ...  We present a new dataset that consists of 3545 English-Hindi code-mixed tweets with opinion towards Demoneti- sation that was implemented in India in 2016 which was followed by a large countrywide debate  ...  Each tweet is tokenized and each token is annotated with a language tag. The dataset has 964 tweets in favor, 647 tweets against and 1934 tweets that have no stance towards the target.  ... 
arXiv:1805.11868v1 fatcat:ltastgi2fbhhdpytuc545xnnaq

Lexical Analysis in Content Management System Details

Takudzwa Fadziso
2019 Global Disclosure of Economics and Business  
Lexical analysis is best described as tokenization that converts a sequence of characters (program) into tokens with identifiable meanings.  ...  The task of the lexical analyzer is to read the various input characters grouping them into lexemes and producing an output of a sequence of tokens.  ...  Lexical errors generally, lexical mistakes happen when invalid characters show up, fundamentally toward the start of the token.  ... 
doi:10.18034/gdeb.v8i2.559 fatcat:hwyg2uvytbdwdo52c44qmhnx4u

Topic-Aware Evaluation and Transformer Methods for Topic-Controllable Summarization [article]

Tatiana Passali, Grigorios Tsoumakas
2022 arXiv   pre-print
First, there is currently no established evaluation metric for this task.  ...  In this work, we propose a new topic-oriented evaluation measure to automatically evaluate the generated summaries based on the topic affinity between the generated summary and the desired topic.  ...  towards this topic.  ... 
arXiv:2206.04317v2 fatcat:2n3y4ef5jnh5nd2334xcpyxytm

The attitude of Japanese newspapers in narrating disaster events: Appraisal in critical discourse study

Dian Puspita, Budi Eko Pranoto
2021 Studies in English language and education  
It reveals newspapers' tendency to emphasize the attitude and to construe the evaluation toward the events or phenomena rather than revealing the feelings or emotions experienced by the emoter(s).  ...  Therefore, it can be concluded that when reporting disaster events, Japanese newspapers tend to emphasize the attitude toward disaster and to construe the evaluation toward the events or phenomena (disaster  ...  Authorial evaluation toward certain topics is found when the writer evaluates the roadmap for lifting evacuation presented in a meeting with a local official done by the Japanese reconstruction minister  ... 
doi:10.24815/siele.v8i2.18368 fatcat:hqgnkhhs2nbwpfdt25emqm4x7u

DExperts: Decoding-Time Controlled Text Generation with Experts and Anti-Experts [article]

Alisa Liu, Maarten Sap, Ximing Lu, Swabha Swayamdipta, Chandra Bhagavatula, Noah A. Smith, Yejin Choi
2021 arXiv   pre-print
Intuitively, under the ensemble, tokens only get high probability if they are considered likely by the experts, and unlikely by the anti-experts.  ...  We apply DExperts to language detoxification and sentiment-controlled generation, where we outperform existing controllable generation methods on both automatic and human evaluations.  ...  for steering toward positivity on negative prompts (left) and steering toward negativity on positive prompts (right).  ... 
arXiv:2105.03023v2 fatcat:jacmrcsmlneexgsxpqkr64o7j4

Neural Quality Estimation with Multiple Hypotheses for Grammatical Error Correction [article]

Zhenghao Liu, Xiaoyuan Yi, Maosong Sun, Liner Yang, Tat-Seng Chua
2021 arXiv   pre-print
Well-trained GEC models can generate several high-quality hypotheses through decoding, such as beam search, which provide valuable GEC evidence and can be used to evaluate GEC quality.  ...  The node interaction attention towards the edited token "suffers" in the second node is also plotted.  ...  representation V k of k-th node, which learns the supporting evidence towards estimating token quality from multi-hypotheses.  ... 
arXiv:2105.04443v1 fatcat:pkthay3t5ja73iij6blynwy5c4

What the Future Brings: Investigating the Impact of Lookahead for Incremental Neural TTS [article]

Brooke Stephenson, Laurent Besacier, Laurent Girin, Thomas Hueber
2020 arXiv   pre-print
The results show that the most salient factors are related to token length. We finally evaluate the effects of lookahead k at the decoder level, using a MUSHRA listening test.  ...  We then investigate which text features are the most influential on the evolution towards the final representation using a random forest analysis.  ...  The goal of the present paper is to pave the way toward an adaptive decoding policy for a neural iTTS.  ... 
arXiv:2009.02035v1 fatcat:celizkfpg5fsbkz27jhgaxrpdu

What the Future Brings: Investigating the Impact of Lookahead for Incremental Neural TTS

Brooke Stephenson, Laurent Besacier, Laurent Girin, Thomas Hueber
2020 Interspeech 2020  
The results show that the most salient factors are related to token length. We finally evaluate the effects of lookahead k at the decoder level, using a MUSHRA listening test.  ...  We then investigate which text features are the most influential on the evolution towards the final representation using a random forest analysis.  ...  The goal of the present paper is to pave the way toward an adaptive decoding policy for a neural iTTS.  ... 
doi:10.21437/interspeech.2020-2103 dblp:conf/interspeech/StephensonBGH20 fatcat:gqmhry5ktvagfpyvknzq23a6he
« Previous Showing results 1 — 15 out of 141,418 results