15,211 Hits in 3.8 sec

Adaptation of AAC to the Context Communication: A Real Improvement for the User Illustration through the VITIPI Word Completion [chapter]

Philippe Boissière, Nadine Vigouroux, Mustapha Mojahid, Frédéric Vella
2012 Lecture Notes in Computer Science  
This paper describes the performance of the VITIPI word completion system through a text input simulation.  ...  The aim of this simulation is to estimate the impact of the linguistic knowledge base size through two metrics: the Key-Stroke Ratio (KSR) and the KeyStroke Per Character (KPC).  ...  The same effect is observed on the (Fig. 3) (after the consideration of 16 sub-corpora the KSR is around 44,5).  ... 
doi:10.1007/978-3-642-31534-3_67 fatcat:psusovqq2necpkhyaasfafmgqm

PJAIT Systems for the WMT 2016

Krzysztof Wolk, Krzysztof Marasek
2016 Proceedings of the First Conference on Machine Translation: Volume 2, Shared Task Papers  
Our results indicate that our approach produced a positive impact on SMT quality.  ...  To evaluate the effects of different preparations on translation results, we conducted experiments and used the BLEU, NIST and TER metrics.  ...  SMT systems are more accurate on corpora from a domain that is not too wide.  ... 
doi:10.18653/v1/w16-2328 dblp:conf/wmt/WolkM16 fatcat:rwsizz5u4rbl5h565ryi7uovda

Extracting parallel sub-sentential fragments from non-parallel corpora

Dragos Stefan Munteanu, Daniel Marcu
2006 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the ACL - ACL '06  
We present a novel method for extracting parallel sub-sentential fragments from comparable, non-parallel bilingual corpora.  ...  We evaluate the quality of the extracted data by showing that it improves the performance of a state-of-the-art statistical machine translation system. © , and the English one be the target, .  ...  SMT Performance Results We evaluate our extracted corpora by measuring their impact on the performance of an SMT system.  ... 
doi:10.3115/1220175.1220186 dblp:conf/acl/MunteanuM06 fatcat:par2gtjcqzc2rmggbx6zpqj72e

Discourse on climate and energy justice: a comparative study of Do It Yourself and Bootstrapped corpora

Camille Biros, Caroline Rossi, Inesa Sahakyan
2018 Corpus  
system.  ...  Corpus, 18 | 2018 statistic tests (Log Likelihood Feature on Antconc, UCREL Semantic Analysis System on WMatrix) rather than raw counts.  ... 
doi:10.4000/corpus.3376 fatcat:ms7xjfo62zawhaofloukr4v2de

PJIIT's systems for WMT 2017 Conference

Krzysztof Wolk, Krzysztof Marasek
2017 Proceedings of the Second Conference on Machine Translation  
Our results indicate that our approach produced a positive impact on SMT quality.  ...  To evaluate the effects of different preparations on translation results, we conducted experiments and used the BLEU, NIST and TER metrics.  ...  SMT systems are more accurate on corpora from a domain that is not too wide.  ... 
doi:10.18653/v1/w17-4743 dblp:conf/wmt/WolkM17 fatcat:ebotrgzcy5bvzdesdsiti3mnly

An Efficient Framework to Extract Parallel Units from Comparable Data [chapter]

Lu Xiang, Yu Zhou, Chengqing Zong
2013 Communications in Computer and Information Science  
Experimental results on SMT show that the baseline SMT system can achieve significant improvement by adding those extra-mined knowledge.  ...  At sentential level, we consider the parallel sentence identification as a classification problem and extract more representative and effective features.  ...  fragments, and (3) adding both the extracted parallel sentences and sub-sentential fragments to the original corpora and then evaluate the impact to an end-to-end SMT system.  ... 
doi:10.1007/978-3-642-41644-6_15 fatcat:careputcljc7hesquxqwtkou2e

Hierarchical Ontology Graph for Solving Semantic Issues in Decision Support Systems

Hua Guo, Kecheng Liu
2019 Proceedings of the 21st International Conference on Enterprise Information Systems  
The study of selecting the appropriate corpora is intended to improve the data asset management of enterprises.  ...  However, for answering the questions with complex logic, AI system is still in an early stage.  ...  The workload of getting a semantically-rich annotated corpora is manageable, which is a crucial impact factor of the computing result.  ... 
doi:10.5220/0007769904830487 dblp:conf/iceis/GuoL19 fatcat:ynexyh7qcbffdkoptmm7kbufpu

Selecting Parallel In-domain Sentences for Neural Machine Translation Using Monolingual Texts [article]

Javad Pourmostafa Roshan Sharami, Dimitar Shterionov, Pieter Spronck
2022 arXiv   pre-print
Our experimental results show that models trained on this in-domain data outperform models trained on generic or a mixture of generic and domain data.  ...  Our work addresses this gap with a method for selecting in-domain data from generic-domain (parallel text) corpora, for the task of machine translation.  ...  According to TER, system top1 achieved the highest score among all systems trained on mixed sub-corpora. This would imply that this system would require the most post-editing effort.  ... 
arXiv:2112.06096v3 fatcat:34ggtx62e5cybiprr2ioaxnxb4

PJAIT Systems for the IWSLT 2015 Evaluation Campaign Enhanced by Comparable Corpora [article]

Krzysztof Wołk, Krzysztof Marasek
2015 arXiv   pre-print
Our results indicate that our approach produced a positive impact on SMT quality.  ...  To evaluate the effects of different preparations on translation results, we conducted experiments and used the BLEU, NIST and TER metrics.  ...  The results show a positive impact of our approach on SMT quality across the language pairs.  ... 
arXiv:1512.01639v1 fatcat:ldzt4lq3sjf2hpr2wgstiuf2xu

Visualising COVID-19 Research [article]

Pierre Le Bras, Azimeh Gharavi, David A. Robb, Ana F. Vidal, Stefano Padilla, Mike J. Chantler
2020 arXiv   pre-print
We apply this method on two recently released publications datasets (Dimensions' COVID-19 dataset and the Allen Institute for AI's CORD-19).  ...  The results also demonstrate the need to quickly and automatically enable search and browsing of large corpora.  ...  ACKNOWLEDGEMENTS This work was funded by the ORCA Hub (EPSRC grant: EP/R026173/1, website: and the Exploiting Impact Using a Modular Decision-Making Toolset project (EPSRC Impact Acceleration  ... 
arXiv:2005.06380v2 fatcat:fc72qsogt5ft3fl5zzqv7ialr4

Comparison of Different Orthographies for Machine Translation of Under-Resourced Dravidian Languages

Bharathi Raja Chakravarthi, Mihael Arcan, John P. McCrae, Michael Wagner
2019 International Conference on Language, Data, and Knowledge  
We performed experiments on the language pairs English-Tamil, English-Telugu, and English-Kannada translation task.  ...  Moreover, by analyzing the METEOR and chrF scores we note that systems, based on the Latin script using sub-word segmented corpora effectively reduce the translation errors.  ...  We study the effect of different orthography on NMT and show that coarse-grained transcription to Latin script outperforms the more fine-grained IPA and native script on multilingual NMT system.  ... 
doi:10.4230/oasics.ldk.2019.6 dblp:conf/ldk/ChakravarthiAM19 fatcat:krlq7ashabcxxkyrkwr2labwwa

Advances in Automatic Speech Recognition for Child Speech Using Factored Time Delay Neural Network

Fei Wu, Leibny Paola García-Perera, Daniel Povey, Sanjeev Khudanpur
2019 Interspeech 2019  
Compared with conventional GMM-HMM and TDNN systems, TDNN-F does better on two widely accessible corpora: CMU Kids and CSLU Kids, and on the combination of these two.  ...  Our system achieves a 26% relative improvement in WER.  ...  Though effective with traditional models like GMM-HMM, VTLN has no significantly impacts on TDNN-F.  ... 
doi:10.21437/interspeech.2019-2980 dblp:conf/interspeech/WuGPK19 fatcat:25dc3axnq5gvhhy7s6evocawzm

Automated Extraction of Vulnerability Information for Home Computer Security [chapter]

Sachini Weerawardhana, Subhojeet Mukherjee, Indrajit Ray, Adele Howe
2015 Lecture Notes in Computer Science  
We discuss design considerations that should be taken into account in implementing information retrieval systems for security domain.  ...  These two systems are evaluated to compare accuracy in recognizing security concepts in previously unseen vulnerability description texts.  ...  Performance was calculated based only on the labels in the intersection: software, operating system, file name, NER-modifier, and consequence/impact.  ... 
doi:10.1007/978-3-319-17040-4_24 fatcat:fugaasniarcivd62rp2yjhx6fq

Corpus Size and Composition: Evidence from the Inflectional Morphology of Nouns in Old English and Old Frisian

2014 Amsterdamer Beiträge zur älteren Germanistik  
In order to estimate the impact of genre differences between Old English and Old Frisian corpora on the interpretation of the data, a sub-corpus of Old English legal texts was selected.  ...  Therefore, clear effects can be expected on the number of attested lemmas in the analysed corpora.  ...  (c) Italics with an asterisk refer to a reconstructed form based on the modern dialects of Frisian. CLASS  ... 
doi:10.1163/9789401211918_021 fatcat:igpj446pl5efjk4mfu4mpkiqwi

Using of heterogeneous corpora for training of an ASR system [article]

Jan Trmal, Gaurav Kumar, Vimal Manohar, Sanjeev Khudanpur, Matt Post, Paul McNamee
2017 arXiv   pre-print
The paper summarizes the development of the LVCSR system built as a part of the Pashto speech-translation system at the SCALE (Summer Camp for Applied Language Exploration) 2015 workshop on "Speech-to-text-translation  ...  This paper concentrates only on the LVCSR part and presents a range of different techniques that were found to be useful in order to benefit from multiple different corpora  ...  This represents one iteration of the joint-multi-corpora training.  ... 
arXiv:1706.00321v1 fatcat:ig56sf4bvjbebej3r7rl4qibwe
« Previous Showing results 1 — 15 out of 15,211 results