A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2017; you can also visit the original URL.
The file type is application/pdf
.
Filters
Adaptation of AAC to the Context Communication: A Real Improvement for the User Illustration through the VITIPI Word Completion
[chapter]
2012
Lecture Notes in Computer Science
This paper describes the performance of the VITIPI word completion system through a text input simulation. ...
The aim of this simulation is to estimate the impact of the linguistic knowledge base size through two metrics: the Key-Stroke Ratio (KSR) and the KeyStroke Per Character (KPC). ...
The same effect is observed on the (Fig. 3) (after the consideration of 16 sub-corpora the KSR is around 44,5). ...
doi:10.1007/978-3-642-31534-3_67
fatcat:psusovqq2necpkhyaasfafmgqm
PJAIT Systems for the WMT 2016
2016
Proceedings of the First Conference on Machine Translation: Volume 2, Shared Task Papers
Our results indicate that our approach produced a positive impact on SMT quality. ...
To evaluate the effects of different preparations on translation results, we conducted experiments and used the BLEU, NIST and TER metrics. ...
SMT systems are more accurate on corpora from a domain that is not too wide. ...
doi:10.18653/v1/w16-2328
dblp:conf/wmt/WolkM16
fatcat:rwsizz5u4rbl5h565ryi7uovda
Extracting parallel sub-sentential fragments from non-parallel corpora
2006
Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the ACL - ACL '06
We present a novel method for extracting parallel sub-sentential fragments from comparable, non-parallel bilingual corpora. ...
We evaluate the quality of the extracted data by showing that it improves the performance of a state-of-the-art statistical machine translation system. © , and the English one be the target, . ...
SMT Performance Results We evaluate our extracted corpora by measuring their impact on the performance of an SMT system. ...
doi:10.3115/1220175.1220186
dblp:conf/acl/MunteanuM06
fatcat:par2gtjcqzc2rmggbx6zpqj72e
Discourse on climate and energy justice: a comparative study of Do It Yourself and Bootstrapped corpora
2018
Corpus
system. ...
Corpus, 18 | 2018 statistic tests (Log Likelihood Feature on Antconc, UCREL Semantic Analysis System on WMatrix) rather than raw counts. ...
doi:10.4000/corpus.3376
fatcat:ms7xjfo62zawhaofloukr4v2de
PJIIT's systems for WMT 2017 Conference
2017
Proceedings of the Second Conference on Machine Translation
Our results indicate that our approach produced a positive impact on SMT quality. ...
To evaluate the effects of different preparations on translation results, we conducted experiments and used the BLEU, NIST and TER metrics. ...
SMT systems are more accurate on corpora from a domain that is not too wide. ...
doi:10.18653/v1/w17-4743
dblp:conf/wmt/WolkM17
fatcat:ebotrgzcy5bvzdesdsiti3mnly
An Efficient Framework to Extract Parallel Units from Comparable Data
[chapter]
2013
Communications in Computer and Information Science
Experimental results on SMT show that the baseline SMT system can achieve significant improvement by adding those extra-mined knowledge. ...
At sentential level, we consider the parallel sentence identification as a classification problem and extract more representative and effective features. ...
fragments, and (3) adding both the extracted parallel sentences and sub-sentential fragments to the original corpora and then evaluate the impact to an end-to-end SMT system. ...
doi:10.1007/978-3-642-41644-6_15
fatcat:careputcljc7hesquxqwtkou2e
Hierarchical Ontology Graph for Solving Semantic Issues in Decision Support Systems
2019
Proceedings of the 21st International Conference on Enterprise Information Systems
The study of selecting the appropriate corpora is intended to improve the data asset management of enterprises. ...
However, for answering the questions with complex logic, AI system is still in an early stage. ...
The workload of getting a semantically-rich annotated corpora is manageable, which is a crucial impact factor of the computing result. ...
doi:10.5220/0007769904830487
dblp:conf/iceis/GuoL19
fatcat:ynexyh7qcbffdkoptmm7kbufpu
Selecting Parallel In-domain Sentences for Neural Machine Translation Using Monolingual Texts
[article]
2022
arXiv
pre-print
Our experimental results show that models trained on this in-domain data outperform models trained on generic or a mixture of generic and domain data. ...
Our work addresses this gap with a method for selecting in-domain data from generic-domain (parallel text) corpora, for the task of machine translation. ...
According to TER, system top1 achieved the highest score among all systems trained on mixed sub-corpora. This would imply that this system would require the most post-editing effort. ...
arXiv:2112.06096v3
fatcat:34ggtx62e5cybiprr2ioaxnxb4
PJAIT Systems for the IWSLT 2015 Evaluation Campaign Enhanced by Comparable Corpora
[article]
2015
arXiv
pre-print
Our results indicate that our approach produced a positive impact on SMT quality. ...
To evaluate the effects of different preparations on translation results, we conducted experiments and used the BLEU, NIST and TER metrics. ...
The results show a positive impact of our approach on SMT quality across the language pairs. ...
arXiv:1512.01639v1
fatcat:ldzt4lq3sjf2hpr2wgstiuf2xu
Visualising COVID-19 Research
[article]
2020
arXiv
pre-print
We apply this method on two recently released publications datasets (Dimensions' COVID-19 dataset and the Allen Institute for AI's CORD-19). ...
The results also demonstrate the need to quickly and automatically enable search and browsing of large corpora. ...
ACKNOWLEDGEMENTS This work was funded by the ORCA Hub (EPSRC grant: EP/R026173/1, website: orcahub.org) and the Exploiting Impact Using a Modular Decision-Making Toolset project (EPSRC Impact Acceleration ...
arXiv:2005.06380v2
fatcat:fc72qsogt5ft3fl5zzqv7ialr4
Comparison of Different Orthographies for Machine Translation of Under-Resourced Dravidian Languages
2019
International Conference on Language, Data, and Knowledge
We performed experiments on the language pairs English-Tamil, English-Telugu, and English-Kannada translation task. ...
Moreover, by analyzing the METEOR and chrF scores we note that systems, based on the Latin script using sub-word segmented corpora effectively reduce the translation errors. ...
We study the effect of different orthography on NMT and show that coarse-grained transcription to Latin script outperforms the more fine-grained IPA and native script on multilingual NMT system. ...
doi:10.4230/oasics.ldk.2019.6
dblp:conf/ldk/ChakravarthiAM19
fatcat:krlq7ashabcxxkyrkwr2labwwa
Advances in Automatic Speech Recognition for Child Speech Using Factored Time Delay Neural Network
2019
Interspeech 2019
Compared with conventional GMM-HMM and TDNN systems, TDNN-F does better on two widely accessible corpora: CMU Kids and CSLU Kids, and on the combination of these two. ...
Our system achieves a 26% relative improvement in WER. ...
Though effective with traditional models like GMM-HMM, VTLN has no significantly impacts on TDNN-F. ...
doi:10.21437/interspeech.2019-2980
dblp:conf/interspeech/WuGPK19
fatcat:25dc3axnq5gvhhy7s6evocawzm
Automated Extraction of Vulnerability Information for Home Computer Security
[chapter]
2015
Lecture Notes in Computer Science
We discuss design considerations that should be taken into account in implementing information retrieval systems for security domain. ...
These two systems are evaluated to compare accuracy in recognizing security concepts in previously unseen vulnerability description texts. ...
Performance was calculated based only on the labels in the intersection: software, operating system, file name, NER-modifier, and consequence/impact. ...
doi:10.1007/978-3-319-17040-4_24
fatcat:fugaasniarcivd62rp2yjhx6fq
Corpus Size and Composition: Evidence from the Inflectional Morphology of Nouns in Old English and Old Frisian
2014
Amsterdamer Beiträge zur älteren Germanistik
In order to estimate the impact of genre differences between Old English and Old Frisian corpora on the interpretation of the data, a sub-corpus of Old English legal texts was selected. ...
Therefore, clear effects can be expected on the number of attested lemmas in the analysed corpora. ...
(c) Italics with an asterisk refer to a reconstructed form based on the modern dialects of Frisian.
CLASS ...
doi:10.1163/9789401211918_021
fatcat:igpj446pl5efjk4mfu4mpkiqwi
Using of heterogeneous corpora for training of an ASR system
[article]
2017
arXiv
pre-print
The paper summarizes the development of the LVCSR system built as a part of the Pashto speech-translation system at the SCALE (Summer Camp for Applied Language Exploration) 2015 workshop on "Speech-to-text-translation ...
This paper concentrates only on the LVCSR part and presents a range of different techniques that were found to be useful in order to benefit from multiple different corpora ...
This represents one iteration of the joint-multi-corpora training. ...
arXiv:1706.00321v1
fatcat:ig56sf4bvjbebej3r7rl4qibwe
« Previous
Showing results 1 — 15 out of 15,211 results