A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2017; you can also visit the original URL.
The file type is application/pdf
.
Filters
Bilingual Chunk Alignment Based on Interactional Matching and Probabilistic Latent Semantic Indexing
[chapter]
2005
Lecture Notes in Computer Science
An integrated method for bilingual chunk partition and alignment, called "Interactional Matching", is proposed in this paper. ...
Furthermore, with the technology of Probabilistic Latent Semantic Indexing(PLSI), this method can deal with not only compositional chunks, but also non-compositional ones. ...
The method involves two key technologies, namely Interactional Matching and Probabilistic Latent Semantic Indexing (PLSI). ...
doi:10.1007/978-3-540-30211-7_44
fatcat:7koilevhdjbthmf6qbeux3q3ca
Getting Past the Language Gap: Innovations in Machine Translation
[chapter]
2012
Mobile Speech and Advanced Natural Language Solutions
There are many possible ways of segmenting and translating phrases: this is done on a probabilistic basis, and the probability distribution of the collected phrase pairs is usually based on their relative ...
Then section "Knowledge-Based MT Systems" will introduce knowledge, semantically-based systems. ...
Costa-jussà (2011) proposed and evaluated an approach that uses a semantic feature for statistical machine translation, based on Latent Semantic Indexing. ...
doi:10.1007/978-1-4614-6018-3_6
fatcat:2njkc6meabhaxosl4wircumfjm
Multilingual Part-of-Speech Tagging: Two Unsupervised Approaches
2009
The Journal of Artificial Intelligence Research
model which instead incorporates multilingual context using latent variables. ...
We consider two ways of applying this intuition to the problem of unsupervised part-of-speech tagging: a model that directly merges tag structures for a pair of languages into a single sequence and a second ...
Any opinions, findings, and conclusions or recommendations expressed above are those of the authors and do not necessarily reflect the views of the NSF. ...
doi:10.1613/jair.2843
fatcat:vwi2yze4endgngwkvtybiczta4
Message from the general chair
2015
2015 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS)
Our system gives a better performance than all the learning-based systems from the CoNLL-2011 shared task on the same dataset. ...
To maximize the utility of the injected knowledge, we deploy a learning-based multi-sieve approach and develop novel entity-based features. ...
rule extraction by deleting spurious word alignment links and adding new valuable links based on bilingual translation span correspondences. ...
doi:10.1109/ispass.2015.7095776
dblp:conf/ispass/Lee15
fatcat:ehbed6nl6barfgs6pzwcvwxria
Analyzing Non-Textual Content Elements to Detect Academic Plagiarism
2021
Zenodo
Detection approaches proposed so far analyze lexical, syntactical, and semantic text similarity. These approaches find copied, moderately reworded, and literally translated text. ...
To demonstrate the benefit of combining non-textual and text-based detection methods, the thesis describes the first plagiarism detection system that integrates th [...] ...
chunked,
considering shared citations depending on the prior chunk,
no merging step
Text-based detection methods
Enco
Encoplot-exact character 16-gram string matching
Sherlock
Sherlock-probabilistic ...
doi:10.5281/zenodo.4913344
fatcat:xmpaahvwuva53l5l5i2gaidvi4
Statistical machine translation enhancements through linguistic levels
2014
ACM Computing Surveys
One of the most popular approaches is the Statistical Machine Translation (SMT) approach, which tries to cover translation in a holistic manner by learning from parallel corpus aligned at the sentence ...
However, with this basic approach, there are some issues at each written linguistic level (i.e., orthographic, morphological, lexical, syntactic and semantic) that remain unsolved. ...
The authors exploit both the standard vector-space model [Salton and McGill 1983] and latent semantic indexing [Landauer et al. 1998 ]. ...
doi:10.1145/2518130
fatcat:cy6cud32tjgvjjsgiiv5aj65zi
Association for Computational Linguistics
[chapter]
2006
Encyclopedia of Language & Linguistics
The lecturers have been invited to write papers on all aspects of computational approaches to Natural Language Processing. The papers received have been revised and prepared to compose this issue. ...
We thank all lecturers and participants who have contributed and made this publication possible. ...
(Calvo 2013) presents different categorical approaches based on Vector Space Model (VSM) with three dimensionality reduction techniques: Latent Semantic Analysis (LSA), Probabilistic Latent Semantic Analysis ...
doi:10.1016/b0-08-044854-2/05234-2
fatcat:bbncnskzhvhxtbfdk5ftli7gva
Embedding Web-based Statistical Translation Models in Cross-Language Information Retrieval
[article]
2003
arXiv
pre-print
Our experiments on standard test collections for CLIR show that the Web-based translation models can surpass commercial MT systems in CLIR tasks. ...
on a bag-of-words. ...
Finally, we want to thank Elliott Macklovitch and the two anonymous reviewers for their constructive comments and careful review. ...
arXiv:cs/0312008v1
fatcat:hztoxce3frcgpbsmegftpg4rdu
Embedding Web-Based Statistical Translation Models in Cross-Language Information Retrieval
2003
Computational Linguistics
Our experiments on standard test collections for CLIR show that the Web-based translation models can surpass commercial MT systems in CLIR tasks. ...
on a bag of words. ...
Finally, we want to thank Elliott Macklovitch and the two anonymous reviewers for their constructive comments and careful review. ...
doi:10.1162/089120103322711587
fatcat:dkxidh7b3vdszodokvwhjd4nre
Pre-training Methods in Information Retrieval
[article]
2022
arXiv
pre-print
Moreover, we discuss some open challenges and highlight several promising directions, with the hope of inspiring and facilitating more works on these topics for future research. ...
In addition, we also introduce PTMs specifically designed for IR, and summarize available datasets as well as benchmark leaderboards. ...
Hybrid Retrieval Models Sparse retrieval models take a (latent) word as the unit of representations, which can calculate the matching score based on exact matching signals. ...
arXiv:2111.13853v3
fatcat:pilemnpphrgv5ksaktvctqdi4y
Video Description: A Survey of Methods, Datasets and Evaluation Metrics
[article]
2019
arXiv
pre-print
It has applications in human-robot interaction, helping the visually impaired and video subtitling. ...
Classical video description approaches combined subject, object and verb detection with template based language models to generate sentences. ...
The research was supported by ARC Discovery Grant DP160101458 and DP150102405. ...
arXiv:1806.00186v3
fatcat:elxztcpzizhr7clugnbjvvrpte
A Survey on Event Extraction for Natural Language Understanding: Riding the Biomedical Literature Wave
2021
IEEE Access
INDEX TERMS Biomedical text mining, event extraction, natural language understanding, semantic parsing. ...
Events can model complex interactions involving multiple participants having a specific semantic role, also handling nested and overlapping definitions. ...
ACKNOWLEDGMENT The authors thank Giulio Carlassare for his contributions during productive discussions and practical experiments on biomedical corpora. ...
doi:10.1109/access.2021.3130956
fatcat:wlr7zeikdva77ojuppqx3vmocy
Multi-word Entity Classification in a Highly Multilingual Environment
2017
Proceedings of the 13th Workshop on Multiword Expressions (MWE 2017)
The program also included a panel discussion on the future directions of the MWE community and the SIGLEX Section. ...
We also want to thank the IC1207 COST Action PARSEME and SIGLEX for their endorsement and support, as well as the EACL 2017 organizers. ...
In addition, we would like to thank Lauren Rudat for her suggestions on improving the stimuli, and to the anonymous reviewers for their suggestions on improving the paper. ...
doi:10.18653/v1/w17-1702
dblp:conf/mwe/ChesneyJSP17
fatcat:bv7aavgth5eurmzuphuowtuuhq
Automatic Extraction of Property Norm-Like Data From Large Text Corpora
2013
Cognitive Science
similarity evaluation and a Word-Net semantic similarity comparison. ...
Traditional methods for deriving property-based representations of concepts from text have focused on extracting unspecified relationships (e.g., car -petrol) or only a subset of possible relation types ...
Unsupervised techniques have found applications in many parts of NLP (e.g., grammar induction, word-alignment for bilingual translation) and do not suffer from the same limits on data resources; however ...
doi:10.1111/cogs.12091
pmid:25019134
fatcat:s4fboxw6szhcdl5znujtxffiru
Neural machine translation: A review of methods, resources, and tools
2020
AI Open
In this article, we first provide a broad review of the methods for NMT and focus on methods relating to architectures, decoding, and data augmentation. ...
In recent years, end-to-end neural machine translation (NMT) has achieved great success and has become the new mainstream method in practical MT systems. ...
Program of China (No. 2017YFB0 202204), National Natural Science Foundation of China (No. 61925601, No. 61761166 008, No. 61772302), Beijing Academy of Artificial Intelligence, Huawei Noah's Ark Lab, and ...
doi:10.1016/j.aiopen.2020.11.001
fatcat:wkplwv43knb3lebicckmwbxlwu
« Previous
Showing results 1 — 15 out of 157 results