A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2022; you can also visit the original URL.
The file type is application/pdf
.
Filters
Multilingual Document Classification via Transductive Learning
2015
Italian Information Retrieval Workshop
We present a transductive learning based framework for multilingual document classification, originally proposed in [7] . ...
setting for multilingual document classification. ...
Motivated by the above considerations, we present a framework for multilingual document classification under a transductive learning setting, originally proposed in [7] . ...
dblp:conf/iir/RomeoIT15
fatcat:zt6re6bz2jejphavzvnxumftty
Knowledge-Based Representation for Transductive Multilingual Document Classification
[chapter]
2015
Lecture Notes in Computer Science
To overcome such issues we propose a new framework for multilingual document classification under a transductive learning setting. ...
, and the robustness of the transductive setting for multilingual document classification. ...
To the best of our knowledge, we bring for the first time a transductive learning approach to a multilingual document classification. ...
doi:10.1007/978-3-319-16354-3_11
fatcat:2ysczrdjdjg4fpxnfxivsj5yjq
Cross-lingual Text Classification via Model Translation with Limited Dictionaries
2016
Proceedings of the 25th ACM International on Conference on Information and Knowledge Management - CIKM '16
An open challenge in CLTC is to classify documents for the languages where labeled training data are not available. ...
Cross-lingual text classi cation (CLTC) refers to the task of classifying documents in di erent languages into the same taxonomy of categories. ...
They exploited the readily available unlabeled data in the target language via semi-supervised learning. ...
doi:10.1145/2983323.2983732
dblp:conf/cikm/XuYLH16
fatcat:shnj4lh6y5hetbzj5xrrgczm44
TextServer: Cloud-Based Multilingual Natural Language Processing
2015
2015 IEEE International Conference on Data Mining Workshop (ICDMW)
Concretely, it is based on graph partitioning via constraint relaxation labeling, where constraints were automatically learned. ...
Named entity recognition and classification (NERC) This service recognizes names of persons, locations and organizations mentioned in a document. ...
doi:10.1109/icdmw.2015.102
dblp:conf/icdm/PadroT15
fatcat:k4a2xf4uyvhgboi3l2zmebkm54
Cross-lingual Inductive Transfer to Detect Offensive Language
[article]
2020
arXiv
pre-print
Further experimentation proves that our model works competitively in a zero-shot learning environment, and is extensible to other languages. ...
In OffensEval 2020, the organizers have released the multilingual Offensive Language Identification Dataset (mOLID), which contains tweets in five different languages, to detect offensive language. ...
It can be classified into multiple types, like transductive transfer learning and inductive transfer learning. ...
arXiv:2007.03771v1
fatcat:33ysykzigvfnhbbdvvhgkhbaty
Joint Bilingual Sentiment Classification with Unlabeled Parallel Corpora
2011
Annual Meeting of the Association for Computational Linguistics
Most previous work on multilingual sentiment analysis has focused on methods to adapt sentiment resources from resource-rich languages to resource-poor languages. ...
We present a novel approach for joint bilingual sentiment classification at the sentence level that augments available labeled data in each language with unlabeled parallel data. ...
We compare to co-training and transductive SVMs in Section 5. Multilingual NLP for Other Tasks. ...
dblp:conf/acl/LuTCT11
fatcat:wjzdoox5fbb6npnvzboiwomowi
Semi-Supervised Representation Learning for Cross-Lingual Text Classification
2013
Conference on Empirical Methods in Natural Language Processing
In this paper, we propose a new cross-lingual adaptation approach for document classification based on learning cross-lingual discriminative distributed representations of words. ...
documents. ...
Guo and Xiao (2012a) developed a transductive subspace representation learning method for crosslingual text classification based on non-negative matrix factorization. ...
dblp:conf/emnlp/XiaoG13
fatcat:lyoryleoyrcgbjdgsme4u3yswq
Cross-domain Feature Selection for Language Identification
2011
International Joint Conference on Natural Language Processing
Our results demonstrate that our method provides improvements in transductive transfer learning for language identification. ...
We show that transductive (cross-domain) learning is an important consideration in building a general-purpose language identification system, and develop a feature selection method that generalizes across ...
(2) base the classification only on the first 500 bytes of the document. ...
dblp:conf/ijcnlp/LuiB11
fatcat:gnftr2csbvf35enj5rlaypkpcu
Generalized Funnelling: Ensemble Learning and Heterogeneous Document Embeddings for Cross-Lingual Text Classification
[article]
2022
arXiv
pre-print
Funnelling (Fun) is a recently proposed method for cross-lingual text classification (CLTC) based on a two-tier learning ensemble for heterogeneous transfer learning (HTL). ...
text classification. ...
[60] demonstrates the effectiveness of meta-learning approaches to crosslingual text classification. ...
arXiv:2110.14764v2
fatcat:4mm6n6crtvgf3ktwmizllw77ci
Low-resource Languages: A Review of Past Work and Future Challenges
[article]
2020
arXiv
pre-print
Document-level alignment Aligning corpora at the document-level is useful for text classification, translation or multilingual representations. ...
Multilingual Learning Multilingual Learning extends the transfer learning techniques in multilingual environments. ...
Yoda system for wmt16 shared task: Bilingual document alignment. In Proceedings of the First Conference on Machine Translation: Volume 2, Shared Task Papers, pages 679-684. ...
arXiv:2006.07264v1
fatcat:mx2vyj6j3vhxplezy2fclffud4
Transductive Auxiliary Task Self-Training for Neural Multi-Task Models
2019
Proceedings of the 2nd Workshop on Deep Learning Approaches for Low-Resource NLP (DeepLo 2019)
Multi-task learning and self-training are two common ways to improve a machine learning model's performance in settings with limited training data. ...
Drawing heavily on ideas from those two approaches, we suggest transductive auxiliary task self-training: training a multi-task model on (i) a combination of main and auxiliary task training data, and ...
Another option is curriculum learning, where selection is based on learning difficulty, increasing the difficulty during learning (Bengio et al., 2009) . ...
doi:10.18653/v1/d19-6128
dblp:conf/acl-deeplo/BjervaKA19
fatcat:b2wukkawjbecdh5bxqshs5ah3y
Transductive Auxiliary Task Self-Training for Neural Multi-Task Models
[article]
2019
arXiv
pre-print
Multi-task learning and self-training are two common ways to improve a machine learning model's performance in settings with limited training data. ...
Drawing heavily on ideas from those two approaches, we suggest transductive auxiliary task self-training: training a multi-task model on (i) a combination of main and auxiliary task training data, and ...
Another option is curriculum learning, where selection is based on learning difficulty, increasing the difficulty during learning (Bengio et al., 2009) . ...
arXiv:1908.06136v2
fatcat:cljvlieoyvetzbkgmfimoliika
Multilingual spoken language processing
2008
IEEE Signal Processing Magazine
MULTILINGUAL SPEECH SUMMARIZATION Spoken document summarization is the recognition, distillation, and presentation of spoken documents in a structural (and mostly) textual form. ...
For example, multilingual emotional speech identification and classification is only at a nascent stage, where large databases of emotional speech are being collected in different languages. ...
doi:10.1109/msp.2008.918417
fatcat:ezye4rngebdpphtis3szqdhvce
Differentiable Allophone Graphs for Language-Universal Speech Recognition
[article]
2021
arXiv
pre-print
These phone-based systems with learned allophone graphs can be used by linguists to document new languages, build phone-based lexicons that capture rich pronunciation variations, and re-evaluate the allophone ...
By training multilingually, we build a universal phone-based speech recognition model with interpretable probabilistic phone-to-phoneme mappings for each language. ...
Multilingual sharing between diverse languages is required to properly learn phonetic distinctions. ...
arXiv:2107.11628v1
fatcat:y64yvn46ijbkvhnh2nlizbjbiu
TwiSE at SemEval-2016 Task 4: Twitter Sentiment Classification
2016
Proceedings of the 10th International Workshop on Semantic Evaluation (SemEval-2016)
Extrinsic evaluation: text classification Document classification is a supervised learning task where a document is associated with one or more categories from a pool of M categories Y = {y 1 , . . . , ...
as inputs to a document classification task. ...
Note that unlike semi-supervised and transductive learning that make use of the unlabeled data in the training process to improve the performance, we use the unlabeled data for hyper-parameter selection ...
doi:10.18653/v1/s16-1010
dblp:conf/semeval/BalikasA16
fatcat:w7o56n5ny5hkjgtnqghp2sdeua
« Previous
Showing results 1 — 15 out of 253 results