Filters








253 Hits in 4.1 sec

Multilingual Document Classification via Transductive Learning

Salvatore Romeo, Dino Ienco, Andrea Tagarelli
2015 Italian Information Retrieval Workshop  
We present a transductive learning based framework for multilingual document classification, originally proposed in [7] .  ...  setting for multilingual document classification.  ...  Motivated by the above considerations, we present a framework for multilingual document classification under a transductive learning setting, originally proposed in [7] .  ... 
dblp:conf/iir/RomeoIT15 fatcat:zt6re6bz2jejphavzvnxumftty

Knowledge-Based Representation for Transductive Multilingual Document Classification [chapter]

Salvatore Romeo, Dino Ienco, Andrea Tagarelli
2015 Lecture Notes in Computer Science  
To overcome such issues we propose a new framework for multilingual document classification under a transductive learning setting.  ...  , and the robustness of the transductive setting for multilingual document classification.  ...  To the best of our knowledge, we bring for the first time a transductive learning approach to a multilingual document classification.  ... 
doi:10.1007/978-3-319-16354-3_11 fatcat:2ysczrdjdjg4fpxnfxivsj5yjq

Cross-lingual Text Classification via Model Translation with Limited Dictionaries

Ruochen Xu, Yiming Yang, Hanxiao Liu, Andrew Hsi
2016 Proceedings of the 25th ACM International on Conference on Information and Knowledge Management - CIKM '16  
An open challenge in CLTC is to classify documents for the languages where labeled training data are not available.  ...  Cross-lingual text classi cation (CLTC) refers to the task of classifying documents in di erent languages into the same taxonomy of categories.  ...  They exploited the readily available unlabeled data in the target language via semi-supervised learning.  ... 
doi:10.1145/2983323.2983732 dblp:conf/cikm/XuYLH16 fatcat:shnj4lh6y5hetbzj5xrrgczm44

TextServer: Cloud-Based Multilingual Natural Language Processing

Lluis Padro, Jordi Turmo
2015 2015 IEEE International Conference on Data Mining Workshop (ICDMW)  
Concretely, it is based on graph partitioning via constraint relaxation labeling, where constraints were automatically learned.  ...  Named entity recognition and classification (NERC) This service recognizes names of persons, locations and organizations mentioned in a document.  ... 
doi:10.1109/icdmw.2015.102 dblp:conf/icdm/PadroT15 fatcat:k4a2xf4uyvhgboi3l2zmebkm54

Cross-lingual Inductive Transfer to Detect Offensive Language [article]

Kartikey Pant, Tanvi Dadu
2020 arXiv   pre-print
Further experimentation proves that our model works competitively in a zero-shot learning environment, and is extensible to other languages.  ...  In OffensEval 2020, the organizers have released the multilingual Offensive Language Identification Dataset (mOLID), which contains tweets in five different languages, to detect offensive language.  ...  It can be classified into multiple types, like transductive transfer learning and inductive transfer learning.  ... 
arXiv:2007.03771v1 fatcat:33ysykzigvfnhbbdvvhgkhbaty

Joint Bilingual Sentiment Classification with Unlabeled Parallel Corpora

Bin Lu, Chenhao Tan, Claire Cardie, Benjamin K. Tsou
2011 Annual Meeting of the Association for Computational Linguistics  
Most previous work on multilingual sentiment analysis has focused on methods to adapt sentiment resources from resource-rich languages to resource-poor languages.  ...  We present a novel approach for joint bilingual sentiment classification at the sentence level that augments available labeled data in each language with unlabeled parallel data.  ...  We compare to co-training and transductive SVMs in Section 5. Multilingual NLP for Other Tasks.  ... 
dblp:conf/acl/LuTCT11 fatcat:wjzdoox5fbb6npnvzboiwomowi

Semi-Supervised Representation Learning for Cross-Lingual Text Classification

Min Xiao, Yuhong Guo
2013 Conference on Empirical Methods in Natural Language Processing  
In this paper, we propose a new cross-lingual adaptation approach for document classification based on learning cross-lingual discriminative distributed representations of words.  ...  documents.  ...  Guo and Xiao (2012a) developed a transductive subspace representation learning method for crosslingual text classification based on non-negative matrix factorization.  ... 
dblp:conf/emnlp/XiaoG13 fatcat:lyoryleoyrcgbjdgsme4u3yswq

Cross-domain Feature Selection for Language Identification

Marco Lui, Timothy Baldwin
2011 International Joint Conference on Natural Language Processing  
Our results demonstrate that our method provides improvements in transductive transfer learning for language identification.  ...  We show that transductive (cross-domain) learning is an important consideration in building a general-purpose language identification system, and develop a feature selection method that generalizes across  ...  (2) base the classification only on the first 500 bytes of the document.  ... 
dblp:conf/ijcnlp/LuiB11 fatcat:gnftr2csbvf35enj5rlaypkpcu

Generalized Funnelling: Ensemble Learning and Heterogeneous Document Embeddings for Cross-Lingual Text Classification [article]

Alejandro Moreo, Andrea Pedrotti, Fabrizio Sebastiani
2022 arXiv   pre-print
Funnelling (Fun) is a recently proposed method for cross-lingual text classification (CLTC) based on a two-tier learning ensemble for heterogeneous transfer learning (HTL).  ...  text classification.  ...  [60] demonstrates the effectiveness of meta-learning approaches to crosslingual text classification.  ... 
arXiv:2110.14764v2 fatcat:4mm6n6crtvgf3ktwmizllw77ci

Low-resource Languages: A Review of Past Work and Future Challenges [article]

Alexandre Magueresse, Vincent Carles, Evan Heetderks
2020 arXiv   pre-print
Document-level alignment Aligning corpora at the document-level is useful for text classification, translation or multilingual representations.  ...  Multilingual Learning Multilingual Learning extends the transfer learning techniques in multilingual environments.  ...  Yoda system for wmt16 shared task: Bilingual document alignment. In Proceedings of the First Conference on Machine Translation: Volume 2, Shared Task Papers, pages 679-684.  ... 
arXiv:2006.07264v1 fatcat:mx2vyj6j3vhxplezy2fclffud4

Transductive Auxiliary Task Self-Training for Neural Multi-Task Models

Johannes Bjerva, Katharina Kann, Isabelle Augenstein
2019 Proceedings of the 2nd Workshop on Deep Learning Approaches for Low-Resource NLP (DeepLo 2019)  
Multi-task learning and self-training are two common ways to improve a machine learning model's performance in settings with limited training data.  ...  Drawing heavily on ideas from those two approaches, we suggest transductive auxiliary task self-training: training a multi-task model on (i) a combination of main and auxiliary task training data, and  ...  Another option is curriculum learning, where selection is based on learning difficulty, increasing the difficulty during learning (Bengio et al., 2009) .  ... 
doi:10.18653/v1/d19-6128 dblp:conf/acl-deeplo/BjervaKA19 fatcat:b2wukkawjbecdh5bxqshs5ah3y

Transductive Auxiliary Task Self-Training for Neural Multi-Task Models [article]

Johannes Bjerva, Katharina Kann, Isabelle Augenstein
2019 arXiv   pre-print
Multi-task learning and self-training are two common ways to improve a machine learning model's performance in settings with limited training data.  ...  Drawing heavily on ideas from those two approaches, we suggest transductive auxiliary task self-training: training a multi-task model on (i) a combination of main and auxiliary task training data, and  ...  Another option is curriculum learning, where selection is based on learning difficulty, increasing the difficulty during learning (Bengio et al., 2009) .  ... 
arXiv:1908.06136v2 fatcat:cljvlieoyvetzbkgmfimoliika

Multilingual spoken language processing

P. Fung, T. Schultz
2008 IEEE Signal Processing Magazine  
MULTILINGUAL SPEECH SUMMARIZATION Spoken document summarization is the recognition, distillation, and presentation of spoken documents in a structural (and mostly) textual form.  ...  For example, multilingual emotional speech identification and classification is only at a nascent stage, where large databases of emotional speech are being collected in different languages.  ... 
doi:10.1109/msp.2008.918417 fatcat:ezye4rngebdpphtis3szqdhvce

Differentiable Allophone Graphs for Language-Universal Speech Recognition [article]

Brian Yan, Siddharth Dalmia, David R. Mortensen, Florian Metze, Shinji Watanabe
2021 arXiv   pre-print
These phone-based systems with learned allophone graphs can be used by linguists to document new languages, build phone-based lexicons that capture rich pronunciation variations, and re-evaluate the allophone  ...  By training multilingually, we build a universal phone-based speech recognition model with interpretable probabilistic phone-to-phoneme mappings for each language.  ...  Multilingual sharing between diverse languages is required to properly learn phonetic distinctions.  ... 
arXiv:2107.11628v1 fatcat:y64yvn46ijbkvhnh2nlizbjbiu

TwiSE at SemEval-2016 Task 4: Twitter Sentiment Classification

Georgios Balikas, Massih-Reza Amini
2016 Proceedings of the 10th International Workshop on Semantic Evaluation (SemEval-2016)  
Extrinsic evaluation: text classification Document classification is a supervised learning task where a document is associated with one or more categories from a pool of M categories Y = {y 1 , . . . ,  ...  as inputs to a document classification task.  ...  Note that unlike semi-supervised and transductive learning that make use of the unlabeled data in the training process to improve the performance, we use the unlabeled data for hyper-parameter selection  ... 
doi:10.18653/v1/s16-1010 dblp:conf/semeval/BalikasA16 fatcat:w7o56n5ny5hkjgtnqghp2sdeua
« Previous Showing results 1 — 15 out of 253 results