Filters








70 Hits in 8.4 sec

When does MAML Work the Best? An Empirical Study on Model-Agnostic Meta-Learning in NLP Applications [article]

Zequn Liu, Ruiyi Zhang, Yiping Song, Ming Zhang
2020 arXiv   pre-print
In this paper, we conduct an empirical study to investigate these impacting factors and conclude when MAML works the best based on the experimental results.  ...  Model-Agnostic Meta-Learning (MAML), a model-agnostic meta-learning method, is successfully employed in NLP applications including few-shot text classification and multi-domain low-resource language generation  ...  Conclusion In this paper, we conduct an empirical study to investigate the impacting factors on the performance of MAML in NLP applications.  ... 
arXiv:2005.11700v1 fatcat:gammq2ryjref7gr3cioore3pm4

Sign-MAML: Efficient Model-Agnostic Meta-Learning by SignSGD [article]

Chen Fan, Parikshit Ram, Sijia Liu
2021 arXiv   pre-print
We propose a new computationally-efficient first-order algorithm for Model-Agnostic Meta-Learning (MAML).  ...  We show that MAML, through the lens of signSGD-oriented BLO, naturally yields an alternating optimization scheme that just requires first-order gradients of a learned meta-model.  ...  When does maml work the best?  ... 
arXiv:2109.07497v2 fatcat:drhcpwjhdbgh5abf5cq3tqjvqy

Meta-Learning for Domain Generalization in Semantic Parsing [article]

Bailin Wang, Mirella Lapata, Ivan Titov
2021 arXiv   pre-print
In this work, we use a meta-learning framework which targets zero-shot domain generalization for semantic parsing.  ...  Experimental results on the (English) Spider and Chinese Spider datasets show that the meta-learning objective significantly boosts the performance of a baseline parser.  ...  We gratefully acknowledge the support of the European Research Council (Titov: ERC StG BroadSem 678254; Lapata: ERC CoG TransModal 681760) and the Dutch National Science Foundation (NWO VIDI 639.022.518  ... 
arXiv:2010.11988v2 fatcat:qsiai7imb5ahxj7ae7jsty5kra

Low-Resource Adaptation of Neural NLP Models [article]

Farhad Nooralahzadeh
2020 arXiv   pre-print
Real-world applications of natural language processing (NLP) are challenging. NLP models rely heavily on supervised machine learning and require large amounts of annotated data.  ...  However, in real-world applications of NLP, the textual resources vary across several dimensions, such as language, dialect, topic, and genre.  ...  Proceedings of the Eighteenth International Conference on Machine Learning. Morgan Kaufmann Publishers Inc., pp. 282-289. Lample, G., Ballesteros, M., Subramanian, S., Kawakami, K., and Dyer, C.  ... 
arXiv:2011.04372v1 fatcat:626mbe5ba5bkdflv755o35u5pq

Multilingual and cross-lingual document classification: A meta-learning approach [article]

Niels van der Heijden, Helen Yannakoudakis, Pushkar Mishra, Ekaterina Shutova
2021 arXiv   pre-print
The great majority of languages in the world are considered under-resourced for the successful application of deep learning methods.  ...  In this work, we propose a meta-learning approach to document classification in limited-resource setting and demonstrate its effectiveness in two different settings: few-shot, cross-lingual adaptation  ...  Acknowledgements This work was supported by Deloitte Risk Advisory B.V., the Netherlands.  ... 
arXiv:2101.11302v2 fatcat:36utmoigc5dx5dcjmif47av5he

CrossFit: A Few-shot Learning Challenge for Cross-task Generalization in NLP [article]

Qinyuan Ye, Bill Yuchen Lin, Xiang Ren
2021 arXiv   pre-print
Our analysis reveals that the few-shot learning ability on unseen tasks can be improved via an upstream learning stage using a set of seen tasks.  ...  To instantiate different seen/unseen task partitions in CrossFit and facilitate in-depth analysis, we present the NLP Few-shot Gym, a repository of 160 diverse few-shot NLP tasks created from open-access  ...  Acknowledgments We thank authors and crowd-workers of all datasets used in our study. We thank huggingface datasets team for making datasets more accessible.  ... 
arXiv:2104.08835v2 fatcat:xnhrmmsmyzb4fjo7ealrw2vnka

Challenge Closed-book Science Exam: A Meta-learning Based Question Answering System [article]

Xinyue Zheng, Peng Wang, Qigang Wang, Zhongchao Shi
2020 arXiv   pre-print
Specifically, our method based on meta-learning method and large language model BERT, which can efficiently solve science problems by learning from related example questions without relying on external  ...  Inspired by the dual process theory in cognitive science, we propose a MetaQA framework, where system 1 is an intuitive meta-classifier and system 2 is a reasoning module.  ...  -The MetaQA system does not rely on large corpus, which is applicable for practical situations when building a targeted knowledge base requires significant human workload and time costs.  ... 
arXiv:2004.12303v1 fatcat:5xzmebvh2vbtpi2z7ogfkvmhau

Evaluating few shot and Contrastive learning Methods for Code Clone Detection [article]

Mohamad Khajezade, Fatemeh Hendijani Fard, Mohamed S. Shehata
2022 arXiv   pre-print
Then, we employ Model Agnostic Meta-learning (MAML), where the model learns a meta-learner capable of extracting transferable knowledge from the train set; so that the model can be fine-tuned using a few  ...  Recently, deep learning-based models have achieved an F1 score (a metric used to assess classifiers) of ∼95% on the CodeXGLUE benchmark.  ...  In this research question we study the performance of a popular few shot learning algorithm, Model Agnostic Meta-learning [9] for code clone detection.  ... 
arXiv:2204.07501v2 fatcat:z6bhugzq6rhgpl6jyj4wydwdhi

Learning to Classify Intents and Slot Labels Given a Handful of Examples [article]

Jason Krone, Yi Zhang, Mona Diab
2020 arXiv   pre-print
We show that two popular few-shot learning algorithms, model agnostic meta learning (MAML) and prototypical networks, outperform a fine-tuning baseline on this benchmark.  ...  We propose a new few-shot learning task, few-shot IC/SF, to study and improve the performance of IC and SF models on classes not seen at training time in ultra low resource scenarios.  ...  p(t * j = a | x * , S) Model Agnostic Meta Learning (MAML) MAML optimizes the parameters φ of the encoder f φ such that when φ is fine-tuned on the support set S for d steps, φ ← Finetune(φ, d |S),  ... 
arXiv:2004.10793v1 fatcat:qffsn6e5vrdixenhu5vyhfnm54

Few-shot Text Classification with Distributional Signatures [article]

Yujia Bao, Menghua Wu, Shiyu Chang, Regina Barzilay
2020 arXiv   pre-print
Our model is trained within a meta-learning framework to map these signatures into attention scores, which are then used to weight the lexical representations of words.  ...  In this paper, we explore meta-learning for few-shot text classification.  ...  MAML MAML meta-learns an initialization such that the model can quickly adapt to new tasks after a few gradient steps.  ... 
arXiv:1908.06039v3 fatcat:bbddbkpop5gynaloacfxnuib3q

X-METRA-ADA: Cross-lingual Meta-Transfer Learning Adaptation to Natural Language Understanding and Question Answering [article]

Meryem M'hamdi, Doo Soon Kim, Franck Dernoncourt, Trung Bui, Xiang Ren, Jonathan May
2021 arXiv   pre-print
Our approach adapts MAML, an optimization-based meta-learning approach, to learn to adapt to new languages.  ...  In this work, we propose X-METRA-ADA, a cross-lingual MEta-TRAnsfer learning ADAptation approach for NLU.  ...  Meta-learning for NLP Previous work in metalearning for NLP is focused on the application of first-order MAML (Finn et al., 2017) .  ... 
arXiv:2104.09696v2 fatcat:yt6um3pmbrf5zh7bdclsu5duoi

Meta-learning Transferable Representations with a Single Target Domain [article]

Hong Liu, Jeff Z. HaoChen, Colin Wei, Tengyu Ma
2020 arXiv   pre-print
MeRLin empirically outperforms previous state-of-the-art transfer learning algorithms on various real-world vision and NLP transfer learning benchmarks.  ...  MeRLin meta-learns representations by ensuring that a head fit on top of the representations with target training data also performs well on target validation data.  ...  [13] empirically studied the success of MAML. Balcan et al. [4] , Tripuraneni et al. [46] theoretically studied meta-learning in a few-shot learning setting.  ... 
arXiv:2011.01418v1 fatcat:ffrlnd7ce5ai3axzt6hdpk35wi

Emerging Trends in Federated Learning: From Model Fusion to Federated X Learning [article]

Shaoxiong Ji and Teemu Saravirta and Shirui Pan and Guodong Long and Anwar Walid
2021 arXiv   pre-print
Following the emerging trends, we also discuss federated learning in the intersection with other learning paradigms, termed as federated x learning, where x includes multitask learning, meta-learning,  ...  Federated learning is a new learning paradigm that decouples data collection and model training via multi-party computation and model aggregation.  ...  The seminal model-agnostic meta-learning (MAML) framework [61] has been intensively applied to this learning scenario.  ... 
arXiv:2102.12920v2 fatcat:5fcwfhxibbedbcbuzrfyqdedky

Should We Be Pre-training? An Argument for End-task Aware Training as an Alternative [article]

Lucio M. Dery, Paul Michel, Ameet Talwalkar, Graham Neubig
2022 arXiv   pre-print
We next introduce an online meta-learning algorithm that learns a set of multi-task weights to better balance among our multiple auxiliary objectives, achieving further improvements on end-task performance  ...  We study replacing end-task agnostic continued training of pre-trained language models with end-task aware training of said models.  ...  We propose a meta-learning algorithm in the mold of Model Agnostic Meta-Learning (MAML) (Finn et al., 2017) to learn task weights.  ... 
arXiv:2109.07437v2 fatcat:ehuvvggh2ba7lfs6jeg6w3ss3u

Meta-Learning with Fewer Tasks through Task Interpolation [article]

Huaxiu Yao, Linjun Zhang, Chelsea Finn
2022 arXiv   pre-print
Empirically, in our experiments on eight datasets from diverse domains including image recognition, pose prediction, molecule property prediction, and medical image classification, we find that the proposed  ...  However, the bottleneck of current meta-learning algorithms is the requirement of a large number of meta-training tasks, which may not be accessible in real-world scenarios.  ...  In gradient-based meta-learning, we use model-agnostic metalearning (MAML) as an example and denote the corresponding base model as f M AM L .  ... 
arXiv:2106.02695v2 fatcat:untnphyg5fcexedqfzfm62uy7i
« Previous Showing results 1 — 15 out of 70 results