56,553 Hits in 6.5 sec

CLIP Models are Few-shot Learners: Empirical Studies on VQA and Visual Entailment [article]

Haoyu Song, Li Dong, Wei-Nan Zhang, Ting Liu, Furu Wei
2022 arXiv   pre-print
Then we propose a parameter-efficient fine-tuning strategy to boost the few-shot performance on the vqa task.  ...  In this work, we empirically show that CLIP can be a strong vision-language few-shot learner by leveraging the power of language.  ...  (No.62076081, No.61772153, and No.61936010), and Natural Science Foundation of Heilongjiang (No.YQ2021F006).  ... 
arXiv:2203.07190v1 fatcat:whf2ljh2mjfa5l4wsbr5dpvktq

FewCLUE: A Chinese Few-shot Learning Evaluation Benchmark [article]

Liang Xu, Xiaojing Lu, Chenyang Yuan, Xuanwei Zhang, Huilin Xu, Hu Yuan, Guoao Wei, Xiang Pan, Xin Tian, Libo Qin, Hu Hai
2021 arXiv   pre-print
Pretrained Language Models (PLMs) have achieved tremendous success in natural language understanding tasks.  ...  Experimental results reveal that: 1) The effect of different few-shot learning methods is sensitive to the pre-trained model to which the methods are applied; 2) PET and P-tuning achieve the best overall  ...  ., 2021) The above methods all restrict the generated templates to be natural language.  ... 
arXiv:2107.07498v2 fatcat:ljx2nma3b5aa3ix2pzyadnkgnu

Towards Zero-Label Language Learning [article]

Zirui Wang, Adams Wei Yu, Orhan Firat, Yuan Cao
2021 arXiv   pre-print
Specifically, inspired by the recent success of few-shot inference on GPT-3, we present a training data creation procedure named Unsupervised Data Generation (UDG), which leverages few-shot prompts to  ...  This paper explores zero-label learning in Natural Language Processing (NLP), whereby no human-annotated data is used anywhere during training and models are trained purely on synthetic data.  ...  To this end, we propose to utilize language models to perform few-shot generation.  ... 
arXiv:2109.09193v1 fatcat:35ijbawdhbalxpqk32qcscbdky

ZmBART: An Unsupervised Cross-lingual Transfer Framework for Language Generation [article]

Kaushal Kumar Maurya, Maunendra Sankar Desarkar, Yoshinobu Kano, Kumari Deepshikha
2021 arXiv   pre-print
This simple modeling approach gave us promising results.We experimented with few-shot training (with 1000 supervised data points) which boosted the model performance further.  ...  In this work, we transfer supervision from high resource language (HRL) to multiple low-resource languages (LRLs) for natural language generation (NLG).  ...  Moreover, although freezing the decoder layer and word embeddings helps in zero-shot setting, it is natural and useful to unfreeze them during few shot training. • Few-shot performance with Supervised  ... 
arXiv:2106.01597v1 fatcat:ajkygelpyfb4jjse7q4tg7w3fe

Label Semantics for Few Shot Named Entity Recognition [article]

Jie Ma, Miguel Ballesteros, Srikanth Doss, Rishita Anubhai, Sunil Mallya, Yaser Al-Onaizan, Dan Roth
2022 arXiv   pre-print
We propose a neural architecture that consists of two BERT encoders, one to encode the document and its tokens and another one to encode each of the labels in natural language format.  ...  We study the problem of few shot learning for named entity recognition.  ...  For the unique set of labels L D associated with dataset D, we apply three steps to get the representations: First, we manually convert the label names to their natural language forms, e.g.  ... 
arXiv:2203.08985v1 fatcat:qrim46gny5csjlv5uhj2lregsq

Vector Projection Network for Few-shot Slot Tagging in Natural Language Understanding [article]

Su Zhu, Ruisheng Cao, Lu Chen, Kai Yu
2020 arXiv   pre-print
Essentially, this approach is equivalent to a normalized linear model with an adaptive bias.  ...  Specifically, in the five-shot setting on benchmarks SNIPS and NER, our method outperforms the strongest few-shot learning baseline by 6.30 and 13.79 points on F_1 score, respectively.  ...  The weights are normalized as ||w k || = 1 to improve the generalization capability of the few-shot model.  ... 
arXiv:2009.09568v2 fatcat:mvglcehisfhwbegzuiwgxt6ts4

Detecting Hate Speech with GPT-3 [article]

Ke-Li Chiu, Annie Collins, Rohan Alexander
2022 arXiv   pre-print
We use GPT-3 to identify sexist and racist text passages with zero-, one-, and few-shot learning.  ...  Sophisticated language models such as OpenAI's GPT-3 can generate hateful text that targets marginalized groups.  ...  We ask GPT-3 to classify these based on zero-, one-, and few-shot learning, with and without instruction. We find that the model performs best with few-shot learning when an instruction is included.  ... 
arXiv:2103.12407v4 fatcat:uzxo7rlbr5fd3esoxkgdwodequ

AdaDurIAN: Few-shot Adaptation for Neural Text-to-Speech with DurIAN [article]

Zewang Zhang, Qiao Tian, Heng Lu, Ling-Hui Chen, Shan Liu
2020 arXiv   pre-print
To cope with this issue, we introduce AdaDurIAN by training an improved DurIAN-based average model and leverage it to few-shot learning with the shared speaker-independent content encoder across different  ...  Several few-shot learning tasks in our experiments show AdaDurIAN can outperform the baseline end-to-end system by a large margin.  ...  Finally, we perform the few-shot emotion transfer tasks on two unseen speakers with limited neutral speech data. We highly recommend readers to go listen to the generated audios 1 .  ... 
arXiv:2005.05642v1 fatcat:v4q5yhqlkbh5dmu64576xdgc4i

Analyzing Commonsense Emergence in Few-shot Knowledge Models [article]

Jeff Da, Ronan Le Bras, Ximing Lu, Yejin Choi, Antoine Bosselut
2021 arXiv   pre-print
To investigate this question, we train commonsense knowledge models in few-shot settings to study the emergence of their commonsense representation abilities.  ...  of large language models.  ...  Acknowledgements The authors would like to thank the anonymous reviewers for their feedback, and the Amazon Mechanical Turk community for help with annotation.  ... 
arXiv:2101.00297v3 fatcat:tn7gycypufej7jxsuvthgrgu5a

When does MAML Work the Best? An Empirical Study on Model-Agnostic Meta-Learning in NLP Applications [article]

Zequn Liu, Ruiyi Zhang, Yiping Song, Ming Zhang
2020 arXiv   pre-print
Model-Agnostic Meta-Learning (MAML), a model-agnostic meta-learning method, is successfully employed in NLP applications including few-shot text classification and multi-domain low-resource language generation  ...  Many impacting factors, including data quantity, similarity among tasks, and the balance between general language model and task-specific adaptation, can affect the performance of MAML in NLP, but few  ...  ) to evaluate the general language model.  ... 
arXiv:2005.11700v1 fatcat:gammq2ryjref7gr3cioore3pm4

PromptMaker: Prompt-based Prototyping with Large Language Models

Ellen Jiang, Kristen Olson, Edwin Toh, Alejandra Molina, Aaron Donsbach, Michael Terry, Carrie J Cai
2022 CHI Conference on Human Factors in Computing Systems Extended Abstracts  
Prototyping is notoriously difficult to do with machine learning (ML), but recent advances in large language models may lower the barriers to people prototyping with ML, through the use of natural language  ...  Through interviews with eleven practitioners during a three-week sprint and a workshop, we find that prompt-based prototyping reduced barriers of access by substantially broadening who can prototype with  ...  Mike Terry proposed the initial study design, conducted user studies, gave high-level scientific advice, and contributed to paper writing.  ... 
doi:10.1145/3491101.3503564 fatcat:mjxonbjkvnhi5b3lxq26uc2wsi

Reinforcement Learning for Few-Shot Text Generation Adaptation [article]

Cheng Pengsen, Dai Jinqiao, Liu Jiayong
2021 arXiv   pre-print
Controlling the generative model to adapt a new domain with limited samples is a difficult challenge and it is receiving increasing attention.  ...  To address this shortcoming, we frame the adaptation of text generation systems as a reinforcement learning problem and provide a new approach to make text generation models easily adaptable to target  ...  Related Work Few-shot-learning-based approaches are increasingly able to train powerful neural networks on small datasets in many nature language processing(NLP) problems [1] .  ... 
arXiv:2111.11030v1 fatcat:a66p54623bgyhaogegvfhu64zm

Open Aspect Target Sentiment Classification with Natural Language Prompts [article]

Ronald Seoh, Ian Birle, Mrinal Tak, Haw-Shiuan Chang, Brian Pinette, Alfred Hough
2021 arXiv   pre-print
To address this, we propose simple approaches that better solve ATSC with natural language prompts, enabling the task under zero-shot cases and enhancing supervised settings, especially for few-shot cases  ...  For many business applications, we often seek to analyze sentiments associated with any arbitrary aspects of commercial products, despite having a very limited amount of labels or even without any labels  ...  Natural Language Prompts There has been a number of recent papers on using prompts -additional sentences appended to the original input text -to direct language models to perform different tasks, exploiting  ... 
arXiv:2109.03685v1 fatcat:j3e27vm4ufd63az3oifkbkvfu4

FEWS: Large-Scale, Low-Shot Word Sense Disambiguation with the Dictionary [article]

Terra Blevins, Mandar Joshi, Luke Zettlemoyer
2021 arXiv   pre-print
We establish baselines on FEWS with knowledge-based and neural WSD approaches and present transfer learning experiments demonstrating that models additionally trained with FEWS better capture rare senses  ...  FEWS has high sense coverage across different natural language domains and provides: (1) a large training set that covers many more senses than previous datasets and (2) a comprehensive evaluation set  ...  Finally, we see that even without exposure to the natural sense distribution in natural language texts, the zero-shot model still performs significantly better on the MFS of words than the LFS, with a  ... 
arXiv:2102.07983v1 fatcat:yigk5rru7nettojaqpixshmfpe

Supervised Contrastive Learning for Pre-trained Language Model Fine-tuning [article]

Beliz Gunel, Jingfei Du, Alexis Conneau, Ves Stoyanov
2021 arXiv   pre-print
State-of-the-art natural language understanding classification models follow two-stages: pre-training a large language model on an auxiliary task, and then fine-tuning the model on a task-specific labeled  ...  Our proposed fine-tuning objective leads to models that are more robust to different levels of noise in the fine-tuning training data, and can generalize better to related tasks with limited labeled data  ...  GENERALIZATION ABILITY OF TASK MODELS In this experiment, we first fine-tune RoBERTa-Large on SST-2 using its full training set and get a task model with and without SCL term.  ... 
arXiv:2011.01403v3 fatcat:iv26tbgzxjf67o26tq2l5rlfpi
« Previous Showing results 1 — 15 out of 56,553 results