A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2020; you can also visit the original URL.
The file type is application/pdf
.
Filters
Language Models are Few-Shot Learners
[article]
2020
arXiv
pre-print
Specifically, we train GPT-3, an autoregressive language model with 175 billion parameters, 10x more than any previous non-sparse language model, and test its performance in the few-shot setting. ...
Here we show that scaling up language models greatly improves task-agnostic, few-shot performance, sometimes even reaching competitiveness with prior state-of-the-art fine-tuning approaches. ...
discussions about ways to approach and evaluate bias, Harrison Edwards and Yura Burda for discussions and experimentation with in-context learning, Geoffrey Irving and Paul Christiano for early discussions of language ...
arXiv:2005.14165v4
fatcat:kilb2lujxfax3kgfiuotql2iyy
Language Models are Few-shot Multilingual Learners
[article]
2021
arXiv
pre-print
Finally, we find the in-context few-shot cross-lingual prediction results of language models are significantly better than random prediction, and they are competitive compared to the existing state-of-the-art ...
We show that, given a few English examples as context, pre-trained language models can predict not only English test samples but also non-English ones. ...
the few-shot learning, where the source language and target language are German and English, respectively. and the performance increases as the models are given more samples. ...
arXiv:2109.07684v1
fatcat:h5uejjfgebcc7ci3zfsosbmzke
It's Not Just Size That Matters: Small Language Models Are Also Few-Shot Learners
[article]
2021
arXiv
pre-print
When scaled to hundreds of billions of parameters, pretrained language models such as GPT-3 (Brown et al., 2020) achieve remarkable few-shot performance. ...
We show that performance similar to GPT-3 can be obtained with language models that are much "greener" in that their parameter count is several orders of magnitude smaller. ...
Very recently, Brown et al. (2020) introduced GPT-3, a pretrained LM with an enormous 175 billion parameters, and showed that it has amazing few-shot abilities: By reformulating tasks as language modeling ...
arXiv:2009.07118v2
fatcat:o7bw7trdyfaipd7qrrtx6ku37u
Multilingual Byte2Speech Models for Scalable Low-resource Speech Synthesis
[article]
2021
arXiv
pre-print
Besides strong results on 40+ languages, the framework demonstrates capabilities to adapt to new languages under extreme low-resource and even few-shot scenarios of merely 40s transcribed recording, without ...
Exhaustive comparative and ablation studies are performed to reveal the potential of the framework for low-resource languages. ...
a few-shot spoken language learner. ...
arXiv:2103.03541v2
fatcat:z7xtjh723rey3gyrzhrrepj3ea
Language Models are Few-shot Multilingual Learners
2021
Proceedings of the 1st Workshop on Multilingual Representation Learning
unpublished
Finally, we find the in-context few-shot cross-lingual prediction results of language models are significantly better than random prediction, and they are competitive compared to the existing state-of-the-art ...
We show that, given a few English examples as context, pre-trained language models can predict not only English test samples but also non-English ones. ...
on the few-shot learning, where the source language and target language are German and English, respectively. and the performance increases as the models are given more samples. ...
doi:10.18653/v1/2021.mrl-1.1
fatcat:czv4znd6g5gexgduqrq6j4rgcu
It's Not Just Size That Matters: Small Language Models Are Also Few-Shot Learners
2021
Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
unpublished
When scaled to hundreds of billions of parameters, pretrained language models such as GPT-3 (Brown et al., 2020) achieve remarkable few-shot performance. ...
We show that performance similar to GPT-3 can be obtained with language models that are much "greener" in that their parameter count is several orders of magnitude smaller. ...
Exploiting cloze questions for few shot text classification and natural language inference. ...
doi:10.18653/v1/2021.naacl-main.185
fatcat:su6c4nod5zc3pg24ojxhvzhv24
Differentiable Prompt Makes Pre-trained Language Models Better Few-shot Learners
[article]
2022
arXiv
pre-print
Large-scale pre-trained language models have contributed significantly to natural language processing by demonstrating remarkable abilities as few-shot learners. ...
This study proposes a novel pluggable, extensible, and efficient approach named DifferentiAble pRompT (DART), which can convert small language models into better few-shot learners without any prompt engineering ...
Hyperparameters are provided in the Appendix A.1. ...
arXiv:2108.13161v7
fatcat:h5a52v35h5cvxevnzlr45qnuoi
Evaluating few shot and Contrastive learning Methods for Code Clone Detection
[article]
2022
arXiv
pre-print
Method: We assess the generalizability of the state of the art models for CCD in few shot settings (i.e., only a few samples are available for fine-tuning) by setting three scenarios: i) unseen problems ...
Objective: The main objective of this research is to assess the ability of the CCD models as well as few shot learning algorithms for unseen programming problems and new languages (i.e., the model is not ...
Few shot learning models are widely studied in computer vision [31, 33, 36] and there are few works conducted on using few shot learning for natural language processing [4, 45] . ...
arXiv:2204.07501v2
fatcat:z6bhugzq6rhgpl6jyj4wydwdhi
LMTurk: Few-Shot Learners as Crowdsourcing Workers in a Language-Model-as-a-Service Framework
[article]
2022
arXiv
pre-print
Vast efforts have been devoted to creating high-performance few-shot learners, i.e., large-scale pretrained language models (PLMs) that perform well with little downstream task training data. ...
The rationale is that crowdsourcing workers are in fact few-shot learners: They are shown a few illustrative examples to learn about a task and then start annotating. ...
Related Work Few-shot learners in NLP. ...
arXiv:2112.07522v2
fatcat:yn2txs3mznh53a5ipmi3mhhvy4
Few-shot Learning with Meta Metric Learners
[article]
2019
arXiv
pre-print
Existing meta-learning or metric-learning based few-shot learning approaches are limited in handling diverse domains with various number of labels. ...
Few-shot Learning aims to learn classifiers for new classes with only a few training examples per class. ...
Finally, we would like to move forward to apply the current framework in other applications, such as language modeling [19] , machine translation [20] and vision applications [21] . ...
arXiv:1901.09890v1
fatcat:ssekfocxqzdkle6ie7ltw3r34i
CLIP Models are Few-shot Learners: Empirical Studies on VQA and Visual Entailment
[article]
2022
arXiv
pre-print
In this work, we empirically show that CLIP can be a strong vision-language few-shot learner by leveraging the power of language. ...
However, after being pre-trained by language supervision from a large amount of image-caption pairs, CLIP itself should also have acquired some few-shot abilities for vision-language tasks. ...
Acknowledgements Haoyu Song, Wei-Nan Zhang, and Ting Liu are supported by the Science and Technology Innovation 2030 Major Project of China (No.2020AAA0108605), National Natural Science Foundation of China ...
arXiv:2203.07190v1
fatcat:whf2ljh2mjfa5l4wsbr5dpvktq
GPT-3 Models are Poor Few-Shot Learners in the Biomedical Domain
[article]
2021
arXiv
pre-print
However, the ability of these large language models in few-shot transfer learning has not yet been explored in the biomedical domain. ...
However, in-domain pretraining seems not to be sufficient; novel pretraining and few-shot learning strategies are required in the biomedical NLP domain. ...
BioBERT, in few-shot settings to figure out whether large language models are proficient few-shot learners in the biomedical domain. ...
arXiv:2109.02555v1
fatcat:3mmirkmyzfdmlgrij6ddro2mhu
Rapid Adaptation with Conditionally Shifted Neurons
[article]
2018
arXiv
pre-print
On metalearning benchmarks from the vision and language domains, models augmented with conditionally shifted neurons achieve state-of-the-art results. ...
Few-shot Language Modeling To evaluate the effectiveness of recurrent models with conditionally shifted neurons, we ran experiments on the few-shot Penn Treebank (PTB) language modeling task introduced ...
These results indicate that our model's few-shot language modelling capabilities far exceed those of Matching Networks (Vinyals et al., 2016) . ...
arXiv:1712.09926v3
fatcat:6kcgr64nlzguzcv74u3icw5ghi
A Good Prompt Is Worth Millions of Parameters: Low-resource Prompt-based Learning for Vision-Language Models
[article]
2022
arXiv
pre-print
To solve this limitation, we study prompt-based low-resource learning of VL tasks with our proposed method, FewVLM, relatively smaller than recent few-shot learners. ...
For FewVLM, we pre-train a sequence-to-sequence transformer model with prefix language modeling (PrefixLM) and masked language modeling (MaskedLM). ...
Conclusion In this work, we present FEWVLM, a few-shot prompt-based learner on vision-language tasks. ...
arXiv:2110.08484v2
fatcat:dsfqfdvlhbenpfkgjgqlyulaba
A Hybrid Approach with Optimization and Metric-based Meta-Learner for Few-Shot Learning
[article]
2019
arXiv
pre-print
Our meta-metric-learning approach consists of two components, a task-specific metric-based learner as a base model, and a meta-learner that learns and specifies the base model. ...
The task-specific classifiers are required to be homogeneous-structured to ease the parameter prediction, so the meta-learning approaches could only handle few-shot learning problems where the tasks share ...
The aforementioned deep few-shot learning models usually are applied to the so-called "k-shot, N -way" scenario, in which each few-shot learning task has the same N number of class labels and each label ...
arXiv:1904.03014v2
fatcat:lkdqydb5e5dyrmx5u5j7f4btoa
« Previous
Showing results 1 — 15 out of 11,498 results