A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2021; you can also visit the original URL.
The file type is
With the advent of Transformer, which was used in translation models in 2017, attention-based architectures began to attract attention. Furthermore, after the emergence of BERT, which strengthened the NLU-specific encoder part, which is a part of the Transformer, and the GPT architecture, which strengthened the NLG-specific decoder part, various methodologies, data, and models for learning the Pretrained Language Model began to appear. Furthermore, in the past three years, various PretrainedarXiv:2112.03014v1 fatcat:65ilb7fuijcmzb6coxaeqdsnp4