A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2021; you can also visit the original URL.
The file type is application/pdf
.
Filters
Sentence Bottleneck Autoencoders from Transformer Language Models
[article]
2021
arXiv
pre-print
We therefore explore the construction of a sentence-level autoencoder from a pretrained, frozen transformer language model. ...
We adapt the masked language modeling objective as a generative, denoising one, while only training a sentence bottleneck and a single-layer modified transformer decoder. ...
To fill in this gap, we introduce AUTOBOT, a new autoencoder model for learning sentence "bottleneck" (i.e., fixed-size) representations from pretrained transformers that is useful for similarity, generation ...
arXiv:2109.00055v2
fatcat:mzozllfuunhsxjo46dvj3ugynm
Sentence Bottleneck Autoencoders from Transformer Language Models
2021
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing
unpublished
We therefore explore the construction of a sentence-level autoencoder from a pretrained, frozen transformer language model. ...
We adapt the masked language modeling objective as a generative, denoising one, while only training a sentence bottleneck and a single-layer modified transformer decoder. ...
To fill in this gap, we introduce AUTOBOT, a new autoencoder model for learning sentence "bottleneck" (i.e., fixed-size) representations from pretrained transformers that is useful for similarity, generation ...
doi:10.18653/v1/2021.emnlp-main.137
fatcat:fsoavrdj5fdj3m5qffczxylg4a
Squeezing bottlenecks: Exploring the limits of autoencoder semantic representation capabilities
2016
Neurocomputing
We present a comprehensive study on the use of autoencoders for modelling text data, in which (differently from previous studies) we focus our attention on the following issues: i) we explore the suitability ...
of two different models bDA and rsDA for constructing deep autoencoders for text data at the sentence level; ii) we propose and evaluate two novel metrics for better assessing the text-reconstruction ...
We trained the autoencoder varying down the size of the bottleneck layer from 100 to 10 with step-sizes of 10. ...
doi:10.1016/j.neucom.2015.06.091
fatcat:ezj3jaf6yvbcddjugxri4vh2mm
Discrete Autoencoders for Sequence Models
[article]
2018
arXiv
pre-print
For instance, even though language has a clear hierarchical structure going from characters through words to sentences, it is not apparent in current language models. ...
Nevertheless, it remains challenging to extract good representations from these models. ...
For example, a language model would just condition on s <i while a neural machine translation model would condition on the input sentence (in the other language) and s <i . ...
arXiv:1801.09797v1
fatcat:r4e6zowc3zejtpkctcmj77hwfy
BottleSum: Unsupervised and Self-supervised Sentence Summarization using the Information Bottleneck Principle
2019
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)
In this paper, we propose a novel approach to unsupervised sentence summarization by mapping the Information Bottleneck principle to a conditional language modelling objective: given a sentence, our approach ...
Using only pretrained language models with no direct supervision, our approach can efficiently perform extractive sentence summarization over a large corpus. ...
Simply, a large corpus of unsupervised summaries is generated with BottleSum Ex using a strong language model, then the same language model is tuned to produce summaries from source sentences on that dataset ...
doi:10.18653/v1/d19-1389
dblp:conf/emnlp/WestHBC19
fatcat:a7rllnns5nhpbk7d2wtpgdp7ua
BottleSum: Unsupervised and Self-supervised Sentence Summarization using the Information Bottleneck Principle
[article]
2019
arXiv
pre-print
In this paper, we propose a novel approach to unsupervised sentence summarization by mapping the Information Bottleneck principle to a conditional language modelling objective: given a sentence, our approach ...
Building on our unsupervised extractive summarization (BottleSumEx), we then present a new approach to self-supervised abstractive summarization (BottleSumSelf), where a transformer-based language model ...
Simply, a large corpus of unsupervised summaries is generated with BottleSum Ex using a strong language model, then the same language model is tuned to produce summaries from source sentences on that dataset ...
arXiv:1909.07405v2
fatcat:ph3oriu4k5emlo5v6j2zoj6tse
Squeezing bottlenecks: exploring the limits of autoencoder semantic representation capabilities
[article]
2014
arXiv
pre-print
capabilities of autoencoders; and iii) we propose an automatic method to find the critical bottleneck dimensionality for text language representations (below which structural information is lost). ...
We present a comprehensive study on the use of autoencoders for modelling text data, in which (differently from previous studies) we focus our attention on the following issues: i) we explore the suitability ...
We trained the autoencoder varying down the size of the bottleneck layer from 100 to 10 with step-sizes of 10. ...
arXiv:1402.3070v1
fatcat:ffbphiul4vftjm5j55svaxv2re
Plug and Play Autoencoders for Conditional Text Generation
[article]
2020
arXiv
pre-print
Text autoencoders are commonly used for conditional generation tasks such as style transfer. ...
We propose methods which are plug and play, where any pretrained autoencoder can be used, and only require learning a mapping within the autoencoder's embedding space, training embedding-to-embedding ( ...
Left: we pretrain an autoencoder on (unannotated) text, which transforms an input sentence x into an embedding z x and uses it to predict a reconstructionx of the input sentence. ...
arXiv:2010.02983v2
fatcat:t27lepghkbb6ddycs5ozshqtf4
Bilingual is At Least Monolingual (BALM): A Novel Translation Algorithm that Encodes Monolingual Priors
[article]
2019
arXiv
pre-print
language. ...
State-of-the-art machine translation (MT) models do not use knowledge of any single language's structure; this is the equivalent of asking someone to translate from English to German while knowing neither ...
generalize to other German-to-English translation tasks outside of image 8 Appendix
Links to Research Materials Github repository: https://github.com/jeffreyscheng/senior-thesis-translation Pre-trained BALM models ...
arXiv:1909.01146v1
fatcat:q6h2ixriyjcktnm42nonxw4iui
Binary Autoencoder for Text Modeling
[chapter]
2019
Communications in Computer and Information Science
Experiments reported in this paper show binary autoencoder to have the main features of VAE: semantic consistency and good latent space coverage; while not suffering from the mode collapse and being a ...
In this paper, autoencoder with binary latent space trained using straight-through estimator is shown to have advantages over VAE on text modeling task. ...
Latent variables obtained from different autoencoders were used as input features to feedforward classifier.
Table 1 : 1 Language modeling results. ...
doi:10.1007/978-3-030-34518-1_10
fatcat:xsci4zzlqncljd2ldhwhaxmvbu
Bag-of-Vectors Autoencoders for Unsupervised Conditional Text Generation
[article]
2021
arXiv
pre-print
models. ...
Our experimental evaluations on unsupervised sentiment transfer and sentence summarization show that our method performs substantially better than a standard autoencoder. ...
In this paper, we extent Emb2Emb from single-vector bottleneck AEs to Bag-of-Vector Autoencoders (BoV-AEs), which encode text into a variable-size representation where the number of vectors grows with ...
arXiv:2110.07002v1
fatcat:b5bfzyxzjve4nhwnzegbew72bq
A Survey on Self-supervised Pre-training for Sequential Transfer Learning in Neural Networks
[article]
2020
arXiv
pre-print
It involves first pre-training a model on a large amount of unlabeled data, then adapting the model to target tasks of interest. ...
Deep neural networks are typically trained under a supervised learning framework where a model learns a single task using labeled data. ...
., 2018) implementation the authors proposed to jointly perform masked language modeling and next sentence prediction. ...
arXiv:2007.00800v1
fatcat:jgjl2l7wqfaq5do4vre5fryuoe
Semantic Sentence Matching with Densely-Connected Recurrent and Co-Attentive Information
2019
PROCEEDINGS OF THE THIRTIETH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE AND THE TWENTY-EIGHTH INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE
Sentence matching is widely used in various natural language tasks such as natural language inference, paraphrase identification, and question answering. ...
It enables preserving the original and the co-attentive feature information from the bottommost word embedding layer to the uppermost recurrent layer. ...
And, by combining DRCN with the ELMo, one of the contextualized embeddings from language models, our model outperforms the LM-Transformer which has 85m parameters with fewer parameters of 61m. ...
doi:10.1609/aaai.v33i01.33016586
fatcat:mniquud2fbdfpmisfnyxqgzdi4
Semantic Sentence Matching with Densely-connected Recurrent and Co-attentive Information
[article]
2018
arXiv
pre-print
Sentence matching is widely used in various natural language tasks such as natural language inference, paraphrase identification, and question answering. ...
It enables preserving the original and the co-attentive feature information from the bottommost word embedding layer to the uppermost recurrent layer. ...
And, by combining DRCN with the ELMo, one of the contextualized embeddings from language models, our model outperforms the LM-Transformer which has 85m parameters with fewer parameters of 61m. ...
arXiv:1805.11360v2
fatcat:q7gksggltzhoxovdea6nmocfdq
Language Model-Based Paired Variational Autoencoders for Robotic Language Learning
[article]
2022
arXiv
pre-print
Next, we introduce PVAE-BERT, which equips the model with a pretrained large-scale language model, i.e., Bidirectional Encoder Representations from Transformers (BERT), enabling the model to go beyond ...
Our experiments suggest that using a pretrained language model as the language encoder allows our approach to scale up for real-world scenarios with instructions from human users. ...
ACKNOWLEDGMENT The authors gratefully acknowledge support from the German Research Foundation DFG, project CML (TRR 169). ...
arXiv:2201.06317v1
fatcat:yklxc5c5mjbm5byx5ktmyl7eoy
« Previous
Showing results 1 — 15 out of 821 results