A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2022; you can also visit the original URL.
The file type is application/pdf
.
Direction is what you need: Improving Word Embedding Compression in Large Language Models
2021
Proceedings of the 6th Workshop on Representation Learning for NLP (RepL4NLP-2021)
unpublished
The adoption of Transformer-based models in natural language processing (NLP) has led to great success using a massive number of parameters. However, due to deployment constraints in edge devices, there has been a rising interest in the compression of these models to improve their inference time and memory footprint. This paper presents a novel loss objective to compress token embeddings in the Transformer-based models by leveraging an AutoEncoder architecture. More specifically, we emphasize
doi:10.18653/v1/2021.repl4nlp-1.32
fatcat:nsc542foi5dvjma2cibe2tstpy