2 Hits in 2.5 sec

Kformer: Knowledge Injection in Transformer Feed-Forward Layers [article]

Yunzhi Yao, Shaohan Huang, Li Dong, Furu Wei, Huajun Chen, Ningyu Zhang
2022 arXiv   pre-print
In this work, we propose a simple model, Kformer, which takes advantage of the knowledge stored in PTMs and external knowledge via knowledge injection in Transformer FFN layers.  ...  A recent study has observed knowledge neurons in the Feed Forward Network (FFN), which are responsible for expressing factual knowledge.  ...  In our paper, inspired by previous work [2, 4] on the feed-forward layers, we propose a novel way to filter and incorporate external knowledge through the feed-forward layers in Transformer.  ... 
arXiv:2201.05742v2 fatcat:zl4nu62z5ja7pee3wefixcgwqe

Neural Knowledge Bank for Pretrained Transformers [article]

Damai Dai, Wenbin Jiang, Qingxiu Dong, Yajuan Lyu, Qiaoqiao She, Zhifang Sui
2022 arXiv   pre-print
Dai et al. (2022) find that the Feed-Forward Networks (FFNs) in pretrained Transformers store factual knowledge in a memory-like manner.  ...  During knowledge injection, we fix the original model and inject factual knowledge into the extended memory slots, so there will be no catastrophic forgetting for the pretrained model.  ...  In each Transformer layer, there are two main modules: a self-attention (Self-Att) module and a feed-forward network (FFN).  ... 
arXiv:2208.00399v1 fatcat:ziagdgoys5f5djgaevghjsoici