SynCoBERT: Syntax-Guided Multi-Modal Contrastive Pre-Training for Code Representation
[article]
Xin Wang, Yasheng Wang, Fei Mi, Pingyi Zhou, Yao Wan, Xiao Liu, Li Li, Hao Wu, Jin Liu, Xin Jiang
2021
arXiv
pre-print
Code representation learning, which aims to encode the semantics of source code into distributed vectors, plays an important role in recent deep-learning-based models for code intelligence. ...
To further explore the properties of programming languages, this paper proposes SynCoBERT, a syntax-guided multi-modal contrastive pre-training approach for better code representations. ...
We propose a multi-modal contrastive learning (MCL) objective to obtain more comprehensive representations by learning from three modalities (code, comment, and AST) through contrastive learning. ...
arXiv:2108.04556v3
fatcat:vhp2fpfkpnh7lpfomyn5yqwyba