A General Method for Transferring Explicit Knowledge into Language Model Pretraining
Recently, pretrained language models, such as Bert and XLNet, have rapidly advanced the state of the art on many NLP tasks. They can model implicit semantic information between words in the text. However, it is solely at the token level without considering the background knowledge. Intuitively, background knowledge influences the efficacy of text understanding. Inspired by this, we focus on improving model pretraining by leveraging external knowledge. Different from recent research that optimizes pretraining models by knowledge masking strategies, we propose a simple but general method to transfer explicit knowledge with pretraining. To be specific, we first match knowledge facts from a knowledge base (KB) and then add a knowledge injunction layer to a transformer directly without changing its architecture. This study seeks to find the direct impact of explicit knowledge on model pretraining. We conduct experiments on 7 datasets using 5 knowledge bases in different downstream tasks. Our investigation reveals promising results in all the tasks. The experiment also verifies that domain-specific knowledge is superior to open-domain knowledge in domain-specific task, and different knowledge bases have different performances in different tasks.