- Abstract: Multi-label text classification (MLTC) brings us new challenge in Natural Language Processing (NLP) which aims at assigning multiple labels for a given document. Many real-world tasks can be view as MLTC, such as tag recommendation, information retrieval, etc. However, several flinty problems are placed in the presence of researchers about how to establish connections between labels or distinguish similar sub-labels, which haven't been solved thoroughly by current endeavor. Therefore, we proposed a novel framework named BdLAN, BERTdoc Label Attention Networks in this paper, consist of the BERTdoc layer, the label embeddings layer, the doc encoder layer, the doc-label attention layer and the prediction layer. We apply a powerful technique BERT to pretrain documents to capture their deep semantic features and encode them via Bi-LSTM to obtain a two-directional contextual representation of uniform length. Then we create label embeddings and feed them together with encoded-pretrained-documents to doc-label attention mechanism to obtain interactive information between documents and their corresponding labels, finally using MLP to make prediction.We carry out experiments on three real-world datasets, the empirical results indicating our proposed model outperforms all state-of-the-art MLTC benchmarks. Moreover, we conduct a case study, visualizing real application of our BdLAN model vividly.