Abstract: Deep learning has proven useful on many NLP tasks including reading
comprehension. However it requires a lot of training data which are not
available in some domains of application. Hence we examine the possibility
of using data-rich domains to pre-train models and then apply them in
domains where training data are harder to get. Specifically, we train a
neural-network-based model on two context-question-answer datasets, the
BookTest and CNN/Daily Mail, and we monitor transfer to subsets of bAbI,
a set of artificial tasks designed to test specific reasoning abilities, and of
SQuAD, a question-answering dataset which is much closer to real-world
applications. Our experiments show very limited transfer if the model isn’t
shown any training examples from the target domain however the results
are promising if the model is shown at least a few target-domain examples.
Furthermore we show that the effect of pre-training is not limited to word
embeddings.
TL;DR: We examine effect of transfer learning in AS Reader model from two source domains (CNN/DM and BookTest) to two target domains (bAbI and SQuAD).
Conflicts: ibm.com
Keywords: Natural language processing, Semi-Supervised Learning, Deep learning, Transfer Learning
21 Replies
Loading