Adversarial Training Methods for Cross-Domain Question Answering

Even though many deep learning models surpass human-level performance on tasks like question answering when evaluated on in-domain test-sets, they might perform relatively poorly on out-of-domain datasets. To address this problem, domain adaptation techniques aim to adapt models trained for a task on in-domain datasets to a target domain by using efficiently samples from the latter. On the contrary, domain generalization techniques aim to incentivate the model to learn domain-invariant features directly from in-domain data to generalize the model for any out-of-domain dataset, pushing to learn task-relevant features and preventing overfitting on in-domain data. We like to compare this approach the way humans learn a task, as they can generally perform the same task on different domains from only a few examples. However, domain generalization is often performed by augmenting in-domain data by applying semantic-preserving transformations to challenge the model during training, leveraging some kind of rules or domain knowledge. Contrarily, in this project our goal is to explore domain generalization techniques applied to question answering based on adversarial training without leveraging any set of rules or domain knowledge but using adversarial terms to make more robust the regular loss with or without adopting task-agnostic critic networks. Such extremely general methodology does not suffer from the limitations of synonym replacement approaches and can be applied to other NLP tasks. Our best variant combines two different and complementary approaches of adversarial training on a DistilBERT baseline, achieving >3% F1-score improvement over the regular fine-tuning process, outperforming several other adversarial and energy-based approaches.