We chose the default project to build a Question Answering system on the SQuAD 2.0 dataset. Our initial approach to solve this problem focused on implementing the default baseline model that is based on a variant of Bidirectional Attention Flow (BiDAF) with attention. We explored performance after adding character level embeddings to the baseline along with exploring various attention mechanisms. Additionally, we also explored the impact of tuning the hyper-parameters used to train the model. Finally, we studied the effect of using multiple variants of RNN as building blocks in the neural architecture. We improved the model performance on both dev and test sets by at least 4 points. The baseline F1 and EM scores without character embedding were 60.65 and 57.13 while our best improvements with BiDAF, Character Embedding, Self-attention with LSTM were 65.80 and 62.99 respectively. The scores would have been better with pre-trained models however, for our track it was prohibited. Even if we could improve the performance by a bit, question answering remains a challenging problem with a lot of scope of improvement. Also, we need to make sure that the current model generalizes beyond SQuAD dataset.
This course was our first foray in the field of NLP and we have developed a deeper understanding about the advances and challenges in Natural Language Understanding and processing and hope to keep improving it with time.