Leveraging Linguistic and Non-Linguistic Features in Automatic Multiple Choice Reading Comprehension
Daria Dzendzik, Carl Vogel, Qun Liu and Jennifer Foster


Question answering is an important subfield of Natural Language Processing. Reading comprehension is a particular type of question answering where the focus is on developing systems that can answer questions about a text. Our main goal is to examine the role of different types of features (linguistic and non-linguistic) in reading comprehension tasks. One reading comprehension task is multiple-choice question answering: given a text and a question about the text, select the correct answer from a set of candidate answers. We introduce an approach to multiple-choice question answering which is based on a combination of string similarities. Our method consists of two parts: we first select the sentences from the text which are relevant to the question, and we then select the answer by means of logistic regression over the concatenation of various string similarity measures computed between the answer and the relevant sentences, and the question-answer pair and the relevant sentences. To compute string similarity, we use TF-IDF, BOW, character n-grams, and word embedding. Our system achieves the best results on Multi-choice Question Answering in Examinations (IJCNLP 2017 Shared Task 5) and is in the top five systems for the MovieQA Plot Synopses Challenge (the best result for October 2017). Good performance on two very different domains demonstrates the efficacy of our approach.