SQuAD – The Stanford Question Answering Data Set

Stanford Question Answering Dataset (SQuAD) is a reading comprehension dataset, consisting of questions posed by crowdworkers on a set of Wikipedia articles, where the answer to every question is a segment of text, or span, from the corresponding reading passage, or the question might be unanswerable.

SQuAD2.0 combines the 100,000 questions in SQuAD 1.1 with over 50,000 unanswerable questions. These are written adversarially by crowdworkers in order to look similar to the answerable ones. The SQuAD2.0 task thus consists not only in answering questions when possible, but also in detecting situations in which no answer is supported by the paragraph and abstaining from any answer.

SQuAD is an ongoing effort, as datasets are expected to evolve.

SQuAD2.0 tests the ability of NLP system to not only answer reading comprehension questions, but also abstain when presented with a question that cannot be answered based on the provided paragraph. The SQuAD LeaderBoard is updated to compare state-of-the-art methods, systems and technologies.

SQuAD is an effort made by the Stanford NLP Group.

SQuAD – The Stanford Question Answering Data Set

Share This Post!

Related Posts

DreamBank: Dreaming as Scheduled Wandering

GPT-3: intelligence or imitation? The core contribution of textual inference

EvalIta 2020