Alibaba’s iDST (Institute of Data Science of Technologies), Alibaba Group’s research arm focused on artificial intelligence, has developed a deep-learning model that scored higher than a human being on a Stanford University reading-comprehension test. This is the first time that a machine has outperformed humans on such a test.
The Stanford Question Answering Dataset (SQuAD) is a large-scale reading-comprehension dataset, consisting of over 100,000 question-answer pairs based on more than 500 Wikipedia articles. Participating teams are required to build machine-learning models that can provide answers to the questions in the dataset. It is perceived as the world’s top machine reading-comprehension test and attracts universities, companies and institutes ranging from Google, Facebook, IBM and Microsoft to Carnegie Mellon University, Stanford University and the Allen Research Institute.
On January 11th, the deep neural network model developed by Alibaba generated a score of 82.44 in Exact Match – providing exact answers to questions – beating the score by humans, 82.304.
The model, which leverages the innovative Hierarchical Attention Network that reads from paragraphs to sentences to words in order to locate the precise phases with potential answers, is believed to have significant commercial value. Previously, the underlying technology has been applied in Alibaba’s 11.11 Global Shopping Festival over, with machines answering a large amount of inbound inquiries during the mega-sale.
Luo Si, Chief Scientist of Natural Language Processing (NLP) at Alibaba iDST commented:
“It is our great honor to witness the milestone where machines surpass humans in reading comprehension. That means objective questions such as ‘what causes rain’ can now be answered with high accuracy by machines. We are especially excited because we believe the technology underneath can be gradually applied to numerous applications such as customer service, museum tutorials and online responses to medical inquiries from patients, decreasing the need for human input in an unprecedented way.”
“We are thrilled to see NLP research has achieved significant progress over the year. We look forward to sharing our model-building methodology with the wider community and exporting the technology to our clients in the near future,” Si added.
Alibaba’s iDST NLP team has received the best scores in previous global evaluations including the ACM CIKM cup of personalized e-commerce search, Chinese Grammar Error Diagnosis and English named entity classification task in the Text Analysis Conference.
Alibaba is committed to cutting-edge technology development. In October last year, Alibaba launched an innovative global research program, the Alibaba DAMO Academy, which will focus on topics such as machine learning, network security, visual computing and NLP, among others.
For more information about the reading comprehension test, please refer to: https://rajpurkar.github.io/SQuAD-explorer/
Like this content? Sign up for our daily newsletter to get latest updates.