eCite Digital Repository

Open-domain question answering framework using Wikipedia

Citation

Ameen, S and Chung, H and Han, SC and Kang, BH, Open-domain question answering framework using Wikipedia, Lecture Notes in Computer Science 9992: Proceedings of the 29th Australasian Joint Conference on Artificial Intelligence (AI 2016): Advances in Artificial Intelligence), 5-8 December 2016, Hobart, Tasmania, pp. 623-635. ISBN 978-3-319-50127-7 (2016) [Refereed Conference Paper]


Preview
PDF
622Kb
  

Copyright Statement

Copyright 2016 Springer International Publishing AG. This is an author-created version of a paper originally published in, Kang B., Bai Q. (eds) AI 2016: Advances in Artificial Intelligence. AI 2016. Lecture Notes in Computer Science, vol 9992. Springer, Cham. The final publication is available at Springer via http://dx.doi.org/10.1007/978-3-319-50127-7_55

DOI: doi:10.1007/978-3-319-50127-7_55

Abstract

This paper explores the feasibility of implementing a model for an open domain, automated question and answering framework that leverages Wikipedia’s knowledgebase. While Wikipedia implicitly comprises answers to common questions, the disambiguation of natural language and the difficulty of developing an information retrieval process that produces answers with specificity present pertinent challenges. However, observational analysis suggests that it is possible to discount the syntactical and lexical structure of a sentence in contexts where questions contain a specific target entity (words that identify a person, location or organisation) and that correspondingly query a property related to it. To investigate this, we implemented an algorithmic process that extracted the target entity from the question using CRF based named entity recognition (NER) and utilised all remaining words as potential properties. Using DBPedia, an ontological database of Wikipedia’s knowledge, we searched for the closest matching property that would produce an answer by applying standardised string matching algorithms including the Levenshtein distance, similar text and Dice’s coefficient. Our experimental results illustrate that using Wikipedia as a knowledgebase produces high precision for questions that contain a singular unambiguous entity as the subject, but lowered accuracy for questions where the entity exists as part of the object.

Item Details

Item Type:Refereed Conference Paper
Keywords:open-domain, question answering, Wikipedia
Research Division:Information and Computing Sciences
Research Group:Artificial Intelligence and Image Processing
Research Field:Artificial Intelligence and Image Processing not elsewhere classified
Objective Division:Information and Communication Services
Objective Group:Computer Software and Services
Objective Field:Computer Software and Services not elsewhere classified
UTAS Author:Ameen, S (Mr Saleem Ameen)
UTAS Author:Chung, H (Mr David Chung)
UTAS Author:Han, SC (Ms Caren Han)
UTAS Author:Kang, BH (Professor Byeong Kang)
ID Code:118093
Year Published:2016
Deposited By:Information and Communication Technology
Deposited On:2017-07-04
Last Modified:2018-02-01
Downloads:133 View Download Statistics

Repository Staff Only: item control page