eCite Digital Repository

RDR-based Open IE for the web document

Citation

Kim, MH and Compton, P and Kim, YS, RDR-based Open IE for the web document, Proceedings of the 6th International Conference on Knowledge Capture 2011, 26-29 June 2011, Alberta, Canada, pp. 105-112. ISBN 978-1-4503-0396-5 (2011) [Refereed Conference Paper]

Copyright Statement

Copyright 2011 ACM http://dx.doi.org/

DOI: doi:10.1145/1999676.1999696

Abstract

The Web contains a massive amount of information embedded in text and obtaining information from Web text is a major research challenge. One research focus is Open Information Extraction aimed at developing relation-independent information extraction. Open Information Extraction (OIE) systems seek to extract all potential relations from the text rather than extracting a few predefined relations. Existing OIE systems such as TEXTRUNNER usually take a machine learning based approach which requires large volumes of training data.

This paper presents a Ripple-Down Rules Open Information Extraction system based on processing example cases and manually adding rules when needed. The key advantages of this approach are that it can handle the freer writing style that occurs in Web documents and can correct errors introduced by natural language pre-processing tools, whereas systems like TEXTRUNNER depend on the quality of the entity-tagging preprocessing in the training data. We evaluated the Ripple-Down Rules approach against the OIE systems, TEXTRUNNER and StatSnowball. In these studies the Ripple-Down Rules approach, with minimal low-cost rule addition achieves much higher precision and somewhat improved recall compared to these other Open Information Extraction systems.

Item Details

Item Type:Refereed Conference Paper
Keywords:knowledge acquisition, expert systems, open information extraction, ripple-down rules
Research Division:Information and Computing Sciences
Research Group:Artificial Intelligence and Image Processing
Research Field:Artificial Intelligence and Image Processing not elsewhere classified
Objective Division:Information and Communication Services
Objective Group:Computer Software and Services
Objective Field:Application Software Packages (excl. Computer Games)
Author:Kim, YS (Dr Yang Kim)
ID Code:94658
Year Published:2011
Deposited By:Computing and Information Systems
Deposited On:2014-09-15
Last Modified:2014-12-09
Downloads:0

Repository Staff Only: item control page