University of Tasmania
Browse

File(s) under permanent embargo

Efficient provenance storage for relational queries

conference contribution
posted on 2023-05-23, 08:58 authored by Bao, Z, Kohler, H, Wang, L, Zhou, X, Sadiq, S
Provenance information is vital in many application areas as it helps explain data lineage and derivation. However, storing fine-grained provenance information can be expensive. In this paper, we present a framework for storing provenance information relating to data derived via database queries. In particular, we first propose a provenance tree data structure which matches the query structure and thereby presents a possibility to avoid redundant storage of information regarding the derivation process. Then we investigate two approaches for reducing storage costs. The first approach utilizes two ingenious rules to achieve reduction on provenance trees. The second one is a dynamic programming solution, which provides a way of optimizing the selection of query tree nodes where provenance information should be stored. The optimization algorithm runs in polynomial time in the query size and is linear in the size of the provenance information, thus enabling provenance tracking and optimization without incurring large overheads. Experiments show that our approaches guarantee significantly lower storage costs than existing approaches.

History

Publication title

Proceedings of the 21st ACM International Conference on Information and Knowledge Management

Pagination

1352-1361

ISBN

978-1-4503-1156-4

Department/School

School of Information and Communication Technology

Publisher

Association for Computing Machinery

Place of publication

United States of America

Event title

21st ACM International Conference on Information and Knowledge Management

Event Venue

Maui, Hawaii

Date of Event (Start Date)

2012-10-29

Date of Event (End Date)

2012-11-02

Rights statement

Copyright 2012 ACM

Repository Status

  • Restricted

Socio-economic Objectives

Information systems, technologies and services not elsewhere classified

Usage metrics

    University Of Tasmania

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC