eCite Digital Repository

Change-aware scheduling for effectively updating linked open data caches

Citation

Akhtar, U and Razzaq, MA and Ur Rehman, U and Amin, MB and Khan, WA and Huh, E-N and Lee, S, Change-aware scheduling for effectively updating linked open data caches, IEEE Access, 6 pp. 65862-65873. ISSN 2169-3536 (2018) [Refereed Article]

Copyright Statement

Copyright 2018 IEEE

DOI: doi:10.1109/ACCESS.2018.2871511

Abstract

The linked open data (LOD) cloud is a global information space with a wealth of structured facts, which are useful for a wide range of usage scenarios. The LOD cloud handles a large number of requests from applications consuming the data. However, the performance of retrieving data from LOD repositories is one of the major challenge. Overcome with this challenge, we argue that it is advantageous to maintain a local cache for efficient querying and processing. Due to the continuous evolution of the LOD cloud, local copies become outdated. In order to utilize the best resources, improvised scheduling is required to maintain the freshness of the local data cache. In this paper, we have proposed an approach to efficiently capture the changes and update the cache. Our proposed approach, called application-aware change prioritization (AACP), consists of a change metric that quantifies the changes in LOD, and a weight function that assigns importance to recent changes. We have also proposed a mechanism to update policies, called preference-aware source update (PASU), which incorporates the previous estimation of changes and establishes when the local data cache needs to be updated. In the experimental evaluation, several state-of-the-art strategies are compared against the proposed approach. The performance of each policy is measured by computing the precision and recall between the local data cache update using the policy under consideration and the data source, which is the ground truth. Both cases of a single update and iterative update are evaluated in this study. The proposed approach is reported to outperform all the other policies by achieving an F1-score of 88% and effectivity of 93.5%.

Item Details

Item Type:Refereed Article
Keywords:resource description framework, linked data, measurement, estimation, scheduling, crawlers, linked open data, change propagation, evolving web data, RDF crawling, cache storage
Research Division:Information and Computing Sciences
Research Group:Data management and data science
Research Field:Data engineering and data science
Objective Division:Information and Communication Services
Objective Group:Information systems, technologies and services
Objective Field:Information systems, technologies and services not elsewhere classified
UTAS Author:Amin, MB (Dr Muhammad Bilal Amin)
ID Code:141581
Year Published:2018
Web of Science® Times Cited:2
Deposited By:Information and Communication Technology
Deposited On:2020-10-30
Last Modified:2020-12-17
Downloads:0

Repository Staff Only: item control page