articleACM SIGMOD RecordSep 1, 2005Closed access

A survey of data provenance in e-science

Indiana University Bloomington

Indexed incrossref

Abstract

Data management is growing in complexity as large-scale applications take advantage of the loosely coupled resources brought together by grid middleware and by abundant storage capacity. Metadata describing the data products used in and generated by these applications is essential to disambiguate the data and enable reuse. Data provenance, one kind of metadata, pertains to the derivation history of a data product starting from its original sources.In this paper we create a taxonomy of data provenance characteristics and apply it to current research efforts in e-science, focusing primarily on scientific workflow approaches. The main aspect of our taxonomy categorizes provenance systems based on why they record…

Citation impact

1,154
total citations
FWCI
108.66
Percentile
100%
References
25
Citations per year

Authors

3

Topics & keywords

Keywords
  • Metadata
  • Computer science
  • Workflow
  • e-Science
  • Provenance
  • Taxonomy (biology)
  • Data science
  • Reuse
UN Sustainable Development Goals
  • Industry, innovation and infrastructure
No related works found for this paper.