articleOct 1, 2016Closed access
The Synthetic Data Vault
Indexed incrossref
Abstract
The goal of this paper is to build a system that automatically creates synthetic data to enable data science endeavors. To achieve this, we present the Synthetic Data Vault (SDV), a system that builds generative models of relational databases. We are able to sample from the model and create synthetic data, hence the name SDV. When implementing the SDV, we also developed an algorithm that computes statistics at the intersection of related database tables. We then used a state-of-the-art multivariate modeling approach to model this data. The SDV iterates through all possible relations, ultimately creating a model for the entire database. Once this model is computed, the same relational information allows the SDV…
Citation impact
557
total citations
- FWCI
- 11.99
- Percentile
- 100%
- References
- 13
Citations per year
Authors
3Topics & keywords
Topics
Keywords
- Computer science
- Synthetic data
- Data modeling
- Data mining
- Relational database
- Data model (GIS)
- Intersection (aeronautics)
- Generative model
UN Sustainable Development Goals
- Industry, innovation and infrastructure
No related works found for this paper.