articleJun 25, 2003Closed access

Reclaiming space from duplicate files in a serverless distributed file system

Microsoft (United States)

Indexed incrossref

Abstract

The Farsite distributed file system provides availability by replicating each file onto multiple desktop computers. Since this replication consumes significant storage space, it is important to reclaim used space where possible. Measurement of over 500 desktop file systems shows that nearly half of all consumed space is occupied by duplicate files. We present a mechanism to reclaim space from this incidental duplication to make it available for controlled file replication. Our mechanism includes: (1) convergent encryption, which enables duplicate files to be coalesced into the space of a single file, even if the files are encrypted with different users' keys; and (2) SALAD, a Self-Arranging Lossy Associative…

Citation impact

643
total citations
FWCI
4.97
Percentile
100%
References
60
Citations per year

Authors

5

Topics & keywords

Keywords
  • Computer science
  • Scalability
  • File system
  • Replication (statistics)
  • Computer file
  • Self-certifying File System
  • File Control Block
  • Torrent file
No related works found for this paper.