pStore: A Secure Peer-to-Peer Backup System

Christopher Batten, Kenneth Barr, Arvind Saraf, Stanley Trepetin
December 8, 2001
MIT 6.824

Abstract

This is the abstract from the paper linked below.

In an effort to combine research in peer-to-peer systems with techniques for incremental backup systems, we propose pStore: a secure distributed backup system based on an adaptive peer-to-peer network. pStore exploits unused personal hard drive space attached to the Internet to provide the distributed redundancy needed for reliable and effective data backup. Experiments on a 30 node network show that 95% of the files in a 13 MB dataset can be retrieved even when 7 of the nodes have failed. On top of this reliability, pStore includes support for file encryption, replication, versioning, and sharing. Its custom versioning system permits arbitrary version retrieval similar to CVS. pStore provides this functionality at less than 10% of the network bandwidth and requires 85% less storage capacity than simpler local tape backup schemes for a representative workload.


LCS Technical Memo 632
(PDF) (bibtex)
You can see how the project evolved by looking at:
Paper (174k PDF)
Proposal
Progress Report (basically a sketch of the final paper. 150k PDF)

FAQ

Q.
 
> I saw your page on pStore here:
> http://catfish.csail.mit.edu/~kbarr/pstore/
>
> I'm curious is anything has come of that, and if you would like some
> help?
A.

The pStore project was purely a research project -- quite tangential 
to what we normally work on, so we have no plans to continue it. 

We used the Chord infrastructure as our underlying peer-to-peer
network and that is available here: http://www.pdos.lcs.mit.edu/chord/

Since working on the project, we have spoken with a few companies who
were interested in our work so there very well may be a commercial
application reminiscent of pStore in the near future. Until then you
might want to check out the following two projects:
 
 * Hivecache (http://www.hivecache.com) aka allmydata.com
 * DIBS (http://www.csua.berkeley.edu/~emin/source_code/dibs)

If peer-to-peer isn't essential, you might (we didn't look at these)
be able to take advantage of a distributed file system like Coda or
RAID+NBD (now in the Linux kernel).

Let us know if you have any more questions.  While we have moved on to
other things, it's always nice to see someone take an interest. 
Updates 2005-12-05