DO NOT REPLY [Bug 4693] New: Amazon S3 storage interface for rsync

samba-bugs at samba.org samba-bugs at samba.org
Wed Jun 13 15:19:15 GMT 2007


https://bugzilla.samba.org/show_bug.cgi?id=4693

           Summary: Amazon S3 storage interface for rsync
           Product: rsync
           Version: 3.0.0
          Platform: Other
        OS/Version: Linux
            Status: NEW
          Severity: enhancement
          Priority: P3
         Component: core
        AssignedTo: wayned at samba.org
        ReportedBy: rsync.20.bdixon at xoxy.net
         QAContact: rsync-qa at samba.org


Amazon last year launched a "Simple Storage Service":

---

Amazon S3 is intentionally built with a minimal feature set.

    * Write, read, and delete objects containing from 1 byte to 5 gigabytes of
data each. The number of objects you can store is unlimited.
    * Each object is stored and retrieved via a unique, developer-assigned key.
    * Authentication mechanisms are provided to ensure that data is kept secure
from unauthorized access. Objects can be made private or public, and rights can
be granted to specific users.
    * Uses standards-based REST and SOAP interfaces designed to work with any
Internet-development toolkit.
    * Built to be flexible so that protocol or functional layers can easily be
added.  Default download protocol is HTTP.  A BitTorrent(TM) protocol interface
is provided to lower costs for high-scale distribution.  Additional interfaces
will be added in the future. 

---

I would like to see rsync support S3 as a storage target. There are utilities
that perform rsync-like functionality with S3 but they are inferior, IMHO, to
rsync.

I'm willing to provide funded S3 access credentials and a small incentive
payment upon completion (ie. integration with the rsync standard release) to
recognized, qualified, rsync developers. Obviously this has to all be
negotiated.

Since S3 is not a filesystem there will need to be conventions created for how
filesystem metadata (permissions, etc.) is stored on S3. Whole file checksums
using the MD5 algorithm are supported by S3.

This is a project that I'd like to see completed for personal use... not a
corporate funded effort so don't get starry eyes. I think it would be a wildly
popular feature based upon the uptake S3 is getting. Something like ~5 Billion
objects are stored on S3 so there is plenty of use going on out there.

Usage:

rsync s3.amazonaws.com::
 List buckets on S3

rsync s3.amazonaws.com::testrsync
 List contents of the testrsync bucket

rsync --create-bucket s3.amazonaws.com::testrsync
 Create the bucket if it does not exist

rsync s3.amazonaws.com::testrsync/testfile ./testfile
 Transfer testfile from the testrsync bucket

rsync --create-bucket ./testfile s3.amazonaws.com::testrsync/testfile
 Transfer testfile to the testrsync bucket

rsync -avz $HOME s3.amazonaws.com::testrsync
 Transfer contents of $HOME recursively to the testrsync bucket preserving
everything.

rsync -avz s3.amazonaws.com::testrsync $HOME
 Bring it all back.

There will also need to be some S3 permission modifiers that apply to bucket
and object creation:

--s3-public-read
--s3-public-read-write
--s3-private

That doesn't cover all of the ACL options S3 can do but those are the ones I
use.

Contact me if you are interested in working on this.


-- 
Configure bugmail: https://bugzilla.samba.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the QA contact for the bug, or are watching the QA contact.


More information about the rsync mailing list