Skip to content

Amazon Simple Storage Service (S3) resource plugin for iRODS

License

Notifications You must be signed in to change notification settings

earlephilhower/irods_resource_plugin_s3

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

69 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

iRODS S3 Resource Plugin
------------------------

To build the S3 Resource Plugin, you will need to have:

 - the iRODS Development Tools (irods-dev and irods-runtime) installed for your platform
     http://irods.org/download

 - libxml2-dev / libxml2-devel

 - libcurl-dev / curl-devel

 - libs3 installed for your platform
     https://github.com/irods/libs3


Then to build this resource plugin:

  ./packaging/build.sh


To use this resource plugin:

  irods@hostname $ iadmin mkresc compResc compound
  irods@hostname $ iadmin mkresc cacheResc unixfilesystem <hostname>:</full/path/to/Vault>
  irods@hostname $ iadmin mkresc archiveResc s3 <hostname>:/<s3BucketName>/irods/Vault "S3_DEFAULT_HOSTNAME=s3.amazonaws.com;S3_AUTH_FILE=</full/path/to/AWS.keypair>;S3_RETRY_COUNT=<num reconn tries>;S3_WAIT_TIME_SEC=<wait between retries>;S3_PROTO=<HTTP|HTTPS>"
  irods@hostname $ iadmin addchildtoresc compResc cacheResc cache
  irods@hostname $ iadmin addchildtoresc compResc archiveResc archive
  irods@hostname $ iput -R compResc foo.txt

The AWS/S3 keypair file should have two values (Access Key ID and Secret Access Key):

  AKDJFH4KJHFCIOBJ5SLK
  rlgjolivb7293r928vu98n498ur92jfgsdkjfh8e

You may specify more than one host:IP as the S3_DEFAULT_HOSTNAME by listing them with a comma (,) between them:
ex:  S3_DEFAULT_HOSTNAME=192.168.122.128:443,192.168.122.129:443,192.168.122.130:443

To control multipart uploads, add the resource variables "S3_MPU_CHUNK" and "S3_MPU_THREADS" to the creation line.
* S3_MPU_CHUNK is the size of each part to be uploaded in parallel (in MB, default is 5MB).  Objects smaller than this will be uploaded with standard PUTs.
* S3_MPU_THREADS is the number of parts to upload in parallel (only under Linux, default is 10).  On non-Linux OSes, this parameter is ignored and multipart uploads are performed sequentially.

To ensure end-to-end data integrity, MD5 checksums can be calculated and used for S3 uploads.  Note that this requires 2x the disk IO (because the file must first be read to calculate the MD5 before the S3 upload can start) and a corresponding increase in CPU usage
S3_ENABLE_MD5=[0|1]  (default is 0, off)

S3 server side encryption can be enabled using the parameter S3_SERVER_ENCRYPT=[0|1] (default=0=off).  This is not the same as HTTPS, and implies that the data will be stored on disk encrypted.  To encrypt during the network transport to S3, please use S3_PROTO=HTTPS (the default)

About

Amazon Simple Storage Service (S3) resource plugin for iRODS

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages

  • C++ 76.4%
  • Python 15.1%
  • Shell 7.3%
  • Makefile 1.2%