The project was created for fun and practice with google storage api.
before using the gscp.py
util you have to provide your google credentials
with environment variable GOOGLE_APPLICATION_CREDENTIALS
.
For example:
export GOOGLE_APPLICATION_CREDENTIALS=~/google-credentials.json
See more info about google credentials: https://cloud.google.com/docs/authentication/getting-started
- python = "^3.9"
- poetry = "^0.1.0"
git clone https://github.com/com30n/gsutil-cp-reimplementation.git
cd gsutil-cp-reimplementation
poetry install
poetry shell
usage: gscp.py [-h] [-r] [-m PARALLEL] [--debug] src_url dst_url
positional arguments:
src_url The whole bucket path with the schema gs://, what do we have to copy
dst_url The local path where we should save the copied file
optional arguments:
-h, --help show this help message and exit
-r, --recursive Download files recursively
-m PARALLEL, --parallel PARALLEL
Start copying in parallel. Specify the number of threads
--debug Show debug info
Download all the files from bucket in parallel:
./gscp.py -m 10 -r gs://my-own-bucket/ /tmp/test
Download the only one file from the bucket:
./gscp.py gs://my-own-bucket/mydir/a/1.txt /tmp/test
Download some files from bucket recursively:
./gscp.py -r gs://my-own-bucket/mydir/ /tmp/test
Before push the commit, please check it with make fmt
command that will reformat the code and run syntax and linters test
Just add some unit tests