Treasure Data API library for Python

Requirements

td-client supports the following versions of Python.

Python 2.7+
Python 3.3+
PyPy

Install

You can install the releases from PyPI.

$ pip install td-client

It'd be better to install certifi to enable SSL certificate verification.

$ pip install certifi

Examples

Please see also the examples at Treasure Data Documentation.

Listing jobs

TreasureData API key will be read from environment variable TD_API_KEY, if none is given via arguments to tdclient.Client.

import tdclient

with tdclient.Client() as td:
    for job in td.jobs():
        print(job.job_id)

Running jobs

Running jobs on Treasure Data.

import tdclient

with tdclient.Client() as td:
    job = td.query("sample_datasets", "SELECT COUNT(1) FROM www_access", type="hive")
    job.wait()
    for row in job.result():
        print(repr(row))

Running jobs via DBAPI2

td-client-python implements PEP 0249 Python Database API v2.0. You can use td-client-python with external libraries which supports Database API such like pandas.

import pandas
import tdclient

def on_waiting(cursor):
    print(cursor.job_status())

with tdclient.connect(db="sample_datasets", type="presto", wait_callback=on_waiting) as td:
    data = pandas.read_sql("SELECT symbol, COUNT(1) AS c FROM nasdaq GROUP BY symbol", td)
    print(repr(data))

We offer another package for pandas named pandas-td with some advanced features. You may prefer it if you need to do complicated things, such like exporting result data to Treasure Data, printing job's progress during long execution, etc.

Importing data

Importing data into Treasure Data in streaming manner, as similar as fluentd is doing.

import sys
import tdclient

with tdclient.Client() as td:
    for file_name in sys.argv[:1]:
        td.import_file("mydb", "mytbl", "csv", file_name)

Bulk import

Importing data into Treasure Data in batch manner.

from __future__ import print_function
import sys
import tdclient
import time
import warnings

if len(sys.argv) <= 1:
    sys.exit(0)

with tdclient.Client() as td:
    session_name = "session-%d" % (int(time.time()),)
    bulk_import = td.create_bulk_import(session_name, "mydb", "mytbl")
    try:
        for file_name in sys.argv[1:]:
            part_name = "part-%s" % (file_name,)
            bulk_import.upload_file(part_name, "json", file_name)
        bulk_import.freeze()
    except:
        bulk_import.delete()
        raise
    bulk_import.perform(wait=True)
    if 0 < bulk_import.error_records:
        warnings.warn("detected %d error records." % (bulk_import.error_records,))
    if 0 < bulk_import.valid_records:
        print("imported %d records." % (bulk_import.valid_records,))
    else:
        raise(RuntimeError("no records have been imported: %s" % (repr(bulk_import.name),)))
    bulk_import.commit(wait=True)
    bulk_import.delete()

Development

Running tests

Run tests.

$ python setup.py test

Running tests (tox)

You can run tests against all supported Python versions. I'd recommend you to install pyenv to manage Pythons.

$ pyenv shell system
$ for version in $(cat .python-version); do [ -d "$(pyenv root)/versions/${version}" ] || pyenv install "${version}"; done
$ pyenv shell --unset

Install tox.

$ pip install tox

Then, run tox.

$ tox

Release

Release to PyPI.

$ python setup.py sdist upload

Version History

See CHANGELOG.md.

License

Apache Software License, Version 2.0

Name		Name	Last commit message	Last commit date
Latest commit History 388 Commits
appveyor		appveyor
tdclient		tdclient
.coveralls.yml		.coveralls.yml
.gitignore		.gitignore
.python-version		.python-version
.travis.yml		.travis.yml
CHANGELOG.md		CHANGELOG.md
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
README.md		README.md
appveyor.yml		appveyor.yml
requirements.txt		requirements.txt
setup.py		setup.py
test-requirements.txt		test-requirements.txt
tox.ini		tox.ini

License

kiyotaka-tanaka/td-client-python

Folders and files

Latest commit

History

Repository files navigation

Treasure Data API library for Python

Requirements

Install

Examples

Listing jobs

Running jobs

Running jobs via DBAPI2

Importing data

Bulk import

Development

Running tests

Running tests (tox)

Release

Version History

License

About

Resources

License

Stars

Watchers

Forks

Languages