TFSnippet

Stable
Develop

TFSnippet is a set of utilities for writing and testing TensorFlow models.

The design philosophy of TFSnippet is non-interfering. It aims to provide a set of useful utilities, possible to be used along with any other TensorFlow libraries and frameworks.

Dependencies

TensorFlow >= 1.5

Installation

pip install git+https://github.com/haowen-xu/tfsnippet.git

Documentation

Tutorials and API docs

Examples

Classification:
- MNIST.
- Convolutional MNIST.
Auto Encoders:
- VAE.
- Convolutional VAE: VAE with convolutional layers.
- Bernoulli Latent VAE: VAE with q(z|x) and p(z) being Bernoulli distribution.
- Mixture Prior VAE: VAE with p(z) being a mixture of Gaussian.
- Gaussian Mixture VAE: GM-VAE, with auxiliary categorical variable y.
- Planar Normalizing Flow: VAE with q(z|x) derived by a planar normalizing flow.
- Dense Real NVP: VAE with q(z|x) derived by a Real NVP (with dense layers).

Quick Tutorial

From the very beginning, you might import the TFSnippet as:

import tfsnippet as spt

Distributions

If you use TFSnippet distribution classes to obtain random samples, you shall get enhanced tensor objects, from which you may compute the log-likelihood by simply calling log_prob().

normal = spt.Normal(0., 1.)
# The type of `samples` is :class:`tfsnippet.stochastic.StochasticTensor`.
samples = normal.sample(n_samples=100)
# You may obtain the log-likelhood of `samples` under `normal` by:
log_prob = samples.log_prob()
# You may also obtain the distribution instance back from the samples,
# such that you may fire-and-forget the distribution instance!
distribution = samples.distribution

The distributions from ZhuSuan can be casted into a TFSnippet distribution class, in case we haven't provided a wrapper for a certain ZhuSuan distribution:

import zhusuan as zs

uniform = spt.as_distribution(zs.distributions.Uniform())
# The type of `samples` is :class:`tfsnippet.stochastic.StochasticTensor`.
samples = uniform.sample(n_samples=100)

Data Flows

It is a common practice to iterate through a dataset by mini-batches. The tfsnippet.DataFlow provides a unified interface for assembling the mini-batch iterators.

# Obtain a shuffled, two-array data flow, with batch-size 64.
# Any batch with samples fewer than 64 would be discarded.
flow = spt.DataFlow.arrays(
    [x, y], batch_size=64, shuffle=True, skip_incomplete=True)
for batch_x, batch_y in flow:
    ...  # Do something with batch_x and batch_y

# You may use a threaded data flow to prefetch the mini-batches
# in a background thread.  The threaded flow is a context object,
# where exiting the context would destroy the background thread.
with flow.threaded(prefetch=5) as threaded_flow:
    for batch_x, batch_y in threaded_flow:
        ...  # Do something with batch_x and batch_y

# If you use `MLSnippet <https://github.com/haowen-xu/mlsnippet>`_,
# you can even load data from a MongoDB via data flow.  Suppose you
# have stored all images from ImageNet into a GridFS (of MongoDB),
# along with the labels stored as ``metadata.y``.
# You may iterate through the ImageNet in batches by:
from mlsnippet.datafs import MongoFS

fs = MongoFS('mongodb://localhost', 'imagenet', 'train')
with fs.as_flow(batch_size=64, with_names=False, meta_keys=['y'],
                shuffle=True, skip_incomplete=True) as flow:
    for batch_x, batch_y in flow:
        ...  # Do something with batch_x and batch_y.  batch_x is the
             # raw content of images you stored into the GridFS.

Training

After you've build the model and obtained the training operation, you may quickly run a training-loop by using utilities from TFSnippet:

input_x = ...  # the input x placeholder
input_y = ...  # the input y placeholder
loss = ...  # the training loss
params = tf.trainable_variables()  # the trainable parameters

# We shall adopt learning-rate annealing, the initial learning rate is
# 0.001, and we would anneal it by a factor of 0.99995 after every step.
learning_rate = spt.AnnealingVariable('learning_rate', 0.001, 0.99995)

# Build the training operation by AdamOptimizer
optimizer = tf.train.AdamOptimizer(learning_rate)
train_op = optimizer.minimize(loss, var_list=params)

# Build the training data-flow
train_flow = spt.DataFlow.arrays(
    [train_x, train_y], batch_size=64, shuffle=True, skip_incomplete=True)
# Build the validation data-flow
valid_flow = spt.DataFlow.arrays([valid_x, valid_y], batch_size=256)

with spt.TrainLoop(params, max_epoch=max_epoch, early_stopping=True) as loop:
    trainer = spt.Trainer(loop, train_op, [input_x, input_y], train_flow,
                          metrics={'loss': loss})
    # Anneal the learning-rate after every step by 0.99995.
    trainer.anneal_after_steps(learning_rate, freq=1)
    # Do validation and apply early-stopping after every epoch.
    trainer.evaluate_after_epochs(
        spt.Evaluator(loop, loss, [input_x, input_y], valid_flow),
        freq=1
    )
    # You may log the learning-rate after every epoch registering an
    # event handler.  Surely you may also add any other handlers.
    trainer.events.on(
        EventKeys.AFTER_EPOCH,
        lambda epoch: trainer.loop.collect_metrics(lr=learning_rate),
    )
    # Print training metrics after every epoch.
    trainer.log_after_epochs(freq=1)
    # Run all the training epochs and steps.
    trainer.run()

Name		Name	Last commit message	Last commit date
Latest commit History 405 Commits
docs		docs
scripts		scripts
tests		tests
tfsnippet		tfsnippet
.coveragerc		.coveragerc
.gitignore		.gitignore
.travis.yml		.travis.yml
CHANGELOG.md		CHANGELOG.md
LICENSE.txt		LICENSE.txt
README.rst		README.rst
requirements-dev.txt		requirements-dev.txt
requirements-docs.txt		requirements-docs.txt
requirements.txt		requirements.txt
setup.py		setup.py

License

shliujing/tfsnippet

Folders and files

Latest commit

History

Repository files navigation

TFSnippet

Dependencies

Installation

Documentation

Examples

Quick Tutorial

Distributions

Data Flows

Training

About

Resources

License

Stars

Watchers

Forks

Languages