Tensorflow datasets (tfds
) implementations of various shape datasets.
git clone https://github.com/jackd/shape-tfds
cd shape-tfds
pip install -r requirements.txt
pip install .
Since the release of tensorflow datasets there has been an understandable surge in pull requests to the main repository. Due to this, and the high bar for quality set by the maintainers, there is a considerable backlog of PRs awaiting review (including several of my own). Having multiple PRs sitting in queue for months on end causes two main issues:
- Poor visibility: Hacking datasets together is not particularly fun and nets you no research kudos. If I can save someone else the bother of finding download links, loading and transforming data, all the better. The prospect of people finding an obscure branch or fork and digging into the detail is quite low.
- Difficult/confusing packaging: for separate packages, I'd rather not have dependencies on my own branch of such a large package like tfds. For people wanting to quickly download my research projects and run them without much bother (and whom have a lax attitude towards virtual environments etc.), having
pip
install a custom fork can go unnoticed at first and lead to much confusion later on.
Eventually, we would like to see this work merged into tfds. Due to size, this will likely have to be made in multiple separate pull requests.
To make this process as simple as possible, we keep the directory structure identical. Creating pull requests should involve:
- changing
import shape_tfds.REST
toimport tensorflow_datasets.REST
- removing
import trimesh
and replacingtrimesh.blah
calls withlazy_imports.trimesh.blah
- adding relevant
dataset_files
to tfds setup.py - copying
url_checksums
across
Under heavy development. Expect untested functionality, bugs and breaking changes.
TODO:
- shapenet core: work out a better way to produce compound configs (zipped/concat)
- add
PaddedTensor
tests - shapenet part
- modelnet
- abc