kaggle:

handle a variety of dataset, the basic idea is just to get the dataset in a convenient way and, most importantly, without any data transformation

in general, the data is saved in the shape of N * C * X * Y

minist:

Get from chainer's dataset module, the resulting images is 0-255 uint8 numpy matrix, saved into pyarrow zero-copy data type

usage:

from dataset.minist.feed import feed
path = "./save/"
feed(feed_path=path)

then training data will be stored in ./save/X.pa, the training label will be stored in ./save/Y.pa

to load the data into numpy, try:

from dataset import pa2np
X, Y = pa2np("./save/X.pa"), pa2np("./save/Y.pa")

sample of resulting dataset:

cifar

Get from chainer's dataset moduel, the resulting image is 0-255 uint8 numpy matrix, saved into pyarrow zero-copy data type

usage:

for cifar-10

from dataset.cifar.feed import feed
path = "./save/"
feed(feed_path=path, dataset_type=10)

then training data will be stored in ./save/X_10.pa, the training label will be stored in ./save/Y_10.pa

sample of resulting dataset:

for cifar-100

from dataset.cifar.feed import feed
path = "./save/"
feed(feed_path=path, dataset_type=100)

then training data will be stored in ./save/X_100.pa, the training label will be stored in ./save/Y_100.pa

sample of resulting dataset:

To load the data, see that in mnist above

coil-20

the resulting image is 0-255 uint8 numpy matrix, saved into pyarrow zero-copy data type

usage:

for unprocessed

from dataset.coil20.feed import feed
path = "./save/"
feed(feed_path=path, dataset_type='unprocessed')

then training data will be stored in ./save/X_unprocessed.pa, the training label will be stored in ./save/Y_unprocessed.pa

sample of resulting dataset:

for processed

from dataset.coil20.feed import feed
path = "./save/"
feed(feed_path=path, dataset_type='processed')

then training data will be stored in ./save/X_processed.pa, the training label will be stored in ./save/Y_processed.pa

To load the data, see that in mnist above

sample of resulting dataset:

kaggle:

below are some kaggle dataset

fer2013

the resulting image is 0-255 uint8 numpy matrix, saved into pyarrow zero-copy data type

usage:

from dataset.kaggle.fer2013.feed import feed
path = "./save/"
feed(feed_path=path)

then training data will be stored in ./save/X.pa, the training label will be stored in ./save/Y.pa

sample of resulting dataset:

whales2018

the resulting image and correponding label is save into dict object

usage:

from dataset.kaggle.whales2018.feed import feed
path = "./save/"
feed(feed_path=path)

isbi 2012

usage:

from dataset.isbi.c2012.feed import feed, load 
path="./save/"
feed(feed_path=path)
#to get data:
data, mask = load(load_path=path)

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
data		data
datasets		datasets
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
__init__.py		__init__.py
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

data

data

datasets

datasets

.gitignore

.gitignore

LICENSE

LICENSE

README.md

README.md

init.py

init.py

setup.py

setup.py

Repository files navigation

handle a variety of dataset, the basic idea is just to get the dataset in a convenient way and, most importantly, without any data transformation

in general, the data is saved in the shape of N * C * X * Y

minist:

cifar

coil-20

kaggle:

fer2013

whales2018

isbi 2012

About

Releases

Packages

Contributors 2

Languages

License

overshiki/datasets

Folders and files

Latest commit

History

Repository files navigation

handle a variety of dataset, the basic idea is just to get the dataset in a convenient way and, most importantly, without any data transformation

in general, the data is saved in the shape of N * C * X * Y

minist:

cifar

coil-20

kaggle:

fer2013

whales2018

isbi 2012

About

Topics

Resources

License

Stars

Watchers

Forks

Languages