The tools used to fetch flickr photos

Install

npm install

Since flickrnode module can not be installed by npm, you shoule git clone it into node_modules directory.

cd node_modules
git clone git://github.com/ciaranj/flickrnode.git

Usage

Search Flickr photos

coffee query.coffee -t <query term> -i <image download path> -f <feature storing path> -m <image processing path> -l <geolocation> -c <database collection>  -n <from datetime> -a <to datetime> -p <starting page> -o <total pages> -s <sorting option> -u <flickr user NSID>

For example:

coffee query.coffee -t paris -i ./test -f ./feature -m ./tmp -l 'Paris, France' -c paris -n "2012/4/1" -a "2012/4/5" -p 1 -o 2 -s interestingness-desc

Sample photos from fetched Flickr photos

sample.coffee -c <database collection> -f <feature storing path> -d <output file> -s <sample number>

For example:

sample.coffee -c paris -f ./features -d test.clu -s 1000

Clustering of sampled photos

python cluster.py -f <sampled photo features> -o <output file> -k <number of clusters>

In addition to "output file", another file "output file" + ".meta" will be created that contains mean and std feature vector of sampled features.

For example:

python cluster.py -f test.clu -o sample.kmean -k 64

Encoding raw image features as VLAD (vector of locally aggregated descriptors) feature

python vlad_encoder.py -c <codebook filename> -n <codebook metadata filename> -f <raw feature filename> -o <output filename>

For example:

python vlad_encoder.py -c flickr_sample_500.k_16 -n flickr_sample_500.k_16.meta -f features.hes -o features.vlad

Encoding all images in database collection as VLAD features

coffee vlad.coffee -c <database collection> -f <raw feature path> -v <vlad path> -b <codebook filename> -m <codebook metadata filename>

For example:

coffee vlad.coffee -c paris -f ./features -v ./vlad -b flickr_sample_500.k_16 -m flickr_sample_500.k_16.meta

Encoding all images in a specified list as VLAD features

coffee vlad_from_filelist.coffee -l <image list file> -f <raw feature path> -v <vlad storing path> -b <codebook> -m <codebook metadata filename>

For example:

coffee vlad_from_filelist.coffee -l image_list.txt -f ./features -v ./vlad -b flickr_sample_500.k_16 -m flickr_sample_500.k_16.meta

Validating VLAD feature performance

python vlad_validate_distance.py -d <vlad feature path> -q <query list file> -g <groundtruth filename>

For example:

python vlad_validate_distance.py -d ./vlad -q query.txt -g ground_truth.txt

Generating pairwise distance matrix for images

python vlad_pairwise_distance.py -d <vlad feature path> -o <output filename>

For example:

python vlad_pairwise_distance.py -d ./vlad -o paris_7910.distance

Generating vlad feature matrix for images

python vlad_data_matrix.py -d <vlad feature path> -o <outout filename> -s <optional sample number> -f <optional output format> -l <optional data label>

The output format could be 'libsvm'. If not given, the output format is simply the arraies of vlad features. The sample number could be set to generate specified numbers of parts of the data. The data label could be set to generate '1' or '-1' label in output file of libsvm format.

For example:

python vlad_data_matrix.py -d ./vlad -o paris_7910.data -f libsvm -l positive

Clustering flickr images using Affinity Propagation algorithm

# Running R interaction envrionment
> source("apc_cluster.R")
> data_cluster("paris_7910.data") # given the vlad feature matrix

# Or under command line
R --slave --args paris_7910 < apc_cluster.R

4 files will be generated under current path. Use "paris_7910.data" as example:

paris_7910.apc: Binary stored APResult object of APCluster package.
paris_7910.apc.clusters: Text-format cluster list.
paris_7910.apc.exemplars: Text-format exemplar list.
paris_7910.apc.similarity: Negative squared distances (Euclidean) matrix.

Browsing clustering results

coffee app.coffee -i <image base path> -a <apc cluster list filename> -c <image set sub-path under image base path>

For example:

coffee app.coffee -i /images -a ./paris_7910.apc.clusters -c paris

A node.js application will be running at port 3000. Open browser to see the clustering results at a URL such as http://localhost:3000/

Browsing MapReduce clustering results

coffee apc_mapreduce.coffee -i <image base path> -a <apc cluster result pathname> -c <image set sub-path under image base path> -s <the threshold of cluster size>

For example:

coffee apc_mapreduce.coffee -i /images -a ./output -c paris -s 20

Generating vlad features for each APC cluster in libsvm format.

python vlad_data_matrix_for_clusters.py -d <vlad feature path> -c <apc cluster list filename> -o <output path and filename prefix> -t <threshold for cluster size>

For example:

python vlad_data_matrix_for_clusters.py -d ./vlad -c paris_7910.apc.clusters -o ./clusters_vlad/cluster_data -t 5

The files ./clusters_vlad/cluster_data.cluster.[0 ~ (clusters_number -1)] will be generated. Each file contains vlad features for photos in the corresponding APC cluster.

The bash script used to train models and test data

./train_model.sh <training data path> <model output path>
./test_model.sh <model path> <test data> <classification result output path>

The bash script used to create the directionary structure for dataset

./create_dataset.sh <base path> <dataset name>

coffee svm_state.coffee -r <svm testing result path> -g <calculating statistics for positive or negative>

For example:

coffee svm_state.coffee -r ./classification -g positive

Broswing fetched photos

coffee image.coffee -i <image base path under public> -c <image relative path under image base path>

For example:

coffee image.coffee -i images -c test

Generate crowdsourcing dataset file

python crowdsourcing.py -d <image path> -o <output file> -n <number of sampled images> -u <image prefix> -m <file output mode {w|a}>

For example:

python crowdsourcing.py -d ./images -o crowd.txt -n 10 -u "http://testurl/images/" -m w

Crawling Pinterest images

coffee pinterest.coffee -i <image download path> -f <feature storing path> -m <image processing path> -c <database collection> -p <starting page> -o <total pages>

For example:

coffee pinterest.coffee -i ./images -f ./features -m ./tmp -c pinterest -p 1 -o 2

Name		Name	Last commit message	Last commit date
Latest commit History 56 Commits
bin		bin
lib		lib
node_modules		node_modules
public		public
views		views
README.md		README.md
apc_cluster.R		apc_cluster.R
apc_mapreduce.coffee		apc_mapreduce.coffee
app.coffee		app.coffee
cluster.py		cluster.py
config_example.coffee		config_example.coffee
create_dataset.sh		create_dataset.sh
crowdsourcing.py		crowdsourcing.py
image.coffee		image.coffee
package.json		package.json
pinterest.coffee		pinterest.coffee
query.coffee		query.coffee
sample.coffee		sample.coffee
svm_state.coffee		svm_state.coffee
test_model.sh		test_model.sh
train_model.sh		train_model.sh
utils.py		utils.py
vlad.coffee		vlad.coffee
vlad.py		vlad.py
vlad_data_matrix.py		vlad_data_matrix.py
vlad_data_matrix_for_clusters.py		vlad_data_matrix_for_clusters.py
vlad_encoder.py		vlad_encoder.py
vlad_from_filelist.coffee		vlad_from_filelist.coffee
vlad_pairwise_distance.py		vlad_pairwise_distance.py
vlad_validate_distance.py		vlad_validate_distance.py

Jency/flickr_fetcher

Folders and files

Latest commit

History

Repository files navigation

The tools used to fetch flickr photos

Install

Usage

Search Flickr photos

Sample photos from fetched Flickr photos

Clustering of sampled photos

Encoding raw image features as VLAD (vector of locally aggregated descriptors) feature

Encoding all images in database collection as VLAD features

Encoding all images in a specified list as VLAD features

Validating VLAD feature performance

Generating pairwise distance matrix for images

Generating vlad feature matrix for images

Clustering flickr images using Affinity Propagation algorithm

Browsing clustering results

Browsing MapReduce clustering results

Generating vlad features for each APC cluster in libsvm format.

The bash script used to train models and test data

The bash script used to create the directionary structure for dataset

Broswing fetched photos

Generate crowdsourcing dataset file

Crawling Pinterest images

About

Resources

Stars

Watchers

Forks