phd-research

Matlab and Python files for PhD topic "Ensemble Learning of High Dimensional Datasets"

Matlab

This folder contains the files to reproduce the results and figures in "Ensemble Learning of High Dimensional Datasets". External codes are not included in the repository but can be downloaded from their original sources. The external codes include

Linear Discriminant Analysis https://www.mathworks.com/matlabcentral/fileexchange/29673-lda-linear-discriminant-analysis
L1-Magic https://statweb.stanford.edu/~candes/l1magic/
MatConvNet http://www.vlfeat.org/matconvnet/

Also not included are the weights for the deep neural networks, and the image, audio and the UCI datasets. These however can be downloaded from

Deep Neural Network Weights : http://www.vlfeat.org/matconvnet/pretrained/
Imagenet ILSVRC 2012 : http://www.image-net.org/challenges/LSVRC/2012/
Images: http://sipi.usc.edu/database/
Audio:
- Danse Arabe :- https://freesound.org/people/FreqMan/sounds/42956/
- Nature sounds :- https://freesound.org/people/IchBinChrist/sounds/424288/
- Human Speech :- https://freesound.org/people/tim.kahn/sounds/71744/
UCI datasets:
- ARCENE :- https://archive.ics.uci.edu/ml/datasets/Arcene
- DEXTER :- https://archive.ics.uci.edu/ml/datasets/Dexter
- DOROTHEA :- https://archive.ics.uci.edu/ml/datasets/Dorothea
- GISETTE :- https://archive.ics.uci.edu/ml/datasets/Gisette
- MADELON :- https://archive.ics.uci.edu/ml/datasets/Madelon

The folder Utility contains helper codes that should be included via the addpath command. Details for the codes in this folder are as descibed in the next section Other folders organizes the codes by the chapters they are used in including codes used for analysis and not discussed anywhere in the thesis.

Utility

CreateAxes :- creates a MxN grid of axes according to the dimensions specified and settings specified. Grid has shared legends. Code was written later in the research when it became obvious that manually arranging ~1000 figures was distracting from more productive work
HouseHolder_nv :- defines the householder normal vector with chracteristics specified by vector v. Implements Algorithms C.1
binRandGen : generates non-i.i.d binary vectors with specific "densities"
cummBinnProb : cdf calculator for binomial distributions
cummPolyaProb : calculates majority vote ensemble accuracy as per P.E. distribution
rand* : Generates various random projections

JLL - Chapter 4

Codes in this folder requires the images, audio, and UCI-DOROTHEA datasets

ImageWrapper : This code reproduces the figures in the section on the empirical corroboration for image datasets (Figures: )
ImageWrapperStratified : This code reproduces the figures in the section on the empirical corroboration for image datasets with stratified sampling(Figures: )
SparseWrapper : This code reproduces the figures in the section on the empirical corroboration on real world sparse binary vectors (Figures:
AudioWrapper : This code reproduces the figures in the section on the empirical corroboration for audio dataset
SynthBinWrapper : This code reproduces the figures in the section on the empirical corroboration for synthetic testcases. Note that the features are not IID (Figures: )
SynthRandWrapper : This code empirically corroborate our theory when the features are not generated by a bernoilli process.

CS - Chapter 4

Codes in this folder requires the images and audio dataset, as well as L1-Magic

l1-subspace : Small proof of concept showing unsuitability of RS as a sensing matrix. Figure . Note: both RP and RS has a ~45% chance failing to reconstruct the sparse signal
cs-image : Code used to reconstruct image from small number of samples
cs-audio : Code used to reconstruct audio from small number of samples. Warning, do not listen to the audio reconstruction of low samples signals on headphones. Volume should be kept at ~80% at all times to prevent speaker damage
cs-audio-sup : L1-eq reconstruction of audio file, not used in the thesis. RP may fail to converge, causing code to fail sometimes
cs-image-sup : L1-eq reconstruction of image file, not used in the thesis. RP may fail to converge, causing code to fail sometimes

FP - Chapter 5

badFlipWrapper : Wrapper around badFlip_ for organizing the flipping probability and plotting the figures
fpEnsExeriment : Simulates how flipping probability relates to ensembles and a simulation of an Ensemble of RS projection on the Bayes' classifier

Ens - Chapter 6

ldaEnsembleRotationDataIndPlotWeightedOrderedLabelNoise_auto : Generates synthetic testcases used in chapter 5
ldaDataset_* : runs experiment on UCI datasets

DNN - Chapter 7

cnn_imagenet_* : Runs experiments on the corresponding DNN
process_results_* : ensembles the results of the PseudoSaccade views using majority vote
- process_results_borda : experiment with a borda count ensemble
- process_results_confMat : generates confusion matrix for additional analysis
corrBase_tbl_7p5 : Calculates diversity measure between base and the pseudosaccade view
corrSaccade_tbl_7p6 : Calculates diversity measure between the pseudosaccades view

Python

These are helper scripts for processing and training neural network, required modules include numpy, scipy, matplotlib, keras, tensorflow, Foolbox

adverseAttack - generate Foolbox adversarial examples, requires Foolbox
ensExp* - various neural network experiments on the ensembles

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
matlab		matlab
python		python
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

matlab

matlab

python

python

README.md

README.md

Repository files navigation

phd-research

Matlab

Utility

JLL - Chapter 4

CS - Chapter 4

FP - Chapter 5

Ens - Chapter 6

DNN - Chapter 7

Python

About

Releases

Packages

Languages

martianunlimited/phd-research

Folders and files

Latest commit

History

Repository files navigation

phd-research

Matlab

Utility

JLL - Chapter 4

CS - Chapter 4

FP - Chapter 5

Ens - Chapter 6

DNN - Chapter 7

Python

About

Resources

Stars

Watchers

Forks

Languages