Skip to content

babakahmadi/Thesis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

32 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

####Difference between supervised and unsupervised:

  • supervised:numerical
    • evaluation: accuracy, precision, recall
  • unsupervised:
    • evaluation: how? - clusters are in the eye of beholder :-)

Cluster Analysis: Finding groups of objects such that the objects in a group will be similar to one another and different from the objects in other groups.

Measures of cluster validity:()

  • external index:
    • entrop
    • y
  • internal index:
    • Sum of Squared Error(SSE)
  • relative index: used to compare two clusterings
    • SSE or entropy
    • cohesion and separation

The validation of clustering structures is the most difficult and frustrating part of cluster analysis

Evaluation Metrics for clustering:( reference )

  • Conductance: [reference][1]
  • Coverage: [reference][2]
  • Modularity: [reference][3]
  • Performance: [reference][4]
  • Silhouette index : reference [1]: # [2]: # [3]: # [4]: #

Validation:

  • internal measures: cohesion and separation
  • external measures: entropy and purity

Algorithms:

some references:

  • cluster validation: reference --> idea: using binary search
  • cluster validation(statistical approach) : reference
  • for proofs: reference --> chapter 3
  • solve two weaknesses of spectral clustering: reference --> method instead of k-means
  • diffusion maps, spectral clustering and eigenfunctions of fokker-planck operators: reference --> euclidean distance in new representation has a meaningful description
  • consistency of spectral clustering: annals of statistics 2008
  • limits of spectral clustering: reference
  • random walk survey: reference
  • kdd bipartite spectral: reference
  • ........
  • co-training spectral: ICML2011
  • mining clustering dimensions: ICML2010
  • Large-Scale Multi-View Spectral Clustering via Bipartite Graph : AAAI2015
  • Incremental Spectral Clustering with the Normalised Laplacian: NIPS2011

good Tutorials:

  • good review: this
  • validation methods: this
  • kernel and spectral: this
  • Spectral Clustering with Purturbed Data: nips 2009
  • Spectral graph theory: chung 97
  • ...

some implementation:

some packages:

Dataset:

  • stanford dataset: this
  • list of repositories: reference
  • wine: link
  • Dermatology: link
  • Letter Recognition: link
  • handWritten digits: link
  • UCI:
    • fisher iris
    • wine
    • breast cancer winsconsin
    • heart
    • handwritten digit

bipartite graph:

  • Solving Cluster Ensemble Problems by Bipartite Graph Partitioning: link

Constructing Similarity Graph:

  • local clustering on multiple manifold: link
    • Train M d-dimensional local linear manifolds by using MPPCA to approximate the underlying manifolds
    • Determine the local tangent space of each point ( using EM )
    • Compute pairwise affinity between two local tangent spaces using
    • p_ij = (max dot(u_1, v_1) )x ... x(max dot(u_k,v_k) ) --> u_i in tangent space
    • q_ij = if in KNN 1, 0 o.w
    • w_ij = p_ij q_ij
  • based on random walk: link
  • based on neighbor propagation: link
    • construct similarity graph and neighborhood graph, propagate it
  • based on newton equations: link
    • sparsify the affinity matrix using second newton equation
  • Local density adaptive: link
  • density sensitive: link

for feature selection:

  • Laplacian Score: nips 2005
  • Spectral Feature Selection:2007
  • Efficient Spectral Feature Selection with minimum dependency: 2010
  • Semi-supervised Feature Selection via Spectral Analysis: 2007

for documentation:

base papers:


to read: