####Difference between supervised and unsupervised:
- supervised:numerical
- evaluation: accuracy, precision, recall
- unsupervised:
- evaluation: how? - clusters are in the eye of beholder :-)
Cluster Analysis: Finding groups of objects such that the objects in a group will be similar to one another and different from the objects in other groups.
Measures of cluster validity:()
- external index:
- entrop
- y
- internal index:
- Sum of Squared Error(SSE)
- relative index: used to compare two clusterings
- SSE or entropy
- cohesion and separation
The validation of clustering structures is the most difficult and frustrating part of cluster analysis
Evaluation Metrics for clustering:( reference )
- Conductance: [reference][1]
- Coverage: [reference][2]
- Modularity: [reference][3]
- Performance: [reference][4]
- Silhouette index : reference [1]: # [2]: # [3]: # [4]: #
- internal measures: cohesion and separation
- external measures: entropy and purity
- comparing different algorithms
- Good roadmap, Completed Paper
- SOM (Self-organizing Map)
- cluster validation: reference --> idea: using binary search
- cluster validation(statistical approach) : reference
- for proofs: reference --> chapter 3
- solve two weaknesses of spectral clustering: reference --> method instead of k-means
- diffusion maps, spectral clustering and eigenfunctions of fokker-planck operators: reference --> euclidean distance in new representation has a meaningful description
- consistency of spectral clustering: annals of statistics 2008
- limits of spectral clustering: reference
- random walk survey: reference
- kdd bipartite spectral: reference
- ........
- co-training spectral: ICML2011
- mining clustering dimensions: ICML2010
- Large-Scale Multi-View Spectral Clustering via Bipartite Graph : AAAI2015
- Incremental Spectral Clustering with the Normalised Laplacian: NIPS2011
- good review: this
- validation methods: this
- kernel and spectral: this
- Spectral Clustering with Purturbed Data: nips 2009
- Spectral graph theory: chung 97
- ...
- dbscan visualization : reference
- dbscan code: code
- MDS(MultiDimensional Scaling): code
- numpy: tutorial
- R package: this
- stanford dataset: this
- list of repositories: reference
- wine: link
- Dermatology: link
- Letter Recognition: link
- handWritten digits: link
- UCI:
- fisher iris
- wine
- breast cancer winsconsin
- heart
- handwritten digit
- Solving Cluster Ensemble Problems by Bipartite Graph Partitioning: link
- local clustering on multiple manifold: link
- Train M d-dimensional local linear manifolds by using MPPCA to approximate the underlying manifolds
- Determine the local tangent space of each point ( using EM )
- Compute pairwise affinity between two local tangent spaces using
- p_ij = (max dot(u_1, v_1) )x ... x(max dot(u_k,v_k) ) --> u_i in tangent space
- q_ij = if in KNN 1, 0 o.w
- w_ij = p_ij q_ij
- based on random walk: link
- based on neighbor propagation: link
- construct similarity graph and neighborhood graph, propagate it
- based on newton equations: link
- sparsify the affinity matrix using second newton equation
- Local density adaptive: link
- density sensitive: link
- Laplacian Score: nips 2005
- Spectral Feature Selection:2007
- Efficient Spectral Feature Selection with minimum dependency: 2010
- Semi-supervised Feature Selection via Spectral Analysis: 2007
- SCORE
- On Spectral Clustering
- Learnng Spectral Clustering
- Multiple non-redundant spectral clustering views
- for Kernel: Bernhard Scholkopf
- good papers: University of Washington
- regularized spectral clustering:jmlr
- model based clustering(select number of clusters): University of Mishigan
- statistical view of marginal spectral clustering: Jordan
- spectral dimension reduction: Jordan 2011