Project Algo

Authors: Michael Bleuez and a friend who may want to remain anonymous

Goal: The project aims to find the size of "clusters" within a set of points. (A cluster is a connex composant, 2 points being 'in contact' iff they are within a given distance of each other)

Performance:

A perfomance table (versions of the program, input format and execution time) is available in math/perfs.ods
Complexity is roughly of O(n.log(n).a^k) with n the number of points, where a~2.2 and k is the dimension of input space, however actual execution time vary a lot depending on properties of input;
1. how much there are points interlinked (big clusters are detrimental in general) (or how big is distance relative to number of points)
2. randomness of the distribution: uniformly distributed allow faster resolution, to a big extent
Real-world speed: at this point of the project, our algorithm can process any reasonable (random-like, 2D) input of size 20k in ~0.5s (i5 4210U 1.7Ghz, SATA SSD) It is really hard to create a non-random distribution that is really the worst possible, but we have been able to slow the algorithm up to 60 sec (still 20k points.) For reference a 100% naïve algorithm take up to 8 minutes to solve (any) 20k-sized input.

Etymology:

cluster: are "connex composant", is a class of objects. Cluster object include reference to the points they contain, which themselves know which cluster they are a part of
quadrillage: divide the space in "cases"
points: are given a reference to an unique to a cluster object (containing only said point at first) at their creation. merge is done via merge method of cluster object
density: relative to the given distance, how much the space is 'crowded'. A good exemple is that same-density sets have clusters of same ratio (size of cluster)/(total number of points)
a-types: are input where the points are quite sparse (relative to the given distance); an a-type input will contain only few tuples and a pletoria of singletons
b-types: are inputs contains too much points relative to the given distance, thus is usually one extra large cluster and a few others

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
__pycache__		__pycache__
compte_rendu		compte_rendu
geo		geo
maths		maths
test		test
README.md		README.md
afficheur.py		afficheur.py
connectes.py		connectes.py
generation_test.py		generation_test.py
script_test.sh		script_test.sh
sujet.pdf		sujet.pdf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

pycache

pycache

compte_rendu

compte_rendu

geo

geo

maths

maths

test

test

README.md

README.md

afficheur.py

afficheur.py

connectes.py

connectes.py

generation_test.py

generation_test.py

script_test.sh

script_test.sh

sujet.pdf

sujet.pdf

Repository files navigation

Project Algo

Authors: Michael Bleuez and a friend who may want to remain anonymous

About

Releases

Packages

Languages

michaelb/point-clustering

Folders and files

Latest commit

History

Repository files navigation

Project Algo

Authors: Michael Bleuez and a friend who may want to remain anonymous

About

Topics

Resources

Stars

Watchers

Forks

Languages