This repository contains the following directories and files:
- Puzzle_Masters_Project_Code.ipynb
- Puzzle_Masters_Project_Report.pdf
- requirements.txt
- Funcs4Testing.py
- ImgCluster.py
- ImgPrep.py
- AdjacencyMatrices
- pickle_files
- plots
- puzzle_scans
The two main deliverables inside of this repository are Puzzle_Masters_Project_Code.ipynb and Puzzle_Masters_Project_Report.pdf. These two files contain the summary of our analysis and the code supporting it.
To select a puzzle and run the code, set the puzzle_folder
variable at the top of the Puzzle_Masters_Project_Code.ipynb file. Valid values include "puzzle_1", "puzzle_2", or "puzzle_3".
The load_from_pickle
variable can be used to speed up script execution. If set to True, the script will skip data preprocessing and load preprocessed feature set from pickle file.
The requirements.txt file contains the list of package dependencies that are necessary to install inside of a virtual environment in order to run Puzzle_Masters_Project_Code.ipynb.
A description of the remaining files and directories are include in the table below:
File/Directory | Description |
---|---|
Funcs4Testing.py | Python library containing functions to test the cluster models' performance and display images. |
ImgCluster.py | Python library containing functions to build and run cluster models. |
ImgPrep.py | Python library containing functions to preprocess data for clustering. |
AdjacencyMatrices | Adjacency matrix for each puzzle stored in csv format. These adjacency matrices are used to measure the cluster models' performance. |
pickle_files | Pickle files of feature sets. These files can be imported and fed directly into the clustering model in order to skip preprocessing step and speed up script execution. |
plots | Images and graphs generated from Puzzle_Masters_Project_Code.ipynb file. |
puzzle_scans | Data source containing subdirectory of images for each puzzle. |
The project can be found here on GitHub.