henrysdev/QueryEngine
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
Henry Warren 2018 [OVERVIEW] This program has the following execution flow: 1. Reads in and parses CSV data from provided data file (term-frequency.csv is included in the project files). 2. Creates leader/follower clusters and prints the pairs and their respective distances out to console. 3. Query Engine user input loop begins. You will be prompted to enter a search query. [DEPENDENCIES] - Python 3 - numpy [SETUP] This project is written in Python 3. The only external library it uses is numpy, which can be downloaded and installed via a package manager such as pip if you do not already have it installed. This should do the trick if you do not have numpy already installed: $ pip3 install numpy [RUN_INSTRUCTIONS] - Navigate into the project directory. $ cd QueryEngine/src/ - Start the program (note that the second argument is the name of the data csv that was generated by the web crawler. This is included in the src/ folder) $ python3 query_engine.py term-frequency.csv [PROJECT_STRUCTURE] QueryEngine/ |--> README |-- src/ |--> database.py # holds document and term data from the csv |--> document.py # represents a document object |--> query_engine.py # main driving class for program |--> similarity.py # math and matrix calculations |--> term-frequency.csv # web crawler output file
About
Local search engine that utilizes k-means clustering to determine relevance.
Resources
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published