Skip to content

lzhou89/ChoosyDonors

Repository files navigation

ChoosyDonors

####Project Description

ChoosyDonors simplifies the process of selecting projects that users are interested in from DonorsChoose, a crowdfunding website for teachers to get funding for projects they want to do with their students. Working with data from DonorsChoose, the app uses natural language processing and machine learning (k-means clustering) to create clusters of similar projects that users can select to create donation portfolios for easy lump donations. ChoosyDonors also calculates an impact score based on multiple criteria to help assess need at the project's school.

ChoosyDonors was built by Linda Zhou in November 2014, as her capstone project for Hackbright Academy.

###Contents

Features

The homepage:

  • User selects how he/she wants to select projects

homepage

Creating a donation portfolio:

  • User first selects a theme for the portfolio.
  • After the user indicates the size of the portfolio, the app will randomly select that many projects from that cluster.
  • Then, the user can select the ones he/she likes and save them to a portfolio that can be accessed from his/her profile.

clusters

choose

portfolio

Single Project Search:

  • Users can also search through the database for specific projects.
  • Selecting a filter will immediately refine the search results the user sees.

search

Search by Impact:

  • Users can maximise their impact based on the level of financial need of a school by choosing a project using this map view. Darker regions reflect greater need.
  • Impact scores were assigned to all schools based on percent of students with free-reduced lunch, graduation rates, percent of students taking the SAT, teacher to student ratios, poverty and crime rates for those regions.

impact

Technologies

Python, SQLite, SQLAlchemy, NLTK, Scikit-learn, Flask, Jinja, Javascript, jQuery, AJAX, d3.js, GoogleMaps API, beautifulsoup, HTML, CSS, Twitter Bootstrap

#####Clustering The need data provided by teachers was processed using NLTK to remove all punctuation and stem all words. Stop words were removed and tf-idf analysis was performed in order to find the key words for each project in the need data. Projects were then grouped into various themes using k-means clustering (scikit-learn) based on the tf-idf weights.

#####Impact Score & Data Visualization The impact score is a composite score based on the graduation rates, teacher-student ratios, percent of students taking the SAT, percent of students with free or reduced lunch, poverty levels, and crime levels for the region a school is located in. Some of the data had to be acquired by scraping location data websites. The data was then normalized and the impact score was calculated based on the normalized data where each factor was weighted equally.

The set of impact scores were then translated into a choropleth map using jefffriesen's d3.js visualization of US zip codes. The SVG was overlaid on GoogleMaps and a randomly selected sampling of 1000 projects is displayed on the map as markers. GoogleMaps' MarkerClusterer library was used to prevent visual overload and improve the overall user experience.

#####Database The SQLite database uses SQLAlchemy as its ORM and contains 10 tables with 3gb of data on 600,000+ projects.

Structure of Files

The main files of ChoosyDonors are:

  • ```model.py```: This file creates the database and defines the classes that map to the database tables.
  • ```app.py```: This is the heart of the app. It contains all the routes, queries & updates the database, and feeds information to the front end.
  • ```cluster.py```: This file performs natural language processing & tf-idf on the need data for all projects and uses k-means clustering to form the clusters for donation portfolios.
  • ```seed.py```: This file seeds the database.
  • ```calc_impact.py```: This is a script to calculate the impact scores based on various regional factors.
  • ```crime_scrape.py``` & ```tablescraper.py```: These were the scripts to scrape regional data off location data websites.
  • ```projects.db```: This is the sqlite3 database for the app. It is not included in this repository, but the files to set up a similar database are included here (```model.py```).
  • **CSS files:** These files (saved in the ```static``` folder) control the styling of the HTML pages.
  • **HTML templates:** These files (saved in the ```templates``` folder) are the pages of the app.
  • **JS files:** These files (saved in the ```static``` folder) allow projects to be loaded in the app without the entire page having to reload.

Data Sources

  • The bulk of the data used is available on the DonorsChoose website in csv form.
  • The education related data for the impact scores is available from the National Center for Education Statistics.
  • Poverty data for neighborhoods in the US can be found on Zipatlas.
  • Crime data for neighborhoods in the US can be found on City-Data.

Learn more about the developer by visiting her LinkedIn.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published