Skip to content

timothyLeeXQ/GR-5067-Natural-Language-Processing

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

GR 5067 Natural Language Processing

This repo contains assignments for the class GR 5067: Natural Language Processing offered by the Columbia University Quantitative Methods in Social Science department. The class and its assignments aim to "provide a detailed tour on how to access, clean, “munge” and organize data, both big and small." (taken from the course syllabus, which the instructor would prefer not to be forked).

Course assignments focused on:

  • HW1 - Familiarising students with Python syntax
  • HW2 - Use a Google search crawler (instructor provided) to generate a corpus of text files
  • HW3 - Simple word search, and model based sentiment analysis
  • HW4 - Streaming twitter classifier

The course final project was a free-choice natural language processing project, and a class presentation. I chose to run an LDA model on the Book of Psalms. A full report is available in this repo.

About

Repo for GR 5067 Assignments and Project. Natural Language Processing fundamentals with NLTK and Sklearn

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published