Skip to content

jonathandunn/political_classification

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

This repository contains feature extraction code for the paper "Profile-based Authorship Analysis."

The dataset is provided here: https://s3.amazonaws.com/jonathandunn/Legislative_Texts.zip

The Vectorizers in the 'data' folder were trained on speeches from the US House and US Senate, Canadian House, and European Parliament, along with misc. political speeches (all in data set).

This produces X, y feature vectors with or without part-of-speech tags. The "ITFIDF" file produces TF-IDF transforms while the "RAW" file produces frequency counts.

About

Code from "Profile-based authorship analysis"

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages