wisautprof

Web information systems - Authorship profiling

Folder structure

DELIVERABLES/ contains
- automatically labeled data (separated in to sets, one for testing(1/3) the rest for training(2/3))
- manually labled data
- Report
classifiers/ contains the classifiers used to fill in the information about our profiles
- most are rule base and implement by hand
- some use external APIs such as ageanalyzer.com or lymbix for age and sentiment detection
data/ contains all our data collections (either in .)son or .shelve)
models is what interacts with the .shelve files in data/ and structure the information

To rebuild the dataset, you need to run python buildDataSetsForServer_new.py. It will generate all the data and put it inside data/test_data/, than you can copy this new data in DELIVERABLES/auto_labeled_data/

dependencies

python-twitter: https://github.com/bear/python-twitter
lymbix https://github.com/lymbix/Python-wrapper
PyQuery http://pythonhosted.org/pyquery/api.html
nltk http://nltk.org/
ageanalyzer.com

Name		Name	Last commit message	Last commit date
Latest commit History 164 Commits
DELIVERABLES		DELIVERABLES
apis		apis
classifiers		classifiers
crawlers		crawlers
data		data
models		models
.gitignore		.gitignore
CoderwallDbExamples.py		CoderwallDbExamples.py
README.md		README.md
addCities.py		addCities.py
annotationTweetsWithSentiment.py		annotationTweetsWithSentiment.py
attributeSentimentExample.py		attributeSentimentExample.py
buildCleanDBofAboutMeProfiles.py		buildCleanDBofAboutMeProfiles.py
buildDataSetsForServer.py		buildDataSetsForServer.py
buildDataSetsForServer_new.py		buildDataSetsForServer_new.py
buildTweetSentDatabase.py		buildTweetSentDatabase.py
collectAboutMeLinks.py		collectAboutMeLinks.py
collectCoderwallProfiles.py		collectCoderwallProfiles.py
crawlLinkedinForAboutmeProfiles.py		crawlLinkedinForAboutmeProfiles.py
crawlTwitterForAboutmeProfiles.py		crawlTwitterForAboutmeProfiles.py
education_classifier.py		education_classifier.py
fetchAllAboutMeProfiles.py		fetchAllAboutMeProfiles.py
fetchAllTwitterLinkedinProfilesFromAboutmeExample.py		fetchAllTwitterLinkedinProfilesFromAboutmeExample.py
fetchCountiesCitiesFromWikipedia.py		fetchCountiesCitiesFromWikipedia.py
fetchData.py		fetchData.py
fetchProfilesFromLinkedIn.py		fetchProfilesFromLinkedIn.py
getEducationForOkcupidProfileExample.py		getEducationForOkcupidProfileExample.py
importCodewall.py		importCodewall.py
importProfiles.py		importProfiles.py
importSentiment.py		importSentiment.py
listCountiesAndCitiesExample.py		listCountiesAndCitiesExample.py
mergeTweetsDataBases.py		mergeTweetsDataBases.py
natLangSentAnalysis.py		natLangSentAnalysis.py
okcupid.py		okcupid.py
proccessDating.py		proccessDating.py
textProcSentAnalysis.py		textProcSentAnalysis.py
updateTweetFinalDB.py		updateTweetFinalDB.py

SkinMissle76/wisautprof

Folders and files

Latest commit

History

Repository files navigation

wisautprof

Folder structure

dependencies

About

Resources

Stars

Watchers

Forks

Languages