Skip to content

guidogallopyn/TextPrediction

Repository files navigation

TextPrediction

this project contains all the sources for the Capstone TextPrediction project for the Data Sciences Specialization on Coursera.

the direcory structure is as follows

  • NGramLM.R : final S3 object for NGrams in R
  • task 0..3: exploration dirctorsie with various R scripts
  • task4
    • traintest.R: script to split corpora in ttaining and test sets
    • count.R: script to count Ngrams, counts stored in counts/
    • train.R: training scritpt that builds series of models with specifed parameters
    • evaluate.R measures coverage on evaluation corpus
    • evaluate2.R; measures perplexity on evaluation corpus
    • evaluate3.R: measures prediction accuracy on evaluation corpus, measurements saore in acc/
  • quiz: scripts to provide answers to quizes
  • python: python LM building (not used in final submission)
  • MilestoneReport: Capstone 1.Rmd is the source for the report
  • shinyapp: self contained directory with all resources for the Text predictor show case app
  • FinalPresentation: FinalReport.RPres is the source for the presentation, others are data files
  • FinalReport; NGramLM.RData contains source for final report (un-finished, not submitted)

not included

  • data: all data downloaded for this project, and splt and subsampled versions
  • models: directory with all models build

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages