Skip to content

shinryudbz/DeepLearning

Repository files navigation

DeepLearning

DeepLearning research project.

Here are some things that will need to be done if we move forward with this code:

  • generalize for the other datasets
    • find a general way to specify which fields for a given document vector has numerical values
  • figure out how to avoid loading all the numerical values into memory to sort and reverse (for the percentile calculation)
  • play with the settings for gensim to determine the optimal dimensionality for each feature vector - current I'm using the defaults.