Skip to content

First try to program something useful in python. Program recognizes a language of text input, uses the n-grams probability.

License

Notifications You must be signed in to change notification settings

tulinkry/Language-recognizer

 
 

Repository files navigation

Language-recognizer

First try to program something useful in python. Program should recognize a language of text input, uses the n-grams probability.

Need some improve in code and in the algorithm as well (do not give the best answers...), need to set same weight and values...

##langRecognizer.py

  • already working
  • try: python3 langRecognizer.py "tell me: in witch language is this input" some_vector_file, the vector file is not mandatory, default is language_vector.p
  • python3 langRecognizer.py (without any argument) works as well

##ngrams.py

  • can creater ngrams and count the number of ngrams in text, count the probability...

##langVector.py

  • some useful function for creating vector of languages

##addVector.py

  • already working
  • command line script for adding a new language vector into existing (or new) file with vector
  • python3 addVector.py "language" file_with_plain_text file_with_vectors
  • last argument is optional, default is language_vector.p

##language_vector.p

  • file with ready vectors, just for testing (create from small data), but you can use it for trying as well
  • data source: Gutenberg.org

About

First try to program something useful in python. Program recognizes a language of text input, uses the n-grams probability.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%