Skip to content

thomas-corcoran/IrishDialects

Repository files navigation

Irish Dialect Classifier

This is a set of tools used to train a model to classify three main dialect groups of the Irish language: Ulster, Connacht, Munster. The classifier uses a model trained on a large corpus of Irish to detect Irish dialect either at the document or sentence level.

Dependencies

  • Python 2.7.x
  • sklearn >= 0.17

If you want to train your own model, you'll need the Nua-Chorpas na hÉireann/The New Corpus for Ireland available upon request here. This is a large corpus of >30 million Irish words from various texts in Irish. Note: I am not affiliated with the creators of the corpus, and thus I cannot grant access to the corpus itself.

Usage

Coming soon

License

Licencsed under GPL V3. If you use, modify or distribute this code, please make sure the source code is freely available.

About

Irish Language Dialect Classifer

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages