Recognizing Japanese Text

An open sourced Japanese digital writing classification model that makes predictions based on generated observations from drawing interaction from an android app.

This project started out as an assignment with a deadline of only a few weeks that ended up being a race against time to squeeze in an MVP in the form of a CNN with an existing datastet while building my first Android app to facilitate creating a new dataset from scratch that would be used for my actual idea. If you are here to see the initial version and notebooks, those parts have been archived here.

But the more interesting results can be found in the notebooks directory. The revision has improved accuracy, more intriguing models, and much cleaner code. All the basic ideas have been rewritten for use with PyTorch instead of Keras and are more sophisticated in their practices.

The Original (probably but not necessarily abandoned) Roadmap:

Stage 1 - Build an OCR recognition model using existing data from the Kuzushiji-49. The observations have a degree of separation from the goal of this project, but it also provides an advantages in terms of comparison/future generalizations in that it's classification is a more difficult task since the historical kuzushiji script is less standardized.
Completed-6/16/2020

Stage 2 - Use transfer learning to bring the smaller dataset up to speed with the models of the Kuzushiji-49.
Completed-6/16/2020

Stage 3 -Once there are significant observations build a standalone without the kuzushiji data and determine the best architecture for the OCR model.
Completed-6/16/2020

Stage 4 - Explore the notion of using the raw data to provide additional data captured (the bitmap images inherently do not capture stroke direction or order).
Completed-6/16/2020

Stage 5 Rewrite a versatile study app that can allow generation of observations more efficiently.
Not Started - TBA

Stage 6 Expand to the katakana and Kanji datasets.
Not Started - TBA

Name		Name	Last commit message	Last commit date
Latest commit History 27 Commits
choubenkyo_kivy_app @ 948c2a8		choubenkyo_kivy_app @ 948c2a8
mod4project		mod4project
notebooks		notebooks
src		src
.gitignore		.gitignore
.gitmodules		.gitmodules
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

choubenkyo_kivy_app @ 948c2a8

choubenkyo_kivy_app @ 948c2a8

mod4project

mod4project

notebooks

notebooks

src

src

.gitignore

.gitignore

.gitmodules

.gitmodules

LICENSE

LICENSE

README.md

README.md

Repository files navigation

Recognizing Japanese Text

The Original (probably but not necessarily abandoned) Roadmap:

About

Releases

Packages

Languages

License

coreyryanhanson/japanese_text_classifiers

Folders and files

Latest commit

History

Repository files navigation

Recognizing Japanese Text

The Original (probably but not necessarily abandoned) Roadmap:

About

Resources

License

Stars

Watchers

Forks

Languages