GitHub

Aim

Parse documents with regular layout using OCR. This is hacky code with bad style where I mix e.g. lists and Numpy array freely so use with care.

Installation

For this code to run, it is nessecary to install the OCR engine Tesseract. It's possible to do this both on Linux and on Windows. On Ubuntu, sudo apt install tesseract-ocr usually does the trick.

On Windows it will might also be nessecary to install poppler.

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
src		src
.gitignore		.gitignore
Readme.md		Readme.md
requirement.txt		requirement.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

src

src

.gitignore

.gitignore

Readme.md

Readme.md

requirement.txt

requirement.txt

Repository files navigation

Aim

Installation

About

Releases

Packages

Languages

EdgarMCR/ocr_try_out

Folders and files

Latest commit

History

Repository files navigation

Aim

Installation

About

Resources

Stars

Watchers

Forks

Languages