TesseractXplore

This tool provides a graphical interface to tesseract. Images can be loaded via a file chooser window or drag-and-drop. The result fulltext-files can also be edited.

Application features
    ┣━ Do OCR (single or batchwise)
    ┣━ Save/Load tesseract settings
    ┣━ Open PDFs with external applications (Webbrowser,..) 
    ┣━ Convert PDF to images 
    ┣━ Edit images (Rotation, Coloradjustements,..)
    ┣━ Evaluate output (Character occurencies)
    ┣━ Compare results with different settings
    ┣━ Edit fulltext outputs (Text, ALTO, hOCR, TSV)
    ┗━ Ease-to-Use online search engine for new models

Tesseract

Tesseract is a commandline based OCR engine. Tesseract has unicode (UTF-8) support, and can recognize more than 100 languages "out of the box".

There are several 3rdParty projects to provide a gui for Tesseract, but they all lack in some way.

Images

Any image readable by Leptonica is supported in Tesseract including BMP, PNM, PNG, JFIF, JPEG, and TIFF. GIF and PDF is not supported, but PDF can be converted to readable imageformats.

Models

The official models are divided into three types tessdata_best, tessdata_fast and tessdata.

The models can be furthermore categorized into language (e.g. german) and script (e.g. latin) models. While script models are trained for a whole writing systems and the integrated dictionary is really broad, the language models are trained with a subset of a writing system and contains a language specific dictionary.

The models names are abbrevations as deu for the standard german (deutsch) language model and further information to the models are hard to find. The models need to be downloaded and installed to the tessdatapath.

Thats why, the application provides an online model search engine with more metadata information for each model. Finding and installing new models are now easy and straight forward.

Fulltext-fileformat

Tesseract supports various output formats: plain text, hOCR (HTML), PDF, invisible-text-only PDF, TSV. The master branch also has experimental support for ALTO (XML) output.

The application allows to edit plain text, hOCR, ALTO and TSV and store the result for further use.

Development Status

See Issues for planned features and current progress.

This is project is currently in an early development stage and not very polished. All the features described below are functional, however.

Python Package

See the wiki for details on the python package.

Installation

It is recommended to use a virtual environment:

python3 -m venv venv
Linux/MacOS:
source venv/bin/activate
Win (cmd.exe):
<venv>\Scripts\activate.bat

Linux/OS:

sudo apt-get install python3-sdl2
pip install --upgrade pip
pip install .[app]

If you use zsh-commandline you need to escape brackets with backslash!

Windows (don't install kivy-gstreamer!):

pip install --upgrade pip
pip install .[all-win]

The standard text renderer can't display combined glyphs correctly, to do that an alternative text renderer needs to be used e.g. pango2:

Install pangoft2 (apt install libfreetype6-dev libpango1.0-dev libpangoft2-1.0-0) or ensure it is available in pkg-config
Recompile kivy. Check that pangoft2 is found use_pangoft2 = 1
Test it! Enforce the text core renderer to pango using environment variable: export KIVY_TEXT=pango

GUI

GUI Usage

Start the GUI:

tesseractXplore

Image Selection and OCR

The basic UI components are shown below:

Drag & drop images or folders into the window.
Or, select files via the file browser on the right
Enter tesseract settings
Click the 'Run' button in the lower-left to recognize the selected images

Other things to do:

Middle-click an image to remove it
Right-click an image for a menu of more actions

Save, load and reset tesseract settings

Sometimes it can be handy to have settings profiles like for historical documents with just one column. The application allows to save settings into a profile with an individual name. To reset the settings to default or to search and load user stored profiles.

Model Search

If you don't have a suitable model, click the 'Find a new model' button to go to the model search screen. You can start with searching by name, with autocompletion support:

You can also run a full search using the additional filters. For example, with group attributes (tessdata,..), tags (medival) or select the model type (Fast/Best) or category (Language/Script):

On the right side of the window is an information page with download option.

Settings

There are also some settings to customize the application and global parameters. And yes, there is a dark mode, because why not.

![Screenshot]

Keyboard Shortcuts

Some keyboard shortcuts are included for convenience:

Key(s)	Action	Screen
F11	Toggle fullscreen	All
Ctrl+O	Open file chooser	Image selection
Shift+Ctrl+O	Open file chooser (dirs)	Image selection
Ctrl+Enter	Run image tagger	Image selection
Ctrl+Enter	Run model search	Model search
Shift+Ctrl+X	Clear selected images	Image selection
Shift+Ctrl+X	Clear search filters	Model search
Ctrl+S	Open settings screen	All
Ctrl+Backspace	Return to main screen	All
Ctrl+Q	Quit	All

Name		Name	Last commit message	Last commit date
Latest commit History 224 Commits
assets		assets
docs		docs
kv		kv
lib		lib
tesseractXplore		tesseractXplore
test		test
widget_tests		widget_tests
.gitignore		.gitignore
.readthedocs.yml		.readthedocs.yml
.travis.yml		.travis.yml
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
README.md		README.md
build.bat		build.bat
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
setup.cfg		setup.cfg
setup.py		setup.py
tesseractXplore.spec		tesseractXplore.spec

License

zuphilip/tesseractXplore

Folders and files

Latest commit

History

Repository files navigation

TesseractXplore

Contents

Use Cases

Tesseract

Images

Models

Fulltext-fileformat

Development Status

Python Package

Installation

GUI

GUI Usage

Image Selection and OCR

Save, load and reset tesseract settings

Model Search

Settings

Keyboard Shortcuts

See Also

About

Resources

License

Stars

Watchers

Forks

Languages