Skip to content

guillotel-nothmann/imageAnnotation

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

44 Commits
 
 
 
 
 
 

Repository files navigation

TMG_ImageAnnotation

TMG_ImageAnnotation is a Python module for viewing and editing OCR outputs and segmentation ground truth. The software is developed in the context of the project Thesaurus Musicarum Germanicarum, dedicated to printed music theory of the modern era (1470-1750). The viewer shows the page layout as a transparent overlay on the document image. Text regions (paragraphs, headins, captions, drop capitals, graphics, music examples, etc.) are displayed as tooltips and can be edited. TMG_ImageAnnotation takes METS-data as input and stores image annotations in the PAGE XML format. The project is designed to be embedded in the OCR-D initiative. It aims to improve especially the page segmentation step that remains critical in OCR workflows for historical sources.

Installation

TMGImageAnnotation is written in Python. It is based on the matplotlib library and requires lxml and numpy. Install Python3 and the resources below if they are not available on your machine:

pip3 install numpy
pip3 install lxml
pip3 install matplotlib
pip3 install scikit-image
pip3 install tensorflow==1.6.0
pip3 install keras==2.1.3
pip3 install h5py== 2.9

Download or clone this project.

git clone https://github.com/guillotel-nothmann/imageAnnotation.git

Usage

Open a terminal navigate to the src folder and run main.py

cd ImageAnnotation/src 
python3 main.py

Once launched, open a METS file that points to page regions via PAGE files with image urls.

Use the following commands to navigate and to edit:

  • Quit : "ctrl+q"
  • Open: mac: "command+O", windows: alt+o
  • Save: mac: "command+S", windows: alt+s
  • Display polygon information: right click on the polygon
  • Edit polygon information: select a polygon and press "+"
  • Add coordinates: click on the polygon lines and press "i"
  • Delete coordinates: click on a polygon point and press "d"
  • Delete whole region: select polygon and press "backspace"
  • Zoom: shift+mouse selection
  • Unzoom: control + mouse click
  • Next page: "right"
  • Previous page: "left"
  • Next region: "down"
  • Previous region: "up"

Polygon regions can be added using the buttons or the following key combinations:

  • "shift+c": caption
  • "ctrl+d": diagram
  • "shift+d": drop capital
  • "shift+f": footer
  • "ctrl+f": footnote
  • "shift+G": graphic
  • "ctrl+h": header
  • "shift+H": heading
  • "shift+I": image
  • "ctrl+l": linedrawing
  • "shift+Z": list
  • "shift+M": marginalia
  • "shift+O": other
  • "ctrl+O: ornament
  • "shift+P": paragraph
  • "ctrl+p": page number
  • "ctrl+s": separator
  • "shift+S": staff notation
  • "shift+T": tablature notation
  • "ctrl+t": table

Example

Run TMGImageAnnotation:

cd ImageAnnotation/src 
python3 main.py

Open the mets.xml file located in the followin folder: "/ImageAnnotation/annotationExample". You should see the following example and should be able to edit its region annotation.

ImageAnnotationExample

Contributing

Pull requests are welcome. For major changes, please open an issue first to discuss what you would like to change.

Please make sure to update tests as appropriate.

License

CC BY-NC-SA

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages