Skip to content

ytarfa/shadow_removal

Repository files navigation

Shadow Removal (Work in Progress)

Training a Convolutional Neural Network (CNN) to detect and remove unwanted shadows from smartphone document captures.

Built with

  • Python
  • Docker
  • OpenCV (image manipulation)
  • Pathos (multiprocessing)
  • Noise (Perlin noise)

Data Synthesis

Silhouettes

A set of manually drawn silhouettes is used to create realistic shadows on a set of document images.

Operations on silhouettes:

  1. Noise is added to silhouette with a perlin noise mask (see noise module)
  2. Silhouette is blurred using a Gaussian convolution operation (see open cv gaussian blurring).
  3. Silhouette is randomly scaled (200-500%)
  4. Silhouette transparency is randomly determined (0.4 - 0.7)
  5. Silhouette is padded with empty pixels (or trimmed) so it has same dimensions as document image.
  6. Final image is a linear blend between original document image and silhouette image.
Original silhouette image Silhouette image after operations Silhouette image applied on document image
Original silhouette Silhouette after operations Silhouette applied on document

Document Images

Document images are agregated from two different datasets: SmartDocQA [1] and The IUPR Dataset of Camera-Captured Document Images [2]. The images from the IUPR dataset are scanned and trimmed; the images from SmartDocQA are not, they are manually trimmed using open-cv contour detection.

Operations on SmartDocQA document images:

  1. Threshold is applied to blacken part of image which is outside of document (see open cv threshold)
  2. Document edges are detected using open cv contour detection.
  3. Using document edgebox, document image is trimmed and warped.
Original SmartDoc image SmartDoc image after threshold application Trimmed SmartDoc image
Original SmartDoc image Treshold SmartDoc image Trimmed SmartDoc image

Training Data

To create the training data, silhouettes are generated using the aformentionned methods and applied on the document images. Here is the training data creation procedure:

  1. Masks and documents are identified using the uuid module.
  2. Original document is saved as "doc_<<document uuid>>.jpg"
  3. Masked documents are saved as "doc_<<document uuid>>_mask_<<mask uuid>>.jpg

In order to improve run time, the python multiprocessing module as well as the pathos module are used to do multiple operations in parallel.

Docker image

In order to run the application on a Google Compute Engine server instance, a docker image is created and pushed to Docker Hub (Docker Repository Link) and then pulled on the server.

References

[1] Nibal Nayef, Muhammad Muzzamil Luqman, Sophea Prum, Sebastien Eskenazi, Joseph Chazalon, Jean-Marc Ogier: “SmartDoc-QA: A Dataset for Quality Assessment of Smartphone Captured Document Images - Single and Multiple Distortions”, Proceedings of the sixth international workshop on Camera Based Document Analysis and Recognition (CBDAR), 2015.

[2] Bukhari, T. (2012). The IUPR Dataset of Camera-Captured Document Images. In Camera-Based Document Analysis and Recognition (pp. 164–171). Springer Berlin Heidelberg.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published