Skip to content

DerekYu177/ExpenseManager

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

This is be roughly split into three parts: Input, Process, Output

Operation

Input

Getting the photo. We wait until the photos have been imported onto the processing computer.

  1. We query the user to determine the relevant folder (ui still under construction)
  2. We run over all of the photos to determine which ones are receipts (can we use ML for this?)

Investigation onto Photos naming scheme still needed.

Process

We apply OCR onto each photo to extract relevant tags and attributes.

  1. The extracted data currently is the date-time, and the total amount. (currently working on the address)
  2. This extracted data is then compared to the database, where duplicates are ignored. (We have the time attribute so this should be okay). Unique entries are inserted into the relevant line.
  3. For questionable attributes such as the address, the computer will prompt with a image of the text in question. (Can ML be used to improve this accuracy?)

Output

We log all data into a .csv file.

Installation Process

required packages

brew install tesseract
pip install Pillow
pip install pytesseract

To get homebrew w/ python2.7 and PyQt4:

xcode-select --install
ruby -e "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/master/install)"
brew install cartr/qt4/pyqt
brew install python

Adding to your PYTHONPATH

 /usr/local/Cellar/python/2.7.13/Frameworks/Python.Framework/Version/2.7/bin/python2.7
/usr/local/Cellar/pyqt/
credits
  • robonobodojo for the excellent guide

to view markdown in atom, use ctrl-shift-m

About

Allowing you to build your expenses straight from your receipt photos

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages