pdfminer
regex
requests
BeautifulSoup
urllib
openpyxl
Python 2.7
Microsoft Excel
- After cloning this repo on your system
cd <name_of_repo>/scripts
- Generate the output.html file to parse the Epic Numbers
pdf2txt.py -o output.html S24A276P001.pdf
- Run the python script run.py as
python run.py
The link to completion of the challenge is here
- https://docs.google.com/document/d/1ZbY7KF4XQfJ7K3VbkcSaLW__sThIlnxcvLVN5nFKSUk/edit
- https://www.binpress.com/tutorial/manipulating-pdfs-with-python/167
- http://docs.python-requests.org/en/master/user/advanced/
- https://www.crummy.com/software/BeautifulSoup/bs4/doc/
- http://stackoverflow.com/questions/24748445/beautiful-soup-using-regex-to-find-tags
- http://rubular.com/