LegalTextExtraction

This is a program written in Python, that extracts information such as names of petitioners, respondents and the members of the coram, counsel, dates of pronouncing or reserving order or judgement from legal documents in PDF, using Python's re package for pattern matching and pdfminer for PDF parsing

Installation

The program is tested on Python 2.7.6 and makes use of the external libraries which need to be installed.

$ pip install pdfminer
$ pip install python-dateutil
$ pip install inflect

Run

The program is a single file named pdf_parser.py which can be run as follows:

$ python pdf_parser.py filename.pdf filename1.pdf ..

It generates one XMLOutput - filename.xml file corresponding to every input file.

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
README.md		README.md
pdf_parser.py		pdf_parser.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

pdf_parser.py

pdf_parser.py

Repository files navigation

LegalTextExtraction

Installation

Run

About

Releases

Packages

Languages

srhrshr/CaseLawExtraction

Folders and files

Latest commit

History

README.md

README.md

pdf_parser.py

pdf_parser.py

Repository files navigation

LegalTextExtraction

Installation

Run

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages