APTS_ApplicantParsingAndTrackingSystem

Objectives:

Extract the important information from the resume
Match the resume to the provided expertise list (This list was provided by Vinod to Harsha)
Match the resume to the Job description (For Concept purpose, I have downloaded the data engineer resume from the job board of travelers insurance)

Code Requirements

Execute the below command from your terminal for installing all the required libraries

pip install requirement.txt

Download the spacy word embedding model using the below command

python -m spacy download en_core_web_md

Usage

python resumeparser [-resume resumeFile]

-resume; input resume file

Example: 
python resumeparser -resume "data/resumes/resume_trail4.pdf"

when you execute the code, the code pushes the output to the respective columns in the database and prints the contents in the json format

Download the DB Browser from here for viewing the sqlite database

Procedures Used for Achieving the objectives (At highlevel)

Obj1: Extract the important information from the resume

Mobile, Email Address, Github, Linkedin
- Used Regex
Address
- Used the combination of Regex and pyad library (for standardization)
Name
- Used the chunking and chinking concepts in combination with a lookup table. Unlike any NER model, this process can prescribe the name rather than predict
  - This process is proven to give better results rather than using any pretrained NER models as its hard for any NER model to understand the variations and the variety of resume.

Obj2: Matching the Resume to provided expertise list

For every word in the expertise list, get the glove word embedding provided by spacy library
Then get the similarity score of the each word embedding to all the nouns (do the parts of speech tagging and eliminate all other part of speech words other than nouns) of the resume;
The top 5 highest similarity score words are the top 5 as the expertise of the resume


We have used the pretrained glove embedded model, in an ideal scenario we need to train our 
own word embedded model on a specfic domain corpus. But the results from the above approach 
seem compelling too. 

Examples: 
--------------------------------------------------------------------------------------------
-'resume_trail4'(data/resumes) is the resume of an Orthopaedic Spine Surgery, below are the top5 expertises 
the code matched from the expertise list

'radiology, internal medicine, general dentistry, general surgeon, physical therapy'
--------------------------------------------------------------------------------------------
- 'resume_trail2' is the resume of a Building inspector (related to construction), below are 
the top5 expertises the code matched from the expertise list

'highway design & construction, public health, life care planner, fire cause & analysis, 
hospital administration'
--------------------------------------------------------------------------------------------

Matching the Resume to the job description

Get the Glove word embedding to every noun and verb of the resume and the job description.
The calcuate the similarity score between those documents with only verbs and nouns

All the traditional ATS systems uses the keywords to filter out the resumes, however there are 
many scenarios where keyword approach is proven wrong. This approach is presented as a POC how can we 
better construct the ATS system with word embedding as they care both for keywords and the dependency 
between words

Examples:
Currently the job description to be matched is a data engineer position from Travelers Insurance

Similarity score between Resume_Harsha_AIScientist (my resume) and job description is 0.90
Similarity score between resume_trail2 (related to construction) and job description is 0.71

We can clearly see how this process can be efficient in picking the best resumes from a pool of 
resumes.

Name		Name	Last commit message	Last commit date
Latest commit History 34 Commits
__pycache__		__pycache__
data		data
db		db
.DS_Store		.DS_Store
README.md		README.md
TextConverter.py		TextConverter.py
dataconnections.py		dataconnections.py
requirements.txt		requirements.txt
resumeparser.py		resumeparser.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

pycache

pycache

data

data

db

db

.DS_Store

.DS_Store

README.md

README.md

TextConverter.py

TextConverter.py

dataconnections.py

dataconnections.py

requirements.txt

requirements.txt

resumeparser.py

resumeparser.py

utils.py

utils.py

Repository files navigation

APTS_ApplicantParsingAndTrackingSystem

Objectives:

Code Requirements

Usage

Procedures Used for Achieving the objectives (At highlevel)

Obj1: Extract the important information from the resume

Obj2: Matching the Resume to provided expertise list

Matching the Resume to the job description

About

Releases

Packages

Languages

roosterbell/APTS_ApplicantParsingAndTrackingSystem

Folders and files

Latest commit

History

Repository files navigation

APTS_ApplicantParsingAndTrackingSystem

Objectives:

Code Requirements

Usage

Procedures Used for Achieving the objectives (At highlevel)

Obj1: Extract the important information from the resume

Obj2: Matching the Resume to provided expertise list

Matching the Resume to the job description

About

Resources

Stars

Watchers

Forks

Languages