-
NER Task is to tag sequence of tokens to a sequence of labels.
- Labels are
- ORG - Organization
- Per - Person
- O - Other
- Loc - Locations
- Labels are
-
Dataset files can be found in
Data/*.tsv
tab separated files. -
To Install repository requirements
pip install -r requirements.txt
-
To use the repo
- to configure models hyper-parameters, edit values in file
models/hyperparameters.py
main.py
is used to train Baseline modelmain_crf.py
is used to train CRF model
- to configure models hyper-parameters, edit values in file
-
Quick Tour:
-
data_loader/TSVDatasetParser.py
is the file containing the main data parser which:- starts by reading tsv file, to create 2 lists data_x, data_y (all data_x tokens are lowercase)
- Create vocabulary from unique tokens, to create
word2idx
dict. Same applies for labelslabel2idx
dict - Converts data from tokens to indices to be NN friendly
encode_dataset(...)
-
models/
hyperparameters.py
it holds the model configuration valuesmodels.py
contains BaseLine Model, CRF Model Architectures- each model has its own functions
forward(...)
,save(...)
,load(...)
,predict_sentences(...)
- each model has its own functions
-
training/
train.py
contain trainer objects for both baseline and CRF models
-
other folders names implies their functional
-
-
Notifications
You must be signed in to change notification settings - Fork 0
elsheikh21/named_entity_recognition
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
About
No description, website, or topics provided.
Resources
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published