The code for my senior thesis on named entity recognition and linking in the domain of biology
The final version of my thesis paper
Contains the master class for training and evaluating the three models.
Shows the process of training and evaluating each of the three models as well as the full RoBERTa pipeline.
Contains the helper functions for tasks such as reformating data and other operations that do not require direct access to the master class data.
Contains files such as the notebooks used for hyperparamater tuning and older code that was used throughout the devlopment in the models found in the master class.
Contains the data files for training, devlopment, and testing.