GitHub - chongyangshi/R214: Codebase and report produced for the biomedical literature text mining coursework option of the Biomedical Information Process module (R214). Part of a collection of my taught component work towards the MPhil degree at the Computer Laboratory of the University of Cambridge.

Warning: this repository was previously prepared for an academic assessment!

This repository was made public after the end of my academic assessments to provide a personal showcase for my past work, with the understanding that the topic and nature of the assessment would change every year. In the unlikely event its contents becomes relevant to any current academic assessment, they should not be used under any circumstances for such an academic purpose.

This warning was added after a recent event at the time of writing, despite the content of this repository not being involved in any way.

In this coursework project, data taken from the BioCreative V Chemical-Disease Relation dataset were used to train a Conditional Random Field (CRM) entity recognition model, which was then paired with an approximate string matching-based grounding system to extract relations between mentions of chemicals and diseases in biomedical literature.

The codebase operates on Python 2.7, and was based on the framework supplied by the assessment setters. The hard-forked repository can be found here.

In the unlikely case that you wish to make use of this repository, with the obvious warning of academic prohibitions against using the codebase for assessments, I disclaim copyright to my proportion of the codebase. The original conll2crfsuite and crfutils tools belong to their original setters, who may have stricter constraints on how derivations of their work can be used.

Name		Name	Last commit message	Last commit date
Latest commit History 40 Commits
.vscode		.vscode
BC5-CDR		BC5-CDR
ablation		ablation
grounding		grounding
pubmed		pubmed
report		report
tools		tools
trigrams		trigrams
.gitignore		.gitignore
DEPENDENCIES		DEPENDENCIES
README.md		README.md
gen_features.sh		gen_features.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

.vscode

.vscode

BC5-CDR

BC5-CDR

ablation

ablation

grounding

grounding

pubmed

pubmed

report

report

tools

tools

trigrams

trigrams

.gitignore

.gitignore

DEPENDENCIES

DEPENDENCIES

README.md

README.md

gen_features.sh

gen_features.sh

Repository files navigation

About

Releases

Packages

Languages

chongyangshi/R214

Folders and files

Latest commit

History

Repository files navigation

About

Resources

Stars

Watchers

Forks

Languages