Skip to content

The chemical-protein interaction extraction project

Notifications You must be signed in to change notification settings

Beira-BF/ChemProtBioCreativeVI

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

20 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

ChemProtBioCreativeVI

This repository contains the source code of the three-stage approach for the chemical-protein interaction extraction task in the BioCreative challenge VI. Details of the three-stage approach are described in: Natural language processing based feature engineering for extracting chemical-protein interactions from literature, (2018), Lung P-Y, He Z, Zhao T & Zhang J.

Prerequisites

Data

Partial dataset used in the model are located in the data folder for demonstration purpose. It contains abstracts of PubMed articles, tagged chemical/protein entities and labeled relations released by the task organizers. The complete dataset, as well as the gold standard for testing set, can be found at BiocreativeVI, or by contacting the organizers: Martin Krallinger & Jesús Santamaría.

Usage

In the last line of RunParser.py, specify the path to the Stanford Neural Network Dependency Parser. Next, run the command

$ sh demo.sh

This will run the pipeline, and generate ChemProtTest_sumbit.tsv, where each row contains: PubMedID, relation type, chemical entity, protein entity.

About

The chemical-protein interaction extraction project

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 97.4%
  • Shell 2.6%