Skip to content

ProteinsWebTeam/interpro7-api

Repository files navigation

Unit and Funtional Testing Coverage Status GitHub license Code style: black

Logo InterPro7

Interpro 7 API

InterPro provides functional analysis of proteins by classifying them into families and predicting domains and important sites.

This is the repository for the source code running the InterPro Rest API, which is currently available at [https://www.ebi.ac.uk/interpro/api/].

This API provides the data that the new InterPro website uses. You can explore the website at [www.ebi.ac.uk/interpro].

The repository for the InterPro Website can be found at [https://github.com/ProteinsWebTeam/interpro7-client].

API URL Design

The InterPro API can be accessed by any of its 6 endpoints:

  • entry
  • protein
  • structure
  • set
  • taxonomy
  • proteome

if the URL only contains the name of the endpoint (e.g. /structure), the API returns an overview object with counters of the chosen entity grouped by its databases.

For each endpoint the user can specify a database (e.g. /entry/pfam), and the API will return a list of the instance in such database.

Similarly, the user can include an accession of an entity in that endpoint (e.g. /protein/uniprot/P99999), which will return an object with detailed metadata of such entity.

The user can freely combine the endpoint blocks (e.g. /entry/interpro/ipr000001/protein/reviewed). The only limitation is that a block describing an endpoint can only appears once in the URL.

The google doc here contains more information about the URL design of this API: Document

The interpro7-api/docs/modifiers.md document contains an exhaustive list of modifiers that can be used with example links.

Dependencies

InterPro7 API runs on Python3 and uses Django as its web framework, together with the Django REST framework to implement the REST API logic.

Another set of dependencies in the codebase are related to data access. Our data storage has 3 sources, a MySQL database for the metadata of all our entities, an elasticsearch instance for the links between them, and, optionally, redis to cache responses of often used requests. The python clients used to communicate with the sources are: mysqlclient, redis and django-redis. For elastic search we use regular http transactions, and therefore no client is required.

The specific versions of these dependencies can be found in the file requirements.txt. Other minor dependencies are also included in the file.

An optional set of dependencies, not required to run the API, but useful for development purposes can be found in dev_requirements.txt.

Local Installation

The procedure to install this project can be seen HERE.

Developers Documentation

Some details about decisions, compromises and techniques used throughout the project can be found HERE


This project followed some of the recommendations and guidelines presented in the book: Test-Driven Development with Python