Wednesday 26 June 2019

Target prediction in ChEMBL

I recently wrote a blogpost on using the ChEMBL REST API to retrieve compounds with bioactivities for particular targets, as well as compound information (structure, logP etc.) for those compounds, from the ChEMBL database.

What about retrieving information on which (ChEMBL) targets are predicted for a particular ChEMBL compound? If you look at the ChEMBL page for a compound (e.g. imatinib),  at the bottom of the page you can see a 'Target predictions' section, containing predicted protein targets in ChEMBL. These are based on the prediction approach described here and here. These predictions give a good first approximation of whether there is a predicted binding between a compound and a ChEMBL target.

Simple REST API queries through the ChEMBL website
It turns out this is easy to retrieve this target prediction info. using the ChEMBL REST API (just for compounds and targets already in ChEMBL). For example, to get target predictions for imatinib, we can type into a web browser:
and we should see:

The fields shown here are:
in_training: was the predicted compound in the training set? yes: 1 / no: 0.
molecule_chembl_id: id of the predicted compound (presumably a parent chembl id)
pred_id: id of the prediction for the target - compound pair
probability: probability that the compound is active on the target. It ranges from 0 (predicted not active) to 1 (predicted active)
target_accession: uniprot_id of the target
target_chembl_id: the ChEMBL id for the target
target_organism: the organism of the target
target_tax_id: NCBI taxonomy id of target
value: 1 or 10, because 2 models were built. One with an activity threshold of 1 uM and the other at 10 uM

Simple REST API queries using Python
I wrote a Python script to retrieve all the predicted targets from ChEMBL, for an input list of ChEMBL compounds. You can see it here.

Thank you to Fiona Hunter and Nicolas Bosc from ChEMBL for advice about the ChEMBL target prediction REST API!

No comments: