Wednesday 29 May 2019

Retrieving data from ChEMBL using their web interface

The ChEMBL database is a wonderfully useful database of bioactivity data for compounds and drugs, and their targets (e.g. protein targets, protein complex targets, whole organism targets, etc.).

Simple queries of ChEMBL via the web
The ChEMBL team have created a RESTful web interface, which means that you can query their data through the web.

For example, to retrieve all the compounds that have bioactivities against the ChEMBL targets CHEMBL1848 and CHEMBL3394, you can type in your web browser :,CHEMBL3394&assay_type=B&pchembl_value__gte=5&limit=100
This should bring you back the first 100 compounds/drugs that have bioactivities against these two ChEMBL targets.

The example above uses 'activity.json' to query activity data. There is an API for their web interface here that tells you all the query types you can perform.

Another example is to use 'molecule.json' to query data on properties of molecules, to get all the properties for molecules CHEMBL1627445 and CHEMBL43600:,CHEMBL43600&limit=100

Simple queries of ChEMBL using Python, in a Jupyter notebook
If you are familiar with Python, an even easier way to query ChEMBL via the web is to write the queries within Python.

Fiona Hunter from ChEMBL kindly provided me with an example Jupyter notebook to do this, which has an example of querying ChEMBL to find the compounds with bioactivities for certain ChEMBL targets, and then to retrieve information on the properties of those compounds. You can see the Jupyter notebook here.

The Jupyter notebook allows you to type in commands and get an instant response, and looks like this: