Friday 31 May 2019

Retrieving data from WormBase ParaSite using their web interface

I've recently been learning how to query the ChEMBL database via the web using their REST API (see my blog post here), and to query the PDBe database via the web using their REST API (see my blog post here).

So challenge for today: learn how to query the WormBase ParaSite database via the web, using their REST API!

Simple queries of WormBase ParaSite via the web
There is a very nice description of the types of queries you can perform on WormBase ParaSite using the REST API, here

This has examples of code you can type directly into a browser, as well as the code you would need to perform the same queries in programming languages such as Perl or Python.

For example, here is their documentation of the query you can use to retrieve the gene tree that a particular gene of interest belongs to: genetree_member_id documentation.

For example, if I want to find out what gene tree the Schistosoma mansoni gene SULT-OR (the protein involved in resistance to the drug oxamniquine), which has the identifier Smp_089320, belongs to, we can type in the web browser: https://parasite.wormbase.org/rest-13/genetree/member/id/Smp_089320?content-type=text/x-phyloxml%2Bxml

We should get back a gene tree in XML format, looking something like this:



By default, this will return the original sequence for each gene in the gene tree, but you can tell it to instead return the alignment for each gene in the gene tree by using the aligned=1 option:https://parasite.wormbase.org/rest-13/genetree/member/id/Smp_089320?aligned=1&content-type=text/x-phyloxml%2Bxml

Then the output should have aligned sequence for each gene (ie. with indels):