Friday, 19 February 2016

Extract phenotypes/GO terms for C. elegans genes from WormBase

I had a list of C. elegans genes (WBGene identifiers) and wanted to know which C. elegans genes have known phenotypes. On the WormBase page, they know say that WormMart is retired, and to use WormMine instead. So I did this:

To get phenotypes for a list of C. elegans genes:
1. Went to WormMine
2. Clicked on 'Lists' and uploaded my list of worm genes (separated by commas), and saved the list as 'wormsinglecopygenes2'
3. Then clicked on 'Phenotypes', and chose 'Genes->Phenotypes', and selected the box 'Constrain to be in the list wormsinglecopygenes2', and then pressed 'Go'.
4. Voila! It gave me a nice output spreadsheet.

To get all C. elegans genes with a particular phenotype:
1. Went to WormMine
2. Clicked on 'Templates', and then clicked on 'Phenotype-->Genes'.
3. Set 'phenotype == lethal'.
4. This found 1343 rows. I clicked 'Download'.

Note 12-Aug-2016: I notice that when I use WormMine I get far fewer genes with a particular phenotype than when I search WormBase directly. For example, for the phenotype 'molt defect' ('WBPhenotype:0000638'), when I search using WBPhenotype:0000638 on www.wormbase.org, and download all the results, the results file has 242 distinct genes. In contrast, when I go to WormMine, and choose the template Phenotype->Genes and enter 'molt defect', it gives me a list of just 19 genes. It seems to be missing many genes, eg. WBGene00000039 is found using the website, but not wormmine, and when I checked the wormbase page for that gene, it has an RNAi phenotype observed of 'molt defect' for WBGene00000039, so surely should be listed under WormMine too. I emailed WormBase and heard that RNAi information is not yet used in WormMine however, and that I can get a full file of phenotypes for C. elegans genes from the ftp site: ftp://ftp.wormbase.org/pub/wormbase/releases/WS253/ONTOLOGY/phenotype_association.WS253.wb
(Thanks Michael Paulini for help!)

To get all GO terms for all C. elegans genes:
1. Went to WormMine.
2. Clicked on 'QueryBuilder', then selected 'gene' from the dropdown list.
3. In the browser at the left, for 'Gene'->'WormBase Gene ID', clicked on 'Show'.
4. In the browser at the left, clicked on 'GO Annotations' -> 'Ontology Term' -> 'Ontology Annotations' -> and for 'Identifier' clicked on 'Show'.
5. Clicked 'Show Results' at the bottom right. This gave 274,273 rows of results. Then clicked 'Download'.

Note 7-Jul-2016:
- the website for WormMine now seems to be here.

No comments: