Monday, 3 March 2014

Retrieving annotations from chado (genedb)

The database behind Genedb is called Chado.

Getting a gff file for Schistosoma mansoni from chado
To extract annotations for a species (eg. Schistosoma mansoni) from Chado, you can use a shell script like this on the Sanger farm (for Sanger users only):

#!/bin/bash

export WRITEDB_ENTRIES_PASSWORD=xxx
export output="/lustre/scratch108/parasites/alc/50HGI_FuncAnnotn/Smansoni_chado_dump"
rm -rf $output;
mkdir -p $output;
bsub  -o  $output/bsub.o -e $output/bsub.e -q long  \
        -M2500 -R "select[mem>2500] rusage[mem=2500]" \
        writedb_entries.py -t -o Smansoni -i -d pgsrv1:5432/pathogens?genedb -x $output


I've replaced the password information with 'xxx', to keep the password secret!

Getting all flatworm transcripts from chado
This is from my colleague Eleanor Stanley (thanks Eleanor).
% chado_dump_transcripts_coding -o Smansoni > Sma.fa
% chado_dump_transcripts_coding -o Emultilocularis > Emu.fa
% cat Sma.fa Emu.fa > flatworm_transcripts.fa

No comments: