Tuesday, 4 March 2014

Querying the chado database

The chado database lies behind Genedb. To carry out queries, you can log into chado directly by typing (from within Sanger):
> ssh pcs5
> chado [then type your chado password]
Then within chado, you can type queries, and put the output in a file.
For example, to get a list of all the Schistosoma mansoni genes that have a note containing the word 'manual' (to find all manually curated genes), and save them in a file 'smansoni_curated', we can type:

\o smansoni_curated

select gene.uniquename as gene
     , prop.value as note
from feature gene
join featureprop prop on gene.feature_id = prop.feature_id
join cvterm prop_type on prop.type_id = prop_type.cvterm_id
join cv prop_type_cv on prop_type.cv_id = prop_type_cv.cv_id
join organism on gene.organism_id = organism.organism_id
where prop_type_cv.name = 'feature_property' and prop_type.name = 'comment'
  and organism.genus = 'Schistosoma' and organism.species = 'mansoni'
  and prop.value like '%manual%'

;

I got this example from the Sample_Chado_queries website.
There are also more sample chado queries on the Useful_chado_queries website.
A third useful webpage is the Extracting_data_from_a_Chado_database website.

Thanks to my colleagues Magdalena, Matt and Anna for help.

Notes:
- to exit chado, I seem to have to type CTRL+D

No comments: