Thursday 10 November 2016

A plot of gene counts across a species tree

My colleague James Cotton has written a nice Python script to plot gene counts in a gene family, across the species tree. Here's notes on how to run it (as otherwise I will forget..):

1. Get the gene counts in the family:
eg.
% grep 'family 132785 :' complete_families.txt | tr ')' '\n' | cut -d" " -f3 | sort | uniq -c

2. Format in the form:
ancylostoma_caninum(23),ancylostoma_ceylanicum(22),ancylostoma_duodenale(22),angiostrongylus_cantonensis(2),angiostrongylus_costaricensis(3),caenorhabditis_elegans(1),cylicostephanus_goldi(10),haemonchus_contortus(4),haemonchus_placei(10),heligmosomoides_bakeri(46),necator_americanus(10),nippostrongylus_brasiliensis(33),oesophagostomum_dentatum(7),panagrellus_redivivus(3),pristionchus_pacificus(1),strongylus_vulgaris(1),teladorsagia_circumcincta(10)

3. Run the script
% ssh -Y pcs5
% cd /lustre/scratch108/parasites/alc/000_50HG_InterestingFamilies/final_list_files/genecountplots
% python DrawTreeWithBars.py 'ancylostoma_caninum(23),ancylostoma_ceylanicum(22),ancylostoma_duodenale(22),angiostrongylus_cantonensis(2),angiostrongylus_costaricensis(3),caenorhabditis_elegans(1),cylicostephanus_goldi(10),haemonchus_contortus(4),haemonchus_placei(10),heligmosomoides_bakeri(46),necator_americanus(10),nippostrongylus_brasiliensis(33),oesophagostomum_dentatum(7),panagrellus_redivivus(3),pristionchus_pacificus(1),strongylus_vulgaris(1),teladorsagia_circumcincta(10)' family132785.png

This makes a nice plot like this:

Update: Mar 2017:
James wrote:
I've updated by family count data around a tree -
python ~/bin/DrawTreeWithBars_50HGI_families.py 'ixodes_scapularis(1),taenia_asiatica(3),echinococcus_multilocularis(4),angiostrongylus_cantonensis(1),hymenolepis_nana(1),protopolystoma_xenopodis(4),schistocephalus_solidus(6),oesophagostomum_dentatum(1),echinococcus_granulosus(4),nippostrongylus_brasiliensis(1),clonorchis_sinensis(2),haemonchus_placei(1),schistosoma_curassoni(1),echinostoma_caproni(4),taenia_solium(4),schistosoma_margrebowiei(1),mesocestoides_corti(4),hydatigera_taeniaeformis(3),schistosoma_mansoni(1),schistosoma_mattheei(1),hymenolepis_diminuta(1),trichoplax_adhaerens(1),diphyllobothrium_latum(4),crassostrea_gigas(1),drosophila_melanogaster(2),schistosoma_haematobium(1),teladorsagia_circumcincta(1),hymenolepis_microstoma(1),strongylus_vulgaris(1),pristionchus_pacificus(1),meloidogyne_hapla(1),schistosoma_rodhaini(1),fasciola_hepatica(2),spirometra_erinaceieuropaei(9),schmidtea_mediterranea(4),schistosoma_japonicum(1),trichobilharzia_regenti(3),angiostrongylus_costaricensis(1)' blah.png