Creating a GO-Slim
I wanted to create my own GO-Slim, and found there is a nice tool for creating a GO-Slim at the EBI. You can start with an existing GO-Slim, eg. the 'generic GO-Slim' (has 149 terms), and add terms.
Mapping GO terms to your own GO-Slim
The next thing I wanted to do was to map GO terms to my GO-Slim. I was able to do this using Map2Slim, which is part of Owltools. To install it, I typed:
% wget http://build.berkeleybop.org/userContent/owltools/owltools
% wget http://build.berkeleybop.org/userContent/owltools/owltools-runner-all.jar
This is a Java program, it ran fine for me locally on my Mac laptop.
To run Map2Slim you need:
(i) a list of the GO terms in your GO-Slim
(ii) the GO annotations for your gene set of interest, with respect to the full ontology
(iii) the gene ontology hierarchy file (.obo file) for the full ontology
You need a list of the GO terms in your GO-Slim. If you are using the generic GO-Slim, you can download the generic GO-Slim in obo format from the geneontology.org website. Then to get a list of the terms, you can type:
% grep "id: GO:" goslim_generic.obo | grep -v alt | cut -d" " -f2 > goslim_terms.txt
The GO annotations for your gene set of interest need to be in GAF-2.0 format. In fact, I found that some of the columns aren't necessary for Map2Slim, so the file could look like this, where the columns marked 'optional' and 'unknown' seem to be ignored by Map2Slim:
!gaf-version: 2.0
WB 482159 482159 optional GO:0005515 pubmed unknown optional unknown optional optional protein unknown 482159 WB optional optional
WB 482159 482159 optional GO:0008270 pubmed unknown optional unknown optional optional protein unknown 482159 WB optional optional
WB 644503 644503 optional GO:0008270 pubmed unknown optional unknown optional optional protein unknown 644503 WB optional optional
WB 644503 644503 optional GO:0005230 pubmed unknown optional unknown optional optional protein unknown 644503 WB optional optional
You can download the latest gene ontology hierarchy (.obo) file from the geneontology.org website.
To run Map2Slim you type for example:
% ./owltools go-basic.obo --gaf my_gaf.txt --map2slim --idfile goslim_terms.txt --write-gaf my_slim.txt
The input GAF file was my_gaf.txt, the input obo file was go-basic.obo and the input list of GO-Slim terms was goslim_terms.txt.
The output file my_slim.txt is in GAF format, but has the GO-Slim terms for your genes. Hurray!
1 comment:
Very good información. I have a question about your output file, what do you do after? how you use the GO-slim terms?
Thank you. good day.
Post a Comment