Monday 5 November 2012

Running GeneWise with HMMs

A nice feature of Ewan Birney's GeneWise software is that GeneWise can use HMMs of gene families to help predict genes in DNA sequence.

This can be done using:
% genewise <hmmfile> <fasta> -hmmer ... [other options]
where <hmmfile> is your HMM file, and <fasta> is the fasta file for your DNA sequence.

In a previous post, I described how to train GeneWise so that it uses a splice site parameter file that has been trained for your species.

If you want to run GeneWise with HMMs, and also want to use a splice site parameter file for your species, you will need to type:
% genewise <hmmfile> <fasta> -hmmer -genestats <paramfile> -nosplice_gtag  ... [other options]
where <paramfile> is your splice site parameter file.

The above command can only be used to compare one HMM to one DNA sequence.

The GeneWise software comes with a program called genewisedb, which can be used to compare multiple HMMs to a fasta file of multiple sequences. However, unfortunately, genewisedb does not have the -genestats option, to allow you to use your own splice site parameter file.

If you want to use the -genestats option, to use your own splice site parameter file, you can use my perl script to run genewise, by comparing each HMM in your input file of multiple HMMs, to each DNA sequence in a fasta file of multiple sequences.

No comments: