I have been learning how to use snippy by Torsten Seemann to identify SNPs in bacterial genomes.
Running snippy
To run snippy on the Sanger computer farm, I first had to type:
% module load snippy/4.6.0
Then I wanted to run snippy for an assembly "14.fasta", by comparing it to a reference genome "ref.fasta". I told snippy to infer SNPs by simulating fake 250-bp reads from the assembly "14.fasta", and comparing those to the reference genome:
% snippy --cpus 16 --outdir mysnps_test --ref ref.fa --ctgs 14.fasta
where the output files were put into directory mysnps_test, and the --cpus 16 means that 16 CPUs are used.
It took 8 minutes to run on that assembly.
Output files from snippy
The main output file from snippy is called snps.tab and looks something like this:
% head -10 mysnps_test/snps.tab
CHROM POS TYPE REF ALT EVIDENCE FTYPE STRAND NT_POS AA_POS EFFECT LOCUS_TAG GENE PRODUCT
AE003852 5414 snp G A A:20 G:0
AE003852 42082 snp A C C:20 A:0
AE003852 137105 del TAACAGAAACAGA T T:14 TAACAGAAACAGA:0
AE003852 144569 snp G A A:20 G:0
AE003852 167663 snp T C C:14 T:0
AE003852 167678 snp G A A:14 G:0
AE003852 167684 snp C T T:14 C:0
AE003852 167697 snp A G G:14 A:0
AE003852 182735 snp C T T:20 C:0
Acknowledgements
Thanks to my colleagues Lia Bote and Vignesh Shetty for help running snippy and understanding it.