Friday, 20 February 2026

Using MOB-suite to predict plasmids in bacterial genome assemblies

Today I wanted to predict plasmids in a bacterial genome assembly, and used the MOB-suite tool.

Here's how I ran it on the Sanger compute farm:

% mob_recon --infile genome.fasta --outdir genome_plasmid 

where genome.fasta is the fasta file name for my genome, and genome_plasmid is the name I wanted to give to the output directory. I needed to request 1000 Mbyte of RAM to run this on a 4.5 Mbyte bacterial genome.

The output file will be genome_plasmid//mobtyper_results.txt.

Some useful columns in the output file are:

column 15: mash_nearest_neighbor
column 16: mash_neighbour_distance
column 17: mash_neighbour_identification

The output 'mobtyper_results.txt' file looks something like this: 

sample_id       num_contigs     size    gc      md5     rep_type(s)     rep_type_accession(s)   relaxase_type(s)        relaxase_type_accession(s)      mpf_type        mpf_type_accession(s)   orit_type(s)    or
it_accession(s) predicted_mobility      mash_nearest_neighbor   mash_neighbor_distance  mash_neighbor_identification    primary_cluster_id      secondary_cluster_id    predicted_host_range_overall_rank       pr
edicted_host_range_overall_name observed_host_range_ncbi_rank   observed_host_range_ncbi_name   reported_host_range_lit_rank    reported_host_range_lit_name    associated_pmid(s)
CCBT0329:AA860  1       153481  0.5174686451848661      8c072d1914bfa50eb379d2673416d2b0        IncC    000092__CP025470        MOBH,MOBH       NC_012690_00071,NC_012885_00072 MPF_F   NC_023291_00077,NC_012885_
00091,NC_016974_00085,NC_012885_00083,NC_014170_00023,NC_009140_00071,NC_012885_00167,NC_012885_00088   MOBH    JQ319772        conjugative     CP015394        0.000143503     Klebsiella pneumoniae   AA860   AJ
278     phylum  Pseudomonadota  class   Gammaproteobacteria     phylum  Pseudomonadota  23800906; 20138094; 19482926; 24567731; 28842132; 20851899; 22290972; 19949054
CCBT0329:AC804  1       3981    0.46897764380808843     cab608a1a227ef9028aa1b8d80e819b9        rep_cluster_159 000964__AF052650        -       -       -       -       -       -       non-mobilizable AF052650 0.00759618       Vibrio cholerae AC804   AM145   genus   Vibrio  genus   Vibrio  -       -       -

In this example, two plasmids are predicted in the genome. The first one is an IncC plasmid of size 153 kb, and has its closest sequence match to NCBI accession CP105394, which is a Klebsiella pneumoniae plasmid. The second one is a small plasmid of about 4 kb, which has its closest sequence match to NCBI accession AF052650, which is a Vibrio cholerae plasmid. If you look up AF052650 on the NCBI website, you'll find it is V. cholerae plasmid pTLC.


 

No comments: