[Note: this is useful to Sanger users only.]
There is a nice program called 'assembly-stats' for calculating assembly statistics on the Sanger farm.
Find the latest version of it:
% module avail -t | grep -i stats
assembly-stats/1.0.1
Load the module:
% module load assembly-stats/1.0.1
Now run it on an assembly file '2038_EDC_717.fas':
% assembly-stats -t 2038_EDC_717.fas
filename total_length number mean_length longest shortest N_count Gaps N50 N50n N70 N70n N90 N90n
2038_EDC_717.fas 3816803 82 46546.38 362749 1020 2 1 163759 8 89245 15 24898 30
If you have a whole directory of assembly files all ending in '.fas', you can make a bourne-shell script to run assembly_stats on them, with a for loop:
#!/bin/sh
# see https://alvinalexander.com/blog/post/linux-unix/bourne-shell-script-for-loop-edit-files/
for i in `ls *.fas`
do
echo "$i"
assembly-stats -t $i > $i.stats
done
This makes a file .stats for each assembly file (e.g. 2038_EDC_717.fas.stats for assembly file 2038_EDC_717.fas).
No comments:
Post a Comment