tag:blogger.com,1999:blog-7233518910685295571.post7001627519892798125..comments2023-05-11T06:21:18.796-07:00Comments on avrilomics: Filtering a repeat library for matches to proteins/RNAsAvril Coghlanhttp://www.blogger.com/profile/14064447050845166903noreply@blogger.comBlogger4125tag:blogger.com,1999:blog-7233518910685295571.post-52583074198873561912015-11-30T08:08:57.291-08:002015-11-30T08:08:57.291-08:00Thanks for sharing these notes: they're very ...Thanks for sharing these notes: they're very helpful!Anonymoushttps://www.blogger.com/profile/15083386061648196490noreply@blogger.comtag:blogger.com,1999:blog-7233518910685295571.post-29943244855640454672015-11-02T06:38:48.074-08:002015-11-02T06:38:48.074-08:00Note: the script above assumes that there is one l...Note: the script above assumes that there is one line per sequence in the fasta file (ie. the sequence is not spread over several lines in the fasta file, it is all on one line).Avril Coghlanhttps://www.blogger.com/profile/14064447050845166903noreply@blogger.comtag:blogger.com,1999:blog-7233518910685295571.post-72904500038174972052015-11-02T06:26:01.028-08:002015-11-02T06:26:01.028-08:00The script is by my colleague Eleanor Stanley. It ...The script is by my colleague Eleanor Stanley. It does something like this:<br /><br />my $fasta = shift @ARGV;<br />my $bad = shift @ARGV;<br />my %reads = () ;<br /><br />open (IN, "$bad") or die "I can't open $bad\n";<br />open (IN2, "$fasta") or die "I can't open $fasta\n";<br />open (OUT, ">$fasta.notwanted") or die "I can't open $fasta.notwanted\n";<br />open (OUT2, ">$fasta.filtered") or die "I can't open $fasta.filtered\n";<br /><br /><br />while () {<br /> chomp;<br /> my @line = split /\s+/ , $_;<br /> $reads{$line[0]}++;<br /> #print "Line:$line[0]:\n";<br />}<br />close (IN);<br /><br />while () {<br /> chomp;<br /> if (/^>(\S+)/) {<br /> my $seq_name = $1;<br /> $seq_name=~s/>//;<br /> my $seq = ;<br /> chomp($seq);<br /> #print "SEQname:$seq_name:\n";<br /> if ($reads{$seq_name} ) {<br /> print OUT ">$seq_name\n";<br /> print OUT "$seq\n";<br /> }<br /> else {<br /> print OUT2 ">$seq_name\n";<br /> print OUT2 "$seq\n";<br /> }<br /> }<br />}<br /><br />close (IN2);<br />close (OUT);<br />close (OUT2);<br />Avril Coghlanhttps://www.blogger.com/profile/14064447050845166903noreply@blogger.comtag:blogger.com,1999:blog-7233518910685295571.post-5928550530762388262015-10-26T07:17:50.062-07:002015-10-26T07:17:50.062-07:00possible to share 'fasta_retrieve_subsets.pl&#...possible to share 'fasta_retrieve_subsets.pl' script?Anonymoushttps://www.blogger.com/profile/08442932127145659818noreply@blogger.com