I wanted to find all the occurrences of a subsequence in a genome assembly. To do this, I first tried using BLAT but it didn't find them for me (not sure why).
So I instead wrote a little Python function to print out all the positions of a subsequence in a sequence:
#====================================================================#
# find the positions of a subsequence in a sequence:
def find_positions_of_subsequence(seq, subsequence, seqname):
still_searching = True
start = 0
end = len(seq) - 1
while (still_searching == True):
position = seq.find(subsequence, start, end)
if position == -1:
still_searching = False
else:
actual_position = position + 1
format_string = "Found at %d in %s" % (actual_position, seqname)
print(format_string)
start = position + 1
return
#====================================================================#
Python saves the day!
No comments:
Post a Comment