Monday 27 July 2015

Python testing using doctests

Are you sick of bugs? Me too!
An easy way to search down bugs is to test your Python code using doctests.
For example, if you have a module like this called 'module.py':


#====================================================================#

# define a function to read GO terms from a line of a file:

def read_go_terms_from_line_of_file(line):
    """read GO terms from a line of a file
    >>> read_go_terms_from_line_of_file("791228 GO:0005515")
    ('791228', ['GO:0005515'])
    >>> read_go_terms_from_line_of_file("530242 GO:0000003,GO:0004675,GO:0005024,GO:0016020")
    ('530242', ['GO:0000003', 'GO:0004675', 'GO:0005024', 'GO:0016020'])
    >>> read_go_terms_from_line_of_file("794214 none")
    ('794214', [])
    """

    # eg.
    # 791228 GO:0005515
    # 5470 GO:0002119,GO:0004652,GO:0006952,GO:0007423,GO:0009792,GO:0010171,GO:0019915,GO:0040002,GO:0040011,GO:0040018,GO:0042302,GO:0042338,GO:0043631,GO:0045138
    # 530242 GO:0000003,GO:0004675,GO:0005024,GO:0016020
    # 794214 none

    temp = line.split()
    family = temp[0]
    go_terms = [] if temp[1] == "none" else temp[1].split(',')


    return(family, go_terms)

#====================================================================#

if __name__ == "__main__":
    import doctest
    doctest.testmod()



This module has just one function 'read_go_terms_from_line_of_file' that reads the name of a gene family, and GO terms for the genes in that family, from an input file. The function takes as its input the line of a file, and returns the name of the family (a string) and a list of the GO terms. At the top of the function (in red above) are some doctests, which define what the function should return (according to the programmer) for particular example inputs. The end of the module has some code (in pink above) that says that if we run this module, by typing:
% python3 module.py
then the doctests are run. If you get no output when you run 'python3 module.py', then it means that the doctests were run, and all ran fine, and no bugs were found (hurray!) If you do get some output, then it helps you find those nasty bugs, and you can hunt them down and squash them (hurray!)

To get more detailed output regarding which tests were run and which failed/passed, type:
% python3 module.py -v

If you want to include doctests in a program (not a module)
An alternative way is to include doctests in your program, for example, your program could have the same code as above, except that the end of the program would have a 'main' function that calls the functions, and the 'doctest' lines would be removed :

def main():
    read_go_terms_from_line_of_file("791228 GO:0005515")

if __name__=="__main__":
    main()


Then, to run the doctests we just run the program with the "-m doctest" option:
% python -m doctest program.py

If your function returns a dictionary
Dictionaries are unsorted, so this is a problem. But you can do something like this:
>>> find_midpoint_positions({'scaff1': ['50=100','150=200','250=300'], 'scaff2': ['100=200', '202=500'], 'scaff3': ['300=500']}) == {'scaff1': [125, 225], 'scaff2': [201]}
    True



No comments: