Tuesday 29 September 2020

Using powsimR and SCOPIT for power calculations for single-cell data analysis


I'm trying to do a power calculation to ask how many biological replicates are needed to detect the majority of differentially expressed genes between two treatments, in a particular cell population (cluster) in single cell data. I found the nice tool powsimR from the group of Dr Ines Hellmann. They say in their paper that for single cell data, the biological replicate is a cell, so their software tells you how many cells you need to sequence to detect the majority of differentially expressed genes (e.g. with >2-fold differential gene expression) between two treatments (e.g. male versus female worms).

There are detailed instructions on how to install powsimR on their powsimR github page. I followed these on a Windows laptop, after updating to the latest version of R, and found that it all installed fine but I had to leave out the 'build_vignettes=TRUE' when running the final devtools::install_github("bvieth/powsimR", dependences=FALSE) command. The powsimR github page says that several users have this issue. They say that this means the vignettes are not built, but can be viewed at powsimR vignette instead. 

PowsimR lets you take into account batch effects (e.g. biological replicates performed in different months), depth of sequencing based on prior data, "dropout" effets where a gene is only detected in a subset of cells, different methods for testing differential expression (e.g. MAST, scde, sDD, monocle). You can specify a particular power and false discovery rate (FDR).

PowsimR can be used to help design experiments in advance, and also can be used for posterior analysis. For example in their paper they say "For example for the Kolodziejczyk data, 384 single cells for each condition would be sufficient to detect >80% of the DE genes with a well controlled FDR of 5%. Given the lower sample sizes actually used in Kolodziejczyk et al (2015), our power analysis suggests that only 60% of all DE genes could be detected". 

In practice, I found that PowsimR looks quite complex to use as there are a lot of parameters to specify, and first you need to understand them. I ran out of time to do this properly, but hopefully can return to it one day, as it looks very useful...


Another nice tool is SCOPIT, which lets you ask the question: how many cells do we need to squence from an adult worm, to capture a rare cell type?  

SCOPIT is very simple to use through the website above.


Thanks to my colleague Faye Rodgers for telling me about SCOPIT.



No comments: