Monday, 26 January 2015

Using ggplot2 to plot boxplots in R

I love ggplot2! Here is a nice boxplot I made today, showing labels for the outliers:


> library(ggplot2)
> var1 <- c(1.06,1.06,1.19,1.28,1.11,1.16,1.04,1.21,1.27,1.41,1.09,1.10,1.04,1.41,1.07,1.16,1.09,1.11)
> var2 <- c(1.14,1.14,1.11,1.13,1.12,1.17,1.16,1.13,1.08,1.21,1.57,1.09)
> var3 <- c(1.13,1.05,1.03,1.04,1.10,1.04,1.14,1.15,1.00,1.08,1.07,1.07,1.08,1.03,1.09,1.07,1.33,1.07,1.08,1.09,1.03,1.05)
> var4 <- c(1.04,1.08,1.12,1.07,1.07,1.09,1.04)
> var5 <- c(1.03,1.04,1.02,1.04,1.04,1.04,1.04,1.04,1.05,1.05,1.06,1.05,1.08,1.10,1.07,1.00,1.18,1.05,1.03,1.11,1.53,1.05,1.08,1.08,1.04,1.06,1.05,1.05,1.04,1.03,1.07,1.41,1.04)
> myvalues <- c(var1,var2,var3,var4,var5)
> mynames <- c( rep('var1',length(var1)), rep('var2',length(var2)), rep('var3', length(var3)), rep('var4', length(var4)), rep('var5', length(var5)) )  

We only want to label outliers:
> mylabels <- c('\n','\n','\n','\n','\n','\n','\n','\n','\n','A','\n','\n','\n','\n','\n','\n','\n','\n','\n','\n','\n','\n','\n',
'\n','\n','\n','\n','\n','B','\n','\n','\n','\n','\n','\n','\n','\n','\n','\n','\n','\n','\n','\n','\n','\n','\n','C','\n','\n',
'\n','\n','\n','\n','\n','\n','\n','\n','\n','\n','\n','\n','\n','\n','\n','\n','\n','\n','\n','\n','\n','\n','\n','\n','\n','\n',
'D','\n','\n','\n','E','\n','\n','\n','\n','\n','\n','\n','\n','\n','\n','F','\n') 

Make the plot:
> mydata <- data.frame(myvalues, mynames)
> myplot <- ggplot(data = mydata, aes(factor(mynames), myvalues))
> myplot + geom_boxplot(outlier.size = 2, fill="red") + ylab("My values") + xlab("My variable") + geom_text(label=mylabels,size=3,hjust=1.5,vjust=1.3)
# outlier.size=2 makes a bigger dot for the outliers, label hjust and vjust adjust the label position

To change the order of boxes along the x-axis:
> myxorder <- factor(mydata$mynames, levels=c("var5","var3","var1","var2","var4"))
> myplot <- ggplot(data = mydata, aes(myxorder, myvalues))
> myplot + geom_boxplot(outlier.size = 2, fill="red") + ylab("My values") + xlab("My variable") + geom_text(label=mylabels,size=3,hjust=1.5,vjust=1.3)


 

2 comments:

  1. Yes it was very cool, but how do you know the label of outliers. That is the question!

    ReplyDelete
  2. Hi Joyce,
    I'm afraid I had to figure this out manually.

    ReplyDelete