jeudi 5 octobre 2017

R: ggplot2 boxplots - wilcox.test pairwise per facet - How to avoid extra vertical space when there are no significant comparisons?

After many questions on how to make boxplots with facets and significance levels, particularly this and this, I still have one more little problem.

I managed to produce the plot shown below, which is exactly what I want.

The problem I am facing now is when I have very few, or no significant comparisons; in those cases, the whole space dedicated to the brackets showing the significance levels is still preserved, but I want to get rid of it.

Please check this MWE with the iris dataset:

library(reshape2)
library(ggplot2)
data(iris)
iris$treatment <- rep(c("A","B"), length(iris$Species)/2)
mydf <- melt(iris, measure.vars=names(iris)[1:4])
mydf$treatment <- as.factor(mydf$treatment)
mydf$variable <- factor(mydf$variable, levels=sort(levels(mydf$variable)))
mydf$both <- factor(paste(mydf$treatment, mydf$variable), levels=(unique(paste(mydf$treatment, mydf$variable))))
a <- combn(levels(mydf$both), 2, simplify = FALSE)#this 6 times, for each lipid class
b <- levels(mydf$Species)
CNb <- relist(
    paste(unlist(a), rep(b, each=sum(lengths(a)))), 
    rep.int(a, length(b))
)
CNb
CNb2 <- data.frame(matrix(unlist(CNb), ncol=2, byrow=T))
CNb2
#new p.values
pv.df <- data.frame()
for (gr in unique(mydf$Species)){
    for (i in 1:length(a)){
        tis <- a[[i]] #variable pair to test
        as <- subset(mydf, Species==gr & both %in% tis)
        pv <- wilcox.test(value ~ both, data=as)$p.value
        ddd <- data.table(as)
        asm <- as.data.frame(ddd[, list(value=mean(value)), by=list(both=both)])
        asm2 <- dcast(asm, .~both, value.var="value")[,-1]
        pf <- data.frame(group1=paste(tis[1], gr), group2=paste(tis[2], gr), mean.group1=asm2[,1], mean.group2=asm2[,2], log.FC.1over2=log2(asm2[,1]/asm2[,2]), p.value=pv)
        pv.df <- rbind(pv.df, pf)
    }
}
pv.df$p.adjust <- p.adjust(pv.df$p.value, method="BH")
colnames(CNb2) <- colnames(pv.df)[1:2]
# merge with the CN list
pv.final <- merge(CNb2, pv.df, by.x = c("group1", "group2"), by.y = c("group1", "group2"))
# fix ordering
pv.final <- pv.final[match(paste(CNb2$group1, CNb2$group2), paste(pv.final$group1, pv.final$group2)),]
# set signif level
pv.final$map.signif <- ifelse(pv.final$p.adjust > 0.05, "", ifelse(pv.final$p.adjust > 0.01,"*", "**"))
# subset
G <- pv.final$p.adjust <= 0.05
CNb[G]
P <- ggplot(mydf,aes(x=both, y=value)) +
    geom_boxplot(aes(fill=Species)) +
    facet_grid(~Species, scales="free", space="free_x") +
    theme(axis.text.x = element_text(angle=45, hjust=1)) +
    geom_signif(test="wilcox.test", comparisons = combn(levels(mydf$both),2, simplify = F),
              map_signif_level = F,            
              vjust=0.5,
              textsize=4,
              size=0.5,
              step_increase = 0.06)
P2 <- ggplot_build(P)

#pv.final$map.signif <- "" #UNCOMMENT THIS LINE TO MOCK A CASE WHERE THERE ARE NO SIGNIFICANT COMPARISONS
#pv.final$map.signif[c(1:42,44:80,82:84)] <- "" #UNCOMMENT THIS LINE TO MOCK A CASE WHERE THERE ARE JUST A COUPLE OF SIGNIFICANT COMPARISONS

P2$data[[2]]$annotation <- rep(pv.final$map.signif, each=3)
# remove non significants
P2$data[[2]] <- P2$data[[2]][P2$data[[2]]$annotation != "",]
# and the final plot
png(filename="test.png", height=800, width=800)
  plot(ggplot_gtable(P2))
dev.off()

Which produces this plot:

test1

The plot above is exactly what I want... But I am facing cases where there are no significant comparisons, or very few. In these cases, a lot of vertical space is left empty.

To exemplify those scenarios, we can uncomment the line:

pv.final$map.signif <- "" #UNCOMMENT THIS LINE TO MOCK A CASE WHERE THERE ARE NO SIGNIFICANT COMPARISONS

So when there are no significant comparisons I get this plot:

test2

If we uncomment this other line instead:

pv.final$map.signif[c(1:42,44:80,82:84)] <- "" #UNCOMMENT THIS LINE TO MOCK A CASE WHERE THERE ARE JUST A COUPLE OF SIGNIFICANT COMPARISONS

We are in a case where there are only a couple of significant comparisons, and obtain this plot:

test3

So my question here is:

How to adjust the vertical space to the number of significant comparisons, so no vertical space is left there?

There might be something I could change in step_increase or in y_position inside geom_signif(), so I only leave space for the significant comparisons in CNb[G]...

Thanks!

Aucun commentaire:

Enregistrer un commentaire