跟着BMC Plant Biology学作图:R语言ggtree圆形树形图并添加分组背景色
论文
Comparative analysis of de novo genomes reveals dynamic intra‑species divergence of NLRs in pepper
数据和代码
https://github.com/sdaf11111/NLR-map-in-pepper
论文中Figure2的示例数据和代码作者放到了github主页,我们可以学习一下他的代码
示例数据是一个nwk格式的树文件 和 一个csv格式的分组文件
学到的新知识点
R包 svglite
输出图片如果保存为svg格式可能会用到这个R包
函数split()
可以把数据框根据某一列分组转换成列表格式,文字表达可能有点看不明白,看一下函数的输出效果
df<-data.frame(x=c("A","A","B","B","B"),
y=c(1,4,5,2,7))
df
split(df$y,df$x)
函数geom_abline()
ggtree里的函数,可以把树末端补平
接下来是实际的代码
首先是加载需要用到的R包
library(dplyr)
library(ggtree)
library(ggplot2)
library(svglite)
library(scales)
读取数据
info <- read.csv("data/20230202/Annuum.Intact.NBARC.group.trimal92.csv")
head(info)
tree <- read.tree("data/20230202/Annuum.Intact.NBARC.tree.trimal92.nwk.txt")
给树添加分组信息
groupInfo<-split(info$ID,info$Group)
tree<- groupOTU(tree, groupInfo)
指定颜色
heatmap.colours <- c("#be9fe1","#8ac6d1","#e1ccec","#fddb3a",
"#C0C0C0","#c9b6e4","#d5c455","#ffb6b9","#fae3d9",
"#9aceff","#d7cde6","#bbded6","#ede59a","#4f98ca",
"#4a69bb","#f5cdaa","NA",
"#c3d14a","#63b637","#FF0000","#008000","#FF1493","#FF4500")
names(heatmap.colours) <- c("G1","G2","G3","G4",
"G5","G6","G7","G8","G9",
"G10","G11","G12","G13","GT",
"GR","G14","NG",
"CHIL","Know","CANN","CECW","CZUN","CASF")
作图代码
p <- ggtree(tree, layout='circular', size=0.2) %<+% info +
geom_aline(linetype="solid",
size=0.5, aes(color=group),
alpha=0.5) +
scale_colour_manual(values=heatmap.colours,
breaks=c("G1","G2","G3","G4","G5",
"G6","G7","G8","G9","G10",
"G11","G12","G13","GT","GR",
"G14","CHIL","Know","CANN",
"CECW","CZUN","CASF"),
name="Group") +
theme(legend.position="right")+
geom_tippoint(aes(color=Species), size=0.2)+
guides(color=guide_legend(ncol=6))
q <- flip(p, 3001, 4090) %>% rotate(3507)
ggsave(file="NLR_tree.92.pdf", plot=q,
width = 9.4,height = 4)
这里有一个问题是geom_abline()
函数的效果如果是在Rstudio的图片显示界面是看不到的,如果保存为pdf就可以看到效果。暂时不明白是什么原因
论文中提供的代码还有一部分是计算MRCA,这部分我暂时没有想明白,想明白了再来介绍吧
示例数据和代码可以到论文中提到的链接处下载,或者给推文点赞,点击在看,最后留言获取
欢迎大家关注我的公众号
小明的数据分析笔记本
小明的数据分析笔记本 公众号 主要分享:1、R语言和python做数据分析和数据可视化的简单小例子;2、园艺植物相关转录组学、基因组学、群体遗传学文献阅读笔记;3、生物信息学入门学习资料及自己的学习笔记!
共有 0 条评论