2022, Henrik·Kaessmann, Nature
The molecular evolution of spermatogenesis across mammals
Paper
系统发育树来源:http://www.timetree.org/
数据库
https://apps.kaessmannlab.org/SpermEvol/
GitHub
Informations in Abstract and Introduction
The "however"
The molecular evolution of individual spermatogenic cell types across mammals remains largely uncharacterized.
1. Spermatogenesis across 11 species
1.1. PCA
Figure. 1b
2. Rates of evolution along spermatogenesis
2.1 Gene expression tree/ Gene expression phylogeny
Figure. 1c
- Calculate the
pairwise expression distance matrices (1- Spearman’s ρ)
with4,498
1:1 amniote orthologues (randomly sampled with replacement1,000 times
, 4,498 genes are sampled for every time, and also by this step to define theBootstrap-value
)sampled.data <- data[sample(x = 1:dim(data)[1], size = dim(data)[1], replace = TRUE), ]
- The
neighbour-joining trees
were constructed using theape
R package v.5.3
3. Evolutionary forces
3.1 pLI
Figure. 2c
Data reference:
Analysis of protein-coding genetic variation in 60,706 humans
http://www.nature.com/doifinder/10.1038/nature19057
- 目的:
to investigate patterns of functional constraint during spermatogenesis
- 结论:
showed a progressive increase of mutational tolerance starting during meiosis and culminating in early spermiogenesis
3.1 Lethal gene fraction
Figure. 2d
结论:
the percentage of expressed genes associated with lethality decreases during spermatogenesis
3.3 dN/dS
Figure. 2e
dN/dS
=Nonsynonymous/Synonymous
结论:
the normalized rate of amino acid altering substitutions in coding sequences across primates is higher in late spermatogenesis
Calculation of dN/dS:
- For Fig. 2e (and Extended Data Fig. 3a), we used the
average dN/dS values across 1:1 orthologues in primates
. For each cell, themean dN/dS value
is plotted. Conserved 1:1 orthologues across six primates (human, chimpanzee, gorilla, gibbon, macaque and marmoset) as well as theircoding and protein sequences
were extracted fromEnsembl
, providing a set of 11,791 protein-coding genes
- For each species and orthologue the
longest transcript
was extracted. Orthologous protein sequences were aligned usingclustalo
v.1.2.4; thenpal2nal
v14 was used (with protein sequences alignments and coding sequences as input) toproduce codon-based alignments
- The
codeml software from the PAML package
v.4.9 was used toestimate dN/dS ratios
- The
M0 site model
was applied to the orthologue alignments toestimate one average dN/dS ratio per orthologous gene set across species
(parameter NSites = 0, model = 0)
3.4 Positively selected gene expression rate
Figure. 2f
Data references:
https://doi.org/10.1093/nar/gkx704
https://doi.org/10.1371/journal.pgen.1000144
结论:
a notable increase in percentages of positively selected genes used during spermatogenesis, with a peak in rSD
3.5 Transcriptome Phylogenetic Age
Figure. 2g
Transcriptome Phylogenetic Age:
an index that gives greater weight to young new genes
结论:
new genes have increasingly more prominent roles in later stages, particularly in rSD
Previous work based on bulk cell-type analyses in mouse 11 uncovered a transcriptionally permissive chromatin environment during spermato- genesis, in particular in rSD, which was suggested to have facilitated the emergence of new genes during evolution
Calculation of Transcriptome Phylogenetic Age:
- The age of the duplicates was determined based on syntenic alignments across vertebrates and parsimony as previously described 60 . The pipeline was run for human, mouse, rat and chicken (based on Ensembl annotations).
- Genes predating the vertebrate split were given a score of 1, genes shared by amniotes were given a score of 2 and so on, until genes that are species-specific were given the maximum score.
- The range of the score differed between species depending on the number of outgroup lineages available (more lineages allowed for more details in the phylogeny) and therefore this index cannot be compared across species, only within species (that is, across organs).
- The score assigned to each gene was multiplied by the expression of the gene (but only if RPKM > 1).
- The results reported used the log2-transformed RPKM values but similar trends were obtained using the raw RPKM values.
- Higher values indicate larger contributions of recently duplicated genes (that is, younger transcriptomes).
Gene age data source: http://gentree.ioz.ac.cn/
Paper: https://genome.cshlp.org/content/29/4/682.full.pdf+html
3.6 Intergenic gene expression
Figure. 2h
结论:
in all species considerably increased contributions of intergenic transcripts after meiosis and a concomitant decrease in the contributions of protein-coding genes
3.7 Translational efficiency
Figure. 2i
结论:
a decline of translational efficiencies of transcripts during spermatogenesis (reaching a minimum in rSD) in all species
3.8 Expression pleiotropy
Figure. 2j
Expression pleiotropy: The breadth of expression across tissues and developmental processes
疑问
: Expression pleiotropy was proposed to represent a key determinant of the types of mutation that are permissible under selection???
此句话依据文章: https://journals.plos.org/plosbiology/article?id=10.1371/journal.pbio.0030245
文中解释:
A decrease in expression pleiotropy
can explain botha decrease in functional constraints
andan increase in adaptation
目的:
explored the reasons underlying the dynamic changes of selective forces and patterns of innovation during spermatogenesis
结论:genes used later in spermatogenesis, in particular those in rSD, have substantially more specific spatiotemporal profiles than genes used earlier in spermatogenesis and in somatic cells
Data reference:
https://doi.org/10.1038/s41586-019-1338-5
重要推论:
Given that a decrease in expression pleiotropy can explain both a decrease in functional constraints and an increase in adaptation, we suggest that it is probably a main contributor to the accelerated molecular evolution in late spermatogenesis.
The specific type of selection acting on haploid cells (haploid selection), in which expressed alleles are directly exposed to selection, may have contributed to the exceptionally rapid evolution of rSD.
3.9 Infertility
Figure. 2k
结论:
the proportion of genes associated with infertility is relatively high in SC and spermatids (especially rSD): higher than in SG and somatic cells
3.10 一些结论
Figure. 2d
+
Figure. 2j+
Figure. 2k
Whereas the tissue- and time-specific late spermatogenic genes, in general, are not essential for viability (Fig. 2d,j and Extended Data Fig. 3g,h, above), we proposed that the specific aforementioned evolutionary forces indicate that many of these genes evolved crucial roles in spermatogenesis
4. Gene expression conservation and innovation
We next sought to trace the individual genes underlying conserved (ancestral) and diverged aspects of germ cells by comparing expression trajectories along spermatogenesis of one-to-one (1:1) orthologous genes across species.
the temporal expression of 1,687 genes is conserved across the seven primates and probably reflects the core ancestral gene expression program of the simian testis
Evaluate the conservation of gene expression trajectory
Supplementary Fig. 2a,b
4.1 Expression trajectory conservation and fertility
Extended data Figure. 5b
结论:
genes involved in fertility are significantly more conserved in their expression trajectories than genes not associated with fertility
4.2 Expression trajectory conservation and GO
Extended Data Fig. 6a,b
结论:
Involvement of conserved genes in fundamental spermatogenic processes
Top 20 enriched biological process GO terms of genes showing conserved expression trajectories across primates and peak expression in the different cell types, respectively.
4.3 Expression trajectory change and GO
Extended Data Fig. 6c,d
结论:
genes with lineage-specific trajectory changes are enriched with broader, typically metabolic, processes
4.4 Expression trajectory change and tissue specificity
Extended Data Fig. 6e
结论:
Genes with changed expression are also significantly more tissue- and time-specific than genes with conserved expression, which may have facilitated their expression change during evolution because of reduced pleiotropic constraints.
4.5 New gene and evolution of new spermatogenic functions
Extended Data Fig. 7a,b
结论:
emergence of key spermatogenic genes at different time points during evolution
new genes contributing to functional roles predominantly during late spermatogenesis
4.6 Cell communications between Sertoli cells and Germline cells
Extended Data Fig. 8a,b
结论:
cross-species comparisons revealed various conserved known and new ligand–receptor interactions
5. Sex chromosomes
systematically assess testicular expression patterns of sex chromosomal genes and their evolution across mammals
5.1 Sex chromosome system
Figure. 4a
结论:
The therian XY chromosome system originated just before the split of eutherians and marsupials and hence has evolved largely independently in these two lineages. Around the same time, the monotreme sex-chromosome system arose from a different pair of autosomes and subsequently expanded to five XY pairs.
5.2 Expressional specificity of X-linked genes
结论:
A notable excess (12–60%) of X-linked genes with predominant expression in SG across all eutherians, in agreement with a previous mouse study
enrichments of genes with SG-specific expression also on the opossum and platypus X chromosomes, thus revealing this to be a shared pattern across mammals
Figure. 4b
Extended Data Fig. 9a
5.3 Expressional specificity of X chromsome is not an ancestral thing
Extended Data Fig. 9a,c
结论:
autosomes in outgroup species corresponding to the different mammalian X chromosomes (for example, platypus chromosome 6, which is homologous to the therian X) do not show any excess of SG- or Sertoli-expressed genes, which means that the ancestral autosomes that gave rise to present-day sex chromosomes were not enriched for such genes
5.4 Differential expression analysis between X and Y spermatids
A differential expression analysis between X and Y spermatids identified, as expected, most sex-chromosome genes, including gametologues (that is, genes with homologous counterparts on X and Y chromosomes), such as the translational regulatory genes
Figure. 4c
Extended Data Fig. 11
5.5 MSCI also exists in monotremes
Figure. 4d
共有 0 条评论