cat tmp2 | tr A-Z a-z | sed 's/^/w|/s/w//U&/g' | tr " " "," > tmp3
samtools view -hb chr:start-end wgs.sort.bam > target.region.bam
samtools view -hb -L target.bed wgs.sort.bam > target.region.bam
bedtools intersect -a wgs.sort.bam -b target.bed > target.region.bam
sambamba view -hb chr:start-end wgs.sort.bam > target.region.bam
#根据bed文件来提取可以用 `sambamba slice `
sambamba slice -L target.bed wgs.sort.bam > target.region.bam
#sambamba slice -L 会是速度最快资源消耗最少的
把gff/gtf转为genebank格式, ref: https://www.biostars.org/p/72220/
The EMBOSS tool seqret would be a possible option.
seqret -sequence reference.fasta -feature -fformat gff -fopenfile 1.gff -osformat genbank -auto
awk '{if($2<10)print $1"/t"$2-10 ;else print $1"/t"$2+10} input > output
awk '{print $1"/t"$2}' rename.txt | tr "/t" "#" | awk '{print "sed -i ""'/''""s""#"$1"#""g""'/''"" input"}' > run.sh
sh run.sh
Length(0-based) = End(0-based) - Start(0-based)
Length(1-based) = End(1-based) - Start(1-based) + 1
Start(1-based) = Start(0-based) + 1
End(1-based) = End(0-based)
Start(0-based) = Start(1-based) - 1
End(0-based) = End(1-based)
有时候会出现,左边两列(geneChr/transcriptid) 和 第三列 (distanceToTSS) 不同的情况
这是因为,左边两列表示的是 输入的bed文件比如peak, 是落在哪个基因上
右边也就是第三列则是这个peak, 距离那个gene的TSS最近
Von Neumann Entropy (VNE) index的含义
This likely reflects a more disordered (the highentropy status) and relaxed chromatin architecture at early development (E38 and E80) (Fig.2b).
In agreement with the phenomenon that 3D structure in early mammalian embryos is initially obscure but gradually established throughout development45–47, the relatively loose chromatin folding highlights a highly plastic state for hepatocyte genomes at the early stages of development and may be essential for the rapid functional transitions in the liver before and after birth.
We observed a significantly higher VNE in the POF stage (0.86, P < 0.016, Wilcoxon rank-sum test) than in the SWF (0.80) and F1 stages (0.79) (Fig. 2a). This is likely due to a more disordered and relaxed chromatin architecture in the POF stage (Fig. 2b), while the architecture is more stable and ordered in mature GCs at the F1 stages, which aligns with the relaxed genome architecture observed during senescence
这句话出自文章https://doi.org/10.1080/19491034.2021.1910437, 文章中有这么一句话,并引用了两篇文献。
Biologically, genomic regions with high entropy likely correlate with high proportions of euchromatin, as euchromatin is more structurally permissive than heterochromatin [1, 2]
1.Macarthur BD, Lemischka IR. Statistical mechanics of pluripotency. Cell. 2013;154(3):484–489
2.Rajapakse I, Groudine M, Mesbahi M. What can systems theory of networks offer to biology? PLoS Comput Biol. 2012;8(6):e1002543.
以下两句话出自文章: https://doi.org/10.1016/j.neo.2020.12.010
In the context of genome structure, the higher the entropy, the more conformations available to the system [46] . If the distant ends of a genomic region, e.g., a gene, interact to form a loop, there are fewer conformations available to the gene and thus the entropy of that genomic region is reduced.
46.Phillips, Rob, et al. "Physical biology of the cell." American Journal of Physics 78.11 (2010): 1230-1230.
We apply one such approach - a derivative of VNE - to measure local chromatin organization of individual gene regions [59]. Higher VNE values indicate that the number of conformations available to the gene and its immediate neighborhood are higher, indicating that chromatin is more accessible.
The more disordered (and permissive) chromatin in the pgEpiSCs was also evident based on its high-entropy status.
然后引用了下图, 下图中的d图的图注是: The extent of disorder in chromatin structure (quantified by the Von Neumann Entropy (VNE))
We found that Di-SG had higher entropy (Fig. 1C), suggestive of less compact chromatin structural organization in Di-SG.
https://github.com/HuiyangYu/PanDepth 基于sam bam cram算基因组(和基因集)的深度和覆盖度 超级快高效的工具(低内存),超级大(几十G)的bam 也一两分钟的事。另外: 默认内存至少是bamdeal 的1之10。 速度也十分快。
Rather than reporting so much detail in the abstract, it might be better to make a more general statement like: "Deletions affecting introns and/or coding regions of numerous genes may have contributed to phenotypic differences between A. baiyi and other Ablax species"
Comparative Recombination Rates in the Rat, Mouse, and Human Genomes
awk '{print $1"/t"$4"/t"$4*0.000554779412}' Chr27.map | sort -Vk 1 | awk '{print "Chr"$1"/t"$2"/t"$3}' > Chr27.genetic.map
SNP的pos * 0.000554779412
