基因组组装之Verkko,实现T2T级别组装(可与hifiasm一较高下)

Verkko是一个用于实现端粒到端粒(telomere to telomere, T2T)基因组组装的新工具。

  • Rautiainen, M., Nurk, S., Walenz, B.P. et al. Telomere-to-telomere assembly of diploid chromosomes with Verkko. Nat Biotechnol (2023)

    image.png

    如上图所示,流程关键组件包括Canu、MBG、GraphAligner和Rukki,这些组件的整合使得Verkko可以实现自动流程处理输入的三代测序数据,最终获得高连续、高准确率的单倍体分型基因组,高质量三代测序数据的输入可获得T2T组装水平基因组。

#安装,环境要求python=3.7
conda install -c conda-forge -c bioconda -c defaults verkko
#运行,此处以单独hifi测序数据为例
verkko -d /home/verkko_assemb --hifi hifiseq_data.fasta --no-nano --threads 30
##如果为ONT或PacBio HiFi数据则不添加--no-nano

#verkko参数
MANDATORY PARAMETERS:
    -d     Directory to use for verkko intermediate and final results.
                             Will be created if needed.
    --hifi        List of files containing PacBio HiFi reads.
    --nano        List of files containing Oxford Nanopore reads.

                             Input reads can be any combination of FASTA/FASTQ,
                             uncompressed or gzip/bzip2/xz compressed.  Any
                             number of files can be supplied; *.gz works.

  ALGORITHM PARAMETERS:
    --no-correction          Do not perform Canu correction on the HiFi reads.
    --no-nano                Assemble without ONT data.

    --hap-kmers h1 h2 type  Use rukki to assign paths to haplotypes.  'h1' and 'h2
                            must be Meryl databases of homopolymer-compressed parental
                            kmers.  'type' must be 'trio', 'hic' or 'strandseq'.

    --base-k
    --max-k
    --window
    --threads

    --split-bases
    --split-reads
    --min-ont-length

    --correct-k-mer-size
    --correct-mer-threshold
    --correct-min-read-length
    --correct-min-overlap-length
    --correct-hash-bits

    --seed-min-length
    --seed-max-length
    --align-bandwidth
    --score-fraction
    --min-identity
    --min-score
    --end-clipping
    --incompatible-cutoff
    --max-trace

  COMPUTATIONAL PARAMETERS:
    --python    Path or name of a python interpreter.  Default: 'python'.
    --mbg              Path to MBG.             Default for both is the
    --graphaligner     Path to GraphAligner.    one packaged with verkko.

    --cleanup                Remove intermediate results.
    --no-cleanup             Retain intermediate results (default).

    --local                  Run on the local machine (default).
    --local-memory           Specify the upper limit on memory to use, in GB, default 64
    --local-cpus             Specify the number of CPUs to use, default 'all'

    --sge                    Enable Sun Grid Engine support.
    --slurm                  Enable Slurm support.
    --lsf                    Enable IBM Spectrum LSF support.

    --snakeopts      Append snakemake options in "string" to the
                             snakemake command.  Options MUST be quoted.

    --sto-run                Set resource limits for various stages.
    --mer-run                Format: number-of-cpus memory-in-gb time-in-hours
    --ovb-run                  --cns-run 8 32 2
    --ovs-run
    --red-run
    --mbg-run
    --utg-run
    --spl-run
    --ali-run
    --pop-run
    --utp-run
    --lay-run
    --sub-run
    --par-run
    --cns-run```

版权声明:
作者:Zad
链接:https://www.techfm.club/p/55986.html
来源:TechFM
文章版权归作者所有,未经允许请勿转载。

THE END
分享
二维码
< <上一篇
下一篇>>