RESUMO
Motivation: Identifying de novo tandem repeat (TR) mutations on a genome-wide scale is essential for understanding genetic variability and its implications in rare diseases. While PacBio HiFi sequencing data enhances the accessibility of the genome's TR regions for genotyping, simple de novo calling strategies often generate an excess of likely false positives, which can obscure true positive findings, particularly as the number of surveyed genomic regions increases. Results: We developed TRGT-denovo, a computational method designed to accurately identify all types of de novo TR mutations-including expansions, contractions, and compositional changes-within family trios. TRGT-denovo directly interrogates read evidence, allowing for the detection of subtle variations often overlooked in variant call format (VCF) files. TRGT-denovo improves the precision and specificity of de novo mutation (DNM) identification, reducing the number of de novo candidates by an order of magnitude compared to genotype-based approaches. In our experiments involving eight rare disease trios previously studiedTRGT-denovo correctly reclassified all false positive DNM candidates as true negatives. Using an expanded repeat catalog, it identified new candidates, of which 95% (19/20) were experimentally validated, demonstrating its effectiveness in minimizing likely false positives while maintaining high sensitivity for true discoveries. Availability and implementation: Built in Rust, TRGT-denovo is available as source code and a pre-compiled Linux binary along with a user guide at: https://github.com/PacificBiosciences/trgt-denovo.
RESUMO
Chronic myelomonocytic leukemia (CMML) is a hematologic malignancy nearly confined to the elderly. Previous studies to determine incidence and prognostic significance of somatic mutations in CMML have relied on candidate gene sequencing, although an unbiased mutational search has not been conducted. As many of the genes commonly mutated in CMML were recently associated with age-related clonal hematopoiesis (ARCH) and aged hematopoiesis is characterized by a myelomonocytic differentiation bias, we hypothesized that CMML and aged hematopoiesis may be closely related. We initially established the somatic mutation landscape of CMML by whole exome sequencing followed by gene-targeted validation. Genes mutated in ⩾10% of patients were SRSF2, TET2, ASXL1, RUNX1, SETBP1, KRAS, EZH2, CBL and NRAS, as well as the novel CMML genes FAT4, ARIH1, DNAH2 and CSMD1. Most CMML patients (71%) had mutations in ⩾2 ARCH genes and 52% had ⩾7 mutations overall. Higher mutation burden was associated with shorter survival. Age-adjusted population incidence and reported ARCH mutation rates are consistent with a model in which clinical CMML ensues when a sufficient number of stochastically acquired age-related mutations has accumulated, suggesting that CMML represents the leukemic conversion of the myelomonocytic-lineage-biased aged hematopoietic system.