Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 83
Filtrar
1.
Elife ; 122024 Feb 21.
Artigo em Inglês | MEDLINE | ID: mdl-38381482

RESUMO

Maintaining germline genome integrity is essential and enormously complex. Although many proteins are involved in DNA replication, proofreading, and repair, mutator alleles have largely eluded detection in mammals. DNA replication and repair proteins often recognize sequence motifs or excise lesions at specific nucleotides. Thus, we might expect that the spectrum of de novo mutations - the frequencies of C>T, A>G, etc. - will differ between genomes that harbor either a mutator or wild-type allele. Previously, we used quantitative trait locus mapping to discover candidate mutator alleles in the DNA repair gene Mutyh that increased the C>A germline mutation rate in a family of inbred mice known as the BXDs (Sasani et al., 2022, Ashbrook et al., 2021). In this study we developed a new method to detect alleles associated with mutation spectrum variation and applied it to mutation data from the BXDs. We discovered an additional C>A mutator locus on chromosome 6 that overlaps Ogg1, a DNA glycosylase involved in the same base-excision repair network as Mutyh (David et al., 2007). Its effect depends on the presence of a mutator allele near Mutyh, and BXDs with mutator alleles at both loci have greater numbers of C>A mutations than those with mutator alleles at either locus alone. Our new methods for analyzing mutation spectra reveal evidence of epistasis between germline mutator alleles and may be applicable to mutation data from humans and other model organisms.


Assuntos
Epistasia Genética , Mutação em Linhagem Germinativa , Humanos , Animais , Camundongos , Alelos , Mutação , Mapeamento Cromossômico , Mamíferos
2.
Genetics ; 226(4)2024 Apr 03.
Artigo em Inglês | MEDLINE | ID: mdl-38298127

RESUMO

Short tandem repeats (STRs) are hotspots of genomic variability in the human germline because of their high mutation rates, which have long been attributed largely to polymerase slippage during DNA replication. This model suggests that STR mutation rates should scale linearly with a father's age, as progenitor cells continually divide after puberty. In contrast, it suggests that STR mutation rates should not scale with a mother's age at her child's conception, since oocytes spend a mother's reproductive years arrested in meiosis II and undergo a fixed number of cell divisions that are independent of the age at ovulation. Yet, mirroring recent findings, we find that STR mutation rates covary with paternal and maternal age, implying that some STR mutations are caused by DNA damage in quiescent cells rather than polymerase slippage in replicating progenitor cells. These results echo the recent finding that DNA damage in oocytes is a significant source of de novo single nucleotide variants and corroborate evidence of STR expansion in postmitotic cells. However, we find that the maternal age effect is not confined to known hotspots of oocyte mutagenesis, nor are postzygotic mutations likely to contribute significantly. STR nucleotide composition demonstrates divergent effects on de novo mutation (DNM) rates between sexes. Unlike the paternal lineage, maternally derived DNMs at A/T STRs display a significantly greater association with maternal age than DNMs at G/C-containing STRs. These observations may suggest the mechanism and developmental timing of certain STR mutations and contradict prior attribution of replication slippage as the primary mechanism of STR mutagenesis.


Assuntos
Repetições de Microssatélites , Taxa de Mutação , Humanos , Feminino , Criança , Mutação , Pais , Meiose , Nucleotídeos
3.
Genome Res ; 34(2): 179-188, 2024 Mar 20.
Artigo em Inglês | MEDLINE | ID: mdl-38355308

RESUMO

A mechanistic understanding of the biological and technical factors that impact transcript measurements is essential to designing and analyzing single-cell and single-nucleus RNA sequencing experiments. Nuclei contain the same pre-mRNA population as cells, but they contain a small subset of the mRNAs. Nonetheless, early studies argued that single-nucleus analysis yielded results comparable to cellular samples if pre-mRNA measurements were included. However, typical workflows do not distinguish between pre-mRNA and mRNA when estimating gene expression, and variation in their relative abundances across cell types has received limited attention. These gaps are especially important given that incorporating pre-mRNA has become commonplace for both assays, despite known gene length bias in pre-mRNA capture. Here, we reanalyze public data sets from mouse and human to describe the mechanisms and contrasting effects of mRNA and pre-mRNA sampling on gene expression and marker gene selection in single-cell and single-nucleus RNA-seq. We show that pre-mRNA levels vary considerably among cell types, which mediates the degree of gene length bias and limits the generalizability of a recently published normalization method intended to correct for this bias. As an alternative, we repurpose an existing post hoc gene length-based correction method from conventional RNA-seq gene set enrichment analysis. Finally, we show that inclusion of pre-mRNA in bioinformatic processing can impart a larger effect than assay choice itself, which is pivotal to the effective reuse of existing data. These analyses advance our understanding of the sources of variation in single-cell and single-nucleus RNA-seq experiments and provide useful guidance for future studies.


Assuntos
Núcleo Celular , Precursores de RNA , Humanos , Animais , Camundongos , RNA-Seq , RNA Mensageiro/genética , Análise de Sequência de RNA/métodos , Núcleo Celular/genética , Perfilação da Expressão Gênica/métodos , Análise de Célula Única
4.
Nat Biotechnol ; 2024 Jan 02.
Artigo em Inglês | MEDLINE | ID: mdl-38168995

RESUMO

Tandem repeat (TR) variation is associated with gene expression changes and numerous rare monogenic diseases. Although long-read sequencing provides accurate full-length sequences and methylation of TRs, there is still a need for computational methods to profile TRs across the genome. Here we introduce the Tandem Repeat Genotyping Tool (TRGT) and an accompanying TR database. TRGT determines the consensus sequences and methylation levels of specified TRs from PacBio HiFi sequencing data. It also reports reads that support each repeat allele. These reads can be subsequently visualized with a companion TR visualization tool. Assessing 937,122 TRs, TRGT showed a Mendelian concordance of 98.38%, allowing a single repeat unit difference. In six samples with known repeat expansions, TRGT detected all expansions while also identifying methylation signals and mosaicism and providing finer repeat length resolution than existing methods. Additionally, we released a database with allele sequences and methylation levels for 937,122 TRs across 100 genomes.

5.
Andrology ; 2023 Dec 10.
Artigo em Inglês | MEDLINE | ID: mdl-38073178

RESUMO

BACKGROUND: There are likely to be hundreds of monogenic forms of human male infertility. Whole genome sequencing (WGS) is the most efficient way to make progress in mapping the causative genetic variants, and ultimately improve clinical management of the disease in each patient. Recruitment of consanguineous families is an effective approach to ascertain the genetic forms of many diseases. OBJECTIVES: To apply WGS to large consanguineous families with likely hereditary male infertility and identify potential genetic cases. MATERIALS AND METHODS: We recruited seven large families with clinically diagnosed male infertility from rural Pakistan, including five with a history of consanguinity. We generated WGS data on 26 individuals (3-5 per family) and analyzed the resulting data with a computational pipeline to identify potentially causal single nucleotide variants, indels, and copy number variants. RESULTS: We identified plausible genetic causes in five of the seven families, including a homozygous 10 kb deletion of exon 2 in a well-established male infertility gene (M1AP), and biallelic missense substitutions (SPAG6, CCDC9, TUBA3C) and an in-frame hemizygous deletion (TKTL1) in genes with emerging relevance. DISCUSSION AND CONCLUSION: The rate of genetic findings using the current approach (71%) was much higher than what we recently achieved using whole-exome sequencing (WES) of unrelated singleton cases (20%). Furthermore, we identified a pathogenic single-exon deletion in M1AP that would be undetectable by WES. Screening more families with WGS, especially in underrepresented populations, will further reveal the types of variants underlying male infertility and accelerate the use of genetics in the patient management.

6.
bioRxiv ; 2023 Nov 14.
Artigo em Inglês | MEDLINE | ID: mdl-37162999

RESUMO

Maintaining germline genome integrity is essential and enormously complex. Although many proteins are involved in DNA replication, proofreading, and repair [1], mutator alleles have largely eluded detection in mammals. DNA replication and repair proteins often recognize sequence motifs or excise lesions at specific nucleotides. Thus, we might expect that the spectrum of de novo mutations - the frequencies of C>T, A>G, etc. - will differ between genomes that harbor either a mutator or wild-type allele. Previously, we used quantitative trait locus mapping to discover candidate mutator alleles in the DNA repair gene Mutyh that increased the C>A germline mutation rate in a family of inbred mice known as the BXDs [2,3]. In this study we developed a new method to detect alleles associated with mutation spectrum variation and applied it to mutation data from the BXDs. We discovered an additional C>A mutator locus on chromosome 6 that overlaps Ogg1, a DNA glycosylase involved in the same base-excision repair network as Mutyh [4]. Its effect depended on the presence of a mutator allele near Mutyh, and BXDs with mutator alleles at both loci had greater numbers of C>A mutations than those with mutator alleles at either locus alone. Our new methods for analyzing mutation spectra reveal evidence of epistasis between germline mutator alleles and may be applicable to mutation data from humans and other model organisms.

7.
PLoS One ; 18(2): e0281934, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-36800380

RESUMO

One to two percent of couples suffer recurrent pregnancy loss and over 50% of the cases are unexplained. Whole genome sequencing (WGS) analysis has the potential to identify previously unrecognized causes of pregnancy loss, but few studies have been performed, and none have included DNA from families including parents, losses, and live births. We conducted a pilot WGS study in three families with unexplained recurrent pregnancy loss, including parents, healthy live births, and losses, which included an embryonic loss (<10 weeks' gestation), fetal deaths (10-20 weeks' gestation) and stillbirths (≥ 20 weeks' gestation). We used the Illumina platform for WGS and state-of-the-art protocols to identify single nucleotide variants (SNVs) following various modes of inheritance. We identified 87 SNVs involving 75 genes in embryonic loss (n = 1), 370 SNVs involving 228 genes in fetal death (n = 3), and 122 SNVs involving 122 genes in stillbirth (n = 2). Of these, 22 de novo, 6 inherited autosomal dominant and an X-linked recessive SNVs were pathogenic (probability of being loss-of-function intolerant >0.9), impacting known genes (e.g., DICER1, FBN2, FLT4, HERC1, and TAOK1) involved in embryonic/fetal development and congenital abnormalities. Further, we identified inherited missense compound heterozygous SNVs impacting genes (e.g., VWA5B2) in two fetal death samples. The variants were not identified as compound heterozygous SNVs in live births and population controls, providing evidence for haplosufficient genes relevant to pregnancy loss. In this pilot study, we provide evidence for de novo and inherited SNVs relevant to pregnancy loss. Our findings provide justification for conducting WGS using larger numbers of families and warrant validation by targeted sequencing to ascertain causal variants. Elucidating genes causing pregnancy loss may facilitate the development of risk stratification strategies and novel therapeutics.


Assuntos
Aborto Habitual , Gravidez , Feminino , Humanos , Projetos Piloto , Aborto Habitual/genética , Natimorto/genética , Natimorto/epidemiologia , Nascido Vivo , Proteínas Serina-Treonina Quinases , Ribonuclease III , RNA Helicases DEAD-box
8.
Cell Rep ; 42(1): 111945, 2023 01 31.
Artigo em Inglês | MEDLINE | ID: mdl-36640362

RESUMO

Genes are typically assumed to express both parental alleles similarly, yet cell lines show random allelic expression (RAE) for many autosomal genes that could shape genetic effects. Thus, understanding RAE in human tissues could improve our understanding of phenotypic variation. Here, we develop a methodology to perform genome-wide profiling of RAE and biallelic expression in GTEx datasets for 832 people and 54 tissues. We report 2,762 autosomal genes with some RAE properties similar to randomly inactivated X-linked genes. We found that RAE is associated with rapidly evolving regions in the human genome, adaptive signaling processes, and genes linked to age-related diseases such as neurodegeneration and cancer. We define putative mechanistic subtypes of RAE distinguished by gene overlaps on sense and antisense DNA strands, aggregation in clusters near telomeres, and increased regulatory complexity and inputs compared with biallelic genes. We provide foundations to study RAE in human phenotypes, evolution, and disease.


Assuntos
Cromossomos , Corpo Humano , Humanos , Adulto , Alelos , Fenótipo , Linhagem Celular
9.
bioRxiv ; 2023 Dec 23.
Artigo em Inglês | MEDLINE | ID: mdl-38187618

RESUMO

Short tandem repeats (STRs) are hotspots of genomic variability in the human germline because of their high mutation rates, which have long been attributed largely to polymerase slippage during DNA replication. This model suggests that STR mutation rates should scale linearly with a father's age, as progenitor cells continually divide after puberty. In contrast, it suggests that STR mutation rates should not scale with a mother's age at her child's conception, since oocytes spend a mother's reproductive years arrested in meiosis II and undergo a fixed number of cell divisions that are independent of the age at ovulation. Yet, mirroring recent findings, we find that STR mutation rates covary with paternal and maternal age, implying that some STR mutations are caused by DNA damage in quiescent cells rather than the classical mechanism of polymerase slippage in replicating progenitor cells. These results also echo the recent finding that DNA damage in quiescent oocytes is a significant source of de novo SNVs and corroborate evidence of STR expansion in postmitotic cells. However, we find that the maternal age effect is not confined to previously discovered hotspots of oocyte mutagenesis, nor are post-zygotic mutations likely to contribute significantly. STR nucleotide composition demonstrates divergent effects on DNM rates between sexes. Unlike the paternal lineage, maternally derived DNMs at A/T STRs display a significantly greater association with maternal age than DNMs at GC-containing STRs. These observations may suggest the mechanism and developmental timing of certain STR mutations and are especially surprising considering the prior belief in replication slippage as the dominant mechanism of STR mutagenesis.

10.
Genome Biol Evol ; 14(12)2022 12 08.
Artigo em Inglês | MEDLINE | ID: mdl-36477201

RESUMO

The ongoing SARS-CoV-2 pandemic is the third zoonotic coronavirus identified in the last 20 years. Enzootic and epizootic coronaviruses of diverse lineages also pose a significant threat to livestock, as most recently observed for virulent strains of porcine epidemic diarrhea virus (PEDV) and swine acute diarrhea-associated coronavirus (SADS-CoV). Unique to RNA viruses, coronaviruses encode a proofreading exonuclease (ExoN) that lowers point mutation rates to increase the viability of large RNA virus genomes, which comes with the cost of limiting virus adaptation via point mutation. This limitation can be overcome by high rates of recombination that facilitate rapid increases in genetic diversification. To compare the dynamics of recombination between related sequences, we developed an open-source computational workflow (IDPlot) that bundles nucleotide identity, recombination, and phylogenetic analysis into a single pipeline. We analyzed recombination dynamics among three groups of coronaviruses with noteworthy impacts on human health and agriculture: SARSr-CoV, Betacoronavirus-1, and SADSr-CoV. We found that all three groups undergo recombination with highly diverged viruses from undersampled or unsampled lineages, including in typically highly conserved regions of the genome. In several cases, no parental origin of recombinant regions could be found in genetic databases, demonstrating our shallow characterization of coronavirus diversity and expanding the genetic pool that may contribute to future zoonotic events. Our results also illustrate the limitations of current sampling approaches for anticipating zoonotic threats to human and animal health.


Assuntos
COVID-19 , SARS-CoV-2 , Animais , Humanos , Filogenia , SARS-CoV-2/genética , Suínos
11.
Genome Biol ; 23(1): 257, 2022 12 14.
Artigo em Inglês | MEDLINE | ID: mdl-36517892

RESUMO

Expansions of short tandem repeats (STRs) cause many rare diseases. Expansion detection is challenging with short-read DNA sequencing data since supporting reads are often mapped incorrectly. Detection is particularly difficult for "novel" STRs, which include new motifs at known loci or STRs absent from the reference genome. We developed STRling to efficiently count k-mers to recover informative reads and call expansions at known and novel STR loci. STRling is sensitive to known STR disease loci, has a low false discovery rate, and resolves novel STR expansions to base-pair position accuracy. It is fast, scalable, open-source, and available at: github.com/quinlan-lab/STRling .


Assuntos
Sequenciamento de Nucleotídeos em Larga Escala , Repetições de Microssatélites , Análise de Sequência de DNA
12.
BMC Bioinformatics ; 23(1): 490, 2022 Nov 16.
Artigo em Inglês | MEDLINE | ID: mdl-36384437

RESUMO

BACKGROUND: Identification of deleterious genetic variants using DNA sequencing data relies on increasingly detailed filtering strategies to isolate the small subset of variants that are more likely to underlie a disease phenotype. Datasets reflecting population allele frequencies of different types of variants serve as powerful filtering tools, especially in the context of rare disease analysis. While such population-scale allele frequency datasets now exist for structural variants (SVs), it remains a challenge to match SV calls between multiple datasets, thereby complicating estimates of a putative SV's population allele frequency. RESULTS: We introduce SVAFotate, a software tool that enables the annotation of SVs with variant allele frequency and related information from existing SV datasets. As a result, VCF files annotated by SVAFotate offer a variety of metrics to aid in the stratification of SVs as common or rare in the broader human population. CONCLUSIONS: Here we demonstrate the use of SVAFotate in the classification of SVs with regards to their population frequency and illustrate how SVAFotate's annotations can be used to filter and prioritize SVs. Lastly, we detail how best to utilize these SV annotations in the analysis of genetic variation in studies of rare disease.


Assuntos
Frequência do Gene , Sequenciamento de Nucleotídeos em Larga Escala , Software , Humanos , Doenças Raras
13.
BMC Bioinformatics ; 23(1): 482, 2022 Nov 14.
Artigo em Inglês | MEDLINE | ID: mdl-36376793

RESUMO

BACKGROUND: Despite numerous molecular and computational advances, roughly half of patients with a rare disease remain undiagnosed after exome or genome sequencing. A particularly challenging barrier to diagnosis is identifying variants that cause deleterious alternative splicing at intronic or exonic loci outside of canonical donor or acceptor splice sites. RESULTS: Several existing tools predict the likelihood that a genetic variant causes alternative splicing. We sought to extend such methods by developing a new metric that aids in discerning whether a genetic variant leads to deleterious alternative splicing. Our metric combines genetic variation in the Genome Aggregate Database with alternative splicing predictions from SpliceAI to compare observed and expected levels of splice-altering genetic variation. We infer genic regions with significantly less splice-altering variation than expected to be constrained. The resulting model of regional splicing constraint captures differential splicing constraint across gene and exon categories, and the most constrained genic regions are enriched for pathogenic splice-altering variants. Building from this model, we developed ConSpliceML. This ensemble machine learning approach combines regional splicing constraint with multiple per-nucleotide alternative splicing scores to guide the prediction of deleterious splicing variants in protein-coding genes. ConSpliceML more accurately distinguishes deleterious and benign splicing variants than state-of-the-art splicing prediction methods, especially in "cryptic" splicing regions beyond canonical donor or acceptor splice sites. CONCLUSION: Integrating a model of genetic constraint with annotations from existing alternative splicing tools allows ConSpliceML to prioritize potentially deleterious splice-altering variants in studies of rare human diseases.


Assuntos
Processamento Alternativo , Doenças Raras , Humanos , Doenças Raras/genética , Splicing de RNA , Íntrons , Éxons , Mutação , Sítios de Splice de RNA
14.
Elife ; 112022 09 07.
Artigo em Inglês | MEDLINE | ID: mdl-36069526

RESUMO

Horizontal gene transfer (HGT) provides a major source of genetic variation. Many viruses, including poxviruses, encode genes with crucial functions directly gained by gene transfer from hosts. The mechanism of transfer to poxvirus genomes is unknown. Using genome analysis and experimental screens of infected cells, we discovered a central role for Long Interspersed Nuclear Element-1 retrotransposition in HGT to virus genomes. The process recapitulates processed pseudogene generation, but with host messenger RNA directed into virus genomes. Intriguingly, hallmark features of retrotransposition appear to favor virus adaption through rapid duplication of captured host genes on arrival. Our study reveals a previously unrecognized conduit of genetic traffic with fundamental implications for the evolution of many virus classes and their hosts.


Assuntos
Poxviridae , Vírus , Evolução Molecular , Transferência Genética Horizontal , Filogenia , Poxviridae/genética , RNA Mensageiro , Vírus/genética , Retroelementos
16.
Nat Methods ; 19(4): 445-448, 2022 04.
Artigo em Inglês | MEDLINE | ID: mdl-35396485

RESUMO

Structural variants are associated with cancers and developmental disorders, but challenges with estimating population frequency remain a barrier to prioritizing mutations over inherited variants. In particular, variability in variant calling heuristics and filtering limits the use of current structural variant catalogs. We present STIX, a method that, instead of relying on variant calls, indexes and searches the raw alignments from thousands of samples to enable more comprehensive allele frequency estimation.


Assuntos
Genoma , Variação Estrutural do Genoma , Neoplasias , Algoritmos , Variação Estrutural do Genoma/genética , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Neoplasias/genética , Software
17.
Mol Genet Genomic Med ; 10(4): e1888, 2022 04.
Artigo em Inglês | MEDLINE | ID: mdl-35119225

RESUMO

BACKGROUND: Genetic disorders contribute to significant morbidity and mortality in critically ill newborns. Despite advances in genome sequencing technologies, a majority of neonatal cases remain unsolved. Complex structural variants (SVs) often elude conventional genome sequencing variant calling pipelines and will explain a portion of these unsolved cases. METHODS: As part of the Utah NeoSeq project, we used a research-based, rapid whole-genome sequencing (WGS) protocol to investigate the genomic etiology for a newborn with a left-sided congenital diaphragmatic hernia (CDH) and cardiac malformations, whose mother also had a history of CDH and atrial septal defect. RESULTS: Using both a novel, alignment-free and traditional alignment-based variant callers, we identified a maternally inherited complex SV on chromosome 8, consisting of an inversion flanked by deletions. This complex inversion, further confirmed using orthogonal molecular techniques, disrupts the ZFPM2 gene, which is associated with both CDH and various congenital heart defects. CONCLUSIONS: Our results demonstrate that complex structural events, which often are unidentifiable or not reported by clinically validated testing procedures, can be discovered and accurately characterized with conventional, short-read sequencing and underscore the utility of WGS as a first-line diagnostic tool.


Assuntos
Hérnias Diafragmáticas Congênitas , Proteínas de Ligação a DNA/genética , Genômica , Hérnias Diafragmáticas Congênitas/genética , Humanos , Recém-Nascido , Fatores de Transcrição/genética , Sequenciamento Completo do Genoma/métodos
18.
Bioinformatics ; 38(5): 1231-1234, 2022 02 07.
Artigo em Inglês | MEDLINE | ID: mdl-34864893

RESUMO

SUMMARY: We present trfermikit, a software tool designed to detect deletions larger than 50 bp occurring in Variable Number Tandem Repeats using Illumina DNA sequencing reads. In such regions, it achieves a better tradeoff between sensitivity and false discovery than a state-of-the-art structural variation caller, Manta and complements it by recovering a significant number of deletions that Manta missed. trfermikit is based upon the fermikit pipeline, which performs read assembly, maps the assembly to the reference genome and calls variants from the alignment. AVAILABILITY AND IMPLEMENTATION: https://github.com/petermchale/trfermikit. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Genoma , Software , Análise de Sequência de DNA , Sequenciamento de Nucleotídeos em Larga Escala
19.
NPJ Genom Med ; 6(1): 60, 2021 Jul 15.
Artigo em Inglês | MEDLINE | ID: mdl-34267211

RESUMO

In studies of families with rare disease, it is common to screen for de novo mutations, as well as recessive or dominant variants that explain the phenotype. However, the filtering strategies and software used to prioritize high-confidence variants vary from study to study. In an effort to establish recommendations for rare disease research, we explore effective guidelines for variant (SNP and INDEL) filtering and report the expected number of candidates for de novo dominant, recessive, and autosomal dominant modes of inheritance. We derived these guidelines using two large family-based cohorts that underwent whole-genome sequencing, as well as two family cohorts with whole-exome sequencing. The filters are applied to common attributes, including genotype-quality, sequencing depth, allele balance, and population allele frequency. The resulting guidelines yield ~10 candidate SNP and INDEL variants per exome, and 18 per genome for recessive and de novo dominant modes of inheritance, with substantially more candidates for autosomal dominant inheritance. For family-based, whole-genome sequencing studies, this number includes an average of three de novo, ten compound heterozygous, one autosomal recessive, four X-linked variants, and roughly 100 candidate variants following autosomal dominant inheritance. The slivar software we developed to establish and rapidly apply these filters to VCF files is available at https://github.com/brentp/slivar under an MIT license, and includes documentation and recommendations for best practices for rare disease analysis.

20.
Bioinformatics ; 37(24): 4860-4861, 2021 12 11.
Artigo em Inglês | MEDLINE | ID: mdl-34146087

RESUMO

SUMMARY: Unfazed is a command-line tool to determine the parental gamete of origin for de novo mutations from paired-end Illumina DNA sequencing reads. Unfazed uses variant information for a sequenced trio to identify the parental gamete of origin by linking phase-informative inherited variants to de novo mutations using read-based phasing. It achieves a high success rate by chaining reads into haplotype groups, thus increasing the search space for informative sites. Unfazed provides a simple command-line interface and scales well to large inputs, determining parent-of-origin for nearly 30 000 de novo variants in under 60 h. AVAILABILITY AND IMPLEMENTATION: Unfazed is available at https://github.com/jbelyeu/unfazed. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Algoritmos , Software , Análise de Sequência de DNA , Haplótipos , Sequenciamento de Nucleotídeos em Larga Escala
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...