Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 11 de 11
Filtrar
Mais filtros

Base de dados
País/Região como assunto
Tipo de documento
País de afiliação
Intervalo de ano de publicação
1.
Bioinformatics ; 40(2)2024 02 01.
Artigo em Inglês | MEDLINE | ID: mdl-38269623

RESUMO

MOTIVATION: In diploid organisms, phasing is the problem of assigning the alleles at heterozygous variants to one of two haplotypes. Reads from PacBio HiFi sequencing provide long, accurate observations that can be used as the basis for both calling and phasing variants. HiFi reads also excel at calling larger classes of variation, such as structural or tandem repeat variants. However, current phasing tools typically only phase small variants, leaving larger variants unphased. RESULTS: We developed HiPhase, a tool that jointly phases SNVs, indels, structural, and tandem repeat variants. The main benefits of HiPhase are (i) dual mode allele assignment for detecting large variants, (ii) a novel application of the A*-algorithm to phasing, and (iii) logic allowing phase blocks to span breaks caused by alignment issues around reference gaps and homozygous deletions. In our assessment, HiPhase produced an average phase block NG50 of 480 kb with 929 switchflip errors and fully phased 93.8% of genes, improving over the current state of the art. Additionally, HiPhase jointly phases SNVs, indels, structural, and tandem repeat variants and includes innate multi-threading, statistics gathering, and concurrent phased alignment output generation. AVAILABILITY AND IMPLEMENTATION: HiPhase is available as source code and a pre-compiled Linux binary with a user guide at https://github.com/PacificBiosciences/HiPhase.


Assuntos
Genoma Humano , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Análise de Sequência de DNA , Algoritmos , Haplótipos , Sequências de Repetição em Tandem
2.
Genet Med ; 26(9): 101166, 2024 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-38767059

RESUMO

PURPOSE: The function of FAM177A1 and its relationship to human disease is largely unknown. Recent studies have demonstrated FAM177A1 to be a critical immune-associated gene. One previous case study has linked FAM177A1 to a neurodevelopmental disorder in 4 siblings. METHODS: We identified 5 individuals from 3 unrelated families with biallelic variants in FAM177A1. The physiological function of FAM177A1 was studied in a zebrafish model organism and human cell lines with loss-of-function variants similar to the affected cohort. RESULTS: These individuals share a characteristic phenotype defined by macrocephaly, global developmental delay, intellectual disability, seizures, behavioral abnormalities, hypotonia, and gait disturbance. We show that FAM177A1 localizes to the Golgi complex in mammalian and zebrafish cells. Intersection of the RNA sequencing and metabolomic data sets from FAM177A1-deficient human fibroblasts and whole zebrafish larvae demonstrated dysregulation of pathways associated with apoptosis, inflammation, and negative regulation of cell proliferation. CONCLUSION: Our data shed light on the emerging function of FAM177A1 and defines FAM177A1-related neurodevelopmental disorder as a new clinical entity.


Assuntos
Complexo de Golgi , Mutação com Perda de Função , Transtornos do Neurodesenvolvimento , Peixe-Zebra , Humanos , Peixe-Zebra/genética , Animais , Transtornos do Neurodesenvolvimento/genética , Transtornos do Neurodesenvolvimento/patologia , Transtornos do Neurodesenvolvimento/metabolismo , Complexo de Golgi/metabolismo , Complexo de Golgi/genética , Masculino , Feminino , Criança , Fenótipo , Pré-Escolar , Deficiência Intelectual/genética , Deficiência Intelectual/patologia , Deficiência Intelectual/metabolismo , Linhagem , Proteínas de Membrana/genética , Proteínas de Membrana/metabolismo
3.
Genet Med ; 25(12): 100947, 2023 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-37534744

RESUMO

PURPOSE: Variants of uncertain significance (VUS) are a common result of diagnostic genetic testing and can be difficult to manage with potential misinterpretation and downstream costs, including time investment by clinicians. We investigated the rate of VUS reported on diagnostic testing via multi-gene panels (MGPs) and exome and genome sequencing (ES/GS) to measure the magnitude of uncertain results and explore ways to reduce their potentially detrimental impact. METHODS: Rates of inconclusive results due to VUS were collected from over 1.5 million sequencing test results from 19 clinical laboratories in North America from 2020 to 2021. RESULTS: We found a lower rate of inconclusive test results due to VUSs from ES/GS (22.5%) compared with MGPs (32.6%; P < .0001). For MGPs, the rate of inconclusive results correlated with panel size. The use of trios reduced inconclusive rates (18.9% vs 27.6%; P < .0001), whereas the use of GS compared with ES had no impact (22.2% vs 22.6%; P = ns). CONCLUSION: The high rate of VUS observed in diagnostic MGP testing warrants examining current variant reporting practices. We propose several approaches to reduce reported VUS rates, while directing clinician resources toward important VUS follow-up.


Assuntos
Predisposição Genética para Doença , Testes Genéticos , Humanos , Testes Genéticos/métodos , Genômica , Exoma/genética , América do Norte
4.
Genet Med ; 23(7): 1255-1262, 2021 07.
Artigo em Inglês | MEDLINE | ID: mdl-33767343

RESUMO

PURPOSE: Clinical genome sequencing (cGS) followed by orthogonal confirmatory testing is standard practice. While orthogonal testing significantly improves specificity, it also results in increased turnaround time and cost of testing. The purpose of this study is to evaluate machine learning models trained to identify false positive variants in cGS data to reduce the need for orthogonal testing. METHODS: We sequenced five reference human genome samples characterized by the Genome in a Bottle Consortium (GIAB) and compared the results with an established set of variants for each genome referred to as a truth set. We then trained machine learning models to identify variants that were labeled as false positives. RESULTS: After training, the models identified 99.5% of the false positive heterozygous single-nucleotide variants (SNVs) and heterozygous insertions/deletions variants (indels) while reducing confirmatory testing of nonactionable, nonprimary SNVs by 85% and indels by 75%. Employing the algorithm in clinical practice reduced overall orthogonal testing using dideoxynucleotide (Sanger) sequencing by 71%. CONCLUSION: Our results indicate that a low false positive call rate can be maintained while significantly reducing the need for confirmatory testing. The framework that generated our models and results is publicly available at https://github.com/HudsonAlpha/STEVE .


Assuntos
Genoma Humano , Sequenciamento de Nucleotídeos em Larga Escala , Algoritmos , Genoma Humano/genética , Heterozigoto , Humanos , Mutação INDEL
5.
BMC Bioinformatics ; 20(1): 496, 2019 Oct 15.
Artigo em Inglês | MEDLINE | ID: mdl-31615419

RESUMO

BACKGROUND: When applying genomic medicine to a rare disease patient, the primary goal is to identify one or more genomic variants that may explain the patient's phenotypes. Typically, this is done through annotation, filtering, and then prioritization of variants for manual curation. However, prioritization of variants in rare disease patients remains a challenging task due to the high degree of variability in phenotype presentation and molecular source of disease. Thus, methods that can identify and/or prioritize variants to be clinically reported in the presence of such variability are of critical importance. METHODS: We tested the application of classification algorithms that ingest variant annotations along with phenotype information for predicting whether a variant will ultimately be clinically reported and returned to a patient. To test the classifiers, we performed a retrospective study on variants that were clinically reported to 237 patients in the Undiagnosed Diseases Network. RESULTS: We treated the classifiers as variant prioritization systems and compared them to four variant prioritization algorithms and two single-measure controls. We showed that the trained classifiers outperformed all other tested methods with the best classifiers ranking 72% of all reported variants and 94% of reported pathogenic variants in the top 20. CONCLUSIONS: We demonstrated how freely available binary classification algorithms can be used to prioritize variants even in the presence of real-world variability. Furthermore, these classifiers outperformed all other tested methods, suggesting that they may be well suited for working with real rare disease patient datasets.


Assuntos
Algoritmos , Doenças Genéticas Inatas/diagnóstico , Genômica/métodos , Mutação , Doenças Raras/diagnóstico , Doenças Genéticas Inatas/genética , Predisposição Genética para Doença , Genoma Humano , Humanos , Fenótipo , Polimorfismo Genético , Medicina de Precisão/métodos , Doenças Raras/genética , Estudos Retrospectivos , Análise de Sequência de DNA/métodos , Software
6.
Mol Biol Evol ; 33(6): 1381-95, 2016 06.
Artigo em Inglês | MEDLINE | ID: mdl-26882987

RESUMO

A selective sweep is the result of strong positive selection driving newly occurring or standing genetic variants to fixation, and can dramatically alter the pattern and distribution of allelic diversity in a population. Population-level sequencing data have enabled discoveries of selective sweeps associated with genes involved in recent adaptations in many species. In contrast, much debate but little evidence addresses whether "selfish" genes are capable of fixation-thereby leaving signatures identical to classical selective sweeps-despite being neutral or deleterious to organismal fitness. We previously described R2d2, a large copy-number variant that causes nonrandom segregation of mouse Chromosome 2 in females due to meiotic drive. Here we show population-genetic data consistent with a selfish sweep driven by alleles of R2d2 with high copy number (R2d2(HC)) in natural populations. We replicate this finding in multiple closed breeding populations from six outbred backgrounds segregating for R2d2 alleles. We find that R2d2(HC) rapidly increases in frequency, and in most cases becomes fixed in significantly fewer generations than can be explained by genetic drift. R2d2(HC) is also associated with significantly reduced litter sizes in heterozygous mothers, making it a true selfish allele. Our data provide direct evidence of populations actively undergoing selfish sweeps, and demonstrate that meiotic drive can rapidly alter the genomic landscape in favor of mutations with neutral or even negative effects on overall Darwinian fitness. Further study will reveal the incidence of selfish sweeps, and will elucidate the relative contributions of selfish genes, adaptation and genetic drift to evolution.


Assuntos
Proteínas Nucleares/genética , Proteínas de Ligação a RNA/genética , Sequências Repetitivas de Ácido Nucleico , Adaptação Fisiológica/genética , Alelos , Animais , Evolução Biológica , Variações do Número de Cópias de DNA/genética , Evolução Molecular , Feminino , Variação Genética , Genética Populacional , Masculino , Camundongos , Modelos Genéticos , Mutação , Seleção Genética
7.
Genome Biol ; 25(1): 253, 2024 Oct 02.
Artigo em Inglês | MEDLINE | ID: mdl-39358801

RESUMO

In this work, we extend vcfdist to be the first variant call benchmarking tool to jointly evaluate phased single-nucleotide polymorphisms (SNPs), small insertions/deletions (INDELs), and structural variants (SVs) for the whole genome. First, we find that a joint evaluation of small and structural variants uniformly reduces measured errors for SNPs (- 28.9%), INDELs (- 19.3%), and SVs (- 52.4%) across three datasets. vcfdist also corrects a common flaw in phasing evaluations, reducing measured flip errors by over 50%. Lastly, we show that vcfdist is more accurate than previously published works and on par with the newest approaches while providing improved result interpretability.


Assuntos
Benchmarking , Mutação INDEL , Polimorfismo de Nucleotídeo Único , Software , Humanos , Variação Estrutural do Genoma , Genoma Humano
8.
HGG Adv ; 2(2)2021 Apr 08.
Artigo em Inglês | MEDLINE | ID: mdl-33937879

RESUMO

Exome and genome sequencing have proven to be effective tools for the diagnosis of neurodevelopmental disorders (NDDs), but large fractions of NDDs cannot be attributed to currently detectable genetic variation. This is likely, at least in part, a result of the fact that many genetic variants are difficult or impossible to detect through typical short-read sequencing approaches. Here, we describe a genomic analysis using Pacific Biosciences circular consensus sequencing (CCS) reads, which are both long (>10 kb) and accurate (>99% bp accuracy). We used CCS on six proband-parent trios with NDDs that were unexplained despite extensive testing, including genome sequencing with short reads. We identified variants and created de novo assemblies in each trio, with global metrics indicating these datasets are more accurate and comprehensive than those provided by short-read data. In one proband, we identified a likely pathogenic (LP), de novo L1-mediated insertion in CDKL5 that results in duplication of exon 3, leading to a frameshift. In a second proband, we identified multiple large de novo structural variants, including insertion-translocations affecting DGKB and MLLT3, which we show disrupt MLLT3 transcript levels. We consider this extensive structural variation likely pathogenic. The breadth and quality of variant detection, coupled to finding variants of clinical and research interest in two of six probands with unexplained NDDs, support the hypothesis that long-read genome sequencing can substantially improve rare disease genetic discovery rates.

9.
Artigo em Inglês | MEDLINE | ID: mdl-32014855

RESUMO

Variations in disease onset and/or severity have often been observed in siblings with cystic fibrosis (CF), despite the same CFTR genotype and environment. We postulated that genomic variation (modifier and/or pharmacogenomic variants) might explain these clinical discordances. From a cohort of patients included in the Wisconsin randomized clinical trial (RCT) of newborn screening (NBS) for CF, we identified two brothers who showed discordant lung disease courses as children, with one milder and the other more severe than average, and a third, eldest brother, who also has severe lung disease. Leukocytes were harvested as the source of DNA, and whole-genome sequencing (WGS) was performed. Variants were identified and analyzed using in-house-developed informatics tools. Lung disease onset and severity were quantitatively different between brothers during childhood. The youngest, less severely affected brother is homozygous for HFE p.H63D. He also has a very rare PLG p.D238N variant that may influence host-pathogen interaction during chronic lung infection. Other variants of interest were found differentially between the siblings. Pharmacogenomics findings were consistent with the middle, most severely affected brother having poor outcomes to common CF treatments. We conclude that genomic variation between siblings with CF is expected. Variable lung disease severity may be associated with differences acting as genetic modifiers and/or pharmacogenomic factors, but large cohort studies are needed to assess this hypothesis.


Assuntos
Fibrose Cística/diagnóstico , Fibrose Cística/genética , Fenótipo , Irmãos , Sequenciamento Completo do Genoma , Adolescente , Biomarcadores , Criança , Pré-Escolar , Fibrose Cística/metabolismo , Regulador de Condutância Transmembrana em Fibrose Cística/genética , Variação Genética , Estudo de Associação Genômica Ampla , Genótipo , Humanos , Recém-Nascido , Masculino , Mutação , Triagem Neonatal , Testes Farmacogenômicos , Prognóstico , Radiografia Torácica , Testes de Função Respiratória
10.
Artigo em Inglês | MEDLINE | ID: mdl-31836585

RESUMO

We assessed the results of genome sequencing for early-onset dementia. Participants were selected from a memory disorders clinic. Genome sequencing was performed along with C9orf72 repeat expansion testing. All returned sequencing results were Sanger-validated. Prior clinical diagnoses included Alzheimer's disease, frontotemporal dementia, and unspecified dementia. The mean age of onset was 54 (41-76). Fifty percent of patients had a strong family history, 37.5% had some, and 12.5% had no known family history. Nine of 32 patients (28%) had a variant defined as pathogenic or likely pathogenic (P/LP) by American College of Medical Genetics and Genomics standards, including variants in APP, C9orf72, CSF1R, and MAPT Nine patients (including three with P/LP variants) harbored established risk alleles with moderate penetrance (odds ratios of ∼2-5) in ABCA7, AKAP9, GBA, PLD3, SORL1, and TREM2 All six patients harboring these moderate penetrance variants but not P/LP variants also had one or two APOE ε4 alleles. One patient had two APOE ε4 alleles with no other established contributors. In total, 16 patients (50%) harbored one or more genetic variants likely to explain symptoms. We identified variants of uncertain significance (VUSs) in ABI3, ADAM10, ARSA, GRID2IP, MME, NOTCH3, PLCD1, PSEN1, TM2D3, TNK1, TTC3, and VPS13C, also often along with other variants. In summary, genome sequencing for early-onset dementia frequently identified multiple established or possible contributory alleles. These observations add support for an oligogenic model for early-onset dementia.


Assuntos
Doença de Alzheimer/genética , Demência/genética , Idoso , Alelos , Apolipoproteína E4/genética , Sequência de Bases , Proteína C9orf72/genética , Mapeamento Cromossômico , Feminino , Estudos de Associação Genética , Predisposição Genética para Doença , Variação Genética , Estudo de Associação Genômica Ampla , Humanos , Masculino , Pessoa de Meia-Idade , Razão de Chances , Penetrância , Fatores de Risco , Sequenciamento Completo do Genoma/métodos
11.
G3 (Bethesda) ; 6(12): 4211-4216, 2016 12 07.
Artigo em Inglês | MEDLINE | ID: mdl-27765810

RESUMO

Wild-derived mouse inbred strains are becoming increasingly popular for complex traits analysis, evolutionary studies, and systems genetics. Here, we report the whole-genome sequencing of two wild-derived mouse inbred strains, LEWES/EiJ and ZALENDE/EiJ, of Mus musculus domesticus origin. These two inbred strains were selected based on their geographic origin, karyotype, and use in ongoing research. We generated 14× and 18× coverage sequence, respectively, and discovered over 1.1 million novel variants, most of which are private to one of these strains. This report expands the number of wild-derived inbred genomes in the Mus genus from six to eight. The sequence variation can be accessed via an online query tool; variant calls (VCF format) and alignments (BAM format) are available for download from a dedicated ftp site. Finally, the sequencing data have also been stored in a lossless, compressed, and indexed format using the multi-string Burrows-Wheeler transform. All data can be used without restriction.


Assuntos
Animais Selvagens/genética , Diploide , Genoma , Camundongos Endogâmicos/genética , Animais , Animais Selvagens/classificação , Feminino , Variação Genética , Genômica/métodos , Sequenciamento de Nucleotídeos em Larga Escala , Masculino , Camundongos , Camundongos Endogâmicos/classificação , Filogenia
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA