Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 84
Filtrar
1.
Genet Epidemiol ; 2024 Sep 24.
Artigo em Inglês | MEDLINE | ID: mdl-39318036

RESUMO

The introduction of Next-Generation Sequencing technologies in the clinics has improved rare disease diagnosis. Nonetheless, for very heterogeneous or very rare diseases, more than half of cases still lack molecular diagnosis. Novel strategies are needed to prioritize variants within a single individual. The Population Sampling Probability (PSAP) method was developed to meet this aim but only for coding variants in exome data. Here, we propose an extension of the PSAP method to the non-coding genome called PSAP-genomic-regions. In this extension, instead of considering genes as testing units (PSAP-genes strategy), we use genomic regions defined over the whole genome that pinpoint potential functional constraints. We conceived an evaluation protocol for our method using artificially generated disease exomes and genomes, by inserting coding and non-coding pathogenic ClinVar variants in large data sets of exomes and genomes from the general population. PSAP-genomic-regions significantly improves the ranking of these variants compared to using a pathogenicity score alone. Using PSAP-genomic-regions, more than 50% of non-coding ClinVar variants were among the top 10 variants of the genome. On real sequencing data from six patients with Cerebral Small Vessel Disease and nine patients with male infertility, all causal variants were ranked in the top 100 variants with PSAP-genomic-regions. By revisiting the testing units used in the PSAP method to include non-coding variants, we have developed PSAP-genomic-regions, an efficient whole-genome prioritization tool which offers promising results for the diagnosis of unresolved rare diseases.

2.
Orphanet J Rare Dis ; 19(1): 327, 2024 Sep 06.
Artigo em Inglês | MEDLINE | ID: mdl-39243101

RESUMO

The diagnostic odysseys for rare disease patients are getting shorter as next-generation sequencing becomes more widespread. However, the complex genetic diversity and factors influencing expressivity continue to challenge accurate diagnosis, leaving more than 50% of genetic variants categorized as variants of uncertain significance.Genomic expression intricately hinges on localized interactions among its products. Conventional variant prioritization, biased towards known disease genes and the structure-function paradigm, overlooks the potential impact of variants shaping the composition, location, size, and properties of biomolecular condensates, genuine membraneless organelles swiftly sensing and responding to environmental changes, and modulating expressivity.To address this complexity, we propose to focus on the nexus of genetic variants within biomolecular condensates determinants. Scrutinizing variant effects in these membraneless organelles could refine prioritization, enhance diagnostics, and unveil the molecular underpinnings of rare diseases. Integrating comprehensive genome sequencing, transcriptomics, and computational models can unravel variant pathogenicity and disease mechanisms, enabling precision medicine. This paper presents the rationale driving our proposal and describes a protocol to implement this approach. By fusing state-of-the-art knowledge and methodologies into the clinical practice, we aim to redefine rare diseases diagnosis, leveraging the power of scientific advancement for more informed medical decisions.


Assuntos
Doenças Raras , Humanos , Variação Genética/genética , Sequenciamento de Nucleotídeos em Larga Escala , Doenças Raras/diagnóstico , Doenças Raras/genética
3.
Genet Epidemiol ; 48(7): 324-343, 2024 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-38940260

RESUMO

Family-based sequencing studies are increasingly used to find rare genetic variants of high risk for disease traits with familial clustering. In some studies, families with multiple disease subtypes are collected and the exomes of affected relatives are sequenced for shared rare variants (RVs). Since different families can harbor different causal variants and each family harbors many RVs, tests to detect causal variants can have low power in this study design. Our goal is rather to prioritize shared variants for further investigation by, for example, pathway analyses or functional studies. The transmission-disequilibrium test prioritizes variants based on departures from Mendelian transmission in parent-child trios. Extending this idea to families, we propose methods to prioritize RVs shared in affected relatives with two disease subtypes, with one subtype more heritable than the other. Global approaches condition on a variant being observed in the study and assume a known probability of carrying a causal variant. In contrast, local approaches condition on a variant being observed in specific families to eliminate the carrier probability. Our simulation results indicate that global approaches are robust to misspecification of the carrier probability and prioritize more effectively than local approaches even when the carrier probability is misspecified.


Assuntos
Variação Genética , Humanos , Modelos Genéticos , Predisposição Genética para Doença , Simulação por Computador , Linhagem , Família , Exoma/genética , Modelos Estatísticos , Desequilíbrio de Ligação , Análise de Sequência de DNA/métodos
4.
Trends Cell Biol ; 34(6): 465-483, 2024 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-38719704

RESUMO

Genome-wide association studies (GWASs) provide a key foundation for elucidating the genetic underpinnings of common polygenic diseases. However, these studies have limitations in their ability to assign causality to particular genetic variants, especially those residing in the noncoding genome. Over the past decade, technological and methodological advances in both analytical and empirical prioritization of noncoding variants have enabled the identification of causative variants by leveraging orthogonal functional evidence at increasing scale. In this review, we present an overview of these approaches and describe how this workflow provides the groundwork necessary to move beyond associations toward genetically informed studies on the molecular and cellular mechanisms of polygenic disease.


Assuntos
Estudo de Associação Genômica Ampla , Herança Multifatorial , Humanos , Herança Multifatorial/genética , Predisposição Genética para Doença , Variação Genética , Animais
5.
Hum Genomics ; 18(1): 44, 2024 Apr 29.
Artigo em Inglês | MEDLINE | ID: mdl-38685113

RESUMO

BACKGROUND: A major obstacle faced by families with rare diseases is obtaining a genetic diagnosis. The average "diagnostic odyssey" lasts over five years and causal variants are identified in under 50%, even when capturing variants genome-wide. To aid in the interpretation and prioritization of the vast number of variants detected, computational methods are proliferating. Knowing which tools are most effective remains unclear. To evaluate the performance of computational methods, and to encourage innovation in method development, we designed a Critical Assessment of Genome Interpretation (CAGI) community challenge to place variant prioritization models head-to-head in a real-life clinical diagnostic setting. METHODS: We utilized genome sequencing (GS) data from families sequenced in the Rare Genomes Project (RGP), a direct-to-participant research study on the utility of GS for rare disease diagnosis and gene discovery. Challenge predictors were provided with a dataset of variant calls and phenotype terms from 175 RGP individuals (65 families), including 35 solved training set families with causal variants specified, and 30 unlabeled test set families (14 solved, 16 unsolved). We tasked teams to identify causal variants in as many families as possible. Predictors submitted variant predictions with estimated probability of causal relationship (EPCR) values. Model performance was determined by two metrics, a weighted score based on the rank position of causal variants, and the maximum F-measure, based on precision and recall of causal variants across all EPCR values. RESULTS: Sixteen teams submitted predictions from 52 models, some with manual review incorporated. Top performers recalled causal variants in up to 13 of 14 solved families within the top 5 ranked variants. Newly discovered diagnostic variants were returned to two previously unsolved families following confirmatory RNA sequencing, and two novel disease gene candidates were entered into Matchmaker Exchange. In one example, RNA sequencing demonstrated aberrant splicing due to a deep intronic indel in ASNS, identified in trans with a frameshift variant in an unsolved proband with phenotypes consistent with asparagine synthetase deficiency. CONCLUSIONS: Model methodology and performance was highly variable. Models weighing call quality, allele frequency, predicted deleteriousness, segregation, and phenotype were effective in identifying causal variants, and models open to phenotype expansion and non-coding variants were able to capture more difficult diagnoses and discover new diagnoses. Overall, computational models can significantly aid variant prioritization. For use in diagnostics, detailed review and conservative assessment of prioritized variants against established criteria is needed.


Assuntos
Doenças Raras , Humanos , Doenças Raras/genética , Doenças Raras/diagnóstico , Genoma Humano/genética , Variação Genética/genética , Biologia Computacional/métodos , Fenótipo
6.
Hum Genomics ; 18(1): 28, 2024 Mar 21.
Artigo em Inglês | MEDLINE | ID: mdl-38509596

RESUMO

BACKGROUND: In the process of finding the causative variant of rare diseases, accurate assessment and prioritization of genetic variants is essential. Previous variant prioritization tools mainly depend on the in-silico prediction of the pathogenicity of variants, which results in low sensitivity and difficulty in interpreting the prioritization result. In this study, we propose an explainable algorithm for variant prioritization, named 3ASC, with higher sensitivity and ability to annotate evidence used for prioritization. 3ASC annotates each variant with the 28 criteria defined by the ACMG/AMP genome interpretation guidelines and features related to the clinical interpretation of the variants. The system can explain the result based on annotated evidence and feature contributions. RESULTS: We trained various machine learning algorithms using in-house patient data. The performance of variant ranking was assessed using the recall rate of identifying causative variants in the top-ranked variants. The best practice model was a random forest classifier that showed top 1 recall of 85.6% and top 3 recall of 94.4%. The 3ASC annotates the ACMG/AMP criteria for each genetic variant of a patient so that clinical geneticists can interpret the result as in the CAGI6 SickKids challenge. In the challenge, 3ASC identified causal genes for 10 out of 14 patient cases, with evidence of decreased gene expression for 6 cases. Among them, two genes (HDAC8 and CASK) had decreased gene expression profiles confirmed by transcriptome data. CONCLUSIONS: 3ASC can prioritize genetic variants with higher sensitivity compared to previous methods by integrating various features related to clinical interpretation, including features related to false positive risk such as quality control and disease inheritance pattern. The system allows interpretation of each variant based on the ACMG/AMP criteria and feature contribution assessed using explainable AI techniques.


Assuntos
Algoritmos , Doenças Raras , Humanos , Doenças Raras/diagnóstico , Doenças Raras/genética , Testes Genéticos , Aprendizado de Máquina , Variação Genética/genética , Histona Desacetilases/genética , Proteínas Repressoras/genética
7.
Genes (Basel) ; 15(3)2024 03 17.
Artigo em Inglês | MEDLINE | ID: mdl-38540429

RESUMO

Genomic variant prioritization is crucial for identifying disease-associated genetic variations. Integrating facial and clinical feature analyses into this process enhances performance. This study demonstrates the integration of facial analysis (GestaltMatcher) and Human Phenotype Ontology analysis (CADA) within VarFish, an open-source variant analysis framework. Challenges related to non-open-source components were addressed by providing an open-source version of GestaltMatcher, facilitating on-premise facial analysis to address data privacy concerns. Performance evaluation on 163 patients recruited from a German multi-center study of rare diseases showed PEDIA's superior accuracy in variant prioritization compared to individual scores. This study highlights the importance of further benchmarking and future integration of advanced facial analysis approaches aligned with ACMG guidelines to enhance variant classification.


Assuntos
Doenças Raras , Humanos , Fenótipo , Doenças Raras/genética
8.
Genome Med ; 15(1): 68, 2023 09 07.
Artigo em Inglês | MEDLINE | ID: mdl-37679823

RESUMO

BACKGROUND: Whole-exome sequencing (WES) and whole-genome sequencing (WGS) have become indispensable tools to solve rare Mendelian genetic conditions. Nevertheless, there is still an urgent need for sensitive, fast algorithms to maximise WES/WGS diagnostic yield in rare disease patients. Most tools devoted to this aim take advantage of patient phenotype information for prioritization of genomic data, although are often limited by incomplete gene-phenotype knowledge stored in biomedical databases and a lack of proper benchmarking on real-world patient cohorts. METHODS: We developed ClinPrior, a novel method for the analysis of WES/WGS data that ranks candidate causal variants based on the patient's standardized phenotypic features (in Human Phenotype Ontology (HPO) terms). The algorithm propagates the data through an interactome network-based prioritization approach. This algorithm was thoroughly benchmarked using a synthetic patient cohort and was subsequently tested on a heterogeneous prospective, real-world series of 135 families affected by hereditary spastic paraplegia (HSP) and/or cerebellar ataxia (CA). RESULTS: ClinPrior successfully identified causative variants achieving a final positive diagnostic yield of 70% in our real-world cohort. This includes 10 novel candidate genes not previously associated with disease, 7 of which were functionally validated within this project. We used the knowledge generated by ClinPrior to create a specific interactome for HSP/CA disorders thus enabling future diagnoses as well as the discovery of novel disease genes. CONCLUSIONS: ClinPrior is an algorithm that uses standardized phenotype information and interactome data to improve clinical genomic diagnosis. It helps in identifying atypical cases and efficiently predicts novel disease-causing genes. This leads to increasing diagnostic yield, shortening of the diagnostic Odysseys and advancing our understanding of human illnesses.


Assuntos
Algoritmos , Genômica , Humanos , Estudos Prospectivos , Bases de Dados Factuais , Estudos de Associação Genética
9.
Front Pediatr ; 11: 1203289, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-37593442

RESUMO

Genetic mutations are critical factors leading to congenital surgical diseases and can be identified through genomic analysis. Early and accurate identification of genetic mutations underlying these conditions is vital for clinical diagnosis and effective treatment. In recent years, artificial intelligence (AI) has been widely applied for analyzing genomic data in various clinical settings, including congenital surgical diseases. This review paper summarizes current state-of-the-art AI-based approaches used in genomic analysis and highlighted some successful applications that deepen our understanding of the etiology of several congenital surgical diseases. We focus on the AI methods designed for the detection of different variant types and the prioritization of deleterious variants located in different genomic regions, aiming to uncover susceptibility genomic mutations contributed to congenital surgical disorders.

10.
medRxiv ; 2023 Aug 04.
Artigo em Inglês | MEDLINE | ID: mdl-37577678

RESUMO

Background: A major obstacle faced by rare disease families is obtaining a genetic diagnosis. The average "diagnostic odyssey" lasts over five years, and causal variants are identified in under 50%. The Rare Genomes Project (RGP) is a direct-to-participant research study on the utility of genome sequencing (GS) for diagnosis and gene discovery. Families are consented for sharing of sequence and phenotype data with researchers, allowing development of a Critical Assessment of Genome Interpretation (CAGI) community challenge, placing variant prioritization models head-to-head in a real-life clinical diagnostic setting. Methods: Predictors were provided a dataset of phenotype terms and variant calls from GS of 175 RGP individuals (65 families), including 35 solved training set families, with causal variants specified, and 30 test set families (14 solved, 16 unsolved). The challenge tasked teams with identifying the causal variants in as many test set families as possible. Ranked variant predictions were submitted with estimated probability of causal relationship (EPCR) values. Model performance was determined by two metrics, a weighted score based on rank position of true positive causal variants and maximum F-measure, based on precision and recall of causal variants across EPCR thresholds. Results: Sixteen teams submitted predictions from 52 models, some with manual review incorporated. Top performing teams recalled the causal variants in up to 13 of 14 solved families by prioritizing high quality variant calls that were rare, predicted deleterious, segregating correctly, and consistent with reported phenotype. In unsolved families, newly discovered diagnostic variants were returned to two families following confirmatory RNA sequencing, and two prioritized novel disease gene candidates were entered into Matchmaker Exchange. In one example, RNA sequencing demonstrated aberrant splicing due to a deep intronic indel in ASNS, identified in trans with a frameshift variant, in an unsolved proband with phenotype overlap with asparagine synthetase deficiency. Conclusions: By objective assessment of variant predictions, we provide insights into current state-of-the-art algorithms and platforms for genome sequencing analysis for rare disease diagnosis and explore areas for future optimization. Identification of diagnostic variants in unsolved families promotes synergy between researchers with clinical and computational expertise as a means of advancing the field of clinical genome interpretation.

11.
BMC Bioinformatics ; 24(1): 294, 2023 Jul 21.
Artigo em Inglês | MEDLINE | ID: mdl-37479972

RESUMO

BACKGROUND: Identifying variants associated with diseases is a challenging task in medical genetics research. Current studies that prioritize variants within individual genomes generally rely on known variants, evidence from literature and genomes, and patient symptoms and clinical signs. The functionalities of the existing tools, which rank variants based on given patient symptoms and clinical signs, are restricted to the coverage of ontologies such as the Human Phenotype Ontology (HPO). However, most clinicians do not limit themselves to HPO while describing patient symptoms/signs and their associated variants/genes. There is thus a need for an automated tool that can prioritize variants based on freely expressed patient symptoms and clinical signs. RESULTS: STARVar is a Symptom-based Tool for Automatic Ranking of Variants using evidence from literature and genomes. STARVar uses patient symptoms and clinical signs, either linked to HPO or expressed in free text format. It returns a ranked list of variants based on a combined score from two classifiers utilizing evidence from genomics and literature. STARVar improves over related tools on a set of synthetic patients. In addition, we demonstrated its distinct contribution to the domain on another synthetic dataset covering publicly available clinical genotype-phenotype associations by using symptoms and clinical signs expressed in free text format. CONCLUSIONS: STARVar stands as a unique and efficient tool that has the advantage of ranking variants with flexibly expressed patient symptoms in free-form text. Therefore, STARVar can be easily integrated into bioinformatics workflows designed to analyze disease-associated genomes. AVAILABILITY: STARVar is freely available from https://github.com/bio-ontology-research-group/STARVar .


Assuntos
Genômica , Software , Humanos , Fenótipo , Biologia Computacional , Estudos de Associação Genética
12.
Genet Med ; 25(7): 100862, 2023 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-37092535

RESUMO

PURPOSE: Disease-specific pathogenic variant prediction tools that differentiate pathogenic variants from benign have been improved through disease specificity recently. However, they have not been evaluated on disease-specific pathogenic variants compared with other diseases, which would help to prioritize disease-specific variants from several genes or novel genes. Thus, we hypothesize that features of pathogenic variants alone would provide a better model. METHODS: We developed an eye disease-specific variant prioritization tool (eyeVarP), which applied the random forest algorithm to the data set of pathogenic variants of eye diseases and other diseases. We also developed the VarP tool and generalized pipeline to filter missense and insertion-deletion variants and predict their pathogenicity from exome or genome sequencing data, thus we provide a complete computational procedure. RESULTS: eyeVarP outperformed pan disease-specific tools in identifying eye disease-specific pathogenic variants under the top 10. VarP outperformed 12 pathogenicity prediction tools with an accuracy of 95% in correctly identifying the pathogenicity of missense and insertion-deletion variants. The complete pipeline would help to develop disease-specific tools for other genetic disorders. CONCLUSION: eyeVarP performs better in identifying eye disease-specific pathogenic variants using pathogenic variant features and gene features. Implementing such complete computational procedure would significantly improve the clinical variant interpretation for specific diseases.


Assuntos
Oftalmopatias , Humanos , Oftalmopatias/diagnóstico , Oftalmopatias/genética , Biologia Computacional/métodos
13.
Gene ; 870: 147384, 2023 Jun 20.
Artigo em Inglês | MEDLINE | ID: mdl-37001572

RESUMO

BACKGROUND: High altitude pulmonary edema (HAPE) is a high-altitude idiopathic disease with serious consequences due to hypoxia at high altitude, and there is individual genetic susceptibility. Whole-exome sequencing (WES) is an effective tool for studying the genetic etiology of HAPE and can identify potentially novel mutations that may cause protein instability and may contribute to the development of HAPE. MATERIALS AND METHODS: A total of 50 unrelated HAPE patients were examined using WES, and the available bioinformatics tools were used to perform an analysis of exonic regions. Using the Phenolyzer program, disease candidate gene analysis was carried out. SIFT, PolyPhen-2, Mutation Taster, CADD, DANN, and I-Mutant software were used to assess the effects of genetic variations on protein function. RESULTS: The results showed that rs368502694 (p. R1022Q) located in NOS3, rs1595850639 (p. G61S) located in MYBPC3, and rs1367895529 (p. R333H) located in ITGAV were correlated with a high risk of HAPE, and thus could be regarded as potential genetic variations associated with HAPE. CONCLUSION: WES was used in this study for the first time to directly screen genetic variations related to HAPE. Notably, our study offers fresh information for the subsequent investigation into the etiology of HAPE.


Assuntos
Doença da Altitude , Edema Pulmonar , Humanos , Edema Pulmonar/genética , Altitude , Sequenciamento do Exoma , Doença da Altitude/genética
14.
Int J Mol Sci ; 24(2)2023 Jan 14.
Artigo em Inglês | MEDLINE | ID: mdl-36675175

RESUMO

Screening for pathogenic variants in the diagnosis of rare genetic diseases can now be performed on all genes thanks to the application of whole exome and genome sequencing (WES, WGS). Yet the repertoire of gene-disease associations is not complete. Several computer-based algorithms and databases integrate distinct gene-gene functional networks to accelerate the discovery of gene-disease associations. We hypothesize that the ability of every type of information to extract relevant insights is disease-dependent. We compiled 33 functional networks classified into 13 knowledge categories (KCs) and observed large variability in their ability to recover genes associated with 91 genetic diseases, as measured using efficiency and exclusivity. We developed GLOWgenes, a network-based algorithm that applies random walk with restart to evaluate KCs' ability to recover genes from a given list associated with a phenotype and modulates the prediction of new candidates accordingly. Comparison with other integration strategies and tools shows that our disease-aware approach can boost the discovery of new gene-disease associations, especially for the less obvious ones. KC contribution also varies if obtained using recently discovered genes. Applied to 15 unsolved WES, GLOWgenes proposed three new genes to be involved in the phenotypes of patients with syndromic inherited retinal dystrophies.


Assuntos
Algoritmos , Doenças Raras , Humanos , Doenças Raras/genética , Fenótipo , Mapeamento Cromossômico
15.
Clin Genet ; 103(2): 190-199, 2023 02.
Artigo em Inglês | MEDLINE | ID: mdl-36309956

RESUMO

Variant prioritization is a crucial step in the analysis of exome and genome sequencing. Multiple phenotype-driven tools have been developed to automate the variant prioritization process, but the efficacy of these tools in clinical setting with fuzzy phenotypic information and whether ensemble of these tools could outperform single algorithm remains to be assessed. A large rare disease cohort with heterogeneous phenotypic information, including a primary cohort of 1614 patients and a replication cohort of 1904 patients referred to exome sequencing, were recruited to assess the efficacy of variant prioritization and their ensemble. Three freely available tools-Exomiser, Xrare, and DeepPVP-and their ensemble were evaluated. The performance of all three tools was influenced by the attributes of phenotypic input. When combining these three tools by weighted-sum entropy method (EWE3), the ensemble outperformed any single algorithm, achieving a rate of 78% diagnostic variants in top 3 (13% improvement over current best performer, compared to Exomiser: 63%, Xrare: 65%, and DeepPVP: 51%), 88% in top 10 and 96% in top 30. The results were replicated in another independent cohort. Our study supports using entropy-weighted ensemble of multiple tools to improve variant prioritization and accelerate molecular diagnosis in exome/genome sequencing.


Assuntos
Algoritmos , Exoma , Humanos , Exoma/genética , Entropia , Fenótipo , Doenças Raras/genética , Software
16.
Genomics Proteomics Bioinformatics ; 21(2): 385-395, 2023 04.
Artigo em Inglês | MEDLINE | ID: mdl-34973416

RESUMO

Non-coding genomic variants constitute the majority of trait-associated genome variations; however, the identification of functional non-coding variants is still a challenge in human genetics, and a method for systematically assessing the impact of regulatory variants on gene expression and linking these regulatory variants to potential target genes is still lacking. Here, we introduce a deep neural network (DNN)-based computational framework, RegVar, which can accurately predict the tissue-specific impact of non-coding regulatory variants on target genes. We show that by robustly learning the genomic characteristics of massive variant-gene expression associations in a variety of human tissues, RegVar vastly surpasses all current non-coding variant prioritization methods in predicting regulatory variants under different circumstances. The unique features of RegVar make it an excellent framework for assessing the regulatory impact of any variant on its putative target genes in a variety of tissues. RegVar is available as a web server at https://regvar.omic.tech/.


Assuntos
Genômica , Redes Neurais de Computação , Humanos , Polimorfismo de Nucleotídeo Único , Estudo de Associação Genômica Ampla
17.
Int J Mol Sci ; 23(21)2022 Oct 26.
Artigo em Inglês | MEDLINE | ID: mdl-36361767

RESUMO

The advent of Whole Genome Sequencing (WGS) broadened the genetic variation detection range, revealing the presence of variants even in non-coding regions of the genome, which would have been missed using targeted approaches. One of the most challenging issues in WGS analysis regards the interpretation of annotated variants. This review focuses on tools suitable for the functional annotation of variants falling into non-coding regions. It couples the description of non-coding genomic areas with the results and performance of existing tools for a functional interpretation of the effect of variants in these regions. Tools were tested in a controlled genomic scenario, representing the ground-truth and allowing us to determine software performance.


Assuntos
Genômica , Software , Humanos , Genômica/métodos , Sequenciamento Completo do Genoma/métodos , Genoma , Genoma Humano
18.
Front Genet ; 13: 982930, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-36246618

RESUMO

Background: Existing BRCA2-specific variant pathogenicity prediction algorithms focus on the prediction of the functional impact of a subtype of variants alone. General variant effect predictors are applicable to all subtypes, but are trained on putative benign and pathogenic variants and do not account for gene-specific information, such as hotspots of pathogenic variants. Local, gene-specific information have been shown to aid variant pathogenicity prediction; therefore, our aim was to develop a BRCA2-specific machine learning model to predict pathogenicity of all types of BRCA2 variants. Methods: We developed an XGBoost-based machine learning model to predict pathogenicity of BRCA2 variants. The model utilizes general variant information such as position, frequency, and consequence for the canonical BRCA2 transcript, as well as deleteriousness prediction scores from several tools. We trained the model on 80% of the expert reviewed variants by the Evidence-Based Network for the Interpretation of Germline Mutant Alleles (ENIGMA) consortium and tested its performance on the remaining 20%, as well as on an independent set of variants of uncertain significance with experimentally determined functional scores. Results: The novel gene-specific model predicted the pathogenicity of ENIGMA BRCA2 variants with an accuracy of 99.9%. The model also performed excellently on predicting the functional consequence of the independent set of variants (accuracy was up to 91.3%). Conclusion: This new, gene-specific model is an accurate method for interpreting the pathogenicity of variants in the BRCA2 gene. It is a valuable addition for variant classification and can prioritize unreviewed variants for functional analysis or expert review.

19.
Hum Mutat ; 43(12): 2010-2020, 2022 12.
Artigo em Inglês | MEDLINE | ID: mdl-36054330

RESUMO

Most causal variants of Mendelian diseases are exonic. Whole-exome sequencing (WES) has become the diagnostic gold standard, but causative variant prioritization constitutes a bottleneck. Here we assessed an in-house sample-to-sequence pipeline and benchmarked free prioritization tools for germline causal variants from WES data. WES of 61 unselected patients with a known genetic disease cause was obtained. Variant prioritizations were performed by diverse tools and recorded to obtain a diagnostic yield when the causal variant was present in the first, fifth, and 10th top rankings. A fraction of causal variants was not captured by WES (8.2%) or did not pass the quality control criteria (13.1%). Most of the applications inspected were unavailable or had technical limitations, leaving nine tools for complete evaluation. Exomiser performed best in the top first rankings, while LIRICAL led in the top fifth rankings. Based on the more conservative top 10th rankings, Xrare had the highest diagnostic yield, followed by a three-way tie among Exomiser, LIRICAL, and PhenIX, then followed by AMELIE, TAPES, Phen-Gen,  AIVar, and VarNote-PAT. Xrare, Exomiser, LIRICAL, and PhenIX are the most efficient options for variant prioritization in real patient WES data.


Assuntos
Exoma , Mutação em Linhagem Germinativa , Humanos , Sequenciamento do Exoma , Exoma/genética
20.
Trends Genet ; 38(12): 1271-1283, 2022 12.
Artigo em Inglês | MEDLINE | ID: mdl-35934592

RESUMO

A molecular diagnosis from the analysis of sequencing data in rare Mendelian diseases has a huge impact on the management of patients and their families. Numerous patient phenotype-aware variant prioritisation (VP) tools have been developed to help automate this process, and shorten the diagnostic odyssey, but performance statistics on real patient data are limited. Here we identify, assess, and compare the performance of all up-to-date, freely available, and programmatically accessible tools using a whole-exome, retinal disease dataset from 134 individuals with a molecular diagnosis. All tools were able to identify around two-thirds of the genetic diagnoses as the top-ranked candidate, with LIRICAL performing best overall. Finally, we discuss the challenges to overcome most cases remaining undiagnosed after current, state-of-the-art practices.


Assuntos
Exoma , Doenças Raras , Humanos , Fenótipo , Sequenciamento do Exoma , Doenças Raras/diagnóstico , Doenças Raras/genética
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA