Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 35
Filtrar
Más filtros

Bases de datos
País/Región como asunto
Tipo del documento
Intervalo de año de publicación
1.
Nature ; 591(7849): 211-219, 2021 03.
Artículo en Inglés | MEDLINE | ID: mdl-33692554

RESUMEN

Polygenic risk scores (PRSs), which often aggregate results from genome-wide association studies, can bridge the gap between initial discovery efforts and clinical applications for the estimation of disease risk using genetics. However, there is notable heterogeneity in the application and reporting of these risk scores, which hinders the translation of PRSs into clinical care. Here, in a collaboration between the Clinical Genome Resource (ClinGen) Complex Disease Working Group and the Polygenic Score (PGS) Catalog, we present the Polygenic Risk Score Reporting Standards (PRS-RS), in which we update the Genetic Risk Prediction Studies (GRIPS) Statement to reflect the present state of the field. Drawing on the input of experts in epidemiology, statistics, disease-specific applications, implementation and policy, this comprehensive reporting framework defines the minimal information that is needed to interpret and evaluate PRSs, especially with respect to downstream clinical applications. Items span detailed descriptions of study populations, statistical methods for the development and validation of PRSs and considerations for the potential limitations of these scores. In addition, we emphasize the need for data availability and transparency, and we encourage researchers to deposit and share PRSs through the PGS Catalog to facilitate reproducibility and comparative benchmarking. By providing these criteria in a structured format that builds on existing standards and ontologies, the use of this framework in publishing PRSs will facilitate translation into clinical care and progress towards defining best practice.


Asunto(s)
Predisposición Genética a la Enfermedad , Genética Médica/normas , Herencia Multifactorial/genética , Humanos , Reproducibilidad de los Resultados , Medición de Riesgo/normas
2.
Am J Hum Genet ; 108(7): 1239-1250, 2021 07 01.
Artículo en Inglés | MEDLINE | ID: mdl-34129815

RESUMEN

Despite release of the GRCh38 human reference genome more than seven years ago, GRCh37 remains more widely used by most research and clinical laboratories. To date, no study has quantified the impact of utilizing different reference assemblies for the identification of variants associated with rare and common diseases from large-scale exome-sequencing data. By calling variants on both the GRCh37 and GRCh38 references, we identified single-nucleotide variants (SNVs) and insertion-deletions (indels) in 1,572 exomes from participants with Mendelian diseases and their family members. We found that a total of 1.5% of SNVs and 2.0% of indels were discordant when different references were used. Notably, 76.6% of the discordant variants were clustered within discrete discordant reference patches (DISCREPs) comprising only 0.9% of loci targeted by exome sequencing. These DISCREPs were enriched for genomic elements including segmental duplications, fix patch sequences, and loci known to contain alternate haplotypes. We identified 206 genes significantly enriched for discordant variants, most of which were in DISCREPs and caused by multi-mapped reads on the reference assembly that lacked the variant call. Among these 206 genes, eight are implicated in known Mendelian diseases and 53 are associated with common phenotypes from genome-wide association studies. In addition, variant interpretations could also be influenced by the reference after lifting-over variant loci to another assembly. Overall, we identified genes and genomic loci affected by reference assembly choice, including genes associated with Mendelian disorders and complex human diseases that require careful evaluation in both research and clinical applications.


Asunto(s)
Exoma , Genoma Humano , Polimorfismo de Nucleótido Simple , Estudios de Cohortes , Enfermedades Genéticas Congénitas/genética , Humanos , Valores de Referencia
3.
Am J Hum Genet ; 107(5): 932-941, 2020 11 05.
Artículo en Inglés | MEDLINE | ID: mdl-33108757

RESUMEN

Harmonization of variant pathogenicity classification across laboratories is important for advancing clinical genomics. The two CLIA-accredited Electronic Medical Record and Genomics Network sequencing centers and the six CLIA-accredited laboratories and one research laboratory performing genome or exome sequencing in the Clinical Sequencing Evidence-Generating Research Consortium collaborated to explore current sources of discordance in classification. Eight laboratories each submitted 20 classified variants in the ACMG secondary finding v.2.0 genes. After removing duplicates, each of the 158 variants was annotated and independently classified by two additional laboratories using the ACMG-AMP guidelines. Overall concordance across three laboratories was assessed and discordant variants were reviewed via teleconference and email. The submitted variant set included 28 P/LP variants, 96 VUS, and 34 LB/B variants, mostly in cancer (40%) and cardiac (27%) risk genes. Eighty-six (54%) variants reached complete five-category (i.e., P, LP, VUS, LB, B) concordance, and 17 (11%) had a discordance that could affect clinical recommendations (P/LP versus VUS/LB/B). 21% and 63% of variants submitted as P and LP, respectively, were discordant with VUS. Of the 54 originally discordant variants that underwent further review, 32 reached agreement, for a post-review concordance rate of 84% (118/140 variants). This project provides an updated estimate of variant concordance, identifies considerations for LP classified variants, and highlights ongoing sources of discordance. Continued and increased sharing of variant classifications and evidence across laboratories, and the ongoing work of ClinGen to provide general as well as gene- and disease-specific guidance, will lead to continued increases in concordance.


Asunto(s)
Enfermedades Cardiovasculares/genética , Variación Genética , Genómica/normas , Laboratorios/normas , Neoplasias/genética , Enfermedades Cardiovasculares/diagnóstico , Biología Computacional/métodos , Pruebas Genéticas , Genética Médica/métodos , Genoma Humano , Secuenciación de Nucleótidos de Alto Rendimiento , Humanos , Ensayos de Aptitud de Laboratorios/estadística & datos numéricos , Neoplasias/diagnóstico , Análisis de Secuencia de ADN , Programas Informáticos , Terminología como Asunto
4.
Hum Mutat ; 43(8): 1114-1121, 2022 08.
Artículo en Inglés | MEDLINE | ID: mdl-34923710

RESUMEN

The All of Us Research Program (AoURP) is a historic effort to accelerate research and improve healthcare by generating and collating data from one million people in the United States. Participants will have the option to receive results from their genome analysis, including actionable findings in 59 gene-disorder pairs for which disorder-associated variants are recommended for return by the American College of Medical Genetics and Genomics. To ensure consistent reporting across the AoURP, in a prelaunch study the four participating clinical laboratories shared all variant classifications in the 59 genes of interest from their internal databases. Of the 11,813 unique variants classified by at least two of the four laboratories, classifications were concordant with regard to reportability for 99.1% (11,711), with only 0.9% (102) having reportability differences. Through variant reassessment, data sharing, and discussion of rationale, participating laboratories resolved all 102 reportable differences. These approaches will be maintained during routine AoU reporting to ensure continuous classification harmonization and consistent reporting within AoURP.


Asunto(s)
Genoma Humano , Salud Poblacional , Pruebas Genéticas/métodos , Variación Genética , Genoma Humano/genética , Genómica/métodos , Humanos , Estados Unidos
5.
Genet Med ; 24(5): 1062-1072, 2022 05.
Artículo en Inglés | MEDLINE | ID: mdl-35331649

RESUMEN

PURPOSE: The Mayo-Baylor RIGHT 10K Study enabled preemptive, sequence-based pharmacogenomics (PGx)-driven drug prescribing practices in routine clinical care within a large cohort. We also generated the tools and resources necessary for clinical PGx implementation and identified challenges that need to be overcome. Furthermore, we measured the frequency of both common genetic variation for which clinical guidelines already exist and rare variation that could be detected by DNA sequencing, rather than genotyping. METHODS: Targeted oligonucleotide-capture sequencing of 77 pharmacogenes was performed using DNA from 10,077 consented Mayo Clinic Biobank volunteers. The resulting predicted drug response-related phenotypes for 13 genes, including CYP2D6 and HLA, affecting 21 drug-gene pairs, were deposited preemptively in the Mayo electronic health record. RESULTS: For the 13 pharmacogenes of interest, the genomes of 79% of participants carried clinically actionable variants in 3 or more genes, and DNA sequencing identified an average of 3.3 additional conservatively predicted deleterious variants that would not have been evident using genotyping. CONCLUSION: Implementation of preemptive rather than reactive and sequence-based rather than genotype-based PGx prescribing revealed nearly universal patient applicability and required integrated institution-wide resources to fully realize individualized drug therapy and to show more efficient use of health care resources.


Asunto(s)
Citocromo P-450 CYP2D6 , Farmacogenética , Centros Médicos Académicos , Secuencia de Bases , Citocromo P-450 CYP2D6/genética , Genotipo , Humanos , Farmacogenética/métodos
6.
Genet Med ; 23(12): 2404-2414, 2021 12.
Artículo en Inglés | MEDLINE | ID: mdl-34363016

RESUMEN

PURPOSE: Cardiovascular disease (CVD) is the leading cause of death in adults in the United States, yet the benefits of genetic testing are not universally accepted. METHODS: We developed the "HeartCare" panel of genes associated with CVD, evaluating high-penetrance Mendelian conditions, coronary artery disease (CAD) polygenic risk, LPA gene polymorphisms, and specific pharmacogenetic (PGx) variants. We enrolled 709 individuals from cardiology clinics at Baylor College of Medicine, and samples were analyzed in a CAP/CLIA-certified laboratory. Results were returned to the ordering physician and uploaded to the electronic medical record. RESULTS: Notably, 32% of patients had a genetic finding with clinical management implications, even after excluding PGx results, including 9% who were molecularly diagnosed with a Mendelian condition. Among surveyed physicians, 84% reported medical management changes based on these results, including specialist referrals, cardiac tests, and medication changes. LPA polymorphisms and high polygenic risk of CAD were found in 20% and 9% of patients, respectively, leading to diet, lifestyle, and other changes. Warfarin and simvastatin pharmacogenetic variants were present in roughly half of the cohort. CONCLUSION: Our results support the use of genetic information in routine cardiovascular health management and provide a roadmap for accompanying research.


Asunto(s)
Cardiología , Enfermedades Cardiovasculares , Adulto , Enfermedades Cardiovasculares/diagnóstico , Enfermedades Cardiovasculares/genética , Enfermedades Cardiovasculares/terapia , Pruebas Genéticas , Humanos , Farmacogenética/métodos , Pruebas de Farmacogenómica , Estados Unidos
7.
J Biomed Inform ; 118: 103795, 2021 06.
Artículo en Inglés | MEDLINE | ID: mdl-33930535

RESUMEN

Structured representation of clinical genetic results is necessary for advancing precision medicine. The Electronic Medical Records and Genomics (eMERGE) Network's Phase III program initially used a commercially developed XML message format for standardized and structured representation of genetic results for electronic health record (EHR) integration. In a desire to move towards a standard representation, the network created a new standardized format based upon Health Level Seven Fast Healthcare Interoperability Resources (HL7® FHIR®), to represent clinical genomics results. These new standards improve the utility of HL7® FHIR® as an international healthcare interoperability standard for management of genetic data from patients. This work advances the establishment of standards that are being designed for broad adoption in the current health information technology landscape.


Asunto(s)
Registros Electrónicos de Salud , Informática Médica , Genómica , Estándar HL7 , Humanos , Medicina de Precisión
8.
Genet Med ; 21(9): 2135-2144, 2019 09.
Artículo en Inglés | MEDLINE | ID: mdl-30890783

RESUMEN

PURPOSE: To provide a validated method to confidently identify exon-containing copy-number variants (CNVs), with a low false discovery rate (FDR), in targeted sequencing data from a clinical laboratory with particular focus on single-exon CNVs. METHODS: DNA sequence coverage data are normalized within each sample and subsequently exonic CNVs are identified in a batch of samples, when the target log2 ratio of the sample to the batch median exceeds defined thresholds. The quality of exonic CNV calls is assessed by C-scores (Z-like scores) using thresholds derived from gold standard samples and simulation studies. We integrate an ExonQC threshold to lower FDR and compare performance with alternate software (VisCap). RESULTS: Thirteen CNVs were used as a truth set to validate Atlas-CNV and compared with VisCap. We demonstrated FDR reduction in validation, simulation, and 10,926 eMERGESeq samples without sensitivity loss. Sixty-four multiexon and 29 single-exon CNVs with high C-scores were assessed by Multiplex Ligation-dependent Probe Amplification (MLPA). CONCLUSION: Atlas-CNV is validated as a method to identify exonic CNVs in targeted sequencing data generated in the clinical laboratory. The ExonQC and C-score assignment can reduce FDR (identification of targets with high variance) and improve calling accuracy of single-exon CNVs respectively. We propose guidelines and criteria to identify high confidence single-exon CNVs.


Asunto(s)
Variaciones en el Número de Copia de ADN/genética , Exones/genética , Genoma Humano/genética , Programas Informáticos , Secuenciación de Nucleótidos de Alto Rendimiento , Humanos , Análisis de Secuencia de ADN
9.
Nucleic Acids Res ; 44(D1): D308-12, 2016 Jan 04.
Artículo en Inglés | MEDLINE | ID: mdl-26590254

RESUMEN

The structure and function of proteins underlie most aspects of biology and their mutational perturbations often cause disease. To identify the molecular determinants of function as well as targets for drugs, it is central to characterize the important residues and how they cluster to form functional sites. The Evolutionary Trace (ET) achieves this by ranking the functional and structural importance of the protein sequence positions. ET uses evolutionary distances to estimate functional distances and correlates genotype variations with those in the fitness phenotype. Thus, ET ranks are worse for sequence positions that vary among evolutionarily closer homologs but better for positions that vary mostly among distant homologs. This approach identifies functional determinants, predicts function, guides the mutational redesign of functional and allosteric specificity, and interprets the action of coding sequence variations in proteins, people and populations. Now, the UET database offers pre-computed ET analyses for the protein structure databank, and on-the-fly analysis of any protein sequence. A web interface retrieves ET rankings of sequence positions and maps results to a structure to identify functionally important regions. This UET database integrates several ways of viewing the results on the protein sequence or structure and can be found at http://mammoth.bcm.tmc.edu/uet/.


Asunto(s)
Bases de Datos de Proteínas , Evolución Molecular , Conformación Proteica , Análisis de Secuencia de Proteína
10.
Nat Methods ; 10(3): 221-7, 2013 Mar.
Artículo en Inglés | MEDLINE | ID: mdl-23353650

RESUMEN

Automated annotation of protein function is challenging. As the number of sequenced genomes rapidly grows, the overwhelming majority of protein products can only be annotated computationally. If computational predictions are to be relied upon, it is crucial that the accuracy of these methods be high. Here we report the results from the first large-scale community-based critical assessment of protein function annotation (CAFA) experiment. Fifty-four methods representing the state of the art for protein function prediction were evaluated on a target set of 866 proteins from 11 organisms. Two findings stand out: (i) today's best protein function prediction algorithms substantially outperform widely used first-generation methods, with large gains on all types of targets; and (ii) although the top methods perform well enough to guide experiments, there is considerable need for improvement of currently available tools.


Asunto(s)
Biología Computacional/métodos , Biología Molecular/métodos , Anotación de Secuencia Molecular , Proteínas/fisiología , Algoritmos , Animales , Bases de Datos de Proteínas , Exorribonucleasas/clasificación , Exorribonucleasas/genética , Exorribonucleasas/fisiología , Predicción , Humanos , Proteínas/química , Proteínas/clasificación , Proteínas/genética , Especificidad de la Especie
11.
Bioinformatics ; 29(21): 2714-21, 2013 Nov 01.
Artículo en Inglés | MEDLINE | ID: mdl-24021383

RESUMEN

MOTIVATION: The constraints under which sequence, structure and function coevolve are not fully understood. Bringing this mutual relationship to light can reveal the molecular basis of binding, catalysis and allostery, thereby identifying function and rationally guiding protein redesign. Underlying these relationships are the epistatic interactions that occur when the consequences of a mutation to a protein are determined by the genetic background in which it occurs. Based on prior data, we hypothesize that epistatic forces operate most strongly between residues nearby in the structure, resulting in smooth evolutionary importance across the structure. METHODS AND RESULTS: We find that when residue scores of evolutionary importance are distributed smoothly between nearby residues, functional site prediction accuracy improves. Accordingly, we designed a novel measure of evolutionary importance that focuses on the interaction between pairs of structurally neighboring residues. This measure that we term pair-interaction Evolutionary Trace yields greater functional site overlap and better structure-based proteome-wide functional predictions. CONCLUSIONS: Our data show that the structural smoothness of evolutionary importance is a fundamental feature of the coevolution of sequence, structure and function. Mutations operate on individual residues, but selective pressure depends in part on the extent to which a mutation perturbs interactions with neighboring residues. In practice, this principle led us to redefine the importance of a residue in terms of the importance of its epistatic interactions with neighbors, yielding better annotation of functional residues, motivating experimental validation of a novel functional site in LexA and refining protein function prediction. CONTACT: lichtarge@bcm.edu. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Asunto(s)
Conformación Proteica , Análisis de Secuencia de Proteína/métodos , Algoritmos , Proteínas Bacterianas/química , Epistasis Genética , Evolución Molecular , Anotación de Secuencia Molecular , Mutación , Proteínas/química , Proteínas/genética , Proteoma/química , Serina Endopeptidasas/química
12.
J Am Med Inform Assoc ; 31(6): 1356-1366, 2024 May 20.
Artículo en Inglés | MEDLINE | ID: mdl-38447590

RESUMEN

OBJECTIVE: This study evaluates an AI assistant developed using OpenAI's GPT-4 for interpreting pharmacogenomic (PGx) testing results, aiming to improve decision-making and knowledge sharing in clinical genetics and to enhance patient care with equitable access. MATERIALS AND METHODS: The AI assistant employs retrieval-augmented generation (RAG), which combines retrieval and generative techniques, by harnessing a knowledge base (KB) that comprises data from the Clinical Pharmacogenetics Implementation Consortium (CPIC). It uses context-aware GPT-4 to generate tailored responses to user queries from this KB, further refined through prompt engineering and guardrails. RESULTS: Evaluated against a specialized PGx question catalog, the AI assistant showed high efficacy in addressing user queries. Compared with OpenAI's ChatGPT 3.5, it demonstrated better performance, especially in provider-specific queries requiring specialized data and citations. Key areas for improvement include enhancing accuracy, relevancy, and representative language in responses. DISCUSSION: The integration of context-aware GPT-4 with RAG significantly enhanced the AI assistant's utility. RAG's ability to incorporate domain-specific CPIC data, including recent literature, proved beneficial. Challenges persist, such as the need for specialized genetic/PGx models to improve accuracy and relevancy and addressing ethical, regulatory, and safety concerns. CONCLUSION: This study underscores generative AI's potential for transforming healthcare provider support and patient accessibility to complex pharmacogenomic information. While careful implementation of large language models like GPT-4 is necessary, it is clear that they can substantially improve understanding of pharmacogenomic data. With further development, these tools could augment healthcare expertise, provider productivity, and the delivery of equitable, patient-centered healthcare services.


Asunto(s)
Farmacogenética , Medicina de Precisión , Humanos , Inteligencia Artificial , Bases del Conocimiento , Almacenamiento y Recuperación de la Información/métodos , Pruebas de Farmacogenómica
13.
medRxiv ; 2024 Jun 13.
Artículo en Inglés | MEDLINE | ID: mdl-38946996

RESUMEN

Pharmacogenomics promises improved outcomes through individualized prescribing. However, the lack of diversity in studies impedes clinical translation and equitable application of precision medicine. We evaluated the frequencies of PGx variants, predicted phenotypes, and medication exposures using whole genome sequencing and EHR data from nearly 100k diverse All of Us Research Program participants. We report 100% of participants carried at least one pharmacogenomics variant and nearly all (99.13%) had a predicted phenotype with prescribing recommendations. Clinical impact was high with over 20% having both an actionable phenotype and a prior exposure to an impacted medication with pharmacogenomic prescribing guidance. Importantly, we also report hundreds of alleles and predicted phenotypes that deviate from known frequencies and/or were previously unreported, including within admixed American and African ancestry groups.

14.
Commun Biol ; 7(1): 174, 2024 Feb 19.
Artículo en Inglés | MEDLINE | ID: mdl-38374434

RESUMEN

Disparities in data underlying clinical genomic interpretation is an acknowledged problem, but there is a paucity of data demonstrating it. The All of Us Research Program is collecting data including whole-genome sequences, health records, and surveys for at least a million participants with diverse ancestry and access to healthcare, representing one of the largest biomedical research repositories of its kind. Here, we examine pathogenic and likely pathogenic variants that were identified in the All of Us cohort. The European ancestry subgroup showed the highest overall rate of pathogenic variation, with 2.26% of participants having a pathogenic variant. Other ancestry groups had lower rates of pathogenic variation, including 1.62% for the African ancestry group and 1.32% in the Latino/Admixed American ancestry group. Pathogenic variants were most frequently observed in genes related to Breast/Ovarian Cancer or Hypercholesterolemia. Variant frequencies in many genes were consistent with the data from the public gnomAD database, with some notable exceptions resolved using gnomAD subsets. Differences in pathogenic variant frequency observed between ancestral groups generally indicate biases of ascertainment of knowledge about those variants, but some deviations may be indicative of differences in disease prevalence. This work will allow targeted precision medicine efforts at revealed disparities.


Asunto(s)
Predisposición Genética a la Enfermedad , Salud Poblacional , Humanos , Población Negra , Genómica , Hispánicos o Latinos/genética , Estados Unidos/epidemiología , Pueblo Europeo , Pueblo Africano , Negro o Afroamericano
15.
medRxiv ; 2024 Apr 12.
Artículo en Inglés | MEDLINE | ID: mdl-38645101

RESUMEN

Background: Multiplexed Assays of Variant Effects (MAVEs) can test all possible single variants in a gene of interest. The resulting saturation-style data may help resolve variant classification disparities between populations, especially for variants of uncertain significance (VUS). Methods: We analyzed clinical significance classifications in 213,663 individuals of European-like genetic ancestry versus 206,975 individuals of non-European-like genetic ancestry from All of Us and the Genome Aggregation Database. Then, we incorporated clinically calibrated MAVE data into the Clinical Genome Resource's Variant Curation Expert Panel rules to automate VUS reclassification for BRCA1, TP53, and PTEN . Results: Using two orthogonal statistical approaches, we show a higher prevalence ( p ≤5.95e-06) of VUS in individuals of non-European-like genetic ancestry across all medical specialties assessed in all three databases. Further, in the non-European-like genetic ancestry group, higher rates of Benign or Likely Benign and variants with no clinical designation ( p ≤2.5e-05) were found across many medical specialties, whereas Pathogenic or Likely Pathogenic assignments were higher in individuals of European-like genetic ancestry ( p ≤2.5e-05). Using MAVE data, we reclassified VUS in individuals of non-European-like genetic ancestry at a significantly higher rate in comparison to reclassified VUS from European-like genetic ancestry ( p =9.1e-03) effectively compensating for the VUS disparity. Further, essential code analysis showed equitable impact of MAVE evidence codes but inequitable impact of allele frequency ( p =7.47e-06) and computational predictor ( p =6.92e-05) evidence codes for individuals of non-European-like genetic ancestry. Conclusions: Generation of saturation-style MAVE data should be a priority to reduce VUS disparities and produce equitable training data for future computational predictors.

16.
BMC Res Notes ; 17(1): 62, 2024 Mar 03.
Artículo en Inglés | MEDLINE | ID: mdl-38433186

RESUMEN

OBJECTIVE: Data from DNA genotyping via a 96-SNP panel in a study of 25,015 clinical samples were utilized for quality control and tracking of sample identity in a clinical sequencing network. The study aimed to demonstrate the value of both the precise SNP tracking and the utility of the panel for predicting the sex-by-genotype of the participants, to identify possible sample mix-ups. RESULTS: Precise SNP tracking showed no sample swap errors within the clinical testing laboratories. In contrast, when comparing predicted sex-by-genotype to the provided sex on the test requisition, we identified 110 inconsistencies from 25,015 clinical samples (0.44%), that had occurred during sample collection or accessioning. The genetic sex predictions were confirmed using additional SNP sites in the sequencing data or high-density genotyping arrays. It was determined that discrepancies resulted from clerical errors (49.09%), samples from transgender participants (3.64%) and stem cell or bone marrow transplant patients (7.27%) along with undetermined sample mix-ups (40%) for which sample swaps occurred prior to arrival at genome centers, however the exact cause of the events at the sampling sites resulting in the mix-ups were not able to be determined.


Asunto(s)
Servicios de Laboratorio Clínico , Secuenciación de Nucleótidos de Alto Rendimiento , Humanos , Trasplante de Médula Ósea , Genotipo , Laboratorios
17.
BMC Bioinformatics ; 14 Suppl 3: S6, 2013.
Artículo en Inglés | MEDLINE | ID: mdl-23514548

RESUMEN

BACKGROUND: Annotating protein function with both high accuracy and sensitivity remains a major challenge in structural genomics. One proven computational strategy has been to group a few key functional amino acids into templates and search for these templates in other protein structures, so as to transfer function when a match is found. To this end, we previously developed Evolutionary Trace Annotation (ETA) and showed that diffusing known annotations over a network of template matches on a structural genomic scale improved predictions of function. In order to further increase sensitivity, we now let each protein contribute multiple templates rather than just one, and also let the template size vary. RESULTS: Retrospective benchmarks in 605 Structural Genomics enzymes showed that multiple templates increased sensitivity by up to 14% when combined with single template predictions even as they maintained the accuracy over 91%. Diffusing function globally on networks of single and multiple template matches marginally increased the area under the ROC curve over 0.97, but in a subset of proteins that could not be annotated by ETA, the network approach recovered annotations for the most confident 20-23 of 91 cases with 100% accuracy. CONCLUSIONS: We improve the accuracy and sensitivity of predictions by using multiple templates per protein structure when constructing networks of ETA matches and diffusing annotations.


Asunto(s)
Conformación Proteica , Proteínas/fisiología , Algoritmos , Biología Computacional , Bases de Datos de Proteínas , Enzimas/química , Evolución Molecular , Genómica , Anotación de Secuencia Molecular , Proteínas/química , Proteínas/genética
18.
Bioinformatics ; 28(16): 2186-8, 2012 Aug 15.
Artículo en Inglés | MEDLINE | ID: mdl-22689386

RESUMEN

UNLABELLED: Most proteins lack experimentally validated functions. To address this problem, we implemented the Evolutionary Trace Annotation (ETA) method in the Cytoscape network visualization environment. The result is the ETAscape plugin, which builds a structural genomics network based on local structural and evolutionary similarities among proteins and then globally diffuses known annotations across the resulting network. The plugin displays these novel functional annotations, their confidence, the molecular basis for individual matches and the set of matches that lead to a prediction. AVAILABILITY: The ETA Network Plugin is available publicly for download at http://mammoth.bcm.tmc.edu/networks/.


Asunto(s)
Biología Computacional/métodos , Proteínas/química , Programas Informáticos , Enzimas/análisis , Enzimas/química , Genómica/métodos , Proteínas/análisis , Especificidad por Sustrato
19.
Circ Genom Precis Med ; 16(2): e003816, 2023 04.
Artículo en Inglés | MEDLINE | ID: mdl-37071725

RESUMEN

BACKGROUND: The implications of secondary findings detected in large-scale sequencing projects remain uncertain. We assessed prevalence and penetrance of pathogenic familial hypercholesterolemia (FH) variants, their association with coronary heart disease (CHD), and 1-year outcomes following return of results in phase III of the electronic medical records and genomics network. METHODS: Adult participants (n=18 544) at 7 sites were enrolled in a prospective cohort study to assess the clinical impact of returning results from targeted sequencing of 68 actionable genes, including LDLR, APOB, and PCSK9. FH variant prevalence and penetrance (defined as low-density lipoprotein cholesterol >155 mg/dL) were estimated after excluding participants enrolled on the basis of hypercholesterolemia. Multivariable logistic regression was used to estimate the odds of CHD compared to age- and sex-matched controls without FH-associated variants. Process (eg, referral to a specialist or ordering new tests), intermediate (eg, new diagnosis of FH), and clinical (eg, treatment modification) outcomes within 1 year after return of results were ascertained by electronic health record review. RESULTS: The prevalence of FH-associated pathogenic variants was 1 in 188 (69 of 13,019 unselected participants). Penetrance was 87.5%. The presence of an FH variant was associated with CHD (odds ratio, 3.02 [2.00-4.53]) and premature CHD (odds ratio, 3.68 [2.34-5.78]). At least 1 outcome occurred in 92% of participants; 44% received a new diagnosis of FH and 26% had treatment modified following return of results. CONCLUSIONS: In a multisite cohort of electronic health record-linked biobanks, monogenic FH was prevalent, penetrant, and associated with presence of CHD. Nearly half of participants with an FH-associated variant received a new diagnosis of FH and a quarter had treatment modified after return of results. These results highlight the potential utility of sequencing electronic health record-linked biobanks to detect FH.


Asunto(s)
Enfermedades Cardiovasculares , Enfermedad de la Arteria Coronaria , Hiperlipoproteinemia Tipo II , Adulto , Humanos , Proproteína Convertasa 9/genética , Registros Electrónicos de Salud , Penetrancia , Prevalencia , Estudios Prospectivos , Factores de Riesgo , Hiperlipoproteinemia Tipo II/diagnóstico , Hiperlipoproteinemia Tipo II/epidemiología , Hiperlipoproteinemia Tipo II/genética , Enfermedad de la Arteria Coronaria/genética , Factores de Riesgo de Enfermedad Cardiaca , Genómica
20.
Res Sq ; 2023 Sep 11.
Artículo en Inglés | MEDLINE | ID: mdl-37790445

RESUMEN

Objective: Data from DNA genotyping via a 96-SNP panel in a study of 25,015 clinical samples were utilized for quality control and tracking of sample identity in a clinical sequencing network. The study aimed to demonstrate the value of both the precise SNP tracking and the utility of the panel for predicting the sex-by-genotype of the participants, to identify possible sample mix-ups. Results: Precise SNP tracking showed no sample swap errors within the clinical testing laboratories. In contrast, when comparing predicted sex-by-genotype to the provided sex on the test requisition, we identified 110 inconsistencies from 25,015 clinical samples (0.44%), that had occurred during sample collection or accessioning. The genetic sex predictions were confirmed using additional SNP sites in the sequencing data or high-density genotyping arrays. It was determined that discrepancies resulted from clerical errors, samples from transgender participants and stem cell or bone marrow transplant patients along with undetermined sample mix-ups.

SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA