Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 29
Filtrar
1.
Am J Hum Genet ; 98(6): 1067-1076, 2016 06 02.
Artículo en Inglés | MEDLINE | ID: mdl-27181684

RESUMEN

Evaluating the pathogenicity of a variant is challenging given the plethora of types of genetic evidence that laboratories consider. Deciding how to weigh each type of evidence is difficult, and standards have been needed. In 2015, the American College of Medical Genetics and Genomics (ACMG) and the Association for Molecular Pathology (AMP) published guidelines for the assessment of variants in genes associated with Mendelian diseases. Nine molecular diagnostic laboratories involved in the Clinical Sequencing Exploratory Research (CSER) consortium piloted these guidelines on 99 variants spanning all categories (pathogenic, likely pathogenic, uncertain significance, likely benign, and benign). Nine variants were distributed to all laboratories, and the remaining 90 were evaluated by three laboratories. The laboratories classified each variant by using both the laboratory's own method and the ACMG-AMP criteria. The agreement between the two methods used within laboratories was high (K-alpha = 0.91) with 79% concordance. However, there was only 34% concordance for either classification system across laboratories. After consensus discussions and detailed review of the ACMG-AMP criteria, concordance increased to 71%. Causes of initial discordance in ACMG-AMP classifications were identified, and recommendations on clarification and increased specification of the ACMG-AMP criteria were made. In summary, although an initial pilot of the ACMG-AMP guidelines did not lead to increased concordance in variant interpretation, comparing variant interpretations to identify differences and having a common framework to facilitate resolution of those differences were beneficial for improving agreement, allowing iterative movement toward increased reporting consistency for variants in genes associated with monogenic disease.


Asunto(s)
Investigación Biomédica , Pruebas Genéticas/normas , Variación Genética/genética , Genómica/métodos , Laboratorios/normas , Mutación/genética , Análisis de Secuencia de ADN/normas , Interpretación Estadística de Datos , Práctica Clínica Basada en la Evidencia , Exoma/genética , Genoma Humano , Guías como Asunto , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Humanos , Hallazgos Incidentales , Programas Informáticos , Estados Unidos
2.
Hum Mutat ; 39(11): 1690-1701, 2018 11.
Artículo en Inglés | MEDLINE | ID: mdl-30311374

RESUMEN

Effective exchange of information about genetic variants is currently hampered by the lack of readily available globally unique variant identifiers that would enable aggregation of information from different sources. The ClinGen Allele Registry addresses this problem by providing (1) globally unique "canonical" variant identifiers (CAids) on demand, either individually or in large batches; (2) access to variant-identifying information in a searchable Registry; (3) links to allele-related records in many commonly used databases; and (4) services for adding links to information about registered variants in external sources. A core element of the Registry is a canonicalization service, implemented using in-memory sequence alignment-based index, which groups variant identifiers denoting the same nucleotide variant and assigns unique and dereferenceable CAids. More than 650 million distinct variants are currently registered, including those from gnomAD, ExAC, dbSNP, and ClinVar, including a small number of variants registered by Registry users. The Registry is accessible both via a web interface and programmatically via well-documented Hypertext Transfer Protocol (HTTP) Representational State Transfer Application Programming Interface (REST-APIs). For programmatic interoperability, the Registry content is accessible in the JavaScript Object Notation for Linked Data (JSON-LD) format. We present several use cases and demonstrate how the linked information may provide raw material for reasoning about variant's pathogenicity.


Asunto(s)
Bases de Datos Genéticas , Variación Genética/genética , Alelos , Humanos , Sistema de Registros , Programas Informáticos
3.
Hum Mutat ; 39(11): 1686-1689, 2018 11.
Artículo en Inglés | MEDLINE | ID: mdl-30311379

RESUMEN

The Clinical Genome Resource (ClinGen)'s work to develop a knowledge base to support the understanding of genes and variants for use in precision medicine and research depends on robust, broadly applicable, and adaptable technical standards for sharing data and information. To forward this goal, ClinGen has joined with the Global Alliance for Genomics and Health (GA4GH) to support the development of open, freely-available technical standards and regulatory frameworks for secure and responsible sharing of genomic and health-related data. In its capacity as one of the 15 inaugural GA4GH "Driver Projects," ClinGen is providing input on the key standards needs of the global genomics community, and has committed to participate on GA4GH Work Streams to support the development of: (1) a standard model for computer-readable variant representation; (2) a data model for linking variant data to annotations; (3) a specification to enable sharing of genomic variant knowledge and associated clinical interpretations; and (4) a set of best practices for use of phenotype and disease ontologies. ClinGen's participation as a GA4GH Driver Project will provide a robust environment to test drive emerging genomic knowledge sharing standards and prove their utility among the community, while accelerating the construction of the ClinGen evidence base.


Asunto(s)
Genoma Humano/genética , Difusión de la Información/métodos , Biología Computacional , Bases de Datos Genéticas , Variación Genética , Genómica , Humanos , Medicina de Precisión
4.
Hum Mutat ; 39(11): 1677-1685, 2018 11.
Artículo en Inglés | MEDLINE | ID: mdl-30311382

RESUMEN

The use of genome-scale sequencing allows for identification of genetic findings beyond the original indication for testing (secondary findings). The ClinGen Actionability Working Group's (AWG) protocol for evidence synthesis and semi-quantitative metric scoring evaluates four domains of clinical actionability for potential secondary findings: severity and likelihood of the outcome, and effectiveness and nature of the intervention. As of February 2018, the AWG has scored 127 genes associated with 78 disorders (up-to-date topics/scores are available at www.clinicalgenome.org). Scores across these disorders were assessed to compare genes/disorders recommended for return as secondary findings by the American College of Medical Genetics and Genomics (ACMG) with those not currently recommended. Disorders recommended by the ACMG scored higher on outcome-related domains (severity and likelihood), but not on intervention-related domains (effectiveness and nature of the intervention). Current practices indicate that return of secondary findings will expand beyond those currently recommended by the ACMG. The ClinGen AWG evidence reports and summary scores are not intended as classifications of actionability, rather they provide a resource to aid decision makers as they determine best practices regarding secondary findings. The ClinGen AWG is working with the ACMG Secondary Findings Committee to update future iterations of their secondary findings list.


Asunto(s)
Genoma Humano/genética , Bases de Datos Genéticas , Exoma/genética , Pruebas Genéticas , Variación Genética/genética , Secuenciación de Nucleótidos de Alto Rendimiento , Humanos
5.
Nucleic Acids Res ; 42(8): 4800-12, 2014 Apr.
Artículo en Inglés | MEDLINE | ID: mdl-24523353

RESUMEN

Cys(2)-His(2) zinc finger proteins (ZFPs) are the largest family of transcription factors in higher metazoans. They also represent the most diverse family with regards to the composition of their recognition sequences. Although there are a number of ZFPs with characterized DNA-binding preferences, the specificity of the vast majority of ZFPs is unknown and cannot be directly inferred by homology due to the diversity of recognition residues present within individual fingers. Given the large number of unique zinc fingers and assemblies present across eukaryotes, a comprehensive predictive recognition model that could accurately estimate the DNA-binding specificity of any ZFP based on its amino acid sequence would have great utility. Toward this goal, we have used the DNA-binding specificities of 678 two-finger modules from both natural and artificial sources to construct a random forest-based predictive model for ZFP recognition. We find that our recognition model outperforms previously described determinant-based recognition models for ZFPs, and can successfully estimate the specificity of naturally occurring ZFPs with previously defined specificities.


Asunto(s)
Proteínas de Unión al ADN/metabolismo , Elementos Reguladores de la Transcripción , Factores de Transcripción/metabolismo , Dedos de Zinc , Inteligencia Artificial , Sitios de Unión , ADN/química , Proteínas de Unión al ADN/química , Modelos Biológicos , Motivos de Nucleótidos , Factores de Transcripción/química
6.
Bioinformatics ; 30(7): 941-8, 2014 Apr 01.
Artículo en Inglés | MEDLINE | ID: mdl-24369152

RESUMEN

MOTIVATION: Generating accurate transcription factor (TF) binding site motifs from data generated using the next-generation sequencing, especially ChIP-seq, is challenging. The challenge arises because a typical experiment reports a large number of sequences bound by a TF, and the length of each sequence is relatively long. Most traditional motif finders are slow in handling such enormous amount of data. To overcome this limitation, tools have been developed that compromise accuracy with speed by using heuristic discrete search strategies or limited optimization of identified seed motifs. However, such strategies may not fully use the information in input sequences to generate motifs. Such motifs often form good seeds and can be further improved with appropriate scoring functions and rapid optimization. RESULTS: We report a tool named discriminative motif optimizer (DiMO). DiMO takes a seed motif along with a positive and a negative database and improves the motif based on a discriminative strategy. We use area under receiver-operating characteristic curve (AUC) as a measure of discriminating power of motifs and a strategy based on perceptron training that maximizes AUC rapidly in a discriminative manner. Using DiMO, on a large test set of 87 TFs from human, drosophila and yeast, we show that it is possible to significantly improve motifs identified by nine motif finders. The motifs are generated/optimized using training sets and evaluated on test sets. The AUC is improved for almost 90% of the TFs on test sets and the magnitude of increase is up to 39%. AVAILABILITY AND IMPLEMENTATION: DiMO is available at http://stormo.wustl.edu/DiMO


Asunto(s)
Inteligencia Artificial , Factores de Transcripción/química , Algoritmos , Animales , Secuencia de Bases , Sitios de Unión , ADN/química , ADN/metabolismo , Drosophila melanogaster , Humanos , Unión Proteica , Saccharomyces cerevisiae , Programas Informáticos , Factores de Transcripción/metabolismo
8.
BMC Bioinformatics ; 13 Suppl 15: S3, 2012.
Artículo en Inglés | MEDLINE | ID: mdl-23046442

RESUMEN

BACKGROUND: In the context of drug discovery and development, much effort has been exerted to determine which conformers of a given molecule are responsible for the observed biological activity. In this work we aimed to predict bioactive conformers using a variant of supervised learning, named multiple-instance learning. A single molecule, treated as a bag of conformers, is biologically active if and only if at least one of its conformers, treated as an instance, is responsible for the observed bioactivity; and a molecule is inactive if none of its conformers is responsible for the observed bioactivity. The implementation requires instance-based embedding, and joint feature selection and classification. The goal of the present project is to implement multiple-instance learning in drug activity prediction, and subsequently to identify the bioactive conformers for each molecule. METHODS: We encoded the 3-dimensional structures using pharmacophore fingerprints which are binary strings, and accomplished instance-based embedding using calculated dissimilarity distances. Four dissimilarity measures were employed and their performances were compared. 1-norm SVM was used for joint feature selection and classification. The approach was applied to four data sets, and the best proposed model for each data set was determined by using the dissimilarity measure yielding the smallest number of selected features. RESULTS: The predictive abilities of the proposed approach were compared with three classical predictive models without instance-based embedding. The proposed approach produced the best predictive models for one data set and second best predictive models for the rest of the data sets, based on the external validations. To validate the ability of the proposed approach to find bioactive conformers, 12 small molecules with co-crystallized structures were seeded in one data set. 10 out of 12 co-crystallized structures were indeed identified as significant conformers using the proposed approach. CONCLUSIONS: The proposed approach was proven not to suffer from overfitting and to be highly competitive with classical predictive models, so it is very powerful for drug activity prediction. The approach was also validated as a useful method for pursuit of bioactive conformers.


Asunto(s)
Inteligencia Artificial , Biología Computacional/métodos , Descubrimiento de Drogas , Modelos Teóricos , Conformación Molecular , Relación Estructura-Actividad Cuantitativa
9.
Xenobiotica ; 42(2): 139-56, 2012 Feb.
Artículo en Inglés | MEDLINE | ID: mdl-21970716

RESUMEN

RATIONALE: The therapeutic promise of trans-resveratrol (tRes) is limited by poor bioavailability following rapid metabolism. We hypothesise that trans-arachidin-1 (tA1) and trans-arachidin-3 (tA3), peanut hairy root-derived isoprenylated analogs of tRes, will exhibit slower metabolism/enhanced bioavailability and retain biological activity via cannabinoid receptor (CBR) binding relative to their non-prenylated parent compounds trans-piceatannol (tPice) and tRes, respectively. RESULTS: The activities of eight human UDP-glucuronosyltransferases (UGTs) toward these compounds were evaluated. The greatest activity was observed for extrahepatic UGTs 1A10 and 1A7, followed by hepatic UGTs 1A1 and 1A9. Importantly, an additional isoprenyl and/or hydroxyl group in tA1 and tA3 slowed overall glucuronidation. CBR binding studies demonstrated that all analogs bound to CB1Rs with similar affinities (5-18 µM); however, only tA1 and tA3 bound appreciably to CB2Rs. Molecular modelling studies confirmed that the isoprenyl moiety of tA1 and tA3 improved binding affinity to CB2Rs. Finally, although tA3 acted as a competitive CB1R antagonist, tA1 antagonised CB1R agonists by both competitive and non-competitive mechanisms. CONCLUSIONS: Prenylated stilbenoids may be preferable alternatives to tRes due to increased bioavailability via slowed metabolism. Similar structural analogs might be developed as novel CB therapeutics for obesity and/or drug dependency.


Asunto(s)
Glucuronosiltransferasa/química , Hemiterpenos/farmacología , Receptores de Cannabinoides/química , Estilbenos/química , Estilbenos/farmacología , Animales , Unión Competitiva , Disponibilidad Biológica , Células CHO , Cromatografía Líquida de Alta Presión , Cricetinae , Hemiterpenos/química , Hemiterpenos/farmacocinética , Humanos , Cinética , Espectrometría de Masas , Fase II de la Desintoxicación Metabólica , Modelos Moleculares , Prenilación , Proteínas Recombinantes/química , Resveratrol , Estilbenos/farmacocinética
10.
Genome Med ; 14(1): 6, 2022 01 18.
Artículo en Inglés | MEDLINE | ID: mdl-35039090

RESUMEN

BACKGROUND: Identification of clinically significant genetic alterations involved in human disease has been dramatically accelerated by developments in next-generation sequencing technologies. However, the infrastructure and accessible comprehensive curation tools necessary for analyzing an individual patient genome and interpreting genetic variants to inform healthcare management have been lacking. RESULTS: Here we present the ClinGen Variant Curation Interface (VCI), a global open-source variant classification platform for supporting the application of evidence criteria and classification of variants based on the ACMG/AMP variant classification guidelines. The VCI is among a suite of tools developed by the NIH-funded Clinical Genome Resource (ClinGen) Consortium and supports an FDA-recognized human variant curation process. Essential to this is the ability to enable collaboration and peer review across ClinGen Expert Panels supporting users in comprehensively identifying, annotating, and sharing relevant evidence while making variant pathogenicity assertions. To facilitate evidence-based improvements in human variant classification, the VCI is publicly available to the genomics community. Navigation workflows support users providing guidance to comprehensively apply the ACMG/AMP evidence criteria and document provenance for asserting variant classifications. CONCLUSIONS: The VCI offers a central platform for clinical variant classification that fills a gap in the learning healthcare system, facilitates widespread adoption of standards for clinical curation, and is available at https://curation.clinicalgenome.org.


Asunto(s)
Variación Genética , Genoma Humano , Humanos , Pruebas Genéticas , Genómica
11.
BMC Bioinformatics ; 12 Suppl 10: S22, 2011 Oct 18.
Artículo en Inglés | MEDLINE | ID: mdl-22166097

RESUMEN

BACKGROUND: It is commonly believed that including domain knowledge in a prediction model is desirable. However, representing and incorporating domain information in the learning process is, in general, a challenging problem. In this research, we consider domain information encoded by discrete or categorical attributes. A discrete or categorical attribute provides a natural partition of the problem domain, and hence divides the original problem into several non-overlapping sub-problems. In this sense, the domain information is useful if the partition simplifies the learning task. The goal of this research is to develop an algorithm to identify discrete or categorical attributes that maximally simplify the learning task. RESULTS: We consider restructuring a supervised learning problem via a partition of the problem space using a discrete or categorical attribute. A naive approach exhaustively searches all the possible restructured problems. It is computationally prohibitive when the number of discrete or categorical attributes is large. We propose a metric to rank attributes according to their potential to reduce the uncertainty of a classification task. It is quantified as a conditional entropy achieved using a set of optimal classifiers, each of which is built for a sub-problem defined by the attribute under consideration. To avoid high computational cost, we approximate the solution by the expected minimum conditional entropy with respect to random projections. This approach is tested on three artificial data sets, three cheminformatics data sets, and two leukemia gene expression data sets. Empirical results demonstrate that our method is capable of selecting a proper discrete or categorical attribute to simplify the problem, i.e., the performance of the classifier built for the restructured problem always beats that of the original problem. CONCLUSIONS: The proposed conditional entropy based metric is effective in identifying good partitions of a classification problem, hence enhancing the prediction performance.


Asunto(s)
Inteligencia Artificial , Modelos Biológicos , Algoritmos , Entropía , Glucógeno Sintasa Quinasa 3/antagonistas & inhibidores , Glucógeno Sintasa Quinasa 3 beta , Humanos , Leucemia-Linfoma Linfoblástico de Células Precursoras/tratamiento farmacológico , Leucemia-Linfoma Linfoblástico de Células Precursoras/genética , Pronóstico , Receptor Cannabinoide CB1/metabolismo , Receptor Cannabinoide CB2/metabolismo
12.
J Biol Chem ; 285(41): 31796-805, 2010 Oct 08.
Artículo en Inglés | MEDLINE | ID: mdl-20667825

RESUMEN

Recently, we found that divalent calcium has no detectable effect on the assembly of Mycobacterium tuberculosis FtsZ (MtbFtsZ), whereas it strongly promoted the assembly of Escherichia coli FtsZ (EcFtsZ). While looking for potential calcium binding residues in EcFtsZ, we found a mutation (E93R) that strongly promoted the assembly of EcFtsZ. The mutation increased the stability and bundling of the FtsZ protofilaments and produced a dominating effect on the assembly of the wild type FtsZ (WT-FtsZ). Although E93R-FtsZ was found to bind to GTP similarly to the WT-FtsZ, it displayed lower GTPase activity than the WT-FtsZ. E93R-FtsZ complemented for its wild type counterpart as observed by a complementation test using JKD7-1/pKD3 cells. However, the bacterial cells became elongated upon overexpression of the mutant allele. We modeled the structure of E93R-FtsZ using the structures of MtbFtsZ/Methanococcus jannaschi FtsZ (MjFtsZ) dimers as templates. The MtbFtsZ-based structure suggests that the Arg(93)-Glu(138) salt bridge provides the additional stability, whereas the effect of mutation appears to be indirect (allosteric) if the EcFtsZ dimer is similar to that of MjFtsZ. The data presented in this study suggest that an increase in the stability of the FtsZ protofilaments is detrimental for the bacterial cytokinesis.


Asunto(s)
Sustitución de Aminoácidos , Proteínas Bacterianas/metabolismo , Citocinesis/fisiología , Proteínas del Citoesqueleto/metabolismo , Escherichia coli/metabolismo , GTP Fosfohidrolasas/metabolismo , Mutación Missense , Proteínas Bacterianas/genética , Proteínas del Citoesqueleto/genética , Escherichia coli/genética , GTP Fosfohidrolasas/genética , Prueba de Complementación Genética , Mycobacterium tuberculosis/genética , Mycobacterium tuberculosis/metabolismo , Multimerización de Proteína/fisiología
13.
Arch Biochem Biophys ; 516(1): 45-51, 2011 Dec 01.
Artículo en Inglés | MEDLINE | ID: mdl-21964243

RESUMEN

Type II phosphatidylinositol (PtdIns) 4-kinases produce PtdIns 4-phosphate, an early key signaling molecule in phosphatidylinositol cycle, which is indispensable for T cell activation. Type II PtdIns 4-kinase alpha and beta have similar biochemical properties. To distinguish these isoforms Epigallocatechin gallate (EGCG) has been evaluated as a specific inhibitor. EGCG is the major active catechin in green tea having anti-inflammatory, antiatherogenic and cancer chemopreventive properties. The precise mechanism of actions and molecular targets of EGCG in early signaling cascades are not well understood. In the present study, we have shown that EGCG inhibits type II PtdIns 4-kinases (α and ß isoforms) and PtdIns 3-kinase activity in vitro. EGCG directly bind to both alpha and beta isoforms of type II PtdIns 4-kinases with a Kd of 2.62 µM and 1.02 µM, respectively. Type II PtdIns 4-kinase-EGCG complex have different binding pattern at its excited state. Both isoforms showed significant change in helicity upon binding with EGCG. EGCG modulates its effect by interacting with ATP binding pocket; the residues likely to be involved in EGCG binding were predicted by Autodock. Our findings suggest that EGCG inhibits two isoforms and could be a key to regulate T cell activation.


Asunto(s)
1-Fosfatidilinositol 4-Quinasa/antagonistas & inhibidores , 1-Fosfatidilinositol 4-Quinasa/metabolismo , Anticarcinógenos/farmacología , Catequina/análogos & derivados , Inhibidores Enzimáticos/farmacología , Fosfatidilinositoles/metabolismo , 1-Fosfatidilinositol 4-Quinasa/química , Secuencia de Aminoácidos , Sitios de Unión , Camellia sinensis/química , Catequina/farmacología , Humanos , Células Jurkat , Modelos Moleculares , Datos de Secuencia Molecular , Neoplasias/prevención & control , Unión Proteica , Isoformas de Proteínas/antagonistas & inhibidores , Isoformas de Proteínas/química , Isoformas de Proteínas/metabolismo , Alineación de Secuencia
14.
Cell Genom ; 1(2)2021 Nov 10.
Artículo en Inglés | MEDLINE | ID: mdl-35311178

RESUMEN

Maximizing the personal, public, research, and clinical value of genomic information will require the reliable exchange of genetic variation data. We report here the Variation Representation Specification (VRS, pronounced "verse"), an extensible framework for the computable representation of variation that complements contemporary human-readable and flat file standards for genomic variation representation. VRS provides semantically precise representations of variation and leverages this design to enable federated identification of biomolecular variation with globally consistent and unique computed identifiers. The VRS framework includes a terminology and information model, machine-readable schema, data sharing conventions, and a reference implementation, each of which is intended to be broadly useful and freely available for community use. VRS was developed by a partnership among national information resource providers, public initiatives, and diagnostic testing laboratories under the auspices of the Global Alliance for Genomics and Health (GA4GH).

15.
J Proteome Res ; 9(9): 4433-42, 2010 Sep 03.
Artículo en Inglés | MEDLINE | ID: mdl-20681595

RESUMEN

Structure-based drug design of protein-kinase inhibitors has been facilitated by availability of an enormous number of structures in the Protein Databank (PDB), systematic analyses of which can provide insight into the factors that govern ligand-protein kinase interactions and into the conformational variability of the protein kinases. In this study, a nonredundant database containing 755 unique, curated, and annotated PDB protein kinase-inhibitor complexes (each consisting of a single protein kinase chain, a ligand, and water molecules around the ligand) was created. With this dataset, analyses were performed of protein conformational variability and interactions of ligands with 11 P-loop residues. Analysis of ligand-protein interactions included ligand atom preference, ligand-protein hydrogen bonds, and the number and position of crystallographic water molecules around important P-loop residues. Analysis of variability in the conformation of the P-loop considered backbone and side-chain dihedral angles, and solvent accessible surface area (SASA). A distorted conformation of the P-loop was observed for some of the protein kinase structures. Lower SASA was observed for the hydrophobic residue in beta1 of several members of the AGC family of protein kinases. Our systematic studies were performed amino acid-by-amino acid, which is unusual for analyses of protein kinase-inhibitor complexes.


Asunto(s)
Biología Computacional/métodos , Bases de Datos de Proteínas , Inhibidores de Proteínas Quinasas/química , Proteínas Quinasas/química , Secuencia de Aminoácidos , Animales , Humanos , Modelos Moleculares , Conformación Proteica , Dominios y Motivos de Interacción de Proteínas , Mapeo de Interacción de Proteínas , Inhibidores de Proteínas Quinasas/metabolismo , Proteínas Quinasas/metabolismo
16.
Pac Symp Biocomput ; 25: 67-78, 2020.
Artículo en Inglés | MEDLINE | ID: mdl-31797587

RESUMEN

As genetic sequencing costs decrease, the lack of clinical interpretation of variants has become the bottleneck in using genetics data. A major rate limiting step in clinical interpretation is the manual curation of evidence in the genetic literature by highly trained biocurators. What makes curation particularly time-consuming is that the curator needs to identify papers that study variant pathogenicity using different types of approaches and evidences-e.g. biochemical assays or case control analysis. In collaboration with the Clinical Genomic Resource (ClinGen)-the flagship NIH program for clinical curation-we propose the first machine learning system, LitGen, that can retrieve papers for a particular variant and filter them by specific evidence types used by curators to assess for pathogenicity. LitGen uses semi-supervised deep learning to predict the type of evi+dence provided by each paper. It is trained on papers annotated by ClinGen curators and systematically evaluated on new test data collected by ClinGen. LitGen further leverages rich human explanations and unlabeled data to gain 7.9%-12.6% relative performance improvement over models learned only on the annotated papers. It is a useful framework to improve clinical variant curation.


Asunto(s)
Biología Computacional , Variación Genética , Estudios de Casos y Controles , Humanos
17.
Biochim Biophys Acta ; 1768(6): 1628-40, 2007 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-17408589

RESUMEN

The structure and dynamics of a single GM1 (Gal5-beta1,3-GalNAc4-beta1,4-(NeuAc3-alpha2,3)-Gal2-beta1,4-Glc1-beta1,1-Cer) embedded in a DPPC bilayer have been studied by MD simulations. Eleven simulations, each of 10 ns productive run, were performed with different initial conformations of GM1. Simulations of GM1-Os in water and of a DPPC bilayer were also performed to delineate the effects of the bilayer and GM1 on the conformational and orientational dynamics of each other. The conformation of the GM1 headgroup observed in the simulations is in agreement with those reported in literature; but the headgroup is restricted when embedded in the bilayer. NeuAc3 is the outermost saccharide towards the water phase. Glc1 and Gal2 prefer a parallel, and NeuAc3, GalNac4 and Gal5 prefer a perpendicular, orientation with respect to the bilayer normal. The overall characteristics of the bilayer are not affected by the presence of GM1; however, GM1 does influence the DPPC molecules in its immediate vicinity. The implications of these observations on the specific recognition and binding of GM1 embedded in a lipid bilayer by exogenous proteins as well as proteins embedded in lipids have been discussed.


Asunto(s)
1,2-Dipalmitoilfosfatidilcolina/química , Gangliósido G(M1)/química , Membrana Dobles de Lípidos/química , Modelos Moleculares , Simulación por Computador , Estructura Molecular , Conformación Proteica
18.
J Phys Chem B ; 112(11): 3346-56, 2008 Mar 20.
Artículo en Inglés | MEDLINE | ID: mdl-18298108

RESUMEN

Gangliosides are a group of structurally diverse, sialic acid containing glycosphingolipids embedded into the membrane via their hydrophobic ceramide moiety. To gain atomic level insights into the structural perturbations caused by Galbeta3GalNAcbeta4(NeuAcalpha3)Galbeta4Glc1Cer (GM1), molecular dynamics (MD) simulations of a 1,2-dipalmitoyl-sn-glycero-3-phosphatidylcholine (DPPC) bilayer containing GM1 at five different concentrations have been performed. Biological membranes contain GM1 only on the exoplasmic leaflet. However, vesicles prepared in the laboratory contain GM1 in both the leaflets albeit unequally. Hence, simulations were performed with GM1 present in only one (asymmetric bilayers) or in both of the leaflets (symmetric bilayers) of the bilayer. In symmetric bilayers, there is a decrease in surface area, an increase in deuterium order parameter, and an increase in peak-to-peak distance of DPPC with increasing concentration of GM1. Thus, the overall area of the lipid bilayer decreases (condensation effect) and the thickness increases with increasing concentrations of GM1. Even in asymmetric systems, decrease in surface area and increase in deuterium order parameter of hydrocarbon chains of DPPC are observed. However, the decrease in bilayer area and the increase in bilayer thickness are not as much as in the symmetric bilayer.


Asunto(s)
1,2-Dipalmitoilfosfatidilcolina/química , Simulación por Computador , Gangliósido G(M1)/química , Membrana Dobles de Lípidos/química , Fluidez de la Membrana , Deuterio/química , Gangliósido G(M1)/metabolismo , Hidrocarburos/química , Interacciones Hidrofóbicas e Hidrofílicas , Membrana Dobles de Lípidos/metabolismo , Microdominios de Membrana/química , Microdominios de Membrana/metabolismo , Conformación Molecular
19.
Science ; 361(6409)2018 09 28.
Artículo en Inglés | MEDLINE | ID: mdl-30139913

RESUMEN

To assess the impact of genetic variation in regulatory loci on human health, we constructed a high-resolution map of allelic imbalances in DNA methylation, histone marks, and gene transcription in 71 epigenomes from 36 distinct cell and tissue types from 13 donors. Deep whole-genome bisulfite sequencing of 49 methylomes revealed sequence-dependent CpG methylation imbalances at thousands of heterozygous regulatory loci. Such loci are enriched for stochastic switching, which is defined as random transitions between fully methylated and unmethylated states of DNA. The methylation imbalances at thousands of loci are explainable by different relative frequencies of the methylated and unmethylated states for the two alleles. Further analyses provided a unifying model that links sequence-dependent allelic imbalances of the epigenome, stochastic switching at gene regulatory loci, and disease-associated genetic variation.


Asunto(s)
Desequilibrio Alélico , Metilación de ADN , Enfermedad/genética , Epigénesis Genética , Genoma Humano , Polimorfismo de Nucleótido Simple , Alelos , Sitios de Unión , Islas de CpG , Redes Reguladoras de Genes , Sitios Genéticos , Estudio de Asociación del Genoma Completo , Humanos , Análisis de Secuencia de ADN , Sulfitos/química , Factores de Transcripción/metabolismo
20.
J Mol Graph Model ; 26(1): 255-68, 2007 Jul.
Artículo en Inglés | MEDLINE | ID: mdl-17212986

RESUMEN

beta3GalTs are type II transmembrane proteins that transfer galactose from UDP-Gal donor substrate to acceptor GlcNAc, GalNAc or Gal in beta1-->3-linkage. beta1-->3-linked galactose have been found to be a part of many glycans like glycosphingolipids, core tetrasaccharide of proteoglycans, type 1 chains. The 3-D structure of none of the beta3GalTs is known to date. In this study, the 3-D structures of human beta3GalT I, II, IV, V, VI and beta3GalNAcT I have been modeled using fold-recognition and comparative modeling methods. Residues that constitute the UDP-Gal binding site have been predicted. The models are able to qualitatively rationalize data from the site-directed mutagenesis experiments reported in the literature. Residues likely to be involved in conferring differential acceptor substrate specificity have been predicted by a combination of specificity determining positions prediction (SDPs) and subsequent mapping on the generated 3-D models.


Asunto(s)
Galactosiltransferasas/química , N-Acetilgalactosaminiltransferasas/química , Secuencia de Aminoácidos , Animales , Secuencia de Carbohidratos , Dominio Catalítico , Simulación por Computador , Galactosiltransferasas/metabolismo , Humanos , Ratones , Modelos Moleculares , Datos de Secuencia Molecular , N-Acetilgalactosaminiltransferasas/metabolismo , Oligosacáridos/química , Oligosacáridos/metabolismo , Conformación Proteica , Pliegue de Proteína , Estructura Secundaria de Proteína , Homología de Secuencia de Aminoácido , Programas Informáticos , Especificidad por Sustrato
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA