Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 29
Filtrar
Más filtros













Base de datos
Intervalo de año de publicación
2.
Genome Med ; 14(1): 6, 2022 01 18.
Artículo en Inglés | MEDLINE | ID: mdl-35039090

RESUMEN

BACKGROUND: Identification of clinically significant genetic alterations involved in human disease has been dramatically accelerated by developments in next-generation sequencing technologies. However, the infrastructure and accessible comprehensive curation tools necessary for analyzing an individual patient genome and interpreting genetic variants to inform healthcare management have been lacking. RESULTS: Here we present the ClinGen Variant Curation Interface (VCI), a global open-source variant classification platform for supporting the application of evidence criteria and classification of variants based on the ACMG/AMP variant classification guidelines. The VCI is among a suite of tools developed by the NIH-funded Clinical Genome Resource (ClinGen) Consortium and supports an FDA-recognized human variant curation process. Essential to this is the ability to enable collaboration and peer review across ClinGen Expert Panels supporting users in comprehensively identifying, annotating, and sharing relevant evidence while making variant pathogenicity assertions. To facilitate evidence-based improvements in human variant classification, the VCI is publicly available to the genomics community. Navigation workflows support users providing guidance to comprehensively apply the ACMG/AMP evidence criteria and document provenance for asserting variant classifications. CONCLUSIONS: The VCI offers a central platform for clinical variant classification that fills a gap in the learning healthcare system, facilitates widespread adoption of standards for clinical curation, and is available at https://curation.clinicalgenome.org.


Asunto(s)
Variación Genética , Genoma Humano , Humanos , Pruebas Genéticas , Genómica
3.
Cell Genom ; 1(2)2021 Nov 10.
Artículo en Inglés | MEDLINE | ID: mdl-35311178

RESUMEN

Maximizing the personal, public, research, and clinical value of genomic information will require the reliable exchange of genetic variation data. We report here the Variation Representation Specification (VRS, pronounced "verse"), an extensible framework for the computable representation of variation that complements contemporary human-readable and flat file standards for genomic variation representation. VRS provides semantically precise representations of variation and leverages this design to enable federated identification of biomolecular variation with globally consistent and unique computed identifiers. The VRS framework includes a terminology and information model, machine-readable schema, data sharing conventions, and a reference implementation, each of which is intended to be broadly useful and freely available for community use. VRS was developed by a partnership among national information resource providers, public initiatives, and diagnostic testing laboratories under the auspices of the Global Alliance for Genomics and Health (GA4GH).

4.
Pac Symp Biocomput ; 25: 67-78, 2020.
Artículo en Inglés | MEDLINE | ID: mdl-31797587

RESUMEN

As genetic sequencing costs decrease, the lack of clinical interpretation of variants has become the bottleneck in using genetics data. A major rate limiting step in clinical interpretation is the manual curation of evidence in the genetic literature by highly trained biocurators. What makes curation particularly time-consuming is that the curator needs to identify papers that study variant pathogenicity using different types of approaches and evidences-e.g. biochemical assays or case control analysis. In collaboration with the Clinical Genomic Resource (ClinGen)-the flagship NIH program for clinical curation-we propose the first machine learning system, LitGen, that can retrieve papers for a particular variant and filter them by specific evidence types used by curators to assess for pathogenicity. LitGen uses semi-supervised deep learning to predict the type of evi+dence provided by each paper. It is trained on papers annotated by ClinGen curators and systematically evaluated on new test data collected by ClinGen. LitGen further leverages rich human explanations and unlabeled data to gain 7.9%-12.6% relative performance improvement over models learned only on the annotated papers. It is a useful framework to improve clinical variant curation.


Asunto(s)
Biología Computacional , Variación Genética , Estudios de Casos y Controles , Humanos
5.
Hum Mutat ; 39(11): 1690-1701, 2018 11.
Artículo en Inglés | MEDLINE | ID: mdl-30311374

RESUMEN

Effective exchange of information about genetic variants is currently hampered by the lack of readily available globally unique variant identifiers that would enable aggregation of information from different sources. The ClinGen Allele Registry addresses this problem by providing (1) globally unique "canonical" variant identifiers (CAids) on demand, either individually or in large batches; (2) access to variant-identifying information in a searchable Registry; (3) links to allele-related records in many commonly used databases; and (4) services for adding links to information about registered variants in external sources. A core element of the Registry is a canonicalization service, implemented using in-memory sequence alignment-based index, which groups variant identifiers denoting the same nucleotide variant and assigns unique and dereferenceable CAids. More than 650 million distinct variants are currently registered, including those from gnomAD, ExAC, dbSNP, and ClinVar, including a small number of variants registered by Registry users. The Registry is accessible both via a web interface and programmatically via well-documented Hypertext Transfer Protocol (HTTP) Representational State Transfer Application Programming Interface (REST-APIs). For programmatic interoperability, the Registry content is accessible in the JavaScript Object Notation for Linked Data (JSON-LD) format. We present several use cases and demonstrate how the linked information may provide raw material for reasoning about variant's pathogenicity.


Asunto(s)
Bases de Datos Genéticas , Variación Genética/genética , Alelos , Humanos , Sistema de Registros , Programas Informáticos
6.
Hum Mutat ; 39(11): 1686-1689, 2018 11.
Artículo en Inglés | MEDLINE | ID: mdl-30311379

RESUMEN

The Clinical Genome Resource (ClinGen)'s work to develop a knowledge base to support the understanding of genes and variants for use in precision medicine and research depends on robust, broadly applicable, and adaptable technical standards for sharing data and information. To forward this goal, ClinGen has joined with the Global Alliance for Genomics and Health (GA4GH) to support the development of open, freely-available technical standards and regulatory frameworks for secure and responsible sharing of genomic and health-related data. In its capacity as one of the 15 inaugural GA4GH "Driver Projects," ClinGen is providing input on the key standards needs of the global genomics community, and has committed to participate on GA4GH Work Streams to support the development of: (1) a standard model for computer-readable variant representation; (2) a data model for linking variant data to annotations; (3) a specification to enable sharing of genomic variant knowledge and associated clinical interpretations; and (4) a set of best practices for use of phenotype and disease ontologies. ClinGen's participation as a GA4GH Driver Project will provide a robust environment to test drive emerging genomic knowledge sharing standards and prove their utility among the community, while accelerating the construction of the ClinGen evidence base.


Asunto(s)
Genoma Humano/genética , Difusión de la Información/métodos , Biología Computacional , Bases de Datos Genéticas , Variación Genética , Genómica , Humanos , Medicina de Precisión
7.
Hum Mutat ; 39(11): 1677-1685, 2018 11.
Artículo en Inglés | MEDLINE | ID: mdl-30311382

RESUMEN

The use of genome-scale sequencing allows for identification of genetic findings beyond the original indication for testing (secondary findings). The ClinGen Actionability Working Group's (AWG) protocol for evidence synthesis and semi-quantitative metric scoring evaluates four domains of clinical actionability for potential secondary findings: severity and likelihood of the outcome, and effectiveness and nature of the intervention. As of February 2018, the AWG has scored 127 genes associated with 78 disorders (up-to-date topics/scores are available at www.clinicalgenome.org). Scores across these disorders were assessed to compare genes/disorders recommended for return as secondary findings by the American College of Medical Genetics and Genomics (ACMG) with those not currently recommended. Disorders recommended by the ACMG scored higher on outcome-related domains (severity and likelihood), but not on intervention-related domains (effectiveness and nature of the intervention). Current practices indicate that return of secondary findings will expand beyond those currently recommended by the ACMG. The ClinGen AWG evidence reports and summary scores are not intended as classifications of actionability, rather they provide a resource to aid decision makers as they determine best practices regarding secondary findings. The ClinGen AWG is working with the ACMG Secondary Findings Committee to update future iterations of their secondary findings list.


Asunto(s)
Genoma Humano/genética , Bases de Datos Genéticas , Exoma/genética , Pruebas Genéticas , Variación Genética/genética , Secuenciación de Nucleótidos de Alto Rendimiento , Humanos
8.
Science ; 361(6409)2018 09 28.
Artículo en Inglés | MEDLINE | ID: mdl-30139913

RESUMEN

To assess the impact of genetic variation in regulatory loci on human health, we constructed a high-resolution map of allelic imbalances in DNA methylation, histone marks, and gene transcription in 71 epigenomes from 36 distinct cell and tissue types from 13 donors. Deep whole-genome bisulfite sequencing of 49 methylomes revealed sequence-dependent CpG methylation imbalances at thousands of heterozygous regulatory loci. Such loci are enriched for stochastic switching, which is defined as random transitions between fully methylated and unmethylated states of DNA. The methylation imbalances at thousands of loci are explainable by different relative frequencies of the methylated and unmethylated states for the two alleles. Further analyses provided a unifying model that links sequence-dependent allelic imbalances of the epigenome, stochastic switching at gene regulatory loci, and disease-associated genetic variation.


Asunto(s)
Desequilibrio Alélico , Metilación de ADN , Enfermedad/genética , Epigénesis Genética , Genoma Humano , Polimorfismo de Nucleótido Simple , Alelos , Sitios de Unión , Islas de CpG , Redes Reguladoras de Genes , Sitios Genéticos , Estudio de Asociación del Genoma Completo , Humanos , Análisis de Secuencia de ADN , Sulfitos/química , Factores de Transcripción/metabolismo
9.
Genome Med ; 9(1): 3, 2017 01 12.
Artículo en Inglés | MEDLINE | ID: mdl-28081714

RESUMEN

BACKGROUND: The success of the clinical use of sequencing based tests (from single gene to genomes) depends on the accuracy and consistency of variant interpretation. Aiming to improve the interpretation process through practice guidelines, the American College of Medical Genetics and Genomics (ACMG) and the Association for Molecular Pathology (AMP) have published standards and guidelines for the interpretation of sequence variants. However, manual application of the guidelines is tedious and prone to human error. Web-based tools and software systems may not only address this problem but also document reasoning and supporting evidence, thus enabling transparency of evidence-based reasoning and resolution of discordant interpretations. RESULTS: In this report, we describe the design, implementation, and initial testing of the Clinical Genome Resource (ClinGen) Pathogenicity Calculator, a configurable system and web service for the assessment of pathogenicity of Mendelian germline sequence variants. The system allows users to enter the applicable ACMG/AMP-style evidence tags for a specific allele with links to supporting data for each tag and generate guideline-based pathogenicity assessment for the allele. Through automation and comprehensive documentation of evidence codes, the system facilitates more accurate application of the ACMG/AMP guidelines, improves standardization in variant classification, and facilitates collaborative resolution of discordances. The rules of reasoning are configurable with gene-specific or disease-specific guideline variations (e.g. cardiomyopathy-specific frequency thresholds and functional assays). The software is modular, equipped with robust application program interfaces (APIs), and available under a free open source license and as a cloud-hosted web service, thus facilitating both stand-alone use and integration with existing variant curation and interpretation systems. The Pathogenicity Calculator is accessible at http://calculator.clinicalgenome.org . CONCLUSIONS: By enabling evidence-based reasoning about the pathogenicity of genetic variants and by documenting supporting evidence, the Calculator contributes toward the creation of a knowledge commons and more accurate interpretation of sequence variants in research and clinical care.


Asunto(s)
Enfermedad/genética , Variación Genética , Genoma Humano , Programas Informáticos , Alelos , Biología Computacional , Genética Médica , Guías como Asunto , Humanos , Mutación
10.
Cell Rep ; 17(8): 2075-2086, 2016 11 15.
Artículo en Inglés | MEDLINE | ID: mdl-27851969

RESUMEN

Cancer progression depends on both cell-intrinsic processes and interactions between different cell types. However, large-scale assessment of cell type composition and molecular profiles of individual cell types within tumors remains challenging. To address this, we developed epigenomic deconvolution (EDec), an in silico method that infers cell type composition of complex tissues as well as DNA methylation and gene transcription profiles of constituent cell types. By applying EDec to The Cancer Genome Atlas (TCGA) breast tumors, we detect changes in immune cell infiltration related to patient prognosis, and a striking change in stromal fibroblast-to-adipocyte ratio across breast cancer subtypes. Furthermore, we show that a less adipose stroma tends to display lower levels of mitochondrial activity and to be associated with cancerous cells with higher levels of oxidative metabolism. These findings highlight the role of stromal composition in the metabolic coupling between distinct cell types within tumors.


Asunto(s)
Neoplasias de la Mama/genética , Neoplasias de la Mama/metabolismo , Epigenómica , Tejido Adiposo/patología , Neoplasias de la Mama/inmunología , Neoplasias de la Mama/patología , Carcinogénesis/genética , Carcinogénesis/patología , Línea Celular Tumoral , Simulación por Computador , Metilación de ADN/genética , Progresión de la Enfermedad , Femenino , Regulación Neoplásica de la Expresión Génica , Humanos , Oxidación-Reducción , Fenotipo , Reproducibilidad de los Resultados , Análisis de Secuencia de ADN , Células del Estroma/patología , Microambiente Tumoral/genética
12.
Am J Hum Genet ; 98(6): 1067-1076, 2016 06 02.
Artículo en Inglés | MEDLINE | ID: mdl-27181684

RESUMEN

Evaluating the pathogenicity of a variant is challenging given the plethora of types of genetic evidence that laboratories consider. Deciding how to weigh each type of evidence is difficult, and standards have been needed. In 2015, the American College of Medical Genetics and Genomics (ACMG) and the Association for Molecular Pathology (AMP) published guidelines for the assessment of variants in genes associated with Mendelian diseases. Nine molecular diagnostic laboratories involved in the Clinical Sequencing Exploratory Research (CSER) consortium piloted these guidelines on 99 variants spanning all categories (pathogenic, likely pathogenic, uncertain significance, likely benign, and benign). Nine variants were distributed to all laboratories, and the remaining 90 were evaluated by three laboratories. The laboratories classified each variant by using both the laboratory's own method and the ACMG-AMP criteria. The agreement between the two methods used within laboratories was high (K-alpha = 0.91) with 79% concordance. However, there was only 34% concordance for either classification system across laboratories. After consensus discussions and detailed review of the ACMG-AMP criteria, concordance increased to 71%. Causes of initial discordance in ACMG-AMP classifications were identified, and recommendations on clarification and increased specification of the ACMG-AMP criteria were made. In summary, although an initial pilot of the ACMG-AMP guidelines did not lead to increased concordance in variant interpretation, comparing variant interpretations to identify differences and having a common framework to facilitate resolution of those differences were beneficial for improving agreement, allowing iterative movement toward increased reporting consistency for variants in genes associated with monogenic disease.


Asunto(s)
Investigación Biomédica , Pruebas Genéticas/normas , Variación Genética/genética , Genómica/métodos , Laboratorios/normas , Mutación/genética , Análisis de Secuencia de ADN/normas , Interpretación Estadística de Datos , Práctica Clínica Basada en la Evidencia , Exoma/genética , Genoma Humano , Guías como Asunto , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Humanos , Hallazgos Incidentales , Programas Informáticos , Estados Unidos
13.
Quant Biol ; 3(3): 115-123, 2015 Sep 01.
Artículo en Inglés | MEDLINE | ID: mdl-26753103

RESUMEN

Transcription factors (TFs) are major modulators of transcription and subsequent cellular processes. The binding of TFs to specific regulatory elements is governed by their specificity. Considering the gap between known TFs sequence and specificity, specificity prediction frameworks are highly desired. Key inputs to such frameworks are protein residues that modulate the specificity of TF under consideration. Simple measures like mutual information (MI) to delineate specificity influencing residues (SIRs) from alignment fail due to structural constraints imposed by the three-dimensional structure of protein. Structural restraints on the evolution of the amino-acid sequence lead to identification of false SIRs. In this manuscript we extended three methods (Direct Information, PSICOV and adjusted mutual information) that have been used to disentangle spurious indirect protein residue-residue contacts from direct contacts, to identify SIRs from joint alignments of amino-acids and specificity. We predicted SIRs forhomeodomain (HD), helix-loop-helix, LacI and GntR families of TFs using these methods and compared to MI. Using various measures, we show that the performance of these three methods is comparable but better than MI. Implication of these methods in specificity prediction framework is discussed. The methods are implemented as an R package and available along with the alignments at stormo.wustl.edu/SpecPred.

14.
Nucleic Acids Res ; 42(8): 4800-12, 2014 Apr.
Artículo en Inglés | MEDLINE | ID: mdl-24523353

RESUMEN

Cys(2)-His(2) zinc finger proteins (ZFPs) are the largest family of transcription factors in higher metazoans. They also represent the most diverse family with regards to the composition of their recognition sequences. Although there are a number of ZFPs with characterized DNA-binding preferences, the specificity of the vast majority of ZFPs is unknown and cannot be directly inferred by homology due to the diversity of recognition residues present within individual fingers. Given the large number of unique zinc fingers and assemblies present across eukaryotes, a comprehensive predictive recognition model that could accurately estimate the DNA-binding specificity of any ZFP based on its amino acid sequence would have great utility. Toward this goal, we have used the DNA-binding specificities of 678 two-finger modules from both natural and artificial sources to construct a random forest-based predictive model for ZFP recognition. We find that our recognition model outperforms previously described determinant-based recognition models for ZFPs, and can successfully estimate the specificity of naturally occurring ZFPs with previously defined specificities.


Asunto(s)
Proteínas de Unión al ADN/metabolismo , Elementos Reguladores de la Transcripción , Factores de Transcripción/metabolismo , Dedos de Zinc , Inteligencia Artificial , Sitios de Unión , ADN/química , Proteínas de Unión al ADN/química , Modelos Biológicos , Motivos de Nucleótidos , Factores de Transcripción/química
15.
Bioinformatics ; 30(7): 941-8, 2014 Apr 01.
Artículo en Inglés | MEDLINE | ID: mdl-24369152

RESUMEN

MOTIVATION: Generating accurate transcription factor (TF) binding site motifs from data generated using the next-generation sequencing, especially ChIP-seq, is challenging. The challenge arises because a typical experiment reports a large number of sequences bound by a TF, and the length of each sequence is relatively long. Most traditional motif finders are slow in handling such enormous amount of data. To overcome this limitation, tools have been developed that compromise accuracy with speed by using heuristic discrete search strategies or limited optimization of identified seed motifs. However, such strategies may not fully use the information in input sequences to generate motifs. Such motifs often form good seeds and can be further improved with appropriate scoring functions and rapid optimization. RESULTS: We report a tool named discriminative motif optimizer (DiMO). DiMO takes a seed motif along with a positive and a negative database and improves the motif based on a discriminative strategy. We use area under receiver-operating characteristic curve (AUC) as a measure of discriminating power of motifs and a strategy based on perceptron training that maximizes AUC rapidly in a discriminative manner. Using DiMO, on a large test set of 87 TFs from human, drosophila and yeast, we show that it is possible to significantly improve motifs identified by nine motif finders. The motifs are generated/optimized using training sets and evaluated on test sets. The AUC is improved for almost 90% of the TFs on test sets and the magnitude of increase is up to 39%. AVAILABILITY AND IMPLEMENTATION: DiMO is available at http://stormo.wustl.edu/DiMO


Asunto(s)
Inteligencia Artificial , Factores de Transcripción/química , Algoritmos , Animales , Secuencia de Bases , Sitios de Unión , ADN/química , ADN/metabolismo , Drosophila melanogaster , Humanos , Unión Proteica , Saccharomyces cerevisiae , Programas Informáticos , Factores de Transcripción/metabolismo
16.
BMC Bioinformatics ; 13 Suppl 15: S3, 2012.
Artículo en Inglés | MEDLINE | ID: mdl-23046442

RESUMEN

BACKGROUND: In the context of drug discovery and development, much effort has been exerted to determine which conformers of a given molecule are responsible for the observed biological activity. In this work we aimed to predict bioactive conformers using a variant of supervised learning, named multiple-instance learning. A single molecule, treated as a bag of conformers, is biologically active if and only if at least one of its conformers, treated as an instance, is responsible for the observed bioactivity; and a molecule is inactive if none of its conformers is responsible for the observed bioactivity. The implementation requires instance-based embedding, and joint feature selection and classification. The goal of the present project is to implement multiple-instance learning in drug activity prediction, and subsequently to identify the bioactive conformers for each molecule. METHODS: We encoded the 3-dimensional structures using pharmacophore fingerprints which are binary strings, and accomplished instance-based embedding using calculated dissimilarity distances. Four dissimilarity measures were employed and their performances were compared. 1-norm SVM was used for joint feature selection and classification. The approach was applied to four data sets, and the best proposed model for each data set was determined by using the dissimilarity measure yielding the smallest number of selected features. RESULTS: The predictive abilities of the proposed approach were compared with three classical predictive models without instance-based embedding. The proposed approach produced the best predictive models for one data set and second best predictive models for the rest of the data sets, based on the external validations. To validate the ability of the proposed approach to find bioactive conformers, 12 small molecules with co-crystallized structures were seeded in one data set. 10 out of 12 co-crystallized structures were indeed identified as significant conformers using the proposed approach. CONCLUSIONS: The proposed approach was proven not to suffer from overfitting and to be highly competitive with classical predictive models, so it is very powerful for drug activity prediction. The approach was also validated as a useful method for pursuit of bioactive conformers.


Asunto(s)
Inteligencia Artificial , Biología Computacional/métodos , Descubrimiento de Drogas , Modelos Teóricos , Conformación Molecular , Relación Estructura-Actividad Cuantitativa
17.
IEEE Trans Nanobioscience ; 11(3): 228-36, 2012 Sep.
Artículo en Inglés | MEDLINE | ID: mdl-22987128

RESUMEN

There are a vast number of biology related research problems involving a combination of multiple sources of data to achieve a better understanding of the underlying problems. It is important to select and interpret the most important information from these sources. Thus it will be beneficial to have a good algorithm to simultaneously extract rules and select features for better interpretation of the predictive model. We propose an efficient algorithm, Combined Rule Extraction and Feature Elimination (CRF), based on 1-norm regularized random forests. CRF simultaneously extracts a small number of rules generated by random forests and selects important features. We applied CRF to several drug activity prediction and microarray data sets. CRF is capable of producing performance comparable with state-of-the-art prediction algorithms using a small number of decision rules. Some of the decision rules are biologically significant.


Asunto(s)
Algoritmos , Inteligencia Artificial , Biología Computacional/métodos , Árboles de Decisión , Miembro 1 de la Subfamilia B de Casetes de Unión a ATP/genética , Bases de Datos Factuales , Humanos , Modelos Teóricos , Neoplasias/genética , Análisis de Secuencia por Matrices de Oligonucleótidos , Receptores de Cannabinoides/genética , Reproducibilidad de los Resultados
18.
Xenobiotica ; 42(2): 139-56, 2012 Feb.
Artículo en Inglés | MEDLINE | ID: mdl-21970716

RESUMEN

RATIONALE: The therapeutic promise of trans-resveratrol (tRes) is limited by poor bioavailability following rapid metabolism. We hypothesise that trans-arachidin-1 (tA1) and trans-arachidin-3 (tA3), peanut hairy root-derived isoprenylated analogs of tRes, will exhibit slower metabolism/enhanced bioavailability and retain biological activity via cannabinoid receptor (CBR) binding relative to their non-prenylated parent compounds trans-piceatannol (tPice) and tRes, respectively. RESULTS: The activities of eight human UDP-glucuronosyltransferases (UGTs) toward these compounds were evaluated. The greatest activity was observed for extrahepatic UGTs 1A10 and 1A7, followed by hepatic UGTs 1A1 and 1A9. Importantly, an additional isoprenyl and/or hydroxyl group in tA1 and tA3 slowed overall glucuronidation. CBR binding studies demonstrated that all analogs bound to CB1Rs with similar affinities (5-18 µM); however, only tA1 and tA3 bound appreciably to CB2Rs. Molecular modelling studies confirmed that the isoprenyl moiety of tA1 and tA3 improved binding affinity to CB2Rs. Finally, although tA3 acted as a competitive CB1R antagonist, tA1 antagonised CB1R agonists by both competitive and non-competitive mechanisms. CONCLUSIONS: Prenylated stilbenoids may be preferable alternatives to tRes due to increased bioavailability via slowed metabolism. Similar structural analogs might be developed as novel CB therapeutics for obesity and/or drug dependency.


Asunto(s)
Glucuronosiltransferasa/química , Hemiterpenos/farmacología , Receptores de Cannabinoides/química , Estilbenos/química , Estilbenos/farmacología , Animales , Unión Competitiva , Disponibilidad Biológica , Células CHO , Cromatografía Líquida de Alta Presión , Cricetinae , Hemiterpenos/química , Hemiterpenos/farmacocinética , Humanos , Cinética , Espectrometría de Masas , Fase II de la Desintoxicación Metabólica , Modelos Moleculares , Prenilación , Proteínas Recombinantes/química , Resveratrol , Estilbenos/farmacocinética
19.
BMC Bioinformatics ; 12 Suppl 10: S22, 2011 Oct 18.
Artículo en Inglés | MEDLINE | ID: mdl-22166097

RESUMEN

BACKGROUND: It is commonly believed that including domain knowledge in a prediction model is desirable. However, representing and incorporating domain information in the learning process is, in general, a challenging problem. In this research, we consider domain information encoded by discrete or categorical attributes. A discrete or categorical attribute provides a natural partition of the problem domain, and hence divides the original problem into several non-overlapping sub-problems. In this sense, the domain information is useful if the partition simplifies the learning task. The goal of this research is to develop an algorithm to identify discrete or categorical attributes that maximally simplify the learning task. RESULTS: We consider restructuring a supervised learning problem via a partition of the problem space using a discrete or categorical attribute. A naive approach exhaustively searches all the possible restructured problems. It is computationally prohibitive when the number of discrete or categorical attributes is large. We propose a metric to rank attributes according to their potential to reduce the uncertainty of a classification task. It is quantified as a conditional entropy achieved using a set of optimal classifiers, each of which is built for a sub-problem defined by the attribute under consideration. To avoid high computational cost, we approximate the solution by the expected minimum conditional entropy with respect to random projections. This approach is tested on three artificial data sets, three cheminformatics data sets, and two leukemia gene expression data sets. Empirical results demonstrate that our method is capable of selecting a proper discrete or categorical attribute to simplify the problem, i.e., the performance of the classifier built for the restructured problem always beats that of the original problem. CONCLUSIONS: The proposed conditional entropy based metric is effective in identifying good partitions of a classification problem, hence enhancing the prediction performance.


Asunto(s)
Inteligencia Artificial , Modelos Biológicos , Algoritmos , Entropía , Glucógeno Sintasa Quinasa 3/antagonistas & inhibidores , Glucógeno Sintasa Quinasa 3 beta , Humanos , Leucemia-Linfoma Linfoblástico de Células Precursoras/tratamiento farmacológico , Leucemia-Linfoma Linfoblástico de Células Precursoras/genética , Pronóstico , Receptor Cannabinoide CB1/metabolismo , Receptor Cannabinoide CB2/metabolismo
20.
Arch Biochem Biophys ; 516(1): 45-51, 2011 Dec 01.
Artículo en Inglés | MEDLINE | ID: mdl-21964243

RESUMEN

Type II phosphatidylinositol (PtdIns) 4-kinases produce PtdIns 4-phosphate, an early key signaling molecule in phosphatidylinositol cycle, which is indispensable for T cell activation. Type II PtdIns 4-kinase alpha and beta have similar biochemical properties. To distinguish these isoforms Epigallocatechin gallate (EGCG) has been evaluated as a specific inhibitor. EGCG is the major active catechin in green tea having anti-inflammatory, antiatherogenic and cancer chemopreventive properties. The precise mechanism of actions and molecular targets of EGCG in early signaling cascades are not well understood. In the present study, we have shown that EGCG inhibits type II PtdIns 4-kinases (α and ß isoforms) and PtdIns 3-kinase activity in vitro. EGCG directly bind to both alpha and beta isoforms of type II PtdIns 4-kinases with a Kd of 2.62 µM and 1.02 µM, respectively. Type II PtdIns 4-kinase-EGCG complex have different binding pattern at its excited state. Both isoforms showed significant change in helicity upon binding with EGCG. EGCG modulates its effect by interacting with ATP binding pocket; the residues likely to be involved in EGCG binding were predicted by Autodock. Our findings suggest that EGCG inhibits two isoforms and could be a key to regulate T cell activation.


Asunto(s)
1-Fosfatidilinositol 4-Quinasa/antagonistas & inhibidores , 1-Fosfatidilinositol 4-Quinasa/metabolismo , Anticarcinógenos/farmacología , Catequina/análogos & derivados , Inhibidores Enzimáticos/farmacología , Fosfatidilinositoles/metabolismo , 1-Fosfatidilinositol 4-Quinasa/química , Secuencia de Aminoácidos , Sitios de Unión , Camellia sinensis/química , Catequina/farmacología , Humanos , Células Jurkat , Modelos Moleculares , Datos de Secuencia Molecular , Neoplasias/prevención & control , Unión Proteica , Isoformas de Proteínas/antagonistas & inhibidores , Isoformas de Proteínas/química , Isoformas de Proteínas/metabolismo , Alineación de Secuencia
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA