Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 33
Filtrar
Más filtros

Banco de datos
País/Región como asunto
Tipo del documento
Intervalo de año de publicación
1.
Cell ; 170(1): 199-212.e20, 2017 Jun 29.
Artículo en Inglés | MEDLINE | ID: mdl-28666119

RESUMEN

Type 2 diabetes (T2D) affects Latinos at twice the rate seen in populations of European descent. We recently identified a risk haplotype spanning SLC16A11 that explains ∼20% of the increased T2D prevalence in Mexico. Here, through genetic fine-mapping, we define a set of tightly linked variants likely to contain the causal allele(s). We show that variants on the T2D-associated haplotype have two distinct effects: (1) decreasing SLC16A11 expression in liver and (2) disrupting a key interaction with basigin, thereby reducing cell-surface localization. Both independent mechanisms reduce SLC16A11 function and suggest SLC16A11 is the causal gene at this locus. To gain insight into how SLC16A11 disruption impacts T2D risk, we demonstrate that SLC16A11 is a proton-coupled monocarboxylate transporter and that genetic perturbation of SLC16A11 induces changes in fatty acid and lipid metabolism that are associated with increased T2D risk. Our findings suggest that increasing SLC16A11 function could be therapeutically beneficial for T2D. VIDEO ABSTRACT.


Asunto(s)
Diabetes Mellitus Tipo 2/metabolismo , Transportadores de Ácidos Monocarboxílicos/genética , Transportadores de Ácidos Monocarboxílicos/metabolismo , Basigina/metabolismo , Membrana Celular/metabolismo , Cromosomas Humanos Par 17/metabolismo , Técnicas de Silenciamiento del Gen , Haplotipos , Hepatocitos/metabolismo , Heterocigoto , Código de Histonas , Humanos , Hígado/metabolismo , Modelos Moleculares , Transportadores de Ácidos Monocarboxílicos/química
2.
Diabetologia ; 66(3): 495-507, 2023 03.
Artículo en Inglés | MEDLINE | ID: mdl-36538063

RESUMEN

AIMS/HYPOTHESIS: Type 2 diabetes is highly polygenic and influenced by multiple biological pathways. Rapid expansion in the number of type 2 diabetes loci can be leveraged to identify such pathways. METHODS: We developed a high-throughput pipeline to enable clustering of type 2 diabetes loci based on variant-trait associations. Our pipeline extracted summary statistics from genome-wide association studies (GWAS) for type 2 diabetes and related traits to generate a matrix of 323 variants × 64 trait associations and applied Bayesian non-negative matrix factorisation (bNMF) to identify genetic components of type 2 diabetes. Epigenomic enrichment analysis was performed in 28 cell types and single pancreatic cells. We generated cluster-specific polygenic scores and performed regression analysis in an independent cohort (N=25,419) to assess for clinical relevance. RESULTS: We identified ten clusters of genetic loci, recapturing the five from our prior analysis as well as novel clusters related to beta cell dysfunction, pronounced insulin secretion, and levels of alkaline phosphatase, lipoprotein A and sex hormone-binding globulin. Four clusters related to mechanisms of insulin deficiency, five to insulin resistance and one had an unclear mechanism. The clusters displayed tissue-specific epigenomic enrichment, notably with the two beta cell clusters differentially enriched in functional and stressed pancreatic beta cell states. Additionally, cluster-specific polygenic scores were differentially associated with patient clinical characteristics and outcomes. The pipeline was applied to coronary artery disease and chronic kidney disease, identifying multiple overlapping clusters with type 2 diabetes. CONCLUSIONS/INTERPRETATION: Our approach stratifies type 2 diabetes loci into physiologically interpretable genetic clusters associated with distinct tissues and clinical outcomes. The pipeline allows for efficient updating as additional GWAS become available and can be readily applied to other conditions, facilitating clinical translation of GWAS findings. Software to perform this clustering pipeline is freely available.


Asunto(s)
Diabetes Mellitus Tipo 2 , Humanos , Diabetes Mellitus Tipo 2/genética , Estudio de Asociación del Genoma Completo , Predisposición Genética a la Enfermedad/genética , Teorema de Bayes , Análisis por Conglomerados , Polimorfismo de Nucleótido Simple
3.
Mol Cell ; 60(6): 941-52, 2015 Dec 17.
Artículo en Inglés | MEDLINE | ID: mdl-26698662

RESUMEN

In insects, brain-derived Prothoracicotropic hormone (PTTH) activates the receptor tyrosine kinase (RTK) Torso to initiate metamorphosis through the release of ecdysone. We have determined the crystal structure of silkworm PTTH in complex with the ligand-binding region of Torso. Here we show that ligand-induced Torso dimerization results from the sequential and negatively cooperative formation of asymmetric heterotetramers. Mathematical modeling of receptor activation based upon our biophysical studies shows that ligand pulses are "buffered" at low receptor levels, leading to a sustained signal. By contrast, high levels of Torso develop the signal intensity and duration of a noncooperative system. We propose that this may allow Torso to coordinate widely different functions from a single ligand by tuning receptor levels. Phylogenic analysis indicates that Torso is found outside arthropods, including human parasitic roundworms. Together, our findings provide mechanistic insight into how this receptor system, with roles in embryonic and adult development, is regulated.


Asunto(s)
Bombyx/metabolismo , Hormonas de Insectos/química , Hormonas de Insectos/metabolismo , Proteínas Tirosina Quinasas Receptoras/química , Proteínas Tirosina Quinasas Receptoras/metabolismo , Animales , Sitios de Unión , Bombyx/química , Cristalografía por Rayos X , Regulación del Desarrollo de la Expresión Génica , Humanos , Proteínas de Insectos/química , Proteínas de Insectos/metabolismo , Modelos Moleculares , Filogenia , Multimerización de Proteína , Receptores de Interleucina-17/química , Transducción de Señal
4.
Diabetologia ; 61(6): 1315-1324, 2018 06.
Artículo en Inglés | MEDLINE | ID: mdl-29626220

RESUMEN

AIMS/HYPOTHESIS: Identifying the metabolite profile of individuals with normal fasting glucose (NFG [<5.55 mmol/l]) who progressed to type 2 diabetes may give novel insights into early type 2 diabetes disease interception and detection. METHODS: We conducted a population-based prospective study among 1150 Framingham Heart Study Offspring cohort participants, age 40-65 years, with NFG. Plasma metabolites were profiled by LC-MS/MS. Penalised regression models were used to select measured metabolites for type 2 diabetes incidence classification (training dataset) and to internally validate the discriminatory capability of selected metabolites beyond conventional type 2 diabetes risk factors (testing dataset). RESULTS: Over a follow-up period of 20 years, 95 individuals with NFG developed type 2 diabetes. Nineteen metabolites were selected repeatedly in the training dataset for type 2 diabetes incidence classification and were found to improve type 2 diabetes risk prediction beyond conventional type 2 diabetes risk factors (AUC was 0.81 for risk factors vs 0.90 for risk factors + metabolites, p = 1.1 × 10-4). Using pathway enrichment analysis, the nitrogen metabolism pathway, which includes three prioritised metabolites (glycine, taurine and phenylalanine), was significantly enriched for association with type 2 diabetes risk at the false discovery rate of 5% (p = 0.047). In adjusted Cox proportional hazard models, the type 2 diabetes risk per 1 SD increase in glycine, taurine and phenylalanine was 0.65 (95% CI 0.54, 0.78), 0.73 (95% CI 0.59, 0.9) and 1.35 (95% CI 1.11, 1.65), respectively. Mendelian randomisation demonstrated a similar relationship for type 2 diabetes risk per 1 SD genetically increased glycine (OR 0.89 [95% CI 0.8, 0.99]) and phenylalanine (OR 1.6 [95% CI 1.08, 2.4]). CONCLUSIONS/INTERPRETATION: In individuals with NFG, information from a discrete set of 19 metabolites improved prediction of type 2 diabetes beyond conventional risk factors. In addition, the nitrogen metabolism pathway and its components emerged as a potential effector of earliest stages of type 2 diabetes pathophysiology.


Asunto(s)
Glucemia/metabolismo , Diabetes Mellitus Tipo 2/sangre , Hemoglobina Glucada/metabolismo , Metabolómica , Adulto , Anciano , Área Bajo la Curva , Biología Computacional , Diabetes Mellitus Tipo 2/metabolismo , Femenino , Glicina/metabolismo , Humanos , Incidencia , Masculino , Análisis de la Aleatorización Mendeliana , Persona de Mediana Edad , Fenilalanina/metabolismo , Estudios Prospectivos , Curva ROC , Factores de Riesgo , Espectrometría de Masas en Tándem , Taurina/metabolismo
5.
PLoS Med ; 15(9): e1002654, 2018 09.
Artículo en Inglés | MEDLINE | ID: mdl-30240442

RESUMEN

BACKGROUND: Type 2 diabetes (T2D) is a heterogeneous disease for which (1) disease-causing pathways are incompletely understood and (2) subclassification may improve patient management. Unlike other biomarkers, germline genetic markers do not change with disease progression or treatment. In this paper, we test whether a germline genetic approach informed by physiology can be used to deconstruct T2D heterogeneity. First, we aimed to categorize genetic loci into groups representing likely disease mechanistic pathways. Second, we asked whether the novel clusters of genetic loci we identified have any broad clinical consequence, as assessed in four separate subsets of individuals with T2D. METHODS AND FINDINGS: In an effort to identify mechanistic pathways driven by established T2D genetic loci, we applied Bayesian nonnegative matrix factorization (bNMF) clustering to genome-wide association study (GWAS) results for 94 independent T2D genetic variants and 47 diabetes-related traits. We identified five robust clusters of T2D loci and traits, each with distinct tissue-specific enhancer enrichment based on analysis of epigenomic data from 28 cell types. Two clusters contained variant-trait associations indicative of reduced beta cell function, differing from each other by high versus low proinsulin levels. The three other clusters displayed features of insulin resistance: obesity mediated (high body mass index [BMI] and waist circumference [WC]), "lipodystrophy-like" fat distribution (low BMI, adiponectin, and high-density lipoprotein [HDL] cholesterol, and high triglycerides), and disrupted liver lipid metabolism (low triglycerides). Increased cluster genetic risk scores were associated with distinct clinical outcomes, including increased blood pressure, coronary artery disease (CAD), and stroke. We evaluated the potential for clinical impact of these clusters in four studies containing individuals with T2D (Metabolic Syndrome in Men Study [METSIM], N = 487; Ashkenazi, N = 509; Partners Biobank, N = 2,065; UK Biobank [UKBB], N = 14,813). Individuals with T2D in the top genetic risk score decile for each cluster reproducibly exhibited the predicted cluster-associated phenotypes, with approximately 30% of all individuals assigned to just one cluster top decile. Limitations of this study include that the genetic variants used in the cluster analysis were restricted to those associated with T2D in populations of European ancestry. CONCLUSION: Our approach identifies salient T2D genetically anchored and physiologically informed pathways, and supports the use of genetics to deconstruct T2D heterogeneity. Classification of patients by these genetic pathways may offer a step toward genetically informed T2D patient management.


Asunto(s)
Diabetes Mellitus Tipo 2/clasificación , Diabetes Mellitus Tipo 2/genética , Sitios Genéticos , Familia de Multigenes , Algoritmos , Teorema de Bayes , Análisis por Conglomerados , Estudios de Cohortes , Estudios Transversales , Bases de Datos Genéticas , Femenino , Efecto Fundador , Marcadores Genéticos , Predisposición Genética a la Enfermedad , Estudio de Asociación del Genoma Completo , Humanos , Insulina/deficiencia , Insulina/genética , Resistencia a la Insulina/genética , Masculino , Fenotipo , Estudios Prospectivos , Factores de Riesgo
6.
Mol Biol Evol ; 31(10): 2557-72, 2014 Oct.
Artículo en Inglés | MEDLINE | ID: mdl-24951729

RESUMEN

MicroRNAs (miRNAs) are endogenous RNA molecules that regulate gene expression posttranscriptionally. To date, the emergence of miRNAs and their patterns of sequence evolution have been analyzed in great detail. However, the extent to which miRNA expression levels have evolved over time, the role different evolutionary forces play in shaping these changes, and whether this variation in miRNA expression can reveal the interplay between miRNAs and mRNAs remain poorly understood. This is especially true for miRNA expressed during key developmental transitions. Here, we assayed miRNA expression levels immediately before (≥18BPF [18 h before puparium formation]) and after (PF) the increase in the hormone ecdysone responsible for triggering metamorphosis. We did so in four strains of Drosophila melanogaster and two closely related species. In contrast to their sequence conservation, approximately 25% of miRNAs analyzed showed significant within-species variation in male expression levels at ≥18BPF and/or PF. Additionally, approximately 33% showed modifications in their pattern of expression bias between developmental timepoints. A separate analysis of the ≥18BPF and PF stages revealed that changes in miRNA abundance accumulate linearly over evolutionary time at PF but not at ≥18BPF. Importantly, ≥18BPF-enriched miRNAs showed the greatest variation in expression levels both within and between species, so are the less likely to evolve under stabilizing selection. Functional attributes, such as expression ubiquity, appeared more tightly associated with lower levels of miRNA expression polymorphism at PF than at ≥18BPF. Furthermore, ≥18BPF- and PF-enriched miRNAs showed opposite patterns of covariation in expression with mRNAs, which denoted the type of regulatory relationship between miRNAs and mRNAs. Collectively, our results show contrasting patterns of functional divergence associated with miRNA expression levels during Drosophila ontogeny.


Asunto(s)
Drosophila melanogaster/crecimiento & desarrollo , Metamorfosis Biológica , MicroARNs/genética , Animales , Secuencia Conservada , Drosophila melanogaster/clasificación , Drosophila melanogaster/genética , Evolución Molecular , Femenino , Perfilación de la Expresión Génica , Regulación del Desarrollo de la Expresión Génica , Variación Genética , Masculino , Datos de Secuencia Molecular , Filogenia , Caracteres Sexuales
8.
Genome Res ; 20(8): 1084-96, 2010 Aug.
Artículo en Inglés | MEDLINE | ID: mdl-20601587

RESUMEN

During evolution, gene repatterning across eukaryotic genomes is not uniform. Some genomic regions exhibit a gene organization conserved phylogenetically, while others are recurrently involved in chromosomal rearrangement, resulting in breakpoint reuse. Both gene order conservation and breakpoint reuse can result from the existence of functional constraints on where chromosomal breakpoints occur or from the existence of regions that are susceptible to breakage. The balance between these two mechanisms is still poorly understood. Drosophila species have very dynamic genomes and, therefore, can be very informative. We compared the gene organization of the main five chromosomal elements (Muller's elements A-E) of nine Drosophila species. Under a parsimonious evolutionary scenario, we estimate that 6116 breakpoints differentiate the gene orders of the species and that breakpoint reuse is associated with approximately 80% of the orthologous landmarks. The comparison of the observed patterns of change in gene organization with those predicted under different simulated modes of evolution shows that fragile regions alone can explain the observed key patterns of Muller's element A (X chromosome) more often than for any other Muller's element. High levels of fragility plus constraints operating on approximately 15% of the genome are sufficient to explain the observed patterns of change and conservation across species. The orthologous landmarks more likely to be under constraint exhibit both a remarkable internal functional heterogeneity and a lack of common functional themes with the exception of the presence of highly conserved noncoding elements. Fragile regions rather than functional constraints have been the main determinant of the evolution of the Drosophila chromosomes.


Asunto(s)
Sitios Frágiles del Cromosoma/genética , Drosophila/genética , Orden Génico , Genoma de los Insectos , Animales , Secuencia de Bases , Puntos de Rotura del Cromosoma , Inversión Cromosómica/genética , Evolución Molecular , Femenino , Expresión Génica , Masculino , Cromosoma X/genética
9.
J Med Genet ; 49(12): 747-52, 2012 Dec.
Artículo en Inglés | MEDLINE | ID: mdl-23118445

RESUMEN

BACKGROUND: Musical abilities such as recognising music and singing performance serve as means for communication and are instruments in sexual selection. Specific regions of the brain have been found to be activated by musical stimuli, but these have rarely been extended to the discovery of genes and molecules associated with musical ability. METHODS: A total of 1008 individuals from 73 families were enrolled and a pitch-production accuracy test was applied to determine musical ability. To identify genetic loci and variants that contribute to musical ability, we conducted family-based linkage and association analyses, and incorporated the results with data from exome sequencing and array comparative genomic hybridisation analyses. RESULTS: We found significant evidence of linkage at 4q23 with the nearest marker D4S2986 (LOD=3.1), whose supporting interval overlaps a previous study in Finnish families, and identified an intergenic single nucleotide polymorphism (SNP) (rs1251078, p = 8.4 × 10(-17)) near UGT8, a gene highly expressed in the central nervous system and known to act in brain organisation. In addition, a non-synonymous SNP in UGT8 was revealed to be highly associated with musical ability (rs4148254, p = 8.0 × 10(-17)), and a 6.2 kb copy number loss near UGT8 showed a plausible association with musical ability (p = 2.9 × 10(-6)). CONCLUSIONS: This study provides new insight into the genetics of musical ability, exemplifying a methodology to assign functional significance to synonymous and non-coding alleles by integrating multiple experimental methods.


Asunto(s)
Pueblo Asiatico/genética , Balactosiltransferasa de Gangliósidos/genética , Música , Polimorfismo de Nucleótido Simple , Desempeño Psicomotor , Adolescente , Adulto , Hibridación Genómica Comparativa , Exoma , Familia , Femenino , Estudios de Asociación Genética , Ligamiento Genético , Humanos , Masculino , Persona de Mediana Edad , Mongolia , Adulto Joven
10.
Cell Metab ; 35(4): 695-710.e6, 2023 04 04.
Artículo en Inglés | MEDLINE | ID: mdl-36963395

RESUMEN

Associations between human genetic variation and clinical phenotypes have become a foundation of biomedical research. Most repositories of these data seek to be disease-agnostic and therefore lack disease-focused views. The Type 2 Diabetes Knowledge Portal (T2DKP) is a public resource of genetic datasets and genomic annotations dedicated to type 2 diabetes (T2D) and related traits. Here, we seek to make the T2DKP more accessible to prospective users and more useful to existing users. First, we evaluate the T2DKP's comprehensiveness by comparing its datasets with those of other repositories. Second, we describe how researchers unfamiliar with human genetic data can begin using and correctly interpreting them via the T2DKP. Third, we describe how existing users can extend their current workflows to use the full suite of tools offered by the T2DKP. We finally discuss the lessons offered by the T2DKP toward the goal of democratizing access to complex disease genetic results.


Asunto(s)
Diabetes Mellitus Tipo 2 , Humanos , Diabetes Mellitus Tipo 2/genética , Acceso a la Información , Estudios Prospectivos , Genómica/métodos , Fenotipo
11.
Dev Cell ; 57(3): 387-397.e4, 2022 02 07.
Artículo en Inglés | MEDLINE | ID: mdl-35134345

RESUMEN

Lipid droplets (LDs) are organelles of cellular lipid storage with fundamental roles in energy metabolism and cell membrane homeostasis. There has been an explosion of research into the biology of LDs, in part due to their relevance in diseases of lipid storage, such as atherosclerosis, obesity, type 2 diabetes, and hepatic steatosis. Consequently, there is an increasing need for a resource that combines datasets from systematic analyses of LD biology. Here, we integrate high-confidence, systematically generated human, mouse, and fly data from studies on LDs in the framework of an online platform named the "Lipid Droplet Knowledge Portal" (https://lipiddroplet.org/). This scalable and interactive portal includes comprehensive datasets, across a variety of cell types, for LD biology, including transcriptional profiles of induced lipid storage, organellar proteomics, genome-wide screen phenotypes, and ties to human genetics. This resource is a powerful platform that can be utilized to identify determinants of lipid storage.


Asunto(s)
Bases de Datos como Asunto , Gotas Lipídicas/metabolismo , Animales , Ésteres del Colesterol/metabolismo , Minería de Datos , Genoma , Humanos , Inflamación/patología , Metabolismo de los Lípidos , Hígado/metabolismo , Masculino , Ratones Endogámicos C57BL , Fenotipo , Fosforilación , Interferencia de ARN
12.
J Comput Chem ; 32(4): 568-81, 2011 Mar.
Artículo en Inglés | MEDLINE | ID: mdl-20812324

RESUMEN

Molecular recognition plays a fundamental role in all biological processes, and that is why great efforts have been made to understand and predict protein-ligand interactions. Finding a molecule that can potentially bind to a target protein is particularly essential in drug discovery and still remains an expensive and time-consuming task. In silico, tools are frequently used to screen molecular libraries to identify new lead compounds, and if protein structure is known, various protein-ligand docking programs can be used. The aim of docking procedure is to predict correct poses of ligand in the binding site of the protein as well as to score them according to the strength of interaction in a reasonable time frame. The purpose of our studies was to present the novel consensus approach to predict both protein-ligand complex structure and its corresponding binding affinity. Our method used as the input the results from seven docking programs (Surflex, LigandFit, Glide, GOLD, FlexX, eHiTS, and AutoDock) that are widely used for docking of ligands. We evaluated it on the extensive benchmark dataset of 1300 protein-ligands pairs from refined PDBbind database for which the structural and affinity data was available. We compared independently its ability of proper scoring and posing to the previously proposed methods. In most cases, our method is able to dock properly approximately 20% of pairs more than docking methods on average, and over 10% of pairs more than the best single program. The RMSD value of the predicted complex conformation versus its native one is reduced by a factor of 0.5 Å. Finally, we were able to increase the Pearson correlation of the predicted binding affinity in comparison with the experimental value up to 0.5.


Asunto(s)
Diseño de Fármacos , Proteínas/antagonistas & inhibidores , Proteínas/metabolismo , Programas Informáticos , Algoritmos , Bases de Datos de Proteínas , Ligandos , Unión Proteica
13.
PLoS Biol ; 5(6): e152, 2007 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-17550304

RESUMEN

That closely related species often differ by chromosomal inversions was discovered by Sturtevant and Plunkett in 1926. Our knowledge of how these inversions originate is still very limited, although a prevailing view is that they are facilitated by ectopic recombination events between inverted repetitive sequences. The availability of genome sequences of related species now allows us to study in detail the mechanisms that generate interspecific inversions. We have analyzed the breakpoint regions of the 29 inversions that differentiate the chromosomes of Drosophila melanogaster and two closely related species, D. simulans and D. yakuba, and reconstructed the molecular events that underlie their origin. Experimental and computational analysis revealed that the breakpoint regions of 59% of the inversions (17/29) are associated with inverted duplications of genes or other nonrepetitive sequences. In only two cases do we find evidence for inverted repetitive sequences in inversion breakpoints. We propose that the presence of inverted duplications associated with inversion breakpoint regions is the result of staggered breaks, either isochromatid or chromatid, and that this, rather than ectopic exchange between inverted repetitive sequences, is the prevalent mechanism for the generation of inversions in the melanogaster species group. Outgroup analysis also revealed evidence for widespread breakpoint recycling. Lastly, we have found that expression domains in D. melanogaster may be disrupted in D. yakuba, bringing into question their potential adaptive significance.


Asunto(s)
Evolución Biológica , Inversión Cromosómica , Drosophila/genética , Genoma de los Insectos , Animales , Rotura Cromosómica , Duplicación de Gen , Datos de Secuencia Molecular
14.
Nucleic Acids Res ; 36(Web Server issue): W303-7, 2008 Jul 01.
Artículo en Inglés | MEDLINE | ID: mdl-18515349

RESUMEN

The 'omics' revolution is causing a flurry of data that all needs to be annotated for it to become useful. Sequences of proteins of unknown function can be annotated with a putative function by comparing them with proteins of known function. This form of annotation is typically performed with BLAST or similar software. Structural genomics is nowadays also bringing us three dimensional structures of proteins with unknown function. We present here software that can be used when sequence comparisons fail to determine the function of a protein with known structure but unknown function. The software, called 3D-Fun, is implemented as a server that runs at several European institutes and is freely available for everybody at all these sites. The 3D-Fun servers accept protein coordinates in the standard PDB format and compare them with all known protein structures by 3D structural superposition using the 3D-Hit software. If structural hits are found with proteins with known function, these are listed together with their function and some vital comparison statistics. This is conceptually very similar in 3D to what BLAST does in 1D. Additionally, the superposition results are displayed using interactive graphics facilities. Currently, the 3D-Fun system only predicts enzyme function but an expanded version with Gene Ontology predictions will be available soon. The server can be accessed at http://3dfun.bioinfo.pl/ or at http://3dfun.cmbi.ru.nl/.


Asunto(s)
Enzimas/química , Programas Informáticos , Algoritmos , Enzimas/metabolismo , Internet , Modelos Moleculares , Homología Estructural de Proteína
15.
Comb Chem High Throughput Screen ; 10(3): 189-96, 2007 Mar.
Artículo en Inglés | MEDLINE | ID: mdl-17346118

RESUMEN

In many cases at the beginning of an HTS-campaign, some information about active molecules is already available. Often known active compounds (such as substrate analogues, natural products, inhibitors of a related protein or ligands published by a pharmaceutical company) are identified in low-throughput validation studies of the biochemical target. In this study we evaluate the effectiveness of a support vector machine applied for those compounds and used to classify a collection with unknown activity. This approach was aimed at reducing the number of compounds to be tested against the given target. Our method predicts the biological activity of chemical compounds based on only the atom pairs (AP) two dimensional topological descriptors. The supervised support vector machine (SVM) method herein is trained on compounds from the MDL drug data report (MDDR) known to be active for specific protein target. For detailed analysis, five different biological targets were selected including cyclooxygenase-2, dihydrofolate reductase, thrombin, HIV-reverse transcriptase and antagonists of the estrogen receptor. The accuracy of compound identification was estimated using the recall and precision values. The sensitivities for all protein targets exceeded 80% and the classification performance reached 100% for selected targets. In another application of the method, we addressed the absence of an initial set of active compounds for a selected protein target at the beginning of an HTS-campaign. In such a case, virtual high-throughput screening (vHTS) is usually applied by using a flexible docking procedure. However, the vHTS experiment typically contains a large percentage of false positives that should be verified by costly and time-consuming experimental follow-up assays. The subsequent use of our machine learning method was found to improve the speed (since the docking procedure was not required for all compounds from the database) and also the accuracy of the HTS hit lists (the enrichment factor).


Asunto(s)
Inteligencia Artificial , Sistemas de Liberación de Medicamentos/métodos , Evaluación Preclínica de Medicamentos/métodos , Inhibidores Enzimáticos , Humanos , Unión Proteica , Relación Estructura-Actividad Cuantitativa , Receptores de Superficie Celular/antagonistas & inhibidores
16.
BMC Bioinformatics ; 7: 53, 2006 Feb 06.
Artículo en Inglés | MEDLINE | ID: mdl-16460560

RESUMEN

BACKGROUND: The number of protein structures from structural genomics centers dramatically increases in the Protein Data Bank (PDB). Many of these structures are functionally unannotated because they have no sequence similarity to proteins of known function. However, it is possible to successfully infer function using only structural similarity. RESULTS: Here we present the PDB-UF database, a web-accessible collection of predictions of enzymatic properties using structure-function relationship. The assignments were conducted for three-dimensional protein structures of unknown function that come from structural genomics initiatives. We show that 4 hypothetical proteins (with PDB accession codes: 1VH0, 1NS5, 1O6D, and 1TO0), for which standard BLAST tools such as PSI-BLAST or RPS-BLAST failed to assign any function, are probably methyltransferase enzymes. CONCLUSION: We suggest that the structure-based prediction of an EC number should be conducted having the different similarity score cutoff for different protein folds. Moreover, performing the annotation using two different algorithms can reduce the rate of false positive assignments. We believe, that the presented web-based repository will help to decrease the number of protein structures that have functions marked as "unknown" in the PDB file. AVAILABILITY: http://paradox.harvard.edu/PDB-UF and http://bioinfo.pl/PDB-UF.


Asunto(s)
Mapeo Cromosómico/métodos , Sistemas de Administración de Bases de Datos , Bases de Datos de Proteínas , Enzimas/química , Enzimas/metabolismo , Alineación de Secuencia/métodos , Análisis de Secuencia de Proteína/métodos , Interfaz Usuario-Computador , Simulación por Computador , Activación Enzimática , Enzimas/análisis , Enzimas/clasificación , Enzimas/ultraestructura , Almacenamiento y Recuperación de la Información/métodos , Modelos Químicos , Modelos Moleculares , Relación Estructura-Actividad
17.
Nucleic Acids Res ; 32(Web Server issue): W576-81, 2004 Jul 01.
Artículo en Inglés | MEDLINE | ID: mdl-15215454

RESUMEN

Meta-BASIC (http://basic.bioinfo.pl) is a novel sensitive approach for recognition of distant similarity between proteins based on consensus alignments of meta profiles. Specifically, Meta-BASIC compares sequence profiles combined with predicted secondary structure by utilizing several scoring systems and alignment algorithms. In our benchmarking tests, Meta-BASIC outperforms many individual servers, including fold recognition servers, and it can compete with meta predictors that base their strength on the structural comparison of models. In addition, Meta-BASIC, which enables detection of very distant relationships even if the tertiary structure for the reference protein is not known, has a high-throughput capability. This new method is applied to 860 PfamA protein families with unknown function (DUF) and provides many novel structure-functional assignments available on-line at http://basic.bioinfo.pl/duf.pl. Detailed discussion is provided for two of the most interesting assignments. DUF271 and DUF431 are predicted to be a nucleotide-diphospho-sugar transferase and an alpha/beta-knot SAM-dependent RNA methyltransferase, respectively.


Asunto(s)
Programas Informáticos , Homología Estructural de Proteína , Algoritmos , Glicosiltransferasas/química , Glicosiltransferasas/clasificación , Internet , Modelos Moleculares , Estructura Secundaria de Proteína , Proteínas/química , Proteínas/clasificación , Proteínas/fisiología , Alineación de Secuencia , Análisis de Secuencia de Proteína , Homología de Secuencia de Aminoácido , ARNt Metiltransferasas/química , ARNt Metiltransferasas/clasificación
18.
Nucleic Acids Res ; 31(13): 3804-7, 2003 Jul 01.
Artículo en Inglés | MEDLINE | ID: mdl-12824423

RESUMEN

ORFeus is a fully automated, sensitive protein sequence similarity search server available to the academic community via the Structure Prediction Meta Server (http://BioInfo.PL/Meta/). The goal of the development of ORFeus was to increase the sensitivity of the detection of distantly related protein families. Predicted secondary structure information was added to the information about sequence conservation and variability, a technique known from hybrid threading approaches. The accuracy of the meta profiles created this way is compared with profiles containing only sequence information and with the standard approach of aligning a single sequence with a profile. Additionally, the alignment of meta profiles is more sensitive in detecting remote homology between protein families than if aligning two sequence-only profiles or if aligning a profile with a sequence. The specificity of the alignment score is improved in the lower specificity range compared with the robust sequence-only profiles.


Asunto(s)
Estructura Secundaria de Proteína , Homología de Secuencia de Aminoácido , Programas Informáticos , Algoritmos , Internet , Alineación de Secuencia
19.
Nat Commun ; 6: 6509, 2015 Mar 05.
Artículo en Inglés | MEDLINE | ID: mdl-25739651

RESUMEN

Genome clustering of homeobox genes is often thought to reflect arrangements of tandem gene duplicates maintained by advantageous coordinated gene regulation. Here we analyse the chromosomal organization of the NK homeobox genes, presumed to be part of a single cluster in the Bilaterian ancestor, across 20 arthropods. We find that the ProtoNK cluster was extensively fragmented in some lineages, showing that NK clustering in Drosophila species does not reflect selectively maintained gene arrangements. More importantly, the arrangement of NK and neighbouring genes across the phylogeny supports that, in two instances within the Drosophila genus, some cluster remnants became reunited via large-scale chromosomal rearrangements. Simulated scenarios of chromosome evolution indicate that these reunion events are unlikely unless the genome neighbourhoods harbouring the participating genes tend to colocalize in the nucleus. Our results underscore how mechanisms other than tandem gene duplication can result in paralogous gene clustering during genome evolution.


Asunto(s)
Drosophila/genética , Evolución Molecular , Regulación de la Expresión Génica/genética , Genes Homeobox/genética , Familia de Multigenes/genética , Translocación Genética/fisiología , Secuencia de Aminoácidos , Animales , Artrópodos/genética , Mapeo Cromosómico , Biología Computacional , Duplicación de Gen/genética , Hibridación in Situ , Funciones de Verosimilitud , Modelos Genéticos , Anotación de Secuencia Molecular , Datos de Secuencia Molecular , Filogenia , Especificidad de la Especie , Translocación Genética/genética
20.
Proteins ; 53 Suppl 6: 418-23, 2003.
Artículo en Inglés | MEDLINE | ID: mdl-14579330

RESUMEN

In CASP5, the BioInfo.PL group has used the structure prediction Meta Server and the associated newly developed flexible meta-predictor, called 3D-Jury, as the main structure prediction tools. The most important feature of the meta-predictor is a high (86%) correlation between the reported confidence score and the quality of the selected model. The Gene Relational Database (GRDB) was used to confirm the fold recognition results by selecting distant homologues and subsequent structure prediction with the Meta Server. A fragment-splicing procedure was performed as a final processing step with large fragments extracted from selected models using model quality control provided by Verify3D. The comparison of submitted models with the native structure conducted after the CASP meeting showed that the GRDB-supported structure prediction led to a satisfactory template fold selection, whereas the fragment-splicing procedure must be improved in the future.


Asunto(s)
Biología Computacional/métodos , Pliegue de Proteína , Proteínas/química , Secuencia de Aminoácidos , Sitios de Unión/genética , Datos de Secuencia Molecular , Conformación Proteica , Estructura Terciaria de Proteína , Proteínas/genética , Alineación de Secuencia , Homología de Secuencia de Aminoácido
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA