Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 33
Filtrar
Mais filtros

Base de dados
País/Região como assunto
Tipo de documento
Intervalo de ano de publicação
1.
Cell ; 170(1): 199-212.e20, 2017 Jun 29.
Artigo em Inglês | MEDLINE | ID: mdl-28666119

RESUMO

Type 2 diabetes (T2D) affects Latinos at twice the rate seen in populations of European descent. We recently identified a risk haplotype spanning SLC16A11 that explains ∼20% of the increased T2D prevalence in Mexico. Here, through genetic fine-mapping, we define a set of tightly linked variants likely to contain the causal allele(s). We show that variants on the T2D-associated haplotype have two distinct effects: (1) decreasing SLC16A11 expression in liver and (2) disrupting a key interaction with basigin, thereby reducing cell-surface localization. Both independent mechanisms reduce SLC16A11 function and suggest SLC16A11 is the causal gene at this locus. To gain insight into how SLC16A11 disruption impacts T2D risk, we demonstrate that SLC16A11 is a proton-coupled monocarboxylate transporter and that genetic perturbation of SLC16A11 induces changes in fatty acid and lipid metabolism that are associated with increased T2D risk. Our findings suggest that increasing SLC16A11 function could be therapeutically beneficial for T2D. VIDEO ABSTRACT.


Assuntos
Diabetes Mellitus Tipo 2/metabolismo , Transportadores de Ácidos Monocarboxílicos/genética , Transportadores de Ácidos Monocarboxílicos/metabolismo , Basigina/metabolismo , Membrana Celular/metabolismo , Cromossomos Humanos Par 17/metabolismo , Técnicas de Silenciamento de Genes , Haplótipos , Hepatócitos/metabolismo , Heterozigoto , Código das Histonas , Humanos , Fígado/metabolismo , Modelos Moleculares , Transportadores de Ácidos Monocarboxílicos/química
2.
Diabetologia ; 66(3): 495-507, 2023 03.
Artigo em Inglês | MEDLINE | ID: mdl-36538063

RESUMO

AIMS/HYPOTHESIS: Type 2 diabetes is highly polygenic and influenced by multiple biological pathways. Rapid expansion in the number of type 2 diabetes loci can be leveraged to identify such pathways. METHODS: We developed a high-throughput pipeline to enable clustering of type 2 diabetes loci based on variant-trait associations. Our pipeline extracted summary statistics from genome-wide association studies (GWAS) for type 2 diabetes and related traits to generate a matrix of 323 variants × 64 trait associations and applied Bayesian non-negative matrix factorisation (bNMF) to identify genetic components of type 2 diabetes. Epigenomic enrichment analysis was performed in 28 cell types and single pancreatic cells. We generated cluster-specific polygenic scores and performed regression analysis in an independent cohort (N=25,419) to assess for clinical relevance. RESULTS: We identified ten clusters of genetic loci, recapturing the five from our prior analysis as well as novel clusters related to beta cell dysfunction, pronounced insulin secretion, and levels of alkaline phosphatase, lipoprotein A and sex hormone-binding globulin. Four clusters related to mechanisms of insulin deficiency, five to insulin resistance and one had an unclear mechanism. The clusters displayed tissue-specific epigenomic enrichment, notably with the two beta cell clusters differentially enriched in functional and stressed pancreatic beta cell states. Additionally, cluster-specific polygenic scores were differentially associated with patient clinical characteristics and outcomes. The pipeline was applied to coronary artery disease and chronic kidney disease, identifying multiple overlapping clusters with type 2 diabetes. CONCLUSIONS/INTERPRETATION: Our approach stratifies type 2 diabetes loci into physiologically interpretable genetic clusters associated with distinct tissues and clinical outcomes. The pipeline allows for efficient updating as additional GWAS become available and can be readily applied to other conditions, facilitating clinical translation of GWAS findings. Software to perform this clustering pipeline is freely available.


Assuntos
Diabetes Mellitus Tipo 2 , Humanos , Diabetes Mellitus Tipo 2/genética , Estudo de Associação Genômica Ampla , Predisposição Genética para Doença/genética , Teorema de Bayes , Análise por Conglomerados , Polimorfismo de Nucleotídeo Único
3.
Mol Cell ; 60(6): 941-52, 2015 Dec 17.
Artigo em Inglês | MEDLINE | ID: mdl-26698662

RESUMO

In insects, brain-derived Prothoracicotropic hormone (PTTH) activates the receptor tyrosine kinase (RTK) Torso to initiate metamorphosis through the release of ecdysone. We have determined the crystal structure of silkworm PTTH in complex with the ligand-binding region of Torso. Here we show that ligand-induced Torso dimerization results from the sequential and negatively cooperative formation of asymmetric heterotetramers. Mathematical modeling of receptor activation based upon our biophysical studies shows that ligand pulses are "buffered" at low receptor levels, leading to a sustained signal. By contrast, high levels of Torso develop the signal intensity and duration of a noncooperative system. We propose that this may allow Torso to coordinate widely different functions from a single ligand by tuning receptor levels. Phylogenic analysis indicates that Torso is found outside arthropods, including human parasitic roundworms. Together, our findings provide mechanistic insight into how this receptor system, with roles in embryonic and adult development, is regulated.


Assuntos
Bombyx/metabolismo , Hormônios de Inseto/química , Hormônios de Inseto/metabolismo , Receptores Proteína Tirosina Quinases/química , Receptores Proteína Tirosina Quinases/metabolismo , Animais , Sítios de Ligação , Bombyx/química , Cristalografia por Raios X , Regulação da Expressão Gênica no Desenvolvimento , Humanos , Proteínas de Insetos/química , Proteínas de Insetos/metabolismo , Modelos Moleculares , Filogenia , Multimerização Proteica , Receptores de Interleucina-17/química , Transdução de Sinais
4.
Diabetologia ; 61(6): 1315-1324, 2018 06.
Artigo em Inglês | MEDLINE | ID: mdl-29626220

RESUMO

AIMS/HYPOTHESIS: Identifying the metabolite profile of individuals with normal fasting glucose (NFG [<5.55 mmol/l]) who progressed to type 2 diabetes may give novel insights into early type 2 diabetes disease interception and detection. METHODS: We conducted a population-based prospective study among 1150 Framingham Heart Study Offspring cohort participants, age 40-65 years, with NFG. Plasma metabolites were profiled by LC-MS/MS. Penalised regression models were used to select measured metabolites for type 2 diabetes incidence classification (training dataset) and to internally validate the discriminatory capability of selected metabolites beyond conventional type 2 diabetes risk factors (testing dataset). RESULTS: Over a follow-up period of 20 years, 95 individuals with NFG developed type 2 diabetes. Nineteen metabolites were selected repeatedly in the training dataset for type 2 diabetes incidence classification and were found to improve type 2 diabetes risk prediction beyond conventional type 2 diabetes risk factors (AUC was 0.81 for risk factors vs 0.90 for risk factors + metabolites, p = 1.1 × 10-4). Using pathway enrichment analysis, the nitrogen metabolism pathway, which includes three prioritised metabolites (glycine, taurine and phenylalanine), was significantly enriched for association with type 2 diabetes risk at the false discovery rate of 5% (p = 0.047). In adjusted Cox proportional hazard models, the type 2 diabetes risk per 1 SD increase in glycine, taurine and phenylalanine was 0.65 (95% CI 0.54, 0.78), 0.73 (95% CI 0.59, 0.9) and 1.35 (95% CI 1.11, 1.65), respectively. Mendelian randomisation demonstrated a similar relationship for type 2 diabetes risk per 1 SD genetically increased glycine (OR 0.89 [95% CI 0.8, 0.99]) and phenylalanine (OR 1.6 [95% CI 1.08, 2.4]). CONCLUSIONS/INTERPRETATION: In individuals with NFG, information from a discrete set of 19 metabolites improved prediction of type 2 diabetes beyond conventional risk factors. In addition, the nitrogen metabolism pathway and its components emerged as a potential effector of earliest stages of type 2 diabetes pathophysiology.


Assuntos
Glicemia/metabolismo , Diabetes Mellitus Tipo 2/sangue , Hemoglobinas Glicadas/metabolismo , Metabolômica , Adulto , Idoso , Área Sob a Curva , Biologia Computacional , Diabetes Mellitus Tipo 2/metabolismo , Feminino , Glicina/metabolismo , Humanos , Incidência , Masculino , Análise da Randomização Mendeliana , Pessoa de Meia-Idade , Fenilalanina/metabolismo , Estudos Prospectivos , Curva ROC , Fatores de Risco , Espectrometria de Massas em Tandem , Taurina/metabolismo
5.
PLoS Med ; 15(9): e1002654, 2018 09.
Artigo em Inglês | MEDLINE | ID: mdl-30240442

RESUMO

BACKGROUND: Type 2 diabetes (T2D) is a heterogeneous disease for which (1) disease-causing pathways are incompletely understood and (2) subclassification may improve patient management. Unlike other biomarkers, germline genetic markers do not change with disease progression or treatment. In this paper, we test whether a germline genetic approach informed by physiology can be used to deconstruct T2D heterogeneity. First, we aimed to categorize genetic loci into groups representing likely disease mechanistic pathways. Second, we asked whether the novel clusters of genetic loci we identified have any broad clinical consequence, as assessed in four separate subsets of individuals with T2D. METHODS AND FINDINGS: In an effort to identify mechanistic pathways driven by established T2D genetic loci, we applied Bayesian nonnegative matrix factorization (bNMF) clustering to genome-wide association study (GWAS) results for 94 independent T2D genetic variants and 47 diabetes-related traits. We identified five robust clusters of T2D loci and traits, each with distinct tissue-specific enhancer enrichment based on analysis of epigenomic data from 28 cell types. Two clusters contained variant-trait associations indicative of reduced beta cell function, differing from each other by high versus low proinsulin levels. The three other clusters displayed features of insulin resistance: obesity mediated (high body mass index [BMI] and waist circumference [WC]), "lipodystrophy-like" fat distribution (low BMI, adiponectin, and high-density lipoprotein [HDL] cholesterol, and high triglycerides), and disrupted liver lipid metabolism (low triglycerides). Increased cluster genetic risk scores were associated with distinct clinical outcomes, including increased blood pressure, coronary artery disease (CAD), and stroke. We evaluated the potential for clinical impact of these clusters in four studies containing individuals with T2D (Metabolic Syndrome in Men Study [METSIM], N = 487; Ashkenazi, N = 509; Partners Biobank, N = 2,065; UK Biobank [UKBB], N = 14,813). Individuals with T2D in the top genetic risk score decile for each cluster reproducibly exhibited the predicted cluster-associated phenotypes, with approximately 30% of all individuals assigned to just one cluster top decile. Limitations of this study include that the genetic variants used in the cluster analysis were restricted to those associated with T2D in populations of European ancestry. CONCLUSION: Our approach identifies salient T2D genetically anchored and physiologically informed pathways, and supports the use of genetics to deconstruct T2D heterogeneity. Classification of patients by these genetic pathways may offer a step toward genetically informed T2D patient management.


Assuntos
Diabetes Mellitus Tipo 2/classificação , Diabetes Mellitus Tipo 2/genética , Loci Gênicos , Família Multigênica , Algoritmos , Teorema de Bayes , Análise por Conglomerados , Estudos de Coortes , Estudos Transversais , Bases de Dados Genéticas , Feminino , Efeito Fundador , Marcadores Genéticos , Predisposição Genética para Doença , Estudo de Associação Genômica Ampla , Humanos , Insulina/deficiência , Insulina/genética , Resistência à Insulina/genética , Masculino , Fenótipo , Estudos Prospectivos , Fatores de Risco
6.
Mol Biol Evol ; 31(10): 2557-72, 2014 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-24951729

RESUMO

MicroRNAs (miRNAs) are endogenous RNA molecules that regulate gene expression posttranscriptionally. To date, the emergence of miRNAs and their patterns of sequence evolution have been analyzed in great detail. However, the extent to which miRNA expression levels have evolved over time, the role different evolutionary forces play in shaping these changes, and whether this variation in miRNA expression can reveal the interplay between miRNAs and mRNAs remain poorly understood. This is especially true for miRNA expressed during key developmental transitions. Here, we assayed miRNA expression levels immediately before (≥18BPF [18 h before puparium formation]) and after (PF) the increase in the hormone ecdysone responsible for triggering metamorphosis. We did so in four strains of Drosophila melanogaster and two closely related species. In contrast to their sequence conservation, approximately 25% of miRNAs analyzed showed significant within-species variation in male expression levels at ≥18BPF and/or PF. Additionally, approximately 33% showed modifications in their pattern of expression bias between developmental timepoints. A separate analysis of the ≥18BPF and PF stages revealed that changes in miRNA abundance accumulate linearly over evolutionary time at PF but not at ≥18BPF. Importantly, ≥18BPF-enriched miRNAs showed the greatest variation in expression levels both within and between species, so are the less likely to evolve under stabilizing selection. Functional attributes, such as expression ubiquity, appeared more tightly associated with lower levels of miRNA expression polymorphism at PF than at ≥18BPF. Furthermore, ≥18BPF- and PF-enriched miRNAs showed opposite patterns of covariation in expression with mRNAs, which denoted the type of regulatory relationship between miRNAs and mRNAs. Collectively, our results show contrasting patterns of functional divergence associated with miRNA expression levels during Drosophila ontogeny.


Assuntos
Drosophila melanogaster/crescimento & desenvolvimento , Metamorfose Biológica , MicroRNAs/genética , Animais , Sequência Conservada , Drosophila melanogaster/classificação , Drosophila melanogaster/genética , Evolução Molecular , Feminino , Perfilação da Expressão Gênica , Regulação da Expressão Gênica no Desenvolvimento , Variação Genética , Masculino , Dados de Sequência Molecular , Filogenia , Caracteres Sexuais
8.
Genome Res ; 20(8): 1084-96, 2010 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-20601587

RESUMO

During evolution, gene repatterning across eukaryotic genomes is not uniform. Some genomic regions exhibit a gene organization conserved phylogenetically, while others are recurrently involved in chromosomal rearrangement, resulting in breakpoint reuse. Both gene order conservation and breakpoint reuse can result from the existence of functional constraints on where chromosomal breakpoints occur or from the existence of regions that are susceptible to breakage. The balance between these two mechanisms is still poorly understood. Drosophila species have very dynamic genomes and, therefore, can be very informative. We compared the gene organization of the main five chromosomal elements (Muller's elements A-E) of nine Drosophila species. Under a parsimonious evolutionary scenario, we estimate that 6116 breakpoints differentiate the gene orders of the species and that breakpoint reuse is associated with approximately 80% of the orthologous landmarks. The comparison of the observed patterns of change in gene organization with those predicted under different simulated modes of evolution shows that fragile regions alone can explain the observed key patterns of Muller's element A (X chromosome) more often than for any other Muller's element. High levels of fragility plus constraints operating on approximately 15% of the genome are sufficient to explain the observed patterns of change and conservation across species. The orthologous landmarks more likely to be under constraint exhibit both a remarkable internal functional heterogeneity and a lack of common functional themes with the exception of the presence of highly conserved noncoding elements. Fragile regions rather than functional constraints have been the main determinant of the evolution of the Drosophila chromosomes.


Assuntos
Sítios Frágeis do Cromossomo/genética , Drosophila/genética , Ordem dos Genes , Genoma de Inseto , Animais , Sequência de Bases , Pontos de Quebra do Cromossomo , Inversão Cromossômica/genética , Evolução Molecular , Feminino , Expressão Gênica , Masculino , Cromossomo X/genética
9.
J Med Genet ; 49(12): 747-52, 2012 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-23118445

RESUMO

BACKGROUND: Musical abilities such as recognising music and singing performance serve as means for communication and are instruments in sexual selection. Specific regions of the brain have been found to be activated by musical stimuli, but these have rarely been extended to the discovery of genes and molecules associated with musical ability. METHODS: A total of 1008 individuals from 73 families were enrolled and a pitch-production accuracy test was applied to determine musical ability. To identify genetic loci and variants that contribute to musical ability, we conducted family-based linkage and association analyses, and incorporated the results with data from exome sequencing and array comparative genomic hybridisation analyses. RESULTS: We found significant evidence of linkage at 4q23 with the nearest marker D4S2986 (LOD=3.1), whose supporting interval overlaps a previous study in Finnish families, and identified an intergenic single nucleotide polymorphism (SNP) (rs1251078, p = 8.4 × 10(-17)) near UGT8, a gene highly expressed in the central nervous system and known to act in brain organisation. In addition, a non-synonymous SNP in UGT8 was revealed to be highly associated with musical ability (rs4148254, p = 8.0 × 10(-17)), and a 6.2 kb copy number loss near UGT8 showed a plausible association with musical ability (p = 2.9 × 10(-6)). CONCLUSIONS: This study provides new insight into the genetics of musical ability, exemplifying a methodology to assign functional significance to synonymous and non-coding alleles by integrating multiple experimental methods.


Assuntos
Povo Asiático/genética , Gangliosídeo Galactosiltransferase/genética , Música , Polimorfismo de Nucleotídeo Único , Desempenho Psicomotor , Adolescente , Adulto , Hibridização Genômica Comparativa , Exoma , Família , Feminino , Estudos de Associação Genética , Ligação Genética , Humanos , Masculino , Pessoa de Meia-Idade , Mongólia , Adulto Jovem
10.
Cell Metab ; 35(4): 695-710.e6, 2023 04 04.
Artigo em Inglês | MEDLINE | ID: mdl-36963395

RESUMO

Associations between human genetic variation and clinical phenotypes have become a foundation of biomedical research. Most repositories of these data seek to be disease-agnostic and therefore lack disease-focused views. The Type 2 Diabetes Knowledge Portal (T2DKP) is a public resource of genetic datasets and genomic annotations dedicated to type 2 diabetes (T2D) and related traits. Here, we seek to make the T2DKP more accessible to prospective users and more useful to existing users. First, we evaluate the T2DKP's comprehensiveness by comparing its datasets with those of other repositories. Second, we describe how researchers unfamiliar with human genetic data can begin using and correctly interpreting them via the T2DKP. Third, we describe how existing users can extend their current workflows to use the full suite of tools offered by the T2DKP. We finally discuss the lessons offered by the T2DKP toward the goal of democratizing access to complex disease genetic results.


Assuntos
Diabetes Mellitus Tipo 2 , Humanos , Diabetes Mellitus Tipo 2/genética , Acesso à Informação , Estudos Prospectivos , Genômica/métodos , Fenótipo
11.
Dev Cell ; 57(3): 387-397.e4, 2022 02 07.
Artigo em Inglês | MEDLINE | ID: mdl-35134345

RESUMO

Lipid droplets (LDs) are organelles of cellular lipid storage with fundamental roles in energy metabolism and cell membrane homeostasis. There has been an explosion of research into the biology of LDs, in part due to their relevance in diseases of lipid storage, such as atherosclerosis, obesity, type 2 diabetes, and hepatic steatosis. Consequently, there is an increasing need for a resource that combines datasets from systematic analyses of LD biology. Here, we integrate high-confidence, systematically generated human, mouse, and fly data from studies on LDs in the framework of an online platform named the "Lipid Droplet Knowledge Portal" (https://lipiddroplet.org/). This scalable and interactive portal includes comprehensive datasets, across a variety of cell types, for LD biology, including transcriptional profiles of induced lipid storage, organellar proteomics, genome-wide screen phenotypes, and ties to human genetics. This resource is a powerful platform that can be utilized to identify determinants of lipid storage.


Assuntos
Bases de Dados como Assunto , Gotículas Lipídicas/metabolismo , Animais , Ésteres do Colesterol/metabolismo , Mineração de Dados , Genoma , Humanos , Inflamação/patologia , Metabolismo dos Lipídeos , Fígado/metabolismo , Masculino , Camundongos Endogâmicos C57BL , Fenótipo , Fosforilação , Interferência de RNA
12.
J Comput Chem ; 32(4): 568-81, 2011 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-20812324

RESUMO

Molecular recognition plays a fundamental role in all biological processes, and that is why great efforts have been made to understand and predict protein-ligand interactions. Finding a molecule that can potentially bind to a target protein is particularly essential in drug discovery and still remains an expensive and time-consuming task. In silico, tools are frequently used to screen molecular libraries to identify new lead compounds, and if protein structure is known, various protein-ligand docking programs can be used. The aim of docking procedure is to predict correct poses of ligand in the binding site of the protein as well as to score them according to the strength of interaction in a reasonable time frame. The purpose of our studies was to present the novel consensus approach to predict both protein-ligand complex structure and its corresponding binding affinity. Our method used as the input the results from seven docking programs (Surflex, LigandFit, Glide, GOLD, FlexX, eHiTS, and AutoDock) that are widely used for docking of ligands. We evaluated it on the extensive benchmark dataset of 1300 protein-ligands pairs from refined PDBbind database for which the structural and affinity data was available. We compared independently its ability of proper scoring and posing to the previously proposed methods. In most cases, our method is able to dock properly approximately 20% of pairs more than docking methods on average, and over 10% of pairs more than the best single program. The RMSD value of the predicted complex conformation versus its native one is reduced by a factor of 0.5 Å. Finally, we were able to increase the Pearson correlation of the predicted binding affinity in comparison with the experimental value up to 0.5.


Assuntos
Desenho de Fármacos , Proteínas/antagonistas & inibidores , Proteínas/metabolismo , Software , Algoritmos , Bases de Dados de Proteínas , Ligantes , Ligação Proteica
13.
PLoS Biol ; 5(6): e152, 2007 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-17550304

RESUMO

That closely related species often differ by chromosomal inversions was discovered by Sturtevant and Plunkett in 1926. Our knowledge of how these inversions originate is still very limited, although a prevailing view is that they are facilitated by ectopic recombination events between inverted repetitive sequences. The availability of genome sequences of related species now allows us to study in detail the mechanisms that generate interspecific inversions. We have analyzed the breakpoint regions of the 29 inversions that differentiate the chromosomes of Drosophila melanogaster and two closely related species, D. simulans and D. yakuba, and reconstructed the molecular events that underlie their origin. Experimental and computational analysis revealed that the breakpoint regions of 59% of the inversions (17/29) are associated with inverted duplications of genes or other nonrepetitive sequences. In only two cases do we find evidence for inverted repetitive sequences in inversion breakpoints. We propose that the presence of inverted duplications associated with inversion breakpoint regions is the result of staggered breaks, either isochromatid or chromatid, and that this, rather than ectopic exchange between inverted repetitive sequences, is the prevalent mechanism for the generation of inversions in the melanogaster species group. Outgroup analysis also revealed evidence for widespread breakpoint recycling. Lastly, we have found that expression domains in D. melanogaster may be disrupted in D. yakuba, bringing into question their potential adaptive significance.


Assuntos
Evolução Biológica , Inversão Cromossômica , Drosophila/genética , Genoma de Inseto , Animais , Quebra Cromossômica , Duplicação Gênica , Dados de Sequência Molecular
14.
Nucleic Acids Res ; 36(Web Server issue): W303-7, 2008 Jul 01.
Artigo em Inglês | MEDLINE | ID: mdl-18515349

RESUMO

The 'omics' revolution is causing a flurry of data that all needs to be annotated for it to become useful. Sequences of proteins of unknown function can be annotated with a putative function by comparing them with proteins of known function. This form of annotation is typically performed with BLAST or similar software. Structural genomics is nowadays also bringing us three dimensional structures of proteins with unknown function. We present here software that can be used when sequence comparisons fail to determine the function of a protein with known structure but unknown function. The software, called 3D-Fun, is implemented as a server that runs at several European institutes and is freely available for everybody at all these sites. The 3D-Fun servers accept protein coordinates in the standard PDB format and compare them with all known protein structures by 3D structural superposition using the 3D-Hit software. If structural hits are found with proteins with known function, these are listed together with their function and some vital comparison statistics. This is conceptually very similar in 3D to what BLAST does in 1D. Additionally, the superposition results are displayed using interactive graphics facilities. Currently, the 3D-Fun system only predicts enzyme function but an expanded version with Gene Ontology predictions will be available soon. The server can be accessed at http://3dfun.bioinfo.pl/ or at http://3dfun.cmbi.ru.nl/.


Assuntos
Enzimas/química , Software , Algoritmos , Enzimas/metabolismo , Internet , Modelos Moleculares , Homologia Estrutural de Proteína
15.
Comb Chem High Throughput Screen ; 10(3): 189-96, 2007 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-17346118

RESUMO

In many cases at the beginning of an HTS-campaign, some information about active molecules is already available. Often known active compounds (such as substrate analogues, natural products, inhibitors of a related protein or ligands published by a pharmaceutical company) are identified in low-throughput validation studies of the biochemical target. In this study we evaluate the effectiveness of a support vector machine applied for those compounds and used to classify a collection with unknown activity. This approach was aimed at reducing the number of compounds to be tested against the given target. Our method predicts the biological activity of chemical compounds based on only the atom pairs (AP) two dimensional topological descriptors. The supervised support vector machine (SVM) method herein is trained on compounds from the MDL drug data report (MDDR) known to be active for specific protein target. For detailed analysis, five different biological targets were selected including cyclooxygenase-2, dihydrofolate reductase, thrombin, HIV-reverse transcriptase and antagonists of the estrogen receptor. The accuracy of compound identification was estimated using the recall and precision values. The sensitivities for all protein targets exceeded 80% and the classification performance reached 100% for selected targets. In another application of the method, we addressed the absence of an initial set of active compounds for a selected protein target at the beginning of an HTS-campaign. In such a case, virtual high-throughput screening (vHTS) is usually applied by using a flexible docking procedure. However, the vHTS experiment typically contains a large percentage of false positives that should be verified by costly and time-consuming experimental follow-up assays. The subsequent use of our machine learning method was found to improve the speed (since the docking procedure was not required for all compounds from the database) and also the accuracy of the HTS hit lists (the enrichment factor).


Assuntos
Inteligência Artificial , Sistemas de Liberação de Medicamentos/métodos , Avaliação Pré-Clínica de Medicamentos/métodos , Inibidores Enzimáticos , Humanos , Ligação Proteica , Relação Quantitativa Estrutura-Atividade , Receptores de Superfície Celular/antagonistas & inibidores
16.
BMC Bioinformatics ; 7: 53, 2006 Feb 06.
Artigo em Inglês | MEDLINE | ID: mdl-16460560

RESUMO

BACKGROUND: The number of protein structures from structural genomics centers dramatically increases in the Protein Data Bank (PDB). Many of these structures are functionally unannotated because they have no sequence similarity to proteins of known function. However, it is possible to successfully infer function using only structural similarity. RESULTS: Here we present the PDB-UF database, a web-accessible collection of predictions of enzymatic properties using structure-function relationship. The assignments were conducted for three-dimensional protein structures of unknown function that come from structural genomics initiatives. We show that 4 hypothetical proteins (with PDB accession codes: 1VH0, 1NS5, 1O6D, and 1TO0), for which standard BLAST tools such as PSI-BLAST or RPS-BLAST failed to assign any function, are probably methyltransferase enzymes. CONCLUSION: We suggest that the structure-based prediction of an EC number should be conducted having the different similarity score cutoff for different protein folds. Moreover, performing the annotation using two different algorithms can reduce the rate of false positive assignments. We believe, that the presented web-based repository will help to decrease the number of protein structures that have functions marked as "unknown" in the PDB file. AVAILABILITY: http://paradox.harvard.edu/PDB-UF and http://bioinfo.pl/PDB-UF.


Assuntos
Mapeamento Cromossômico/métodos , Sistemas de Gerenciamento de Base de Dados , Bases de Dados de Proteínas , Enzimas/química , Enzimas/metabolismo , Alinhamento de Sequência/métodos , Análise de Sequência de Proteína/métodos , Interface Usuário-Computador , Simulação por Computador , Ativação Enzimática , Enzimas/análise , Enzimas/classificação , Enzimas/ultraestrutura , Armazenamento e Recuperação da Informação/métodos , Modelos Químicos , Modelos Moleculares , Relação Estrutura-Atividade
17.
Nucleic Acids Res ; 32(Web Server issue): W576-81, 2004 Jul 01.
Artigo em Inglês | MEDLINE | ID: mdl-15215454

RESUMO

Meta-BASIC (http://basic.bioinfo.pl) is a novel sensitive approach for recognition of distant similarity between proteins based on consensus alignments of meta profiles. Specifically, Meta-BASIC compares sequence profiles combined with predicted secondary structure by utilizing several scoring systems and alignment algorithms. In our benchmarking tests, Meta-BASIC outperforms many individual servers, including fold recognition servers, and it can compete with meta predictors that base their strength on the structural comparison of models. In addition, Meta-BASIC, which enables detection of very distant relationships even if the tertiary structure for the reference protein is not known, has a high-throughput capability. This new method is applied to 860 PfamA protein families with unknown function (DUF) and provides many novel structure-functional assignments available on-line at http://basic.bioinfo.pl/duf.pl. Detailed discussion is provided for two of the most interesting assignments. DUF271 and DUF431 are predicted to be a nucleotide-diphospho-sugar transferase and an alpha/beta-knot SAM-dependent RNA methyltransferase, respectively.


Assuntos
Software , Homologia Estrutural de Proteína , Algoritmos , Glicosiltransferases/química , Glicosiltransferases/classificação , Internet , Modelos Moleculares , Estrutura Secundária de Proteína , Proteínas/química , Proteínas/classificação , Proteínas/fisiologia , Alinhamento de Sequência , Análise de Sequência de Proteína , Homologia de Sequência de Aminoácidos , tRNA Metiltransferases/química , tRNA Metiltransferases/classificação
18.
Nucleic Acids Res ; 31(13): 3804-7, 2003 Jul 01.
Artigo em Inglês | MEDLINE | ID: mdl-12824423

RESUMO

ORFeus is a fully automated, sensitive protein sequence similarity search server available to the academic community via the Structure Prediction Meta Server (http://BioInfo.PL/Meta/). The goal of the development of ORFeus was to increase the sensitivity of the detection of distantly related protein families. Predicted secondary structure information was added to the information about sequence conservation and variability, a technique known from hybrid threading approaches. The accuracy of the meta profiles created this way is compared with profiles containing only sequence information and with the standard approach of aligning a single sequence with a profile. Additionally, the alignment of meta profiles is more sensitive in detecting remote homology between protein families than if aligning two sequence-only profiles or if aligning a profile with a sequence. The specificity of the alignment score is improved in the lower specificity range compared with the robust sequence-only profiles.


Assuntos
Estrutura Secundária de Proteína , Homologia de Sequência de Aminoácidos , Software , Algoritmos , Internet , Alinhamento de Sequência
19.
Nat Commun ; 6: 6509, 2015 Mar 05.
Artigo em Inglês | MEDLINE | ID: mdl-25739651

RESUMO

Genome clustering of homeobox genes is often thought to reflect arrangements of tandem gene duplicates maintained by advantageous coordinated gene regulation. Here we analyse the chromosomal organization of the NK homeobox genes, presumed to be part of a single cluster in the Bilaterian ancestor, across 20 arthropods. We find that the ProtoNK cluster was extensively fragmented in some lineages, showing that NK clustering in Drosophila species does not reflect selectively maintained gene arrangements. More importantly, the arrangement of NK and neighbouring genes across the phylogeny supports that, in two instances within the Drosophila genus, some cluster remnants became reunited via large-scale chromosomal rearrangements. Simulated scenarios of chromosome evolution indicate that these reunion events are unlikely unless the genome neighbourhoods harbouring the participating genes tend to colocalize in the nucleus. Our results underscore how mechanisms other than tandem gene duplication can result in paralogous gene clustering during genome evolution.


Assuntos
Drosophila/genética , Evolução Molecular , Regulação da Expressão Gênica/genética , Genes Homeobox/genética , Família Multigênica/genética , Translocação Genética/fisiologia , Sequência de Aminoácidos , Animais , Artrópodes/genética , Mapeamento Cromossômico , Biologia Computacional , Duplicação Gênica/genética , Hibridização In Situ , Funções Verossimilhança , Modelos Genéticos , Anotação de Sequência Molecular , Dados de Sequência Molecular , Filogenia , Especificidade da Espécie , Translocação Genética/genética
20.
Proteins ; 53 Suppl 6: 418-23, 2003.
Artigo em Inglês | MEDLINE | ID: mdl-14579330

RESUMO

In CASP5, the BioInfo.PL group has used the structure prediction Meta Server and the associated newly developed flexible meta-predictor, called 3D-Jury, as the main structure prediction tools. The most important feature of the meta-predictor is a high (86%) correlation between the reported confidence score and the quality of the selected model. The Gene Relational Database (GRDB) was used to confirm the fold recognition results by selecting distant homologues and subsequent structure prediction with the Meta Server. A fragment-splicing procedure was performed as a final processing step with large fragments extracted from selected models using model quality control provided by Verify3D. The comparison of submitted models with the native structure conducted after the CASP meeting showed that the GRDB-supported structure prediction led to a satisfactory template fold selection, whereas the fragment-splicing procedure must be improved in the future.


Assuntos
Biologia Computacional/métodos , Dobramento de Proteína , Proteínas/química , Sequência de Aminoácidos , Sítios de Ligação/genética , Dados de Sequência Molecular , Conformação Proteica , Estrutura Terciária de Proteína , Proteínas/genética , Alinhamento de Sequência , Homologia de Sequência de Aminoácidos
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA