Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 33
Filtrar
1.
Cell Metab ; 35(4): 695-710.e6, 2023 04 04.
Artigo em Inglês | MEDLINE | ID: mdl-36963395

RESUMO

Associations between human genetic variation and clinical phenotypes have become a foundation of biomedical research. Most repositories of these data seek to be disease-agnostic and therefore lack disease-focused views. The Type 2 Diabetes Knowledge Portal (T2DKP) is a public resource of genetic datasets and genomic annotations dedicated to type 2 diabetes (T2D) and related traits. Here, we seek to make the T2DKP more accessible to prospective users and more useful to existing users. First, we evaluate the T2DKP's comprehensiveness by comparing its datasets with those of other repositories. Second, we describe how researchers unfamiliar with human genetic data can begin using and correctly interpreting them via the T2DKP. Third, we describe how existing users can extend their current workflows to use the full suite of tools offered by the T2DKP. We finally discuss the lessons offered by the T2DKP toward the goal of democratizing access to complex disease genetic results.


Assuntos
Diabetes Mellitus Tipo 2 , Humanos , Diabetes Mellitus Tipo 2/genética , Acesso à Informação , Estudos Prospectivos , Genômica/métodos , Fenótipo
2.
Diabetologia ; 66(3): 495-507, 2023 03.
Artigo em Inglês | MEDLINE | ID: mdl-36538063

RESUMO

AIMS/HYPOTHESIS: Type 2 diabetes is highly polygenic and influenced by multiple biological pathways. Rapid expansion in the number of type 2 diabetes loci can be leveraged to identify such pathways. METHODS: We developed a high-throughput pipeline to enable clustering of type 2 diabetes loci based on variant-trait associations. Our pipeline extracted summary statistics from genome-wide association studies (GWAS) for type 2 diabetes and related traits to generate a matrix of 323 variants × 64 trait associations and applied Bayesian non-negative matrix factorisation (bNMF) to identify genetic components of type 2 diabetes. Epigenomic enrichment analysis was performed in 28 cell types and single pancreatic cells. We generated cluster-specific polygenic scores and performed regression analysis in an independent cohort (N=25,419) to assess for clinical relevance. RESULTS: We identified ten clusters of genetic loci, recapturing the five from our prior analysis as well as novel clusters related to beta cell dysfunction, pronounced insulin secretion, and levels of alkaline phosphatase, lipoprotein A and sex hormone-binding globulin. Four clusters related to mechanisms of insulin deficiency, five to insulin resistance and one had an unclear mechanism. The clusters displayed tissue-specific epigenomic enrichment, notably with the two beta cell clusters differentially enriched in functional and stressed pancreatic beta cell states. Additionally, cluster-specific polygenic scores were differentially associated with patient clinical characteristics and outcomes. The pipeline was applied to coronary artery disease and chronic kidney disease, identifying multiple overlapping clusters with type 2 diabetes. CONCLUSIONS/INTERPRETATION: Our approach stratifies type 2 diabetes loci into physiologically interpretable genetic clusters associated with distinct tissues and clinical outcomes. The pipeline allows for efficient updating as additional GWAS become available and can be readily applied to other conditions, facilitating clinical translation of GWAS findings. Software to perform this clustering pipeline is freely available.


Assuntos
Diabetes Mellitus Tipo 2 , Humanos , Diabetes Mellitus Tipo 2/genética , Estudo de Associação Genômica Ampla , Predisposição Genética para Doença/genética , Teorema de Bayes , Análise por Conglomerados , Polimorfismo de Nucleotídeo Único
3.
Dev Cell ; 57(3): 387-397.e4, 2022 02 07.
Artigo em Inglês | MEDLINE | ID: mdl-35134345

RESUMO

Lipid droplets (LDs) are organelles of cellular lipid storage with fundamental roles in energy metabolism and cell membrane homeostasis. There has been an explosion of research into the biology of LDs, in part due to their relevance in diseases of lipid storage, such as atherosclerosis, obesity, type 2 diabetes, and hepatic steatosis. Consequently, there is an increasing need for a resource that combines datasets from systematic analyses of LD biology. Here, we integrate high-confidence, systematically generated human, mouse, and fly data from studies on LDs in the framework of an online platform named the "Lipid Droplet Knowledge Portal" (https://lipiddroplet.org/). This scalable and interactive portal includes comprehensive datasets, across a variety of cell types, for LD biology, including transcriptional profiles of induced lipid storage, organellar proteomics, genome-wide screen phenotypes, and ties to human genetics. This resource is a powerful platform that can be utilized to identify determinants of lipid storage.


Assuntos
Bases de Dados como Assunto , Gotículas Lipídicas/metabolismo , Animais , Ésteres do Colesterol/metabolismo , Mineração de Dados , Genoma , Humanos , Inflamação/patologia , Metabolismo dos Lipídeos , Fígado/metabolismo , Masculino , Camundongos Endogâmicos C57BL , Fenótipo , Fosforilação , Interferência de RNA
4.
PLoS Med ; 15(9): e1002654, 2018 09.
Artigo em Inglês | MEDLINE | ID: mdl-30240442

RESUMO

BACKGROUND: Type 2 diabetes (T2D) is a heterogeneous disease for which (1) disease-causing pathways are incompletely understood and (2) subclassification may improve patient management. Unlike other biomarkers, germline genetic markers do not change with disease progression or treatment. In this paper, we test whether a germline genetic approach informed by physiology can be used to deconstruct T2D heterogeneity. First, we aimed to categorize genetic loci into groups representing likely disease mechanistic pathways. Second, we asked whether the novel clusters of genetic loci we identified have any broad clinical consequence, as assessed in four separate subsets of individuals with T2D. METHODS AND FINDINGS: In an effort to identify mechanistic pathways driven by established T2D genetic loci, we applied Bayesian nonnegative matrix factorization (bNMF) clustering to genome-wide association study (GWAS) results for 94 independent T2D genetic variants and 47 diabetes-related traits. We identified five robust clusters of T2D loci and traits, each with distinct tissue-specific enhancer enrichment based on analysis of epigenomic data from 28 cell types. Two clusters contained variant-trait associations indicative of reduced beta cell function, differing from each other by high versus low proinsulin levels. The three other clusters displayed features of insulin resistance: obesity mediated (high body mass index [BMI] and waist circumference [WC]), "lipodystrophy-like" fat distribution (low BMI, adiponectin, and high-density lipoprotein [HDL] cholesterol, and high triglycerides), and disrupted liver lipid metabolism (low triglycerides). Increased cluster genetic risk scores were associated with distinct clinical outcomes, including increased blood pressure, coronary artery disease (CAD), and stroke. We evaluated the potential for clinical impact of these clusters in four studies containing individuals with T2D (Metabolic Syndrome in Men Study [METSIM], N = 487; Ashkenazi, N = 509; Partners Biobank, N = 2,065; UK Biobank [UKBB], N = 14,813). Individuals with T2D in the top genetic risk score decile for each cluster reproducibly exhibited the predicted cluster-associated phenotypes, with approximately 30% of all individuals assigned to just one cluster top decile. Limitations of this study include that the genetic variants used in the cluster analysis were restricted to those associated with T2D in populations of European ancestry. CONCLUSION: Our approach identifies salient T2D genetically anchored and physiologically informed pathways, and supports the use of genetics to deconstruct T2D heterogeneity. Classification of patients by these genetic pathways may offer a step toward genetically informed T2D patient management.


Assuntos
Diabetes Mellitus Tipo 2/classificação , Diabetes Mellitus Tipo 2/genética , Loci Gênicos , Família Multigênica , Algoritmos , Teorema de Bayes , Análise por Conglomerados , Estudos de Coortes , Estudos Transversais , Bases de Dados Genéticas , Feminino , Efeito Fundador , Marcadores Genéticos , Predisposição Genética para Doença , Estudo de Associação Genômica Ampla , Humanos , Insulina/deficiência , Insulina/genética , Resistência à Insulina/genética , Masculino , Fenótipo , Estudos Prospectivos , Fatores de Risco
5.
Diabetologia ; 61(6): 1315-1324, 2018 06.
Artigo em Inglês | MEDLINE | ID: mdl-29626220

RESUMO

AIMS/HYPOTHESIS: Identifying the metabolite profile of individuals with normal fasting glucose (NFG [<5.55 mmol/l]) who progressed to type 2 diabetes may give novel insights into early type 2 diabetes disease interception and detection. METHODS: We conducted a population-based prospective study among 1150 Framingham Heart Study Offspring cohort participants, age 40-65 years, with NFG. Plasma metabolites were profiled by LC-MS/MS. Penalised regression models were used to select measured metabolites for type 2 diabetes incidence classification (training dataset) and to internally validate the discriminatory capability of selected metabolites beyond conventional type 2 diabetes risk factors (testing dataset). RESULTS: Over a follow-up period of 20 years, 95 individuals with NFG developed type 2 diabetes. Nineteen metabolites were selected repeatedly in the training dataset for type 2 diabetes incidence classification and were found to improve type 2 diabetes risk prediction beyond conventional type 2 diabetes risk factors (AUC was 0.81 for risk factors vs 0.90 for risk factors + metabolites, p = 1.1 × 10-4). Using pathway enrichment analysis, the nitrogen metabolism pathway, which includes three prioritised metabolites (glycine, taurine and phenylalanine), was significantly enriched for association with type 2 diabetes risk at the false discovery rate of 5% (p = 0.047). In adjusted Cox proportional hazard models, the type 2 diabetes risk per 1 SD increase in glycine, taurine and phenylalanine was 0.65 (95% CI 0.54, 0.78), 0.73 (95% CI 0.59, 0.9) and 1.35 (95% CI 1.11, 1.65), respectively. Mendelian randomisation demonstrated a similar relationship for type 2 diabetes risk per 1 SD genetically increased glycine (OR 0.89 [95% CI 0.8, 0.99]) and phenylalanine (OR 1.6 [95% CI 1.08, 2.4]). CONCLUSIONS/INTERPRETATION: In individuals with NFG, information from a discrete set of 19 metabolites improved prediction of type 2 diabetes beyond conventional risk factors. In addition, the nitrogen metabolism pathway and its components emerged as a potential effector of earliest stages of type 2 diabetes pathophysiology.


Assuntos
Glicemia/metabolismo , Diabetes Mellitus Tipo 2/sangue , Hemoglobinas Glicadas/metabolismo , Metabolômica , Adulto , Idoso , Área Sob a Curva , Biologia Computacional , Diabetes Mellitus Tipo 2/metabolismo , Feminino , Glicina/metabolismo , Humanos , Incidência , Masculino , Análise da Randomização Mendeliana , Pessoa de Meia-Idade , Fenilalanina/metabolismo , Estudos Prospectivos , Curva ROC , Fatores de Risco , Espectrometria de Massas em Tandem , Taurina/metabolismo
7.
Cell ; 170(1): 199-212.e20, 2017 Jun 29.
Artigo em Inglês | MEDLINE | ID: mdl-28666119

RESUMO

Type 2 diabetes (T2D) affects Latinos at twice the rate seen in populations of European descent. We recently identified a risk haplotype spanning SLC16A11 that explains ∼20% of the increased T2D prevalence in Mexico. Here, through genetic fine-mapping, we define a set of tightly linked variants likely to contain the causal allele(s). We show that variants on the T2D-associated haplotype have two distinct effects: (1) decreasing SLC16A11 expression in liver and (2) disrupting a key interaction with basigin, thereby reducing cell-surface localization. Both independent mechanisms reduce SLC16A11 function and suggest SLC16A11 is the causal gene at this locus. To gain insight into how SLC16A11 disruption impacts T2D risk, we demonstrate that SLC16A11 is a proton-coupled monocarboxylate transporter and that genetic perturbation of SLC16A11 induces changes in fatty acid and lipid metabolism that are associated with increased T2D risk. Our findings suggest that increasing SLC16A11 function could be therapeutically beneficial for T2D. VIDEO ABSTRACT.


Assuntos
Diabetes Mellitus Tipo 2/metabolismo , Transportadores de Ácidos Monocarboxílicos/genética , Transportadores de Ácidos Monocarboxílicos/metabolismo , Basigina/metabolismo , Membrana Celular/metabolismo , Cromossomos Humanos Par 17/metabolismo , Técnicas de Silenciamento de Genes , Haplótipos , Hepatócitos/metabolismo , Heterozigoto , Código das Histonas , Humanos , Fígado/metabolismo , Modelos Moleculares , Transportadores de Ácidos Monocarboxílicos/química
8.
Mol Cell ; 60(6): 941-52, 2015 Dec 17.
Artigo em Inglês | MEDLINE | ID: mdl-26698662

RESUMO

In insects, brain-derived Prothoracicotropic hormone (PTTH) activates the receptor tyrosine kinase (RTK) Torso to initiate metamorphosis through the release of ecdysone. We have determined the crystal structure of silkworm PTTH in complex with the ligand-binding region of Torso. Here we show that ligand-induced Torso dimerization results from the sequential and negatively cooperative formation of asymmetric heterotetramers. Mathematical modeling of receptor activation based upon our biophysical studies shows that ligand pulses are "buffered" at low receptor levels, leading to a sustained signal. By contrast, high levels of Torso develop the signal intensity and duration of a noncooperative system. We propose that this may allow Torso to coordinate widely different functions from a single ligand by tuning receptor levels. Phylogenic analysis indicates that Torso is found outside arthropods, including human parasitic roundworms. Together, our findings provide mechanistic insight into how this receptor system, with roles in embryonic and adult development, is regulated.


Assuntos
Bombyx/metabolismo , Hormônios de Inseto/química , Hormônios de Inseto/metabolismo , Receptores Proteína Tirosina Quinases/química , Receptores Proteína Tirosina Quinases/metabolismo , Animais , Sítios de Ligação , Bombyx/química , Cristalografia por Raios X , Regulação da Expressão Gênica no Desenvolvimento , Humanos , Proteínas de Insetos/química , Proteínas de Insetos/metabolismo , Modelos Moleculares , Filogenia , Multimerização Proteica , Receptores de Interleucina-17/química , Transdução de Sinais
9.
Nat Commun ; 6: 6509, 2015 Mar 05.
Artigo em Inglês | MEDLINE | ID: mdl-25739651

RESUMO

Genome clustering of homeobox genes is often thought to reflect arrangements of tandem gene duplicates maintained by advantageous coordinated gene regulation. Here we analyse the chromosomal organization of the NK homeobox genes, presumed to be part of a single cluster in the Bilaterian ancestor, across 20 arthropods. We find that the ProtoNK cluster was extensively fragmented in some lineages, showing that NK clustering in Drosophila species does not reflect selectively maintained gene arrangements. More importantly, the arrangement of NK and neighbouring genes across the phylogeny supports that, in two instances within the Drosophila genus, some cluster remnants became reunited via large-scale chromosomal rearrangements. Simulated scenarios of chromosome evolution indicate that these reunion events are unlikely unless the genome neighbourhoods harbouring the participating genes tend to colocalize in the nucleus. Our results underscore how mechanisms other than tandem gene duplication can result in paralogous gene clustering during genome evolution.


Assuntos
Drosophila/genética , Evolução Molecular , Regulação da Expressão Gênica/genética , Genes Homeobox/genética , Família Multigênica/genética , Translocação Genética/fisiologia , Sequência de Aminoácidos , Animais , Artrópodes/genética , Mapeamento Cromossômico , Biologia Computacional , Duplicação Gênica/genética , Hibridização In Situ , Funções Verossimilhança , Modelos Genéticos , Anotação de Sequência Molecular , Dados de Sequência Molecular , Filogenia , Especificidade da Espécie , Translocação Genética/genética
10.
Mol Biol Evol ; 31(10): 2557-72, 2014 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-24951729

RESUMO

MicroRNAs (miRNAs) are endogenous RNA molecules that regulate gene expression posttranscriptionally. To date, the emergence of miRNAs and their patterns of sequence evolution have been analyzed in great detail. However, the extent to which miRNA expression levels have evolved over time, the role different evolutionary forces play in shaping these changes, and whether this variation in miRNA expression can reveal the interplay between miRNAs and mRNAs remain poorly understood. This is especially true for miRNA expressed during key developmental transitions. Here, we assayed miRNA expression levels immediately before (≥18BPF [18 h before puparium formation]) and after (PF) the increase in the hormone ecdysone responsible for triggering metamorphosis. We did so in four strains of Drosophila melanogaster and two closely related species. In contrast to their sequence conservation, approximately 25% of miRNAs analyzed showed significant within-species variation in male expression levels at ≥18BPF and/or PF. Additionally, approximately 33% showed modifications in their pattern of expression bias between developmental timepoints. A separate analysis of the ≥18BPF and PF stages revealed that changes in miRNA abundance accumulate linearly over evolutionary time at PF but not at ≥18BPF. Importantly, ≥18BPF-enriched miRNAs showed the greatest variation in expression levels both within and between species, so are the less likely to evolve under stabilizing selection. Functional attributes, such as expression ubiquity, appeared more tightly associated with lower levels of miRNA expression polymorphism at PF than at ≥18BPF. Furthermore, ≥18BPF- and PF-enriched miRNAs showed opposite patterns of covariation in expression with mRNAs, which denoted the type of regulatory relationship between miRNAs and mRNAs. Collectively, our results show contrasting patterns of functional divergence associated with miRNA expression levels during Drosophila ontogeny.


Assuntos
Drosophila melanogaster/crescimento & desenvolvimento , Metamorfose Biológica , MicroRNAs/genética , Animais , Sequência Conservada , Drosophila melanogaster/classificação , Drosophila melanogaster/genética , Evolução Molecular , Feminino , Perfilação da Expressão Gênica , Regulação da Expressão Gênica no Desenvolvimento , Variação Genética , Masculino , Dados de Sequência Molecular , Filogenia , Caracteres Sexuais
11.
J Comput Biol ; 21(3): 247-56, 2014 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-21091053

RESUMO

Molecular docking is a widely used method for lead optimization. However, docking tools often fail to predict how a ligand (the smaller molecule, such as a substrate or drug candidate) binds to a receptor (the accepting part of a protein). We present here the HarmonyDOCK, a novel method for assessing the docking software accuracy, and creating the scoring function which would determine consensus protein-ligand pose among those generated by available docking programs. Conformations for few hundred protein-ligand complexes with known three-dimensional structure were predicted on a benchmark set by set of different docking programs. On the basis of the derived ranking, the point of reference and the lower score limit were determined for subsequent investigations. The focus of the methodology is on the top-ranked poses, with the assumption being that the conformation of the docked molecules is the most accurate. We found out that some docking programs perform considerably better than the others, yet in all cases the proper selection of decoys, namely HarmonyDOCK, is needed for successful docking procedure.


Assuntos
Ligantes , Simulação de Acoplamento Molecular , Conformação Proteica , Proteínas/química , Sítios de Ligação , Desenho de Fármacos , Ligação Proteica , Software
12.
J Med Genet ; 49(12): 747-52, 2012 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-23118445

RESUMO

BACKGROUND: Musical abilities such as recognising music and singing performance serve as means for communication and are instruments in sexual selection. Specific regions of the brain have been found to be activated by musical stimuli, but these have rarely been extended to the discovery of genes and molecules associated with musical ability. METHODS: A total of 1008 individuals from 73 families were enrolled and a pitch-production accuracy test was applied to determine musical ability. To identify genetic loci and variants that contribute to musical ability, we conducted family-based linkage and association analyses, and incorporated the results with data from exome sequencing and array comparative genomic hybridisation analyses. RESULTS: We found significant evidence of linkage at 4q23 with the nearest marker D4S2986 (LOD=3.1), whose supporting interval overlaps a previous study in Finnish families, and identified an intergenic single nucleotide polymorphism (SNP) (rs1251078, p = 8.4 × 10(-17)) near UGT8, a gene highly expressed in the central nervous system and known to act in brain organisation. In addition, a non-synonymous SNP in UGT8 was revealed to be highly associated with musical ability (rs4148254, p = 8.0 × 10(-17)), and a 6.2 kb copy number loss near UGT8 showed a plausible association with musical ability (p = 2.9 × 10(-6)). CONCLUSIONS: This study provides new insight into the genetics of musical ability, exemplifying a methodology to assign functional significance to synonymous and non-coding alleles by integrating multiple experimental methods.


Assuntos
Povo Asiático/genética , Gangliosídeo Galactosiltransferase/genética , Música , Polimorfismo de Nucleotídeo Único , Desempenho Psicomotor , Adolescente , Adulto , Hibridização Genômica Comparativa , Exoma , Família , Feminino , Estudos de Associação Genética , Ligação Genética , Humanos , Masculino , Pessoa de Meia-Idade , Mongólia , Adulto Jovem
13.
J Comput Chem ; 32(4): 568-81, 2011 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-20812324

RESUMO

Molecular recognition plays a fundamental role in all biological processes, and that is why great efforts have been made to understand and predict protein-ligand interactions. Finding a molecule that can potentially bind to a target protein is particularly essential in drug discovery and still remains an expensive and time-consuming task. In silico, tools are frequently used to screen molecular libraries to identify new lead compounds, and if protein structure is known, various protein-ligand docking programs can be used. The aim of docking procedure is to predict correct poses of ligand in the binding site of the protein as well as to score them according to the strength of interaction in a reasonable time frame. The purpose of our studies was to present the novel consensus approach to predict both protein-ligand complex structure and its corresponding binding affinity. Our method used as the input the results from seven docking programs (Surflex, LigandFit, Glide, GOLD, FlexX, eHiTS, and AutoDock) that are widely used for docking of ligands. We evaluated it on the extensive benchmark dataset of 1300 protein-ligands pairs from refined PDBbind database for which the structural and affinity data was available. We compared independently its ability of proper scoring and posing to the previously proposed methods. In most cases, our method is able to dock properly approximately 20% of pairs more than docking methods on average, and over 10% of pairs more than the best single program. The RMSD value of the predicted complex conformation versus its native one is reduced by a factor of 0.5 Å. Finally, we were able to increase the Pearson correlation of the predicted binding affinity in comparison with the experimental value up to 0.5.


Assuntos
Desenho de Fármacos , Proteínas/antagonistas & inibidores , Proteínas/metabolismo , Software , Algoritmos , Bases de Dados de Proteínas , Ligantes , Ligação Proteica
14.
Genome Res ; 20(8): 1084-96, 2010 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-20601587

RESUMO

During evolution, gene repatterning across eukaryotic genomes is not uniform. Some genomic regions exhibit a gene organization conserved phylogenetically, while others are recurrently involved in chromosomal rearrangement, resulting in breakpoint reuse. Both gene order conservation and breakpoint reuse can result from the existence of functional constraints on where chromosomal breakpoints occur or from the existence of regions that are susceptible to breakage. The balance between these two mechanisms is still poorly understood. Drosophila species have very dynamic genomes and, therefore, can be very informative. We compared the gene organization of the main five chromosomal elements (Muller's elements A-E) of nine Drosophila species. Under a parsimonious evolutionary scenario, we estimate that 6116 breakpoints differentiate the gene orders of the species and that breakpoint reuse is associated with approximately 80% of the orthologous landmarks. The comparison of the observed patterns of change in gene organization with those predicted under different simulated modes of evolution shows that fragile regions alone can explain the observed key patterns of Muller's element A (X chromosome) more often than for any other Muller's element. High levels of fragility plus constraints operating on approximately 15% of the genome are sufficient to explain the observed patterns of change and conservation across species. The orthologous landmarks more likely to be under constraint exhibit both a remarkable internal functional heterogeneity and a lack of common functional themes with the exception of the presence of highly conserved noncoding elements. Fragile regions rather than functional constraints have been the main determinant of the evolution of the Drosophila chromosomes.


Assuntos
Sítios Frágeis do Cromossomo/genética , Drosophila/genética , Ordem dos Genes , Genoma de Inseto , Animais , Sequência de Bases , Pontos de Quebra do Cromossomo , Inversão Cromossômica/genética , Evolução Molecular , Feminino , Expressão Gênica , Masculino , Cromossomo X/genética
15.
Comb Chem High Throughput Screen ; 12(5): 484-9, 2009 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-19519327

RESUMO

We present here the random forest supervised machine learning algorithm applied to flexible docking results from five typical virtual high throughput screening (HTS) studies. Our approach is aimed at: i) reducing the number of compounds to be tested experimentally against the given protein target and ii) extending results of flexible docking experiments performed only on a subset of a chemical library in order to select promising inhibitors from the whole dataset. The random forest (RF) method is applied and tested here on compounds from the MDL drug data report (MDDR). The recall values for selected five diverse protein targets are over 90% and the performance reaches 100%. This machine learning method combined with flexible docking is capable to find 60% of the active compounds for most protein targets by docking only 10% of screened ligands. Therefore our in silico approach is able to scan very large databases rapidly in order to predict biological activity of small molecule inhibitors and provides an effective alternative for more computationally demanding methods in virtual HTS.


Assuntos
Algoritmos , Inteligência Artificial , Desenho de Fármacos , Proteínas/antagonistas & inibidores , Proteínas/metabolismo , Humanos , Ligantes , Proteínas/química , Bibliotecas de Moléculas Pequenas/química , Bibliotecas de Moléculas Pequenas/metabolismo , Relação Estrutura-Atividade
16.
Nucleic Acids Res ; 36(Web Server issue): W303-7, 2008 Jul 01.
Artigo em Inglês | MEDLINE | ID: mdl-18515349

RESUMO

The 'omics' revolution is causing a flurry of data that all needs to be annotated for it to become useful. Sequences of proteins of unknown function can be annotated with a putative function by comparing them with proteins of known function. This form of annotation is typically performed with BLAST or similar software. Structural genomics is nowadays also bringing us three dimensional structures of proteins with unknown function. We present here software that can be used when sequence comparisons fail to determine the function of a protein with known structure but unknown function. The software, called 3D-Fun, is implemented as a server that runs at several European institutes and is freely available for everybody at all these sites. The 3D-Fun servers accept protein coordinates in the standard PDB format and compare them with all known protein structures by 3D structural superposition using the 3D-Hit software. If structural hits are found with proteins with known function, these are listed together with their function and some vital comparison statistics. This is conceptually very similar in 3D to what BLAST does in 1D. Additionally, the superposition results are displayed using interactive graphics facilities. Currently, the 3D-Fun system only predicts enzyme function but an expanded version with Gene Ontology predictions will be available soon. The server can be accessed at http://3dfun.bioinfo.pl/ or at http://3dfun.cmbi.ru.nl/.


Assuntos
Enzimas/química , Software , Algoritmos , Enzimas/metabolismo , Internet , Modelos Moleculares , Homologia Estrutural de Proteína
17.
PLoS Biol ; 5(6): e152, 2007 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-17550304

RESUMO

That closely related species often differ by chromosomal inversions was discovered by Sturtevant and Plunkett in 1926. Our knowledge of how these inversions originate is still very limited, although a prevailing view is that they are facilitated by ectopic recombination events between inverted repetitive sequences. The availability of genome sequences of related species now allows us to study in detail the mechanisms that generate interspecific inversions. We have analyzed the breakpoint regions of the 29 inversions that differentiate the chromosomes of Drosophila melanogaster and two closely related species, D. simulans and D. yakuba, and reconstructed the molecular events that underlie their origin. Experimental and computational analysis revealed that the breakpoint regions of 59% of the inversions (17/29) are associated with inverted duplications of genes or other nonrepetitive sequences. In only two cases do we find evidence for inverted repetitive sequences in inversion breakpoints. We propose that the presence of inverted duplications associated with inversion breakpoint regions is the result of staggered breaks, either isochromatid or chromatid, and that this, rather than ectopic exchange between inverted repetitive sequences, is the prevalent mechanism for the generation of inversions in the melanogaster species group. Outgroup analysis also revealed evidence for widespread breakpoint recycling. Lastly, we have found that expression domains in D. melanogaster may be disrupted in D. yakuba, bringing into question their potential adaptive significance.


Assuntos
Evolução Biológica , Inversão Cromossômica , Drosophila/genética , Genoma de Inseto , Animais , Quebra Cromossômica , Duplicação Gênica , Dados de Sequência Molecular
18.
Chem Biol Drug Des ; 69(4): 269-79, 2007 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-17461975

RESUMO

A structure-based in silico virtual drug discovery procedure was assessed with severe acute respiratory syndrome coronavirus main protease serving as a case study. First, potential compounds were extracted from protein-ligand complexes selected from Protein Data Bank database based on structural similarity to the target protein. Later, the set of compounds was ranked by docking scores using a Electronic High-Throughput Screening flexible docking procedure to select the most promising molecules. The set of best performing compounds was then used for similarity search over the 1 million entries in the Ligand.Info Meta-Database. Selected molecules having close structural relationship to a 2-methyl-2,4-pentanediol may provide candidate lead compounds toward the development of novel allosteric severe acute respiratory syndrome protease inhibitors.


Assuntos
Biologia Computacional/métodos , Simulação por Computador , Avaliação Pré-Clínica de Medicamentos/métodos , Endopeptidases/metabolismo , Inibidores de Proteases/farmacologia , Coronavírus Relacionado à Síndrome Respiratória Aguda Grave/efeitos dos fármacos , Coronavírus Relacionado à Síndrome Respiratória Aguda Grave/enzimologia , Bases de Dados de Proteínas , Desenho de Fármacos , Ligantes , Modelos Moleculares , Conformação Molecular , Inibidores de Proteases/química , Relação Estrutura-Atividade
19.
Comb Chem High Throughput Screen ; 10(3): 189-96, 2007 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-17346118

RESUMO

In many cases at the beginning of an HTS-campaign, some information about active molecules is already available. Often known active compounds (such as substrate analogues, natural products, inhibitors of a related protein or ligands published by a pharmaceutical company) are identified in low-throughput validation studies of the biochemical target. In this study we evaluate the effectiveness of a support vector machine applied for those compounds and used to classify a collection with unknown activity. This approach was aimed at reducing the number of compounds to be tested against the given target. Our method predicts the biological activity of chemical compounds based on only the atom pairs (AP) two dimensional topological descriptors. The supervised support vector machine (SVM) method herein is trained on compounds from the MDL drug data report (MDDR) known to be active for specific protein target. For detailed analysis, five different biological targets were selected including cyclooxygenase-2, dihydrofolate reductase, thrombin, HIV-reverse transcriptase and antagonists of the estrogen receptor. The accuracy of compound identification was estimated using the recall and precision values. The sensitivities for all protein targets exceeded 80% and the classification performance reached 100% for selected targets. In another application of the method, we addressed the absence of an initial set of active compounds for a selected protein target at the beginning of an HTS-campaign. In such a case, virtual high-throughput screening (vHTS) is usually applied by using a flexible docking procedure. However, the vHTS experiment typically contains a large percentage of false positives that should be verified by costly and time-consuming experimental follow-up assays. The subsequent use of our machine learning method was found to improve the speed (since the docking procedure was not required for all compounds from the database) and also the accuracy of the HTS hit lists (the enrichment factor).


Assuntos
Inteligência Artificial , Sistemas de Liberação de Medicamentos/métodos , Avaliação Pré-Clínica de Medicamentos/métodos , Inibidores Enzimáticos , Humanos , Ligação Proteica , Relação Quantitativa Estrutura-Atividade , Receptores de Superfície Celular/antagonistas & inibidores
20.
J Comput Aided Mol Des ; 20(5): 305-19, 2006 May.
Artigo em Inglês | MEDLINE | ID: mdl-16972168

RESUMO

The modeling of the severe acute respiratory syndrome coronavirus helicase ATPase catalytic domain was performed using the protein structure prediction Meta Server and the 3D Jury method for model selection, which resulted in the identification of 1JPR, 1UAA and 1W36 PDB structures as suitable templates for creating a full atom 3D model. This model was further utilized to design small molecules that are expected to block an ATPase catalytic pocket thus inhibit the enzymatic activity. Binding sites for various functional groups were identified in a series of molecular dynamics calculation. Their positions in the catalytic pocket were used as constraints in the Cambridge structural database search for molecules having the pharmacophores that interacted most strongly with the enzyme in a desired position. The subsequent MD simulations followed by calculations of binding energies of the designed molecules were compared to ATP identifying the most successful candidates, for likely inhibitors - molecules possessing two phosphonic acid moieties at distal ends of the molecule.


Assuntos
Domínio Catalítico , DNA Helicases/antagonistas & inibidores , DNA Helicases/química , Desenho de Fármacos , Inibidores Enzimáticos/farmacologia , Modelos Moleculares , Coronavírus Relacionado à Síndrome Respiratória Aguda Grave/enzimologia , Sequência de Aminoácidos , Sequência Conservada , Dados de Sequência Molecular , Alinhamento de Sequência , Homologia Estrutural de Proteína , Termodinâmica
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA