Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 22
Filtrar
Más filtros











Base de datos
Intervalo de año de publicación
1.
J Chem Inf Model ; 64(9): 3826-3840, 2024 May 13.
Artículo en Inglés | MEDLINE | ID: mdl-38696451

RESUMEN

Recent advances in computational methods provide the promise of dramatically accelerating drug discovery. While mathematical modeling and machine learning have become vital in predicting drug-target interactions and properties, there is untapped potential in computational drug discovery due to the vast and complex chemical space. This paper builds on our recently published computational fragment-based drug discovery (FBDD) method called fragment databases from screened ligand drug discovery (FDSL-DD). FDSL-DD uses in silico screening to identify ligands from a vast library, fragmenting them while attaching specific attributes based on predicted binding affinity and interaction with the target subdomain. In this paper, we further propose a two-stage optimization method that utilizes the information from prescreening to optimize computational ligand synthesis. We hypothesize that using prescreening information for optimization shrinks the search space and focuses on promising regions, thereby improving the optimization for candidate ligands. The first optimization stage assembles these fragments into larger compounds using genetic algorithms, followed by a second stage of iterative refinement to produce compounds with enhanced bioactivity. To demonstrate broad applicability, the methodology is demonstrated on three diverse protein targets found in human solid cancers, bacterial antimicrobial resistance, and the SARS-CoV-2 virus. Combined, the proposed FDSL-DD and a two-stage optimization approach yield high-affinity ligand candidates more efficiently than other state-of-the-art computational FBDD methods. We further show that a multiobjective optimization method accounting for drug-likeness can still produce potential candidate ligands with a high binding affinity. Overall, the results demonstrate that integrating detailed chemical information with a constrained search framework can markedly optimize the initial drug discovery process, offering a more precise and efficient route to developing new therapeutics.


Asunto(s)
Descubrimiento de Drogas , Ligandos , Descubrimiento de Drogas/métodos , Humanos , SARS-CoV-2/metabolismo , Algoritmos , Tratamiento Farmacológico de COVID-19 , COVID-19/virología
2.
J Mol Graph Model ; 127: 108669, 2024 03.
Artículo en Inglés | MEDLINE | ID: mdl-38011826

RESUMEN

Fragment-based drug design (FBDD) is one major drug discovery method employed in computer-aided drug discovery. Due to its inherent limitations, this process experiences long processing times and limited success rates. Here we present a new Fragment Databases from Screened Ligands Drug Design method (FDSL-DD) that intelligently incorporates information about fragment characteristics into a fragment-based design approach to the drug development process. The initial step of the FDSL-DD is the creation of a fragment database from a library of docked, drug-like ligands for a specific target, which deviates from the traditional in silico FBDD strategy, incorporating structure-based design screening techniques to combine the advantages of both approaches. Three different protein targets have been tested in this study to demonstrate the potential of the created fragment library and FDSL-DD. Utilizing the FDSL-DD led to an increase in binding affinity for each protein target. The most substantial increase was exhibited by the ligand designed for TIPE2, with a 3.6 kcalmol-1 difference between the top ligand from the FDSL-DD and top ligand from the high throughput virtual screening (HTVS). Using drug-like ligands in the initial HTVS allows for a greater search of chemical space, with higher efficiency in fragments selection, less grid boxes, and potentially identifying more interactions.


Asunto(s)
Diseño de Fármacos , Descubrimiento de Drogas , Ligandos , Descubrimiento de Drogas/métodos , Ensayos Analíticos de Alto Rendimiento , Bases de Datos Factuales
3.
PeerJ ; 11: e14779, 2023.
Artículo en Inglés | MEDLINE | ID: mdl-36785708

RESUMEN

A major challenge for clustering algorithms is to balance the trade-off between homogeneity, i.e., the degree to which an individual cluster includes only related sequences, and completeness, the degree to which related sequences are broken up into multiple clusters. Most algorithms are conservative in grouping sequences with other sequences. Remote homologs may fail to be clustered together and instead form unnecessarily distinct clusters. The resulting clusters have high homogeneity but completeness that is too low. We propose Complet+, a computationally scalable post-processing method to increase the completeness of clusters without an undue cost in homogeneity. Complet+ proves to effectively merge closely-related clusters of protein that have verified structural relationships in the SCOPe classification scheme, improving the completeness of clustering results at little cost to homogeneity. Applying Complet+ to clusters obtained using MMseqs2's clusterupdate achieves an increased V-measure of 0.09 and 0.05 at the SCOPe superfamily and family levels, respectively. Complet+ also creates more biologically representative clusters, as shown by a substantial increase in Adjusted Mutual Information (AMI) and Adjusted Rand Index (ARI) metrics when comparing predicted clusters to biological classifications. Complet+ similarly improves clustering metrics when applied to other methods, such as CD-HIT and linclust. Finally, we show that Complet+ runtime scales linearly with respect to the number of clusters being post-processed on a COG dataset of over 3 million sequences. Code and supplementary information is available on Github: https://github.com/EESI/Complet-Plus.


Asunto(s)
Algoritmos , Proteínas , Alineación de Secuencia , Secuencia de Aminoácidos , Proteínas/química , Análisis por Conglomerados
4.
Biology (Basel) ; 11(12)2022 Dec 08.
Artículo en Inglés | MEDLINE | ID: mdl-36552295

RESUMEN

Through the COVID-19 pandemic, SARS-CoV-2 has gained and lost multiple mutations in novel or unexpected combinations. Predicting how complex mutations affect COVID-19 disease severity is critical in planning public health responses as the virus continues to evolve. This paper presents a novel computational framework to complement conventional lineage classification and applies it to predict the severe disease potential of viral genetic variation. The transformer-based neural network model architecture has additional layers that provide sample embeddings and sequence-wide attention for interpretation and visualization. First, training a model to predict SARS-CoV-2 taxonomy validates the architecture's interpretability. Second, an interpretable predictive model of disease severity is trained on spike protein sequence and patient metadata from GISAID. Confounding effects of changing patient demographics, increasing vaccination rates, and improving treatment over time are addressed by including demographics and case date as independent input to the neural network model. The resulting model can be interpreted to identify potentially significant virus mutations and proves to be a robust predctive tool. Although trained on sequence data obtained entirely before the availability of empirical data for Omicron, the model can predict the Omicron's reduced risk of severe disease, in accord with epidemiological and experimental data.

5.
Comput Biol Med ; 149: 105969, 2022 10.
Artículo en Inglés | MEDLINE | ID: mdl-36041271

RESUMEN

Epidemiological studies show that COVID-19 variants-of-concern, like Delta and Omicron, pose different risks for severe disease, but they typically lack sequence-level information for the virus. Studies which do obtain viral genome sequences are generally limited in time, location, and population scope. Retrospective meta-analyses require time-consuming data extraction from heterogeneous formats and are limited to publicly available reports. Fortuitously, a subset of GISAID, the global SARS-CoV-2 sequence repository, includes "patient status" metadata that can indicate whether a sequence record is associated with mild or severe disease. While GISAID lacks data on comorbidities relevant to severity, such as obesity and chronic disease, it does include metadata for age and sex to use as additional attributes in modeling. With these caveats, previous efforts have demonstrated that genotype-patient status models can be fit to GISAID data, particularly when country-of-origin is used as an additional feature. But are these models robust and biologically meaningful? This paper shows that, in fact, temporal and geographic biases in sequences submitted to GISAID, as well as the evolving pandemic response, particularly reduction in severe disease due to vaccination, create complex issues for model development and interpretation. This paper poses a potential solution: efficient mixed effects machine learning using GPBoost, treating country as a random effect group. Training and validation using temporally split GISAID data and emerging Omicron variants demonstrates that GPBoost models are more predictive of the impact of spike protein mutations on patient outcomes than fixed effect XGBoost, LightGBM, random forests, and elastic net logistic regression models.


Asunto(s)
COVID-19 , Glicoproteína de la Espiga del Coronavirus , COVID-19/epidemiología , Humanos , Aprendizaje Automático , Mutación , Filogenia , Estudios Retrospectivos , SARS-CoV-2 , Índice de Severidad de la Enfermedad , Glicoproteína de la Espiga del Coronavirus/genética
6.
mSystems ; 7(2): e0003522, 2022 04 26.
Artículo en Inglés | MEDLINE | ID: mdl-35311562

RESUMEN

Next-generation sequencing has been essential to the global response to the COVID-19 pandemic. As of January 2022, nearly 7 million severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) sequences are available to researchers in public databases. Sequence databases are an abundant resource from which to extract biologically relevant and clinically actionable information. As the pandemic has gone on, SARS-CoV-2 has rapidly evolved, involving complex genomic changes that challenge current approaches to classifying SARS-CoV-2 variants. Deep sequence learning could be a potentially powerful way to build complex sequence-to-phenotype models. Unfortunately, while they can be predictive, deep learning typically produces "black box" models that cannot directly provide biological and clinical insight. Researchers should therefore consider implementing emerging methods for visualizing and interpreting deep sequence models. Finally, researchers should address important data limitations, including (i) global sequencing disparities, (ii) insufficient sequence metadata, and (iii) screening artifacts due to poor sequence quality control.


Asunto(s)
COVID-19 , SARS-CoV-2 , Humanos , SARS-CoV-2/genética , Pandemias/prevención & control , Secuenciación de Nucleótidos de Alto Rendimiento
7.
PLoS Comput Biol ; 17(9): e1009345, 2021 09.
Artículo en Inglés | MEDLINE | ID: mdl-34550967

RESUMEN

Recurrent neural networks with memory and attention mechanisms are widely used in natural language processing because they can capture short and long term sequential information for diverse tasks. We propose an integrated deep learning model for microbial DNA sequence data, which exploits convolutional neural networks, recurrent neural networks, and attention mechanisms to predict taxonomic classifications and sample-associated attributes, such as the relationship between the microbiome and host phenotype, on the read/sequence level. In this paper, we develop this novel deep learning approach and evaluate its application to amplicon sequences. We apply our approach to short DNA reads and full sequences of 16S ribosomal RNA (rRNA) marker genes, which identify the heterogeneity of a microbial community sample. We demonstrate that our implementation of a novel attention-based deep network architecture, Read2Pheno, achieves read-level phenotypic prediction. Training Read2Pheno models will encode sequences (reads) into dense, meaningful representations: learned embedded vectors output from the intermediate layer of the network model, which can provide biological insight when visualized. The attention layer of Read2Pheno models can also automatically identify nucleotide regions in reads/sequences which are particularly informative for classification. As such, this novel approach can avoid pre/post-processing and manual interpretation required with conventional approaches to microbiome sequence classification. We further show, as proof-of-concept, that aggregating read-level information can robustly predict microbial community properties, host phenotype, and taxonomic classification, with performance at least comparable to conventional approaches. An implementation of the attention-based deep learning network is available at https://github.com/EESI/sequence_attention (a python package) and https://github.com/EESI/seq2att (a command line tool).


Asunto(s)
Aprendizaje Profundo , Microbiota/genética , Redes Neurales de la Computación , ARN Ribosómico 16S/genética , Algoritmos , Biología Computacional , Bases de Datos Genéticas , Microbioma Gastrointestinal/genética , Interacciones Microbiota-Huesped/genética , Humanos , Enfermedades Inflamatorias del Intestino/microbiología , Procesamiento de Lenguaje Natural , Fenotipo , Prevotella/clasificación , Prevotella/genética , Prevotella/aislamiento & purificación , Prueba de Estudio Conceptual , ARN Ribosómico 16S/clasificación
8.
Biology (Basel) ; 9(11)2020 Oct 28.
Artículo en Inglés | MEDLINE | ID: mdl-33126516

RESUMEN

Machine learning algorithms can learn mechanisms of antimicrobial resistance from the data of DNA sequence without any a priori information. Interpreting a trained machine learning algorithm can be exploited for validating the model and obtaining new information about resistance mechanisms. Different feature extraction methods, such as SNP calling and counting nucleotide k-mers have been proposed for presenting DNA sequences to the model. However, there are trade-offs between interpretability, computational complexity and accuracy for different feature extraction methods. In this study, we have proposed a new feature extraction method, counting amino acid k-mers or oligopeptides, which provides easier model interpretation compared to counting nucleotide k-mers and reaches the same or even better accuracy in comparison with different methods. Additionally, we have trained machine learning algorithms using different feature extraction methods and compared the results in terms of accuracy, model interpretability and computational complexity. We have built a new feature selection pipeline for extraction of important features so that new AMR determinants can be discovered by analyzing these features. This pipeline allows the construction of models that only use a small number of features and can predict resistance accurately.

9.
PLoS Comput Biol ; 16(9): e1008269, 2020 09.
Artículo en Inglés | MEDLINE | ID: mdl-32941419

RESUMEN

We propose an efficient framework for genetic subtyping of SARS-CoV-2, the novel coronavirus that causes the COVID-19 pandemic. Efficient viral subtyping enables visualization and modeling of the geographic distribution and temporal dynamics of disease spread. Subtyping thereby advances the development of effective containment strategies and, potentially, therapeutic and vaccine strategies. However, identifying viral subtypes in real-time is challenging: SARS-CoV-2 is a novel virus, and the pandemic is rapidly expanding. Viral subtypes may be difficult to detect due to rapid evolution; founder effects are more significant than selection pressure; and the clustering threshold for subtyping is not standardized. We propose to identify mutational signatures of available SARS-CoV-2 sequences using a population-based approach: an entropy measure followed by frequency analysis. These signatures, Informative Subtype Markers (ISMs), define a compact set of nucleotide sites that characterize the most variable (and thus most informative) positions in the viral genomes sequenced from different individuals. Through ISM compression, we find that certain distant nucleotide variants covary, including non-coding and ORF1ab sites covarying with the D614G spike protein mutation which has become increasingly prevalent as the pandemic has spread. ISMs are also useful for downstream analyses, such as spatiotemporal visualization of viral dynamics. By analyzing sequence data available in the GISAID database, we validate the utility of ISM-based subtyping by comparing spatiotemporal analyses using ISMs to epidemiological studies of viral transmission in Asia, Europe, and the United States. In addition, we show the relationship of ISMs to phylogenetic reconstructions of SARS-CoV-2 evolution, and therefore, ISMs can play an important complementary role to phylogenetic tree-based analysis, such as is done in the Nextstrain project. The developed pipeline dynamically generates ISMs for newly added SARS-CoV-2 sequences and updates the visualization of pandemic spatiotemporal dynamics, and is available on Github at https://github.com/EESI/ISM (Jupyter notebook), https://github.com/EESI/ncov_ism (command line tool) and via an interactive website at https://covid19-ism.coe.drexel.edu/.


Asunto(s)
Betacoronavirus/clasificación , Betacoronavirus/genética , Infecciones por Coronavirus , Genómica/métodos , Pandemias , Neumonía Viral , COVID-19 , Infecciones por Coronavirus/epidemiología , Infecciones por Coronavirus/transmisión , Infecciones por Coronavirus/virología , Evolución Molecular , Marcadores Genéticos/genética , Genoma Viral/genética , Humanos , Mutación/genética , Filogenia , Neumonía Viral/epidemiología , Neumonía Viral/transmisión , Neumonía Viral/virología , ARN Viral/genética , SARS-CoV-2 , Alineación de Secuencia , Análisis de Secuencia de ARN , Análisis Espacio-Temporal
10.
J Biomed Biotechnol ; 2011: 495849, 2011.
Artículo en Inglés | MEDLINE | ID: mdl-21541181

RESUMEN

High-throughput sequencing technologies enable metagenome profiling, simultaneous sequencing of multiple microbial species present within an environmental sample. Since metagenomic data includes sequence fragments ("reads") from organisms that are absent from any database, new algorithms must be developed for the identification and annotation of novel sequence fragments. Homology-based techniques have been modified to detect novel species and genera, but, composition-based methods, have not been adapted. We develop a detection technique that can discriminate between "known" and "unknown" taxa, which can be used with composition-based methods, as well as a hybrid method. Unlike previous studies, we rigorously evaluate all algorithms for their ability to detect novel taxa. First, we show that the integration of a detector with a composition-based method performs significantly better than homology-based methods for the detection of novel species and genera, with best performance at finer taxonomic resolutions. Most importantly, we evaluate all the algorithms by introducing an "unknown" class and show that the modified version of PhymmBL has similar or better overall classification performance than the other modified algorithms, especially for the species-level and ultrashort reads. Finally, we evaluate the performance of several algorithms on a real acid mine drainage dataset.


Asunto(s)
Código de Barras del ADN Taxonómico/métodos , Análisis de Secuencia de ADN/métodos , Algoritmos , Bacterias/genética , Bases de Datos de Ácidos Nucleicos , Genoma/genética , Metagenómica , Minería , Sistemas de Lectura Abierta/genética , Curva ROC , Especificidad de la Especie , Eliminación de Residuos Líquidos
11.
IEEE Trans Inf Technol Biomed ; 13(2): 184-94, 2009 Mar.
Artículo en Inglés | MEDLINE | ID: mdl-19272861

RESUMEN

In this paper, we present a novel method to mine, model, and evaluate a regulatory system executing cellular functions that can be represented as a biomolecular network. Our method consists of two steps. First, a novel scale-free network clustering approach is applied to such a biomolecular network to obtain various subnetworks. Second, computational models are generated for the subnetworks and simulated to predict their behavior in the cellular context. We discuss and evaluate some of the advanced computational modeling approaches, in particular, state-space modeling, probabilistic Boolean network modeling, and fuzzy logic modeling. The modeling and simulation results represent hypotheses that are tested against high-throughput biological datasets (microarrays and/or genetic screens) under normal and perturbation conditions. Experimental results on time-series gene expression data for the human cell cycle indicate that our approach is promising for subnetwork mining and simulation from large biomolecular networks.


Asunto(s)
Bases de Datos Genéticas , Lógica Difusa , Redes Reguladoras de Genes , Algoritmos , Análisis por Conglomerados , Simulación por Computador , Humanos , Cadenas de Markov , Análisis por Micromatrices , Modelos Moleculares
12.
Curr Genomics ; 10(7): 493-510, 2009 Nov.
Artículo en Inglés | MEDLINE | ID: mdl-20436876

RESUMEN

Traditionally, studies in microbial genomics have focused on single-genomes from cultured species, thereby limiting their focus to the small percentage of species that can be cultured outside their natural environment. Fortunately, recent advances in high-throughput sequencing and computational analyses have ushered in the new field of metagenomics, which aims to decode the genomes of microbes from natural communities without the need for cultivation. Although metagenomic studies have shed a great deal of insight into bacterial diversity and coding capacity, several computational challenges remain due to the massive size and complexity of metagenomic sequence data. Current tools and techniques are reviewed in this paper which address challenges in 1) genomic fragment annotation, 2) phylogenetic reconstruction, 3) functional classification of samples, and 4) interpreting complementary metaproteomics and metametabolomics data. Also surveyed are important applications of metagenomic studies, including microbial forensics and the roles of microbial communities in shaping human health and soil ecology.

13.
Mol Biosyst ; 4(10): 1015-23, 2008 Oct.
Artículo en Inglés | MEDLINE | ID: mdl-19082141

RESUMEN

We describe a multi-platform ((1)H NMR, LC-MS, microarray) investigation of metabolic disturbances associated with the leptin receptor defective (db/db) mouse model of type 2 diabetes using novel assignment methodologies. For the first time, several urinary metabolites were found to be associated with diabetes and/or diabetes progression and confirmed in both NMR and LC-MS datasets. The confirmed metabolites were trimethylamine-n-oxide (TMAO), creatine, carnitine, and phenylalanine. TMAO and phenylalanine were both elevated in db/db mice and decreased in these mice with age. Levels of both creatine and carnitine increase in diabetic mice with age and creatine was also significantly decreased in db/db mice. Additionally, many metabolic markers were found by either NMR or LC-MS, but could not be found in both, due to instrumental limitations. This indicates that the combined use of NMR and LC-MS instrumentation provides complementary information that would be otherwise unattainable. Pathway analyses of urinary metabolites and liver, muscle, and adipose tissue transcripts from the db/db model were also performed to identify altered biochemical processes in the diabetic mice. Metabolite and liver transcript levels associated with the TCA cycle and steroid processes were altered in db/db mice. In addition, gene expression in muscle and liver associated with fatty acid processing was altered in the diabetic mice and similar evidence was observed in the LC-MS data. Our findings highlight the importance of a number of processes known to be associated with diabetes and reveal tissue specific responses to the condition. When studying metabolic disorders such as diabetes, multiple platform integrated profiling of metabolite alterations in biofluids can provide important insights into the processes underlying the disease.


Asunto(s)
Diabetes Mellitus Tipo 2/metabolismo , Modelos Animales de Enfermedad , Metaboloma , Receptores de Leptina/deficiencia , Animales , Diabetes Mellitus Tipo 2/genética , Espectroscopía de Resonancia Magnética , Masculino , Espectrometría de Masas , Ratones , Receptores de Leptina/genética
14.
Comput Methods Programs Biomed ; 88(1): 18-25, 2007 Oct.
Artículo en Inglés | MEDLINE | ID: mdl-17707543

RESUMEN

Microtubule dynamics play a critical role in cell function and stress response, modulating mitosis, morphology, signaling, and transport. Drugs such as paclitaxel (Taxol) can impact tubulin polymerization and affect microtubule dynamics. While theoretical methods have been previously proposed to simulate microtubule dynamics, we develop a methodology here that can be used to compare model predictions with experimental data. Our model is a hybrid of (1) a simple two-state stochastic formulation of tubulin polymerization kinetics and (2) an equilibrium approximation for the chemical kinetics of Taxol drug binding to microtubule ends. Model parameters are biologically realistic, with values taken directly from experimental measurements. Model validation is conducted against published experimental data comparing optical measurements of microtubule dynamics in cultured cells under normal and Taxol-treated conditions. To compare model predictions with experimental data requires applying a "windowing" strategy on the spatiotemporal resolution of the simulation. From a biological perspective, this is consistent with interpreting the microtubule "pause" phenomenon as at least partially an artifact of spatiotemporal resolution limits on experimental measurement.


Asunto(s)
Simulación por Computador , Microtúbulos/efectos de los fármacos , Paclitaxel/farmacocinética , Moduladores de Tubulina/farmacocinética , Humanos , Modelos Teóricos , Investigación Cualitativa , Procesos Estocásticos
15.
BMC Bioinformatics ; 8: 258, 2007 Jul 18.
Artículo en Inglés | MEDLINE | ID: mdl-17640351

RESUMEN

BACKGROUND: The functions of human cells are carried out by biomolecular networks, which include proteins, genes, and regulatory sites within DNA that encode and control protein expression. Models of biomolecular network structure and dynamics can be inferred from high-throughput measurements of gene and protein expression. We build on our previously developed fuzzy logic method for bridging quantitative and qualitative biological data to address the challenges of noisy, low resolution high-throughput measurements, i.e., from gene expression microarrays. We employ an evolutionary search algorithm to accelerate the search for hypothetical fuzzy biomolecular network models consistent with a biological data set. We also develop a method to estimate the probability of a potential network model fitting a set of data by chance. The resulting metric provides an estimate of both model quality and dataset quality, identifying data that are too noisy to identify meaningful correlations between the measured variables. RESULTS: Optimal parameters for the evolutionary search were identified based on artificial data, and the algorithm showed scalable and consistent performance for as many as 150 variables. The method was tested on previously published human cell cycle gene expression microarray data sets. The evolutionary search method was found to converge to the results of exhaustive search. The randomized evolutionary search was able to converge on a set of similar best-fitting network models on different training data sets after 30 generations running 30 models per generation. Consistent results were found regardless of which of the published data sets were used to train or verify the quantitative predictions of the best-fitting models for cell cycle gene dynamics. CONCLUSION: Our results demonstrate the capability of scalable evolutionary search for fuzzy network models to address the problem of inferring models based on complex, noisy biomolecular data sets. This approach yields multiple alternative models that are consistent with the data, yielding a constrained set of hypotheses that can be used to optimally design subsequent experiments.


Asunto(s)
Algoritmos , Lógica Difusa , Redes Reguladoras de Genes , Modelos Biológicos , Ciclo Celular/genética , Perfilación de la Expresión Génica/métodos , Células HeLa , Humanos , Análisis de Secuencia por Matrices de Oligonucleótidos/métodos , Probabilidad , Programas Informáticos
16.
Ageing Res Rev ; 5(4): 434-48, 2006 Nov.
Artículo en Inglés | MEDLINE | ID: mdl-16904954

RESUMEN

The aging of an organism is the result of complex changes in structure and function of molecules, cells, tissues, and whole body systems. To increase our understanding of how aging works, we have to analyze and integrate quantitative evidence from multiple levels of biological organization. Here, we define a broader conceptual framework for a quantitative, computational systems biology approach to aging. Initially, we consider fractal supply networks that give rise to scaling laws relating body mass, metabolism and lifespan. This approach provides a top-down view of constrained cellular processes. Concomitantly, multi-omics data generation build such a framework from the bottom-up, using modeling strategies to identify key pathways and their physiological capacity. Multiscale spatio-temporal representations finally connect molecular processes with structural organization. As aging manifests on a systems level, it emerges as a highly networked process regulated through feedback loops between levels of biological organization.


Asunto(s)
Envejecimiento/fisiología , Biología Computacional , Modelos Biológicos , Estrés Oxidativo/fisiología , Animales , Humanos
17.
Cancer Epidemiol Biomarkers Prev ; 15(5): 1000-8, 2006 May.
Artículo en Inglés | MEDLINE | ID: mdl-16702383

RESUMEN

Epidemiologic studies have revealed a complex association between human genetic variance and cancer risk. Quantitative biological modeling based on experimental data can play a critical role in interpreting the effect of genetic variation on biochemical pathways relevant to cancer development and progression. Defects in human DNA base excision repair (BER) proteins can reduce cellular tolerance to oxidative DNA base damage caused by endogenous and exogenous sources, such as exposure to toxins and ionizing radiation. If not repaired, DNA base damage leads to cell dysfunction and mutagenesis, consequently leading to cancer, disease, and aging. Population screens have identified numerous single-nucleotide polymorphism variants in many BER proteins and some have been purified and found to exhibit mild kinetic defects. Epidemiologic studies have led to conflicting conclusions on the association between single-nucleotide polymorphism variants in BER proteins and cancer risk. Using experimental data for cellular concentration and the kinetics of normal and variant BER proteins, we apply a previously developed and tested human BER pathway model to (i) estimate the effect of mild variants on BER of abasic sites and 8-oxoguanine, a prominent oxidative DNA base modification, (ii) identify ranges of variation associated with substantial BER capacity loss, and (iii) reveal nonintuitive consequences of multiple simultaneous variants. Our findings support previous work suggesting that mild BER variants have a minimal effect on pathway capacity whereas more severe defects and simultaneous variation in several BER proteins can lead to inefficient repair and potentially deleterious consequences of cellular damage.


Asunto(s)
ADN Ligasas/fisiología , Reparación del ADN/fisiología , Neoplasias/genética , Daño del ADN , ADN Glicosilasas/fisiología , ADN Ligasas/genética , Predisposición Genética a la Enfermedad , Guanina/análogos & derivados , Humanos , Matemática , Modelos Genéticos , Epidemiología Molecular , Polimorfismo Genético
18.
J Bacteriol ; 186(18): 6298-305, 2004 Sep.
Artículo en Inglés | MEDLINE | ID: mdl-15342600

RESUMEN

DNA microarrays encompassing the entire genome of Yersinia pestis were used to characterize global regulatory changes during steady-state vegetative growth occurring after shift from 26 to 37 degrees C in the presence and absence of Ca2+. Transcriptional profiles revealed that 51, 4, and 13 respective genes and open reading frames (ORFs) on pCD, pPCP, and pMT were thermoinduced and that the majority of these genes carried by pCD were downregulated by Ca2+. In contrast, Ca2+ had little effect on chromosomal genes and ORFs, of which 235 were thermally upregulated and 274 were thermally downregulated. The primary consequence of these regulatory events is profligate catabolism of numerous metabolites available in the mammalian host.


Asunto(s)
Perfilación de la Expresión Génica , Regulación Bacteriana de la Expresión Génica , Yersinia pestis/genética , Adaptación Fisiológica , Calcio/metabolismo , Cromosomas Bacterianos , Genes Bacterianos , Análisis de Secuencia por Matrices de Oligonucleótidos , Plásmidos , Temperatura
19.
BMC Bioinformatics ; 5: 108, 2004 Aug 10.
Artículo en Inglés | MEDLINE | ID: mdl-15304201

RESUMEN

BACKGROUND: Recent technological advances in high-throughput data collection allow for experimental study of increasingly complex systems on the scale of the whole cellular genome and proteome. Gene network models are needed to interpret the resulting large and complex data sets. Rationally designed perturbations (e.g., gene knock-outs) can be used to iteratively refine hypothetical models, suggesting an approach for high-throughput biological system analysis. We introduce an approach to gene network modeling based on a scalable linear variant of fuzzy logic: a framework with greater resolution than Boolean logic models, but which, while still semi-quantitative, does not require the precise parameter measurement needed for chemical kinetics-based modeling. RESULTS: We demonstrated our approach with exhaustive search for fuzzy gene interaction models that best fit transcription measurements by microarray of twelve selected genes regulating the yeast cell cycle. Applying an efficient, universally applicable data normalization and fuzzification scheme, the search converged to a small number of models that individually predict experimental data within an error tolerance. Because only gene transcription levels are used to develop the models, they include both direct and indirect regulation of genes. CONCLUSION: Biological relationships in the best-fitting fuzzy gene network models successfully recover direct and indirect interactions predicted from previous knowledge to result in transcriptional correlation. Fuzzy models fit on one yeast cell cycle data set robustly predict another experimental data set for the same system. Linear fuzzy gene networks and exhaustive rule search are the first steps towards a framework for an integrated modeling and experiment approach to high-throughput "reverse engineering" of complex biological systems.


Asunto(s)
Inteligencia Artificial , Lógica Difusa , Modelos Genéticos , Análisis de Secuencia por Matrices de Oligonucleótidos/estadística & datos numéricos , Ciclo Celular/genética , Biología Computacional/estadística & datos numéricos , Modelos Lineales , Saccharomyces cerevisiae/genética
20.
Free Radic Biol Med ; 37(3): 422-7, 2004 Aug 01.
Artículo en Inglés | MEDLINE | ID: mdl-15223076

RESUMEN

Human DNA can be damaged by natural metabolism through free radical production. It has been suggested that the equilibrium between innate damage and cellular DNA repair results in an oxidative DNA damage background that potentially contributes to disease and aging. Efforts to quantitatively characterize the human oxidative DNA damage background level, based on measuring 8-oxoguanine lesions as a biomarker, have led to estimates that vary over three to four orders of magnitude, depending on the method of measurement. We applied a previously developed and validated quantitative pathway model of human DNA base excision repair, integrating experimentally determined endogenous damage rates and model parameters from multiple sources. Our estimates of at most 100 8-oxoguanine lesions per cell are consistent with the low end of data from biochemical and cell biology experiments, a result robust to model limitations and parameter variation. Our findings show the power of quantitative system modeling to interpret composite experimental data and make biologically and physiologically relevant predictions for complex human DNA repair pathway mechanisms and capacity.


Asunto(s)
Daño del ADN , Reparación del ADN/fisiología , Modelos Biológicos , Estrés Oxidativo/fisiología , Radicales Libres/metabolismo , Humanos , Oxidación-Reducción
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA