Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 33
Filtrar
1.
Nucleic Acids Res ; 50(1): 46-56, 2022 01 11.
Artículo en Inglés | MEDLINE | ID: mdl-34850940

RESUMEN

Clustering cells and depicting the lineage relationship among cell subpopulations are fundamental tasks in single-cell omics studies. However, existing analytical methods face challenges in stratifying cells, tracking cellular trajectories, and identifying critical points of cell transitions. To overcome these, we proposed a novel Markov hierarchical clustering algorithm (MarkovHC), a topological clustering method that leverages the metastability of exponentially perturbed Markov chains for systematically reconstructing the cellular landscape. Briefly, MarkovHC starts with local connectivity and density derived from the input and outputs a hierarchical structure for the data. We firstly benchmarked MarkovHC on five simulated datasets and ten public single-cell datasets with known labels. Then, we used MarkovHC to investigate the multi-level architectures and transition processes during human embryo preimplantation development and gastric cancer procession. MarkovHC found heterogeneous cell states and sub-cell types in lineage-specific progenitor cells and revealed the most possible transition paths and critical points in the cellular processes. These results demonstrated MarkovHC's effectiveness in facilitating the stratification of cells, identification of cell populations, and characterization of cellular trajectories and critical points.


Asunto(s)
Biología Computacional/métodos , Análisis de la Célula Individual/métodos , Blastocisto/citología , Blastocisto/metabolismo , Carcinogénesis/genética , Carcinogénesis/metabolismo , Linaje de la Célula , Humanos , Cadenas de Markov
2.
J Theor Biol ; 532: 110923, 2022 01 07.
Artículo en Inglés | MEDLINE | ID: mdl-34606876

RESUMEN

Dynamic models of gene expression are urgently required. In this paper, we describe the time evolution of gene expression by learning a jump diffusion process to model the biological process directly. Our algorithm needs aggregate gene expression data as input and outputs the parameters of the jump diffusion process. The learned jump diffusion process can predict population distributions of gene expression at any developmental stage, obtain long-time trajectories for individual cells, and offer a novel approach to computing RNA velocity. Moreover, it studies biological systems from a stochastic dynamic perspective. Gene expression data at a time point, which is a snapshot of a cellular process, is treated as an empirical marginal distribution of a stochastic process. The Wasserstein distance between the empirical distribution and predicted distribution by the jump diffusion process is minimized to learn the dynamics. For the learned jump diffusion process, its trajectories correspond to the development process of cells, the stochasticity determines the heterogeneity of cells, its instantaneous rate of state change can be taken as "RNA velocity", and the changes in scales and orientations of clusters can be noticed too. We demonstrate that our method can recover the underlying nonlinear dynamics better compared to previous parametric models and the diffusion processes driven by Brownian motion for both synthetic and real world datasets. Our method is also robust to perturbations of data because the computation involves only population expectations.


Asunto(s)
Modelos Biológicos , Dinámicas no Lineales , Difusión , Expresión Génica , Procesos Estocásticos
3.
PLoS One ; 13(4): e0196226, 2018.
Artículo en Inglés | MEDLINE | ID: mdl-29702671

RESUMEN

Copy number variations (CNVs) are gain and loss of DNA sequence of a genome. High throughput platforms such as microarrays and next generation sequencing technologies (NGS) have been applied for genome wide copy number losses. Although progress has been made in both approaches, the accuracy and consistency of CNV calling from the two platforms remain in dispute. In this study, we perform a deep analysis on copy number losses on 254 human DNA samples, which have both SNP microarray data and NGS data publicly available from Hapmap Project and 1000 Genomes Project respectively. We show that the copy number losses reported from Hapmap Project and 1000 Genome Project only have < 30% overlap, while these reports are required to have cross-platform (e.g. PCR, microarray and high-throughput sequencing) experimental supporting by their corresponding projects, even though state-of-art calling methods were employed. On the other hand, copy number losses are found directly from HapMap microarray data by an accurate algorithm, i.e. CNVhac, almost all of which have lower read mapping depth in NGS data; furthermore, 88% of which can be supported by the sequences with breakpoint in NGS data. Our results suggest the ability of microarray calling CNVs and the possible introduction of false negatives from the unessential requirement of the additional cross-platform supporting. The inconsistency of CNV reports from Hapmap Project and 1000 Genomes Project might result from the inadequate information containing in microarray data, the inconsistent detection criteria, or the filtration effect of cross-platform supporting. The statistical test on CNVs called from CNVhac show that the microarray data can offer reliable CNV reports, and majority of CNV candidates can be confirmed by raw sequences. Therefore, the CNV candidates given by a good caller could be highly reliable without cross-platform supporting, so additional experimental information should be applied in need instead of necessarily.


Asunto(s)
Biología Computacional/métodos , Variaciones en el Número de Copia de ADN , Polimorfismo de Nucleótido Simple , Análisis de Secuencia de ADN/métodos , Algoritmos , Genoma Humano , Proyecto Mapa de Haplotipos , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Proyecto Genoma Humano , Humanos , Análisis de Secuencia por Matrices de Oligonucleótidos/métodos
4.
Nat Commun ; 8(1): 1622, 2017 11 20.
Artículo en Inglés | MEDLINE | ID: mdl-29158486

RESUMEN

In human cells, DNA is hierarchically organized and assembled with histones and DNA-binding proteins in three dimensions. Chromatin interactions play important roles in genome architecture and gene regulation, including robustness in the developmental stages and flexibility during the cell cycle. Here we propose in situ Hi-C method named Bridge Linker-Hi-C (BL-Hi-C) for capturing structural and regulatory chromatin interactions by restriction enzyme targeting and two-step proximity ligation. This method improves the sensitivity and specificity of active chromatin loop detection and can reveal the regulatory enhancer-promoter architecture better than conventional methods at a lower sequencing depth and with a simpler protocol. We demonstrate its utility with two well-studied developmental loci: the beta-globin and HOXC cluster regions.


Asunto(s)
Cromatina/química , Cromatina/metabolismo , Ensayos Analíticos de Alto Rendimiento/tendencias , Línea Celular Tumoral , Cromatina/genética , Cromosomas/química , Cromosomas/genética , ADN/genética , ADN/metabolismo , Regulación de la Expresión Génica , Histonas/metabolismo , Humanos , Unión Proteica , Secuencias Reguladoras de Ácidos Nucleicos
5.
PLoS One ; 11(5): e0155838, 2016.
Artículo en Inglés | MEDLINE | ID: mdl-27195482

RESUMEN

Adaptation is a crucial biological function possessed by many sensory systems. Early work has shown that some influential equilibrium models can achieve accurate adaptation. However, recent studies indicate that there are close relationships between adaptation and nonequilibrium. In this paper, we provide an explanation of these two seemingly contradictory results based on Markov models with relatively simple networks. We show that as the nonequilibrium driving becomes stronger, the system under consideration will undergo a phase transition along a fixed direction: from non-adaptation to simple adaptation then to oscillatory adaptation, while the transition in the opposite direction is forbidden. This indicates that although adaptation may be observed in equilibrium systems, it tends to occur in systems far away from equilibrium. In addition, we find that nonequilibrium will improve the performance of adaptation by enhancing the adaptation efficiency. All these results provide a deeper insight into the connection between adaptation and nonequilibrium. Finally, we use a more complicated network model of bacterial chemotaxis to validate the main results of this paper.


Asunto(s)
Adaptación Biológica , Algoritmos , Fenómenos Fisiológicos Bacterianos , Quimiotaxis , Ambiente , Escherichia coli/metabolismo , Cadenas de Markov , Modelos Biológicos , Modelos Estadísticos , Oscilometría , Procesos Estocásticos , Temperatura , Termodinámica
6.
BMC Med Genomics ; 8 Suppl 2: S14, 2015.
Artículo en Inglés | MEDLINE | ID: mdl-26044773

RESUMEN

BACKGROUND: RNA-Seq is a powerful new technology to comprehensively analyze the transcriptome of any given cells. An important task in RNA-Seq data analysis is quantifying the expression levels of all transcripts. Although many methods have been introduced and much progress has been made, a satisfactory solution remains be elusive. RESULTS: In this article, we borrow the idea from the Positional Dependent Nearest Neighborhood (PDNN) model, originally developed for analyzing microarray data, to model the non-uniformity of read distribution in RNA-seq data. We propose a robust nonlinear regression model named PDEGEM, a Positional Dependent Energy Guided Expression Model to estimate the abundance of transcripts. Using real data, we find that the PDEGEM fits the data better than mseq in all three real datasets we tested. We also find that the expression measure obtained using PDEGEM showed higher correlation with that obtained from alterative assays for quantifying gene and isoform expressions. CONCLUSIONS: Based on these results, we believe that our PDEGEM can improve the accuracy in modeling and estimating the transcript abundance and isoform expression in RNA-Seq data. Additionally, although the stacking energy and positional weight of the PDEGEM are relatively related to sequencing platforms and species, they share some common trends, which indicates that the PDEGEM could partly reflect the mechanism of DNA binding between the template strain and the new synthesized read.


Asunto(s)
Algoritmos , Bases de Datos Genéticas , Modelos Estadísticos , Análisis de Secuencia de ARN/métodos , Animales , Humanos , Ratones , ARN Mensajero/genética , ARN Mensajero/metabolismo , Termodinámica
8.
mBio ; 5(6): e01867, 2014 Nov 25.
Artículo en Inglés | MEDLINE | ID: mdl-25425232

RESUMEN

UNLABELLED: The prokaryotic pangenome partitions genes into core and dispensable genes. The order of core genes, albeit assumed to be stable under selection in general, is frequently interrupted by horizontal gene transfer and rearrangement, but how a core-gene-defined genome maintains its stability or flexibility remains to be investigated. Based on data from 30 species, including 425 genomes from six phyla, we grouped core genes into syntenic blocks in the context of a pangenome according to their stability across multiple isolates. A subset of the core genes, often species specific and lineage associated, formed a core-gene-defined genome organizational framework (cGOF). Such cGOFs are either single segmental (one-third of the species analyzed) or multisegmental (the rest). Multisegment cGOFs were further classified into symmetric or asymmetric according to segment orientations toward the origin-terminus axis. The cGOFs in Gram-positive species are exclusively symmetric and often reversible in orientation, as opposed to those of the Gram-negative bacteria, which are all asymmetric and irreversible. Meanwhile, all species showing strong strand-biased gene distribution contain symmetric cGOFs and often specific DnaE (α subunit of DNA polymerase III) isoforms. Furthermore, functional evaluations revealed that cGOF genes are hub associated with regard to cellular activities, and the stability of cGOF provides efficient indexes for scaffold orientation as demonstrated by assembling virtual and empirical genome drafts. cGOFs show species specificity, and the symmetry of multisegmental cGOFs is conserved among taxa and constrained by DNA polymerase-centric strand-biased gene distribution. The definition of species-specific cGOFs provides powerful guidance for genome assembly and other structure-based analysis. IMPORTANCE: Prokaryotic genomes are frequently interrupted by horizontal gene transfer (HGT) and rearrangement. To know whether there is a set of genes not only conserved in position among isolates but also functionally essential for a given species and to further evaluate the stability or flexibility of such genome structures across lineages are of importance. Based on a large number of multi-isolate pangenomic data, our analysis reveals that a subset of core genes is organized into a core-gene-defined genome organizational framework, or cGOF. Furthermore, the lineage-associated cGOFs among Gram-positive and Gram-negative bacteria behave differently: the former, composed of 2 to 4 segments, have their fragments symmetrically rearranged around the origin-terminus axis, whereas the latter show more complex segmentation and are partitioned asymmetrically into chromosomal structures. The definition of cGOFs provides new insights into prokaryotic genome organization and efficient guidance for genome assembly and analysis.


Asunto(s)
Archaea/genética , Bacterias/genética , Genes Esenciales , Variación Estructural del Genoma , Biología Computacional , Reordenamiento Génico , Genoma Arqueal , Genoma Bacteriano , Inestabilidad Genómica , Sintenía
9.
Phys Biol ; 11(5): 056001, 2014 Aug 14.
Artículo en Inglés | MEDLINE | ID: mdl-25118617

RESUMEN

The inositol trisphosphate receptor (IPR) is a crucial ion channel that regulates the Ca(2+) influx from the endoplasmic reticulum (ER) to the cytoplasm. A thorough study of the IPR channel contributes to a better understanding of calcium oscillations and waves. It has long been observed that the IPR channel is a typical biological system which performs adaptation. However, recent advances on the physical essence of adaptation show that adaptation systems with a negative feedback mechanism, such as the IPR channel, must break detailed balance and always operate out of equilibrium with energy dissipation. Almost all previous IPR models are equilibrium models assuming detailed balance and thus violate the dissipative nature of adaptation. In this article, we constructed a nonequilibrium allosteric model of single IPR channels based on the patch-clamp experimental data obtained from the IPR in the outer membranes of isolated nuclei of the Xenopus oocyte. It turns out that our model reproduces the patch-clamp experimental data reasonably well and produces both the correct steady-state and dynamic properties of the channel. Particularly, our model successfully describes the complicated bimodal [Ca(2+)] dependence of the mean open duration at high [IP3], a steady-state behavior which fails to be correctly described in previous IPR models. Finally, we used the patch-clamp experimental data to validate that the IPR channel indeed breaks detailed balance and thus is a nonequilibrium system which consumes energy.


Asunto(s)
Calcio/fisiología , Receptores de Inositol 1,4,5-Trifosfato/química , Modelos Biológicos , Sitio Alostérico , Animales , Núcleo Celular/química , Simulación por Computador , Femenino , Receptores de Inositol 1,4,5-Trifosfato/fisiología , Oocitos/fisiología , Técnicas de Placa-Clamp , Subunidades de Proteína/química , Subunidades de Proteína/fisiología , Xenopus
10.
IET Syst Biol ; 8(3): 87-95, 2014 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-25014375

RESUMEN

Discovering the regulation of cancer-related gene is of great importance in cancer biology. Transcription factors and microRNAs are two kinds of crucial regulators in gene expression, and they compose a combinatorial regulatory network with their target genes. Revealing the structure of this network could improve the authors' understanding of gene regulation, and further explore the molecular pathway in cancer. In this article, the authors propose a novel approach graphical adaptive lasso (GALASSO) to construct the regulatory network in breast cancer. GALASSO use a Gaussian graphical model with adaptive lasso penalties to integrate the sequence information as well as gene expression profiles. The simulation study and the experimental profiles verify the accuracy of the authors' approach. The authors further reveal the structure of the regulatory network, and explore the role of feedforward loops in gene regulation. In addition, the authors discuss the combinatorial regulatory effect between transcription factors and microRNAs, and select miR-155 for detailed analysis of microRNA's role in cancer. The proposed GALASSO approach is an efficient method to construct the combinatorial regulatory network. It also provides a new way to integrate different data sources and could find more applications in meta-analysis problem.


Asunto(s)
Neoplasias de la Mama/genética , MicroARNs/genética , Factores de Transcripción/metabolismo , Algoritmos , Neoplasias de la Mama/metabolismo , Biología Computacional/métodos , Gráficos por Computador , Simulación por Computador , Femenino , Perfilación de la Expresión Génica , Regulación de la Expresión Génica , Regulación Neoplásica de la Expresión Génica , Redes Reguladoras de Genes , Humanos , MicroARNs/metabolismo , Neoplasias/genética , Distribución Normal , Curva ROC
11.
IET Syst Biol ; 8(4): 138-45, 2014 Aug.
Artículo en Inglés | MEDLINE | ID: mdl-25075526

RESUMEN

A number of biological systems can be modelled by Markov chains. Recently, there has been an increasing concern about when biological systems modelled by Markov chains will perform a dynamic phenomenon called overshoot. In this study, the authors found that the steady-state behaviour of the system will have a great effect on the occurrence of overshoot. They showed that overshoot in general cannot occur in systems that will finally approach an equilibrium steady state. They further classified overshoot into two types, named as simple overshoot and oscillating overshoot. They showed that except for extreme cases, oscillating overshoot will occur if the system is far from equilibrium. All these results clearly show that overshoot is a non-equilibrium dynamic phenomenon with energy consumption. In addition, the main result in this study is validated with real experimental data.


Asunto(s)
Neoplasias de la Mama/fisiopatología , Transformación Celular Neoplásica/metabolismo , Metabolismo Energético , Cadenas de Markov , Modelos Biológicos , Modelos Estadísticos , Células Madre Neoplásicas/fisiología , Neoplasias de la Mama/patología , Proliferación Celular , Transformación Celular Neoplásica/patología , Simulación por Computador , Humanos , Células Madre Neoplásicas/patología , Termodinámica , Células Tumorales Cultivadas
12.
Nucleic Acids Res ; 42(5): 3009-16, 2014 Mar.
Artículo en Inglés | MEDLINE | ID: mdl-24343027

RESUMEN

DNA methylation is an important defense and regulatory mechanism. In mammals, most DNA methylation occurs at CpG sites, and asymmetric non-CpG methylation has only been detected at appreciable levels in a few cell types. We are the first to systematically study the strand-specific distribution of non-CpG methylation. With the divide-and-compare strategy, we show that CHG and CHH methylation are not intrinsically different in human embryonic stem cells (ESCs) and induced pluripotent stem cells (iPSCs). We also find that non-CpG methylation is skewed between the two strands in introns, especially at intron boundaries and in highly expressed genes. Controlling for the proximal sequences of non-CpG sites, we show that the skew of non-CpG methylation in introns is mainly guided by sequence skew. By studying subgroups of transposable elements, we also found that non-CpG methylation is distributed in a strand-specific manner in both short interspersed nuclear elements (SINE) and long interspersed nuclear elements (LINE), but not in long terminal repeats (LTR). Finally, we show that on the antisense strand of Alus, a non-CpG site just downstream of the A-box is highly methylated. Together, the divide-and-compare strategy leads us to identify regions with strand-specific distributions of non-CpG methylation in humans.


Asunto(s)
Metilación de ADN , Células Madre Pluripotentes/metabolismo , Línea Celular , Islas de CpG , Humanos , Intrones , Elementos de Nucleótido Esparcido Largo , Análisis de Secuencia de ADN , Elementos de Nucleótido Esparcido Corto , Secuencias Repetidas Terminales , Transcripción Genética
13.
BMC Genomics ; 14: 31, 2013 Jan 16.
Artículo en Inglés | MEDLINE | ID: mdl-23324182

RESUMEN

BACKGROUND: Microarray technology is widely utilized for monitoring the expression changes of thousands of genes simultaneously. However, the requirement of relatively large amount of RNA for labeling and hybridization makes it difficult to perform microarray experiments with limited biological materials, thus leads to the development of many methods for preparing and amplifying mRNA. It is addressed that amplification methods usually bring bias, which may strongly hamper the following interpretation of the results. A big challenge is how to correct for the bias before further analysis. RESULTS: In this article, we observed the bias in rice gene expression microarray data generated with the Affymetrix one-cycle, two-cycle RNA labeling protocols, followed by validation with Real Time PCR. Based on these data, we proposed a statistical framework to model the processes of mRNA two-cycle linear amplification, and established a linear model for probe level correction. Maximum Likelihood Estimation (MLE) was applied to perform robust estimation of the Retaining Rate for each probe. After bias correction, some known pre-processing methods, such as PDNN, could be combined to finish preprocessing. Then, we evaluated our model and the results suggest that our model can effectively increase the quality of the microarray raw data: (i) Decrease the Coefficient of Variation for PM intensities of probe sets; (ii) Distinguish the microarray samples of five stages for rice stamen development more clearly; (iii) Improve the correlation coefficients among stamen microarray samples. We also discussed the necessity of model adjustment by comparing with another simple adjustment method. CONCLUSION: We conclude that the adjustment model is necessary and could effectively increase the quality of estimation for gene expression from the microarray raw data.


Asunto(s)
Análisis de Secuencia por Matrices de Oligonucleótidos/métodos , ARN/genética , ARN/metabolismo , Flores/genética , Perfilación de la Expresión Génica , Modelos Estadísticos , Oryza/genética , ARN/análisis , ARN Mensajero/genética , ARN Mensajero/metabolismo , Reacción en Cadena en Tiempo Real de la Polimerasa , Coloración y Etiquetado
14.
Quant Biol ; 1(3): 201-208, 2013 Sep.
Artículo en Inglés | MEDLINE | ID: mdl-26085954

RESUMEN

Cancer stem cell (CSC) theory suggests a cell-lineage structure in tumor cells in which CSCs are capable of giving rise to the other non-stem cancer cells (NSCCs) but not vice versa. However, an alternative scenario of bidirectional interconversions between CSCs and NSCCs was proposed very recently. Here we present a general population model of cancer cells by integrating conventional cell divisions with direct conversions between different cell states, namely, not only can CSCs differentiate into NSCCs by asymmetric cell division, NSCCs can also dedifferentiate into CSCs by cell state conversion. Our theoretical model is validated when applying the model to recent experimental data. It is also found that the transient increase in CSCs proportion initiated from the purified NSCCs subpopulation cannot be well predicted by the conventional CSC model where the conversion from NSCCs to CSCs is forbidden, implying that the cell state conversion is required especially for the transient dynamics. The theoretical analysis also gives the condition such that our general model can be equivalently reduced into a simple Markov chain with only cell state transitions keeping the same cell proportion dynamics.

15.
PLoS One ; 7(6): e38743, 2012.
Artículo en Inglés | MEDLINE | ID: mdl-22719933

RESUMEN

UNLABELLED: Etiologic diagnoses of lower respiratory tract infections (LRTI) have been relying primarily on bacterial cultures that often fail to return useful results in time. Although DNA-based assays are more sensitive than bacterial cultures in detecting pathogens, the molecular results are often inconsistent and challenged by doubts on false positives, such as those due to system- and environment-derived contaminations. Here we report a nationwide cohort study on 2986 suspected LRTI patients across P. R. China. We compared the performance of a DNA-based assay qLAMP (quantitative Loop-mediated isothermal AMPlification) with that of standard bacterial cultures in detecting a panel of eight common respiratory bacterial pathogens from sputum samples. Our qLAMP assay detects the panel of pathogens in 1047(69.28%) patients from 1533 qualified patients at the end. We found that the bacterial titer quantified based on qLAMP is a predictor of probability that the bacterium in the sample can be detected in culture assay. The relatedness of the two assays fits a logistic regression curve. We used a piecewise linear function to define breakpoints where latent pathogen abruptly change its competitive relationship with others in the panel. These breakpoints, where pathogens start to propagate abnormally, are used as cutoffs to eliminate the influence of contaminations from normal flora. With help of the cutoffs derived from statistical analysis, we are able to identify causative pathogens in 750 (48.92%) patients from qualified patients. In conclusion, qLAMP is a reliable method in quantifying bacterial titer. Despite the fact that there are always latent bacteria contaminated in sputum samples, we can identify causative pathogens based on cutoffs derived from statistical analysis of competitive relationship. TRIAL REGISTRATION: ClinicalTrials.gov NCT00567827.


Asunto(s)
Infecciones Bacterianas/diagnóstico , Infecciones del Sistema Respiratorio/diagnóstico , Esputo/microbiología , Infecciones Bacterianas/microbiología , Humanos , Probabilidad , Infecciones del Sistema Respiratorio/microbiología
16.
BMC Med Genomics ; 5: 24, 2012 Jun 12.
Artículo en Inglés | MEDLINE | ID: mdl-22691279

RESUMEN

BACKGROUND: Copy number variation (CNV) is essential to understand the pathology of many complex diseases at the DNA level. Affymetrix SNP arrays, which are widely used for CNV studies, significantly depend on accurate copy number (CN) estimation. Nevertheless, CN estimation may be biased by several factors, including cross-hybridization and training sample batch, as well as genomic waves of intensities induced by sequence-dependent hybridization rate and amplification efficiency. Since many available algorithms only address one or two of the three factors, a high false discovery rate (FDR) often results when identifying CNV. Therefore, we have developed a new CNV detection pipeline which is based on hybridization and amplification rate correction (CNVhac). METHODS: CNVhac first estimates the allelic concentrations (ACs) of target sequences by using the sample independent parameters trained through physicochemical hybridization law. Then the raw CN is estimated by taking the ratio of AC to the corresponding average AC from a reference sample set for one specific site. Finally, a hidden Markov model (HMM) segmentation process is implemented to detect CNV regions. RESULTS: Based on public HapMap data, the results show that CNVhac effectively smoothes the genomic waves and facilitates more accurate raw CN estimates compared to other methods. Moreover, CNVhac alleviates, to a certain extent, the sample dependence of inference and makes CNV calling with appreciable low FDRs. CONCLUSION: CNVhac is an effective approach to address the common difficulties in SNP array analysis, and the working principles of CNVhac can be easily extended to other platforms.


Asunto(s)
Técnicas de Amplificación de Ácido Nucleico/métodos , Hibridación de Ácido Nucleico/métodos , Análisis de Secuencia por Matrices de Oligonucleótidos/métodos , Polimorfismo de Nucleótido Simple/genética , Artefactos , Calibración , Variaciones en el Número de Copia de ADN/genética , Femenino , Proyecto Mapa de Haplotipos , Humanos , Masculino
17.
PLoS One ; 7(3): e33160, 2012.
Artículo en Inglés | MEDLINE | ID: mdl-22432002

RESUMEN

Phosphorylation and transcriptional regulation events are critical for cells to transmit and respond to signals. In spite of its importance, systems-level strategies that couple these two networks have yet to be presented. Here we introduce a novel approach that integrates the physical and functional aspects of phosphorylation network together with the transcription network in S.cerevisiae, and demonstrate that different network motifs are involved in these networks, which should be considered in interpreting and integrating large scale datasets. Based on this understanding, we introduce a HeRS score (hetero-regulatory similarity score) to systematically characterize the functional relevance of kinase/phosphatase involvement with transcription factor, and present an algorithm that predicts hetero-regulatory modules. When extended to signaling network, this approach confirmed the structure and cross talk of MAPK pathways, inferred a novel functional transcription factor Sok2 in high osmolarity glycerol pathway, and explained the mechanism of reduced mating efficiency upon Fus3 deletion. This strategy is applicable to other organisms as large-scale datasets become available, providing a means to identify the functional relationships between kinases/phosphatases and transcription factors.


Asunto(s)
Redes Reguladoras de Genes/genética , Transducción de Señal/genética , Bases de Datos Genéticas , Retroalimentación Fisiológica , Regulación de la Expresión Génica , Motivos de Nucleótidos/genética , Fosforilación , Saccharomyces cerevisiae/genética , Saccharomyces cerevisiae/metabolismo , Proteínas de Saccharomyces cerevisiae/metabolismo
18.
J Theor Biol ; 296: 13-20, 2012 Mar 07.
Artículo en Inglés | MEDLINE | ID: mdl-22100501

RESUMEN

In this paper, we perform a complete analysis of the kinetic behavior of the general modifier mechanism of Botts and Morales in both equilibrium steady states and non-equilibrium steady states (NESS). Enlightened by the non-equilibrium theory of Markov chains, we introduce the net flux into discussion and acquire an expression of the rate of product formation in NESS, which has clear biophysical significance. Up till now, it is a general belief that being an activator or an inhibitor is an intrinsic property of the modifier. However, we reveal that this traditional point of view is based on the equilibrium assumption. A modifier may no longer be an overall activator or inhibitor when the reaction system is not in equilibrium. Based on the regulation of enzyme activity by the modifier concentration, we classify the kinetic behavior of the modifier into three categories, which are named hyperbolic behavior, bell-shaped behavior, and switching behavior, respectively. We show that the switching phenomenon, in which a modifier may convert between an activator and an inhibitor when the modifier concentration varies, occurs only in NESS. Effects of drugs on the Pgp ATPase activity, where drugs may convert from activators to inhibitors with the increase of the drug concentration, are taken as a typical example to demonstrate the occurrence of the switching phenomenon.


Asunto(s)
Catálisis , Activación Enzimática , Inhibidores Enzimáticos/química , Modelos Químicos , Miembro 1 de la Subfamilia B de Casetes de Unión a ATP/química , Adenosina Trifosfatasas/química , Enzimas/química
19.
Front Biosci (Elite Ed) ; 4(6): 2150-61, 2012 01 01.
Artículo en Inglés | MEDLINE | ID: mdl-22202027

RESUMEN

The goal of network clustering algorithms detect dense clusters in a network, and provide a first step towards the understanding of large scale biological networks. With numerous recent advances in biotechnologies, large-scale genetic interactions are widely available, but there is a limited understanding of which clustering algorithms may be most effective. In order to address this problem, we conducted a systematic study to compare and evaluate six clustering algorithms in analyzing genetic interaction networks, and investigated influencing factors in choosing algorithms. The algorithms considered in this comparison include hierarchical clustering, topological overlap matrix, bi-clustering, Markov clustering, Bayesian discriminant analysis based community detection, and variational Bayes approach to modularity. Both experimentally identified and synthetically constructed networks were used in this comparison. The accuracy of the algorithms is measured by the Jaccard index in comparing predicted gene modules with benchmark gene sets. The results suggest that the choice differs according to the network topology and evaluation criteria. Hierarchical clustering showed to be best at predicting protein complexes; Bayesian discriminant analysis based community detection proved best under epistatic miniarray profile (EMAP) datasets; the variational Bayes approach to modularity was noticeably better than the other algorithms in the genome-scale networks.


Asunto(s)
Algoritmos , Redes Reguladoras de Genes , Análisis por Conglomerados
20.
BMC Syst Biol ; 5 Suppl 1: S9, 2011 Jun 20.
Artículo en Inglés | MEDLINE | ID: mdl-21689484

RESUMEN

BACKGROUND: Cellular functions depend on genetic, physical and other types of interactions. As such, derived interaction networks can be utilized to discover novel genes involved in specific biological processes. Epistatic Miniarray Profile, or E-MAP, which is an experimental platform that measures genetic interactions on a genome-wide scale, has successfully recovered known pathways and revealed novel protein complexes in Saccharomyces cerevisiae (budding yeast). RESULTS: By combining E-MAP data with co-expression data, we first predicted a potential cell cycle related gene set. Using Gene Ontology (GO) function annotation as a benchmark, we demonstrated that the prediction by combining microarray and E-MAP data is generally >50% more accurate in identifying co-functional gene pairs than the prediction using either data source alone. We also used transcription factor (TF)-DNA binding data (Chip-chip) and protein phosphorylation data to construct a local cell cycle regulation network based on potential cell cycle related gene set we predicted. Finally, based on the E-MAP screening with 48 cell cycle genes crossing 1536 library strains, we predicted four unknown genes (YPL158C, YPR174C, YJR054W, and YPR045C) as potential cell cycle genes, and analyzed them in detail. CONCLUSION: By integrating E-MAP and DNA microarray data, potential cell cycle-related genes were detected in budding yeast. This integrative method significantly improves the reliability of identifying co-functional gene pairs. In addition, the reconstructed network sheds light on both the function of known and predicted genes in the cell cycle process. Finally, our strategy can be applied to other biological processes and species, given the availability of relevant data.


Asunto(s)
Ciclo Celular/genética , Epigénesis Genética/genética , Perfilación de la Expresión Génica , Genes Fúngicos/genética , Saccharomyces cerevisiae/citología , Saccharomyces cerevisiae/genética , Integración de Sistemas , Proteína Quinasa CDC28 de Saccharomyces cerevisiae/metabolismo , Análisis por Conglomerados , Redes Reguladoras de Genes , Genómica , Saccharomyces cerevisiae/metabolismo , Factores de Transcripción/metabolismo
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA
...