Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 13 de 13
Filtrar
Mais filtros








Base de dados
Intervalo de ano de publicação
1.
Front Genet ; 15: 1354208, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-38463168

RESUMO

CTCF-mediated chromatin loops create insulated neighborhoods that constrain promoter-enhancer interactions, serving as a unit of gene regulation. Disruption of the CTCF binding sites (CBS) will lead to the destruction of insulated neighborhoods, which in turn can cause dysregulation of the contained genes. In a recent study, it is found that CTCF/cohesin binding sites are a major mutational hotspot in the cancer genome. Mutations can affect CTCF binding, causing the disruption of insulated neighborhoods. And our analysis reveals a significant enrichment of well-known proto-oncogenes in insulated neighborhoods with mutations specifically occurring in anchor regions. It can be assumed that some mutations disrupt CTCF binding, leading to the disruption of insulated neighborhoods and subsequent activation of proto-oncogenes within these insulated neighborhoods. To explore the consequences of such mutations, we develop DeepCBS, a computational tool capable of analyzing mutations at CTCF binding sites, predicting their influence on insulated neighborhoods, and investigating the potential activation of proto-oncogenes. Futhermore, DeepCBS is applied to somatic mutation data of liver cancer. As a result, 87 mutations that disrupt CTCF binding sites are identified, which leads to the identification of 237 disrupted insulated neighborhoods containing a total of 135 genes. Integrative analysis of gene expression differences in liver cancer further highlights three genes: ARHGEF39, UBE2C and DQX1. Among them, ARHGEF39 and UBE2C have been reported in the literature as potential oncogenes involved in the development of liver cancer. The results indicate that DQX1 may be a potential oncogene in liver cancer and may contribute to tumor immune escape. In conclusion, DeepCBS is a promising method to analyze impacts of mutations occurring at CTCF binding sites on the insulator function of CTCF, with potential extensions to shed light on the effects of mutations on other functions of CTCF.

2.
Front Cell Dev Biol ; 10: 978962, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-36393848

RESUMO

Early embryonic cell cycles usually alternate between S and M phases without any gap phase. When the gap phases are developmentally introduced in various cell types remains poorly defined especially during embryogenesis. To establish the cell-specific introduction of gap phases in embryo, we generate multiple fluorescence ubiquitin cell cycle indicators (FUCCI) in C. elegans. Time-lapse 3D imaging followed by lineal expression profiling reveals sharp and differential accumulation of the FUCCI reporters, allowing the systematic demarcation of cell cycle phases throughout embryogenesis. Accumulation of the reporters reliably identifies both G1 and G2 phases only in two embryonic cells with an extended cell cycle length, suggesting that the remaining cells divide either without a G1 phase, or with a brief G1 phase that is too short to be picked up by our reporters. In summary, we provide an initial picture of gap phase introduction in a metazoan embryo. The newly developed FUCCI reporters pave the way for further characterization of developmental control of cell cycle progression.

3.
BMC Bioinformatics ; 23(1): 304, 2022 Jul 27.
Artigo em Inglês | MEDLINE | ID: mdl-35896971

RESUMO

BACKGROUND: Previous studies have demonstrated the value of re-analysing publicly available genetics data with recent analytical approaches. Publicly available datasets, such as the Women's Health Initiative (WHI) offered by the database of genotypes and phenotypes (dbGaP), provide a wealthy resource for researchers to perform multiple analyses, including Genome-Wide Association Studies. Often, the genetic information of individuals in these datasets are stored in imputed dosage files output by MaCH; mldose and mlinfo files. In order for researchers to perform GWAS studies with this data, they must first be converted to a file format compatible with their tool of choice e.g., PLINK. Currently, there is no published tool which easily converts the datasets provided in MACH dosage files into PLINK-ready files. RESULTS: Herein, we present Canary a singularity-based tool which converts MaCH dosage files into PLINK-compatible files with a single line of user input at the command line. Further, we provide a detailed tutorial on preparation of phenotype files. Moreover, Canary comes with preinstalled software often used during GWAS studies, to further increase the ease-of-use of HPC systems for researchers. CONCLUSIONS: Until now, conversion of imputed data in the form of MaCH mldose and mlinfo files needed to be completed manually. Canary uses singularity container technology to allow users to automatically convert these MaCH files into PLINK compatible files. Additionally, Canary provides researchers with a platform to conduct GWAS analysis more easily as it contains essential software needed for conducting GWAS studies, such as PLINK and Bioconductor. We hope that this tool will greatly increase the ease at which researchers can perform GWAS with imputed data, particularly on HPC environments.


Assuntos
Estudo de Associação Genômica Ampla , Feminino , Humanos , Fenótipo , Polimorfismo de Nucleotídeo Único , Software
4.
Brief Bioinform ; 23(2)2022 03 10.
Artigo em Inglês | MEDLINE | ID: mdl-35021191

RESUMO

Networks consisting of molecular interactions are intrinsically dynamical systems of an organism. These interactions curated in molecular interaction databases are still not complete and contain false positives introduced by high-throughput screening experiments. In this study, we propose a framework to integrate interactions of functional associated protein-coding genes from 31 data sources to reconstruct a network with high coverage and quality. For each interaction, 369 features were constructed including properties of both the interaction and the involved genes. The training and validation sets were built on the pathway interactions as positives and the potential negative instances resulting from our proposed semi-supervised strategy. Random forest classification method was then applied to train and predict multiple times to give a score for each interaction. After setting a threshold estimated by a Binomial distribution, a Human protein-coding Gene Functional Association Network (HuGFAN) was reconstructed with 20 383 genes and 1185 429 high confidence interactions. Then, HuGFAN was compared with other networks from data sources with respect to network properties, suggesting that HuGFAN is more function and pathway related. Finally, HuGFAN was applied to identify cancer driver through two famous network-based methods (DriverNet and HotNet2) to show its outstanding performance compared with other networks. HuGFAN and other supplementary files are freely available at https://github.com/xthuang226/HuGFAN.


Assuntos
Redes Reguladoras de Genes , Aprendizado de Máquina , Bases de Dados Factuais , Humanos
5.
Dev Genes Evol ; 230(4): 265-278, 2020 07.
Artigo em Inglês | MEDLINE | ID: mdl-32556563

RESUMO

hlh-1 is a myogenic transcription factor required for body-wall muscle specification during embryogenesis in Caenorhabditis elegans. Despite its well-known role in muscle specification, comprehensive regulatory control upstream of hlh-1 remains poorly defined. Here, we first established a statistical reference for the spatiotemporal expression of hlh-1 at single-cell resolution up to the second last round of divisions for most of the cell lineages (from 4- to 350-cell stage) using 13 wild-type embryos. We next generated lineal expression of hlh-1 after RNA interference (RNAi) perturbation of 65 genes, which were selected based on their degree of conservation, mutant phenotypes, and known roles in development. We then compared the expression profiles between wild-type and RNAi embryos by clustering according to their lineal expression patterns using mean-shift and density-based clustering algorithms, which not only confirmed the roles of existing genes but also uncovered the potential functions of novel genes in muscle specification at multiple levels, including cellular, lineal, and embryonic levels. By combining the public data on protein-protein interactions, protein-DNA interactions, and genetic interactions with our RNAi data, we inferred regulatory pathways upstream of hlh-1 that function globally or locally. This work not only revealed diverse and multilevel regulatory mechanisms coordinating muscle differentiation during C. elegans embryogenesis but also laid a foundation for further characterizing the regulatory pathways controlling muscle specification at the cellular, lineal (local), or embryonic (global) level.


Assuntos
Proteínas de Caenorhabditis elegans/metabolismo , Caenorhabditis elegans/embriologia , Caenorhabditis elegans/metabolismo , Desenvolvimento Muscular/genética , Proteínas Musculares/metabolismo , Proteínas Nucleares/metabolismo , Fatores de Transcrição/metabolismo , Animais , Caenorhabditis elegans/genética , Proteínas de Caenorhabditis elegans/genética , Linhagem da Célula/genética , Regulação da Expressão Gênica no Desenvolvimento/genética , Família Multigênica , Proteínas Musculares/genética , Proteínas Nucleares/genética , Fenótipo , Interferência de RNA , Transdução de Sinais/genética , Análise de Célula Única , Fatores de Transcrição/genética
6.
Database (Oxford) ; 2018: 1-6, 2018 01 01.
Artigo em Inglês | MEDLINE | ID: mdl-30219837

RESUMO

While long non-coding RNAs (lncRNAs) may play important roles in cellular function and biological process, we still know little about them. Growing evidences indicate that subcellular localization of lncRNAs may provide clues to their functionality. To facilitate researchers functionally characterize thousands of lncRNAs, we developed a database-driven application, lncSLdb, which stores and manages user-collected qualitative and quantitative subcellular localization information of lncRNAs from literature mining. The current release contains >11 000 transcripts from three species. Based on the accumulated region of lncRNAs, we classify transcripts into three basic localization types (nucleus, cytoplasm and nucleus/cytoplasm). In some conditions, the nucleus and cytoplasm types can be divided into three more accurate subtypes (chromosome, nucleoplasm and ribosome). Besides browsing and downloading data in lncSLdb, our system provides a set of comprehensive tools to search by gene symbols, genome coordinates or sequence similarity. We hope that lncSLdb will provide a convenient platform for researchers to investigate the functions and the molecular mechanisms of lncRNAs in the view of subcellular localization.


Assuntos
RNA Longo não Codificante/genética , Software , Bases de Dados Genéticas , Ferramenta de Busca , Estatística como Assunto , Frações Subcelulares/metabolismo
7.
Genetics ; 209(1): 37-49, 2018 05.
Artigo em Inglês | MEDLINE | ID: mdl-29567658

RESUMO

Intercellular signaling interactions play a key role in breaking fate symmetry during animal development. Identification of signaling interactions at cellular resolution is technically challenging, especially in a developing embryo. Here, we develop a platform that allows automated inference and validation of signaling interactions for every cell cycle of Caenorhabditis elegans embryogenesis. This is achieved by the generation of a systems-level cell contact map, which consists of 1114 highly confident intercellular contacts, by modeling analysis and is validated through cell membrane labeling coupled with cell lineage analysis. We apply the map to identify cell pairs between which a Notch signaling interaction takes place. By generating expression patterns for two ligands and two receptors of the Notch signaling pathway with cellular resolution using the automated expression profiling technique, we are able to refine existing and identify novel Notch interactions during C. elegans embryogenesis. Targeted cell ablation followed by cell lineage analysis demonstrates the roles of signaling interactions during cell division in breaking fate symmetry. Finally, we describe the development of a website that allows online access to the cell-cell contact map for mapping of other signaling interactions by the community. The platform can be adapted to establish cellular interactions from any other signaling pathway.


Assuntos
Ciclo Celular , Desenvolvimento Embrionário , Transdução de Sinais , Animais , Animais Geneticamente Modificados , Biomarcadores , Caenorhabditis elegans/embriologia , Caenorhabditis elegans/genética , Caenorhabditis elegans/metabolismo , Comunicação Celular , Linhagem da Célula , Proteínas de Drosophila/metabolismo , Dosagem de Genes , Ligação Proteica , Receptores Notch/metabolismo , Reprodutibilidade dos Testes , Transgenes
8.
BMC Bioinformatics ; 18(1): 72, 2017 Jan 31.
Artigo em Inglês | MEDLINE | ID: mdl-28137264

RESUMO

BACKGROUND: With the increase in the amount of DNA methylation and gene expression data, the epigenetic mechanisms of cancers can be extensively investigate. Available methods integrate the DNA methylation and gene expression data into a network by specifying the anti-correlation between them. However, the correlation between methylation and expression is usually unknown and difficult to determine. RESULTS: To address this issue, we present a novel multiple network framework for epigenetic modules, namely, Epigenetic Module based on Differential Networks (EMDN) algorithm, by simultaneously analyzing DNA methylation and gene expression data. The EMDN algorithm prevents the specification of the correlation between methylation and expression. The accuracy of EMDN algorithm is more efficient than that of modern approaches. On the basis of The Cancer Genome Atlas (TCGA) breast cancer data, we observe that the EMDN algorithm can recognize positively and negatively correlated modules and these modules are significantly more enriched in the known pathways than those obtained by other algorithms. These modules can serve as bio-markers to predict breast cancer subtypes by using methylation profiles, where positively and negatively correlated modules are of equal importance in the classification of cancer subtypes. Epigenetic modules also estimate the survival time of patients, and this factor is critical for cancer therapy. CONCLUSIONS: The proposed model and algorithm provide an effective method for the integrative analysis of DNA methylation and gene expression. The algorithm is freely available as an R-package at https://github.com/william0701/EMDN .


Assuntos
Algoritmos , Neoplasias da Mama/genética , Metilação de DNA , Epigênese Genética , Transcriptoma , Neoplasias da Mama/metabolismo , Feminino , Perfilação da Expressão Gênica , Redes Reguladoras de Genes , Genômica , Humanos
9.
Bioinformatics ; 33(10): 1528-1535, 2017 May 15.
Artigo em Inglês | MEDLINE | ID: mdl-28011782

RESUMO

MOTIVATION: Cell fate specification plays a key role to generate distinct cell types during metazoan development. However, most of the underlying signaling networks at cellular level are not well understood. Availability of time lapse single-cell gene expression data collected throughout Caenorhabditis elegans embryogenesis provides an excellent opportunity for investigating signaling networks underlying cell fate specification at systems, cellular and molecular levels. RESULTS: We propose a framework to infer signaling networks at cellular level by exploring the single-cell gene expression data. Through analyzing the expression data of nhr-25 , a hypodermis-specific transcription factor, in every cells of both wild-type and mutant C.elegans embryos through RNAi against 55 genes, we have inferred a total of 23 genes that regulate (activate or inhibit) nhr-25 expression in cell-specific fashion. We also infer the signaling pathways consisting of each of these genes and nhr-25 based on a probabilistic graphical model for the selected five founder cells, 'ABarp', 'ABpla', 'ABpra', 'Caa' and 'Cpa', which express nhr-25 and mostly develop into hypodermis. By integrating the inferred pathways, we reconstruct five signaling networks with one each for the five founder cells. Using RNAi gene knockdown as a validation method, the inferred networks are able to predict the effects of the knockdown genes. These signaling networks in the five founder cells are likely to ensure faithful hypodermis cell fate specification in C.elegans at cellular level. AVAILABILITY AND IMPLEMENTATION: All source codes and data are available at the github repository https://github.com/xthuang226/Worm_Single_Cell_Data_and_Codes.git . CONTACT: zhuyuan@cug.edu.cn. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Algoritmos , Caenorhabditis elegans/crescimento & desenvolvimento , Desenvolvimento Embrionário/genética , Perfilação da Expressão Gênica/métodos , Regulação da Expressão Gênica no Desenvolvimento , Transdução de Sinais/genética , Análise de Célula Única/métodos , Animais , Caenorhabditis elegans/genética , Caenorhabditis elegans/metabolismo , Proteínas de Caenorhabditis elegans/genética , Proteínas de Caenorhabditis elegans/metabolismo , Diferenciação Celular/genética , Proteínas de Ligação a DNA/genética , Interferência de RNA , Fatores de Transcrição/genética
10.
Mol Biosyst ; 12(1): 85-92, 2016 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-26555698

RESUMO

In Caenorhabditis elegans, a large number of protein-protein interactions (PPIs) are identified by different experiments. However, a comprehensive weighted PPI network, which is essential for signaling pathway inference, is not yet available in this model organism. Therefore, we firstly construct an integrative PPI network in C. elegans with 12,951 interactions involving 5039 proteins from seven molecular interaction databases. Then, a reliability score based on a probabilistic graphical model (RSPGM) is proposed to assess PPIs. It assumes that the random number of interactions between two proteins comes from the Bernoulli distribution to avoid multi-links. The main parameter of the RSPGM score contains a few latent variables which can be considered as several common properties between two proteins. Validations on high-confidence yeast datasets show that RSPGM provides more accurate evaluation than other approaches, and the PPIs in the reconstructed PPI network have higher biological relevance than that in the original network in terms of gene ontology, gene expression, essentiality and the prediction of known protein complexes. Furthermore, this weighted integrative PPI network in C. elegans is employed on inferring interaction path of the canonical Wnt/ß-catenin pathway as well. Most genes on the inferred interaction path have been validated to be Wnt pathway components. Therefore, RSPGM is essential and effective for evaluating PPIs and inferring interaction path. Finally, the PPI network with RSPGM scores can be queried and visualized on a user interactive website, which is freely available at .


Assuntos
Proteínas de Caenorhabditis elegans/metabolismo , Biologia Computacional/métodos , Modelos Biológicos , Modelos Estatísticos , Mapeamento de Interação de Proteínas , Mapas de Interação de Proteínas , Algoritmos , Animais , Bases de Dados de Proteínas , Mapeamento de Interação de Proteínas/métodos , Reprodutibilidade dos Testes , Navegador
11.
Mol Syst Biol ; 11(6): 814, 2015 Jun 10.
Artigo em Inglês | MEDLINE | ID: mdl-26063786

RESUMO

Coordination of cell division timing is crucial for proper cell fate specification and tissue growth. However, the differential regulation of cell division timing across or within cell types during metazoan development remains poorly understood. To elucidate the systems-level genetic architecture coordinating division timing, we performed a high-content screening for genes whose depletion produced a significant reduction in the asynchrony of division between sister cells (ADS) compared to that of wild-type during Caenorhabditis elegans embryogenesis. We quantified division timing using 3D time-lapse imaging followed by computer-aided lineage analysis. A total of 822 genes were selected for perturbation based on their conservation and known roles in development. Surprisingly, we find that cell fate determinants are not only essential for establishing fate asymmetry, but also are imperative for setting the ADS regardless of cellular context, indicating a common genetic architecture used by both cellular processes. The fate determinants demonstrate either coupled or separate regulation between the two processes. The temporal coordination appears to facilitate cell migration during fate specification or tissue growth. Our quantitative dataset with cellular resolution provides a resource for future analyses of the genetic control of spatial and temporal coordination during metazoan development.


Assuntos
Proteínas de Caenorhabditis elegans/biossíntese , Diferenciação Celular/genética , Divisão Celular/genética , Desenvolvimento Embrionário , Animais , Caenorhabditis elegans/embriologia , Caenorhabditis elegans/genética , Linhagem da Célula/genética , Movimento Celular , Regulação da Expressão Gênica no Desenvolvimento
12.
Biomed Eng Online ; 12 Suppl 1: S1, 2013.
Artigo em Inglês | MEDLINE | ID: mdl-24564942

RESUMO

BACKGROUND: In Caenorhabditis elegans early embryo, cell cycles only have two phases: DNA synthesis and mitosis, which are different from the typical 4-phase cell cycle. Modeling this cell-cycle process into network can fill up the gap in C. elegans cell-cycle study and provide a thorough understanding on the cell-cycle regulations and progressions at the network level. METHODS: In this paper, C. elegans early embryonic cell-cycle network has been constructed based on the knowledge of key regulators and their interactions from literature studies. A discrete dynamical Boolean model has been applied in computer simulations to study dynamical properties of this network. The cell-cycle network is compared with random networks and tested under several perturbations to analyze its robustness. To investigate whether our proposed network could explain biological experiment results, we have also compared the network simulation results with gene knock down experiment data. RESULTS: With the Boolean model, this study showed that the cell-cycle network was stable with a set of attractors (fixed points). A biological pathway was observed in the simulation, which corresponded to a whole cell-cycle progression. The C. elegans network was significantly robust when compared with random networks of the same size because there were less attractors and larger basins than random networks. Moreover, the network was also robust under perturbations with no significant change of the basin size. In addition, the smaller number of attractors and the shorter biological pathway from gene knock down network simulation interpreted the shorter cell-cycle lengths in mutant from the RNAi gene knock down experiment data. Hence, we demonstrated that the results in network simulation could be verified by the RNAi gene knock down experiment data. CONCLUSIONS: A C. elegans early embryonic cell cycles network was constructed and its properties were analyzed and compared with those of random networks. Computer simulation results provided biologically meaningful interpretations of RNAi gene knock down experiment data.


Assuntos
Caenorhabditis elegans/embriologia , Ciclo Celular/fisiologia , Redes Reguladoras de Genes , Modelos Biológicos , Animais , Simulação por Computador , Técnicas de Silenciamento de Genes , Interferência de RNA
13.
Mol Biol Evol ; 28(1): 501-11, 2011 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-20724380

RESUMO

Serum response factor (SRF) and myocyte enhancer factor 2 (MEF2) represent two types of members of the MCM1, AGAMOUS, DEFICIENS, and SRF (MADS)-box transcription factor family present in animals and fungi. Each type has distinct biological functions, which are reflected by the distinct specificities of the proteins bound to their cognate DNA-binding sites and activated by their respective cofactors. However, little is known about the evolution of MADS domains and their DNA-binding sites. Here, we report on the conservation and evolution of the two types of MADS domains with their cognate DNA-binding sites by using phylogenetic analyses. First, there are great similarities between the two types of proteins with amino acid positions highly conserved, which are critical for binding to the DNA sequence and for the maintenance of the 3D structure. Second, in contrast to MEF2-type MADS domains, distinct conserved residues are present at some positions in SRF-type MADS domains, determining specificity and the configuration of the MADS domain bound to DNA sequences. Furthermore, the ancestor sequence of SRF- and MEF2-type MADS domains is more similar to MEF2-type MADS domains than to SRF-type MADS domains. In the case of DNA-binding sites, the MEF2 site has a T-rich core in one DNA sequence and an A-rich core in the reverse sequence as compared with the SRF site, no matter whether where either A or T is present in the two complementary sequences. In addition, comparing SRF sites in the human and the mouse genomes reveals that the evolution rate of CArG-boxes is faster in mouse than in human. Moreover, interestingly, a CArG-like sequence, which is probably functionless, could potentially mutate to a functional CArG-box that can be bound by SRF and vice versa. Together, these results significantly improve our knowledge on the conservation and evolution of the MADS domains and their binding sites to date and provide new insights to investigate the MADS family, which is not only on evolution of MADS factors but also on evolution of their binding sites and even on coevolution of MADS factors with their binding sites.


Assuntos
Sítios de Ligação/genética , Evolução Molecular , Fatores de Regulação Miogênica/genética , Fator de Resposta Sérica/genética , Sequência de Aminoácidos , Animais , Sequência de Bases , DNA/genética , DNA/metabolismo , Humanos , Fatores de Transcrição MEF2 , Camundongos , Dados de Sequência Molecular , Filogenia , Estrutura Terciária de Proteína
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA