Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 14 de 14
Filtrar
1.
Genome Res ; 34(1): 119-133, 2024 02 07.
Artículo en Inglés | MEDLINE | ID: mdl-38190633

RESUMEN

Single-cell technologies offer unprecedented opportunities to dissect gene regulatory mechanisms in context-specific ways. Although there are computational methods for extracting gene regulatory relationships from scRNA-seq and scATAC-seq data, the data integration problem, essential for accurate cell type identification, has been mostly treated as a standalone challenge. Here we present scTIE, a unified method that integrates temporal multimodal data and infers regulatory relationships predictive of cellular state changes. scTIE uses an autoencoder to embed cells from all time points into a common space by using iterative optimal transport, followed by extracting interpretable information to predict cell trajectories. Using a variety of synthetic and real temporal multimodal data sets, we show scTIE achieves effective data integration while preserving more biological signals than existing methods, particularly in the presence of batch effects and noise. Furthermore, on the exemplar multiome data set we generated from differentiating mouse embryonic stem cells over time, we show scTIE captures regulatory elements highly predictive of cell transition probabilities, providing new potentials to understand the regulatory landscape driving developmental processes.


Asunto(s)
Perfilación de la Expresión Génica , Análisis de la Célula Individual , Animales , Ratones , Perfilación de la Expresión Génica/métodos , Análisis de la Célula Individual/métodos , Regulación de la Expresión Génica
2.
Proc Natl Acad Sci U S A ; 119(40): e2200421119, 2022 10 04.
Artículo en Inglés | MEDLINE | ID: mdl-36161951

RESUMEN

Strong ultraviolet (UV) radiation at high altitude imposes a serious selective pressure, which may induce skin pigmentation adaptation of indigenous populations. We conducted skin pigmentation phenotyping and genome-wide analysis of Tibetans in order to understand the underlying mechanism of adaptation to UV radiation. We observe that Tibetans have darker baseline skin color compared with lowland Han Chinese, as well as an improved tanning ability, suggesting a two-level adaptation to boost their melanin production. A genome-wide search for the responsible genes identifies GNPAT showing strong signals of positive selection in Tibetans. An enhancer mutation (rs75356281) located in GNPAT intron 2 is enriched in Tibetans (58%) but rare in other world populations (0 to 18%). The adaptive allele of rs75356281 is associated with darker skin in Tibetans and, under UVB treatment, it displays higher enhancer activities compared with the wild-type allele in in vitro luciferase assays. Transcriptome analyses of gene-edited cells clearly show that with UVB treatment, the adaptive variant of GNPAT promotes melanin synthesis, likely through the interactions of CAT and ACAA1 in peroxisomes with other pigmentation genes, and they act synergistically, leading to an improved tanning ability in Tibetans for UV protection.


Asunto(s)
Adaptación Fisiológica , Altitud , Pigmentación de la Piel , Aciltransferasas/genética , Adaptación Fisiológica/genética , Etnicidad , Humanos , Melaninas/genética , Fenotipo , Pigmentación de la Piel/genética , Tibet , Transcriptoma , Rayos Ultravioleta
3.
Genome Res ; 30(4): 622-634, 2020 04.
Artículo en Inglés | MEDLINE | ID: mdl-32188700

RESUMEN

A time course experiment is a widely used design in the study of cellular processes such as differentiation or response to stimuli. In this paper, we propose time course regulatory analysis (TimeReg) as a method for the analysis of gene regulatory networks based on paired gene expression and chromatin accessibility data from a time course. TimeReg can be used to prioritize regulatory elements, to extract core regulatory modules at each time point, to identify key regulators driving changes of the cellular state, and to causally connect the modules across different time points. We applied the method to analyze paired chromatin accessibility and gene expression data from a retinoic acid (RA)-induced mouse embryonic stem cells (mESCs) differentiation experiment. The analysis identified 57,048 novel regulatory elements regulating cerebellar development, synapse assembly, and hindbrain morphogenesis, which substantially extended our knowledge of cis-regulatory elements during differentiation. Using single-cell RNA-seq data, we showed that the core regulatory modules can reflect the properties of different subpopulations of cells. Finally, the driver regulators are shown to be important in clarifying the relations between modules across adjacent time points. As a second example, our method on Ascl1-induced direct reprogramming from fibroblast to neuron time course data identified Id1/2 as driver regulators of early stage of reprogramming.


Asunto(s)
Ensamble y Desensamble de Cromatina , Cromatina/genética , Regulación de la Expresión Génica , Células Madre Embrionarias de Ratones/metabolismo , Algoritmos , Animales , Diferenciación Celular/efectos de los fármacos , Diferenciación Celular/genética , Linaje de la Célula , Reprogramación Celular/genética , Técnicas de Reprogramación Celular , Cromatina/metabolismo , Biología Computacional/métodos , Perfilación de la Expresión Génica/métodos , Redes Reguladoras de Genes , Ratones , Células Madre Embrionarias de Ratones/efectos de los fármacos , Factores de Transcripción/metabolismo , Transcriptoma , Tretinoina/farmacología
4.
PLoS Genet ; 13(3): e1006664, 2017 03.
Artículo en Inglés | MEDLINE | ID: mdl-28273089

RESUMEN

The general transcription factor TBP (TATA-box binding protein) and its associated factors (TAFs) together form the TFIID complex, which directs transcription initiation. Through RNAi and mutant analysis, we identified a specific TBP family protein, TRF2, and a set of TAFs that regulate lipid droplet (LD) size in the Drosophila larval fat body. Among the three Drosophila TBP genes, trf2, tbp and trf1, only loss of function of trf2 results in increased LD size. Moreover, TRF2 and TAF9 regulate fatty acid composition of several classes of phospholipids. Through RNA profiling, we found that TRF2 and TAF9 affects the transcription of a common set of genes, including peroxisomal fatty acid ß-oxidation-related genes that affect phospholipid fatty acid composition. We also found that knockdown of several TRF2 and TAF9 target genes results in large LDs, a phenotype which is similar to that of trf2 mutants. Together, these findings provide new insights into the specific role of the general transcription machinery in lipid homeostasis.


Asunto(s)
Proteínas de Drosophila/metabolismo , Drosophila/genética , Ácidos Grasos/química , Lípidos/química , Factores Asociados con la Proteína de Unión a TATA/metabolismo , Proteína 2 de Unión a Repeticiones Teloméricas/metabolismo , Factor de Transcripción TFIID/metabolismo , Alelos , Secuencias de Aminoácidos , Animales , Drosophila/metabolismo , Homeostasis , Mutación , Oxígeno/química , Peroxisomas/química , Fenotipo , Fosfolípidos/química , Interferencia de ARN , Análisis de Secuencia de ARN , Factor de Transcripción TFIID/química
5.
bioRxiv ; 2023 May 22.
Artículo en Inglés | MEDLINE | ID: mdl-37292801

RESUMEN

Single-cell technologies offer unprecedented opportunities to dissect gene regulatory mechanisms in context-specific ways. Although there are computational methods for extracting gene regulatory relationships from scRNA-seq and scATAC-seq data, the data integration problem, essential for accurate cell type identification, has been mostly treated as a standalone challenge. Here we present scTIE, a unified method that integrates temporal multimodal data and infers regulatory relationships predictive of cellular state changes. scTIE uses an autoencoder to embed cells from all time points into a common space using iterative optimal transport, followed by extracting interpretable information to predict cell trajectories. Using a variety of synthetic and real temporal multimodal datasets, we demonstrate scTIE achieves effective data integration while preserving more biological signals than existing methods, particularly in the presence of batch effects and noise. Furthermore, on the exemplar multiome dataset we generated from differentiating mouse embryonic stem cells over time, we demonstrate scTIE captures regulatory elements highly predictive of cell transition probabilities, providing new potentials to understand the regulatory landscape driving developmental processes.

6.
Genome Biol ; 23(1): 114, 2022 05 16.
Artículo en Inglés | MEDLINE | ID: mdl-35578363

RESUMEN

Technological development has enabled the profiling of gene expression and chromatin accessibility from the same cell. We develop scREG, a dimension reduction methodology, based on the concept of cis-regulatory potential, for single cell multiome data. This concept is further used for the construction of subpopulation-specific cis-regulatory networks. The capability of inferring useful regulatory network is demonstrated by the two-fold increment on network inference accuracy compared to the Pearson correlation-based method and the 27-fold enrichment of GWAS variants for inflammatory bowel disease in the cis-regulatory elements. The R package scREG provides comprehensive functions for single cell multiome data analysis.


Asunto(s)
Cromatina , Secuencias Reguladoras de Ácidos Nucleicos , Cromatina/genética , Expresión Génica , Redes Reguladoras de Genes , Análisis de la Célula Individual
7.
Elife ; 112022 12 16.
Artículo en Inglés | MEDLINE | ID: mdl-36525361

RESUMEN

Systems genetics holds the promise to decipher complex traits by interpreting their associated SNPs through gene regulatory networks derived from comprehensive multi-omics data of cell types, tissues, and organs. Here, we propose SpecVar to integrate paired chromatin accessibility and gene expression data into context-specific regulatory network atlas and regulatory categories, conduct heritability enrichment analysis with genome-wide association studies (GWAS) summary statistics, identify relevant tissues, and estimate relevance correlation to depict common genetic factors acting in the shared regulatory networks between traits. Our method improves power upon existing approaches by associating SNPs with context-specific regulatory elements to assess heritability enrichments and by explicitly prioritizing gene regulations underlying relevant tissues. Ablation studies, independent data validation, and comparison experiments with existing methods on GWAS of six phenotypes show that SpecVar can improve heritability enrichment, accurately detect relevant tissues, and reveal causal regulations. Furthermore, SpecVar correlates the relevance patterns for pairs of phenotypes and better reveals shared SNP-associated regulations of phenotypes than existing methods. Studying GWAS of 206 phenotypes in UK Biobank demonstrates that SpecVar leverages the context-specific regulatory network atlas to prioritize phenotypes' relevant tissues and shared heritability for biological and therapeutic insights. SpecVar provides a powerful way to interpret SNPs via context-specific regulatory networks and is available at https://github.com/AMSSwanglab/SpecVar, copy archived at swh:1:rev:cf27438d3f8245c34c357ec5f077528e6befe829.


Asunto(s)
Redes Reguladoras de Genes , Estudio de Asociación del Genoma Completo , Fenotipo , Regulación de la Expresión Génica , Herencia Multifactorial/genética , Polimorfismo de Nucleótido Simple
8.
Nat Commun ; 12(1): 4763, 2021 08 06.
Artículo en Inglés | MEDLINE | ID: mdl-34362918

RESUMEN

The comparison of gene regulatory networks between diseased versus healthy individuals or between two different treatments is an important scientific problem. Here, we propose sc-compReg as a method for the comparative analysis of gene expression regulatory networks between two conditions using single cell gene expression (scRNA-seq) and single cell chromatin accessibility data (scATAC-seq). Our software, sc-compReg, can be used as a stand-alone package that provides joint clustering and embedding of the cells from both scRNA-seq and scATAC-seq, and the construction of differential regulatory networks across two conditions. We apply the method to compare the gene regulatory networks of an individual with chronic lymphocytic leukemia (CLL) versus a healthy control. The analysis reveals a tumor-specific B cell subpopulation in the CLL patient and identifies TOX2 as a potential regulator of this subpopulation.


Asunto(s)
Redes Reguladoras de Genes , Leucemia Linfocítica Crónica de Células B/genética , Análisis de la Célula Individual/métodos , Linfocitos B , Cromatina , Regulación Neoplásica de la Expresión Génica , Proteínas HMGB , Humanos , ARN Citoplasmático Pequeño , Programas Informáticos
9.
Nat Commun ; 11(1): 4928, 2020 10 01.
Artículo en Inglés | MEDLINE | ID: mdl-33004791

RESUMEN

High-altitude adaptation of Tibetans represents a remarkable case of natural selection during recent human evolution. Previous genome-wide scans found many non-coding variants under selection, suggesting a pressing need to understand the functional role of non-coding regulatory elements (REs). Here, we generate time courses of paired ATAC-seq and RNA-seq data on cultured HUVECs under hypoxic and normoxic conditions. We further develop a variant interpretation methodology (vPECA) to identify active selected REs (ASREs) and associated regulatory network. We discover three causal SNPs of EPAS1, the key adaptive gene for Tibetans. These SNPs decrease the accessibility of ASREs with weakened binding strength of relevant TFs, and cooperatively down-regulate EPAS1 expression. We further construct the downstream network of EPAS1, elucidating its roles in hypoxic response and angiogenesis. Collectively, we provide a systematic approach to interpret phenotype-associated noncoding variants in proper cell types and relevant dynamic conditions, to model their impact on gene regulation.


Asunto(s)
Aclimatación/genética , Cromatina/metabolismo , Etnicidad/genética , Redes Reguladoras de Genes , Modelos Genéticos , Altitud , Mal de Altura/etnología , Mal de Altura/genética , Mal de Altura/metabolismo , Factores de Transcripción con Motivo Hélice-Asa-Hélice Básico/genética , Hipoxia de la Célula/genética , Células Cultivadas , Cromatina/genética , Secuenciación de Inmunoprecipitación de Cromatina , Resistencia a la Enfermedad/genética , Femenino , Regulación de la Expresión Génica , Células Endoteliales de la Vena Umbilical Humana , Humanos , Hipoxia/genética , Hipoxia/metabolismo , Oxígeno/metabolismo , Polimorfismo de Nucleótido Simple , Embarazo , Cultivo Primario de Células , RNA-Seq , Elementos Reguladores de la Transcripción/genética , Selección Genética , Tibet/etnología , Factores de Transcripción/metabolismo , Secuenciación Completa del Genoma
10.
IEEE Trans Neural Netw Learn Syst ; 30(1): 269-283, 2019 01.
Artículo en Inglés | MEDLINE | ID: mdl-29994273

RESUMEN

Multidomain network classification has attracted significant attention in data integration and machine learning, which can enhance network classification or prediction performance by integrating information from different sources. Despite the previous success, existing multidomain network learning methods usually assume that different views are available for the same set of instances, and thus, they seek a consistent classification result for all domains. However, in many real-world problems, each domain has its specific instance set, and one instance in one domain may correspond to multiple instances in another domain. Moreover, due to the rapid growth of data sources, different domains may not be relevant to each other, which asks for selecting domains relevant to the target/focused domain. A key challenge under this setting is how to achieve accurate prediction by integrating different data representations without losing data information. In this paper, we propose a semisupervised classification approach for a multidomain network based on label propagation, i.e., multidomain classification with domain selection (MCS), which can deal with the cross-domain information and different instance sets in domains. In particular, with sparse weight properties, the proposed MCS can automatically identify those domains relevant to our target domain by assigning them higher weights than the other irrelevant domains. This not only significantly improves a classification accuracy but also helps to obtain optimal network partition for the target domain. From the theoretical viewpoint, we equivalently decompose MCS into two simpler subproblems with analytical solutions, which can be efficiently solved by their computational procedures. Extensive experimental results on both synthetic and real-world data sets empirically demonstrate the advantages of the proposed approach in terms of both prediction performance and domain selection ability.

11.
Biochim Biophys Acta Mol Cell Biol Lipids ; 1864(2): 168-180, 2019 02.
Artículo en Inglés | MEDLINE | ID: mdl-30521938

RESUMEN

Lipid homeostasis is important for executing normal cellular functions and maintaining physiological conditions. The biophysical properties and intricate metabolic network of lipids underlie the coordinated regulation of different lipid species in lipid homeostasis. To reveal the homeostatic response among different lipids, we systematically knocked down 40 lipid metabolism genes in Drosophila S2 cells by RNAi and profiled the lipidomic changes. Clustering analyses of lipids reveal that many pairs of genes acting in a sequential fashion or sharing the same substrate are tightly clustered. Through a lipid-gene regulatory network analysis, we further found that a reduction of triacylglycerol (TAG) is associated with an increase of phosphatidylinositol (PI) and lysophosphatidylinositol (LPI) or a reduction of hexosyl-ceramide (HexCer) and hydroxylated hexosyl-ceramide (OH-HexCer). Importantly, negative coregulation between TAG and LPI/PI, and positive coregulation between TAG and HexCer, were also found in human Hela cells. Together, our results reveal coregulations of TAG with PI/LPI and with HexCer in lipid homeostasis.


Asunto(s)
Lípidos/genética , Fosfatidilinositoles/metabolismo , Triglicéridos/metabolismo , Animales , Línea Celular , Ceramidas/metabolismo , Ceramidas/fisiología , Drosophila , Redes Reguladoras de Genes/genética , Células HeLa , Homeostasis , Humanos , Metabolismo de los Lípidos/genética , Lípidos/fisiología , Lisofosfolípidos/metabolismo , Transducción de Señal , Triglicéridos/genética
12.
Cell Stem Cell ; 24(2): 271-284.e8, 2019 02 07.
Artículo en Inglés | MEDLINE | ID: mdl-30686763

RESUMEN

Tissue development results from lineage-specific transcription factors (TFs) programming a dynamic chromatin landscape through progressive cell fate transitions. Here, we define epigenomic landscape during epidermal differentiation of human pluripotent stem cells (PSCs) and create inference networks that integrate gene expression, chromatin accessibility, and TF binding to define regulatory mechanisms during keratinocyte specification. We found two critical chromatin networks during surface ectoderm initiation and keratinocyte maturation, which are driven by TFAP2C and p63, respectively. Consistently, TFAP2C, but not p63, is sufficient to initiate surface ectoderm differentiation, and TFAP2C-initiated progenitor cells are capable of maturing into functional keratinocytes. Mechanistically, TFAP2C primes the surface ectoderm chromatin landscape and induces p63 expression and binding sites, thus allowing maturation factor p63 to positively autoregulate its own expression and close a subset of the TFAP2C-initiated surface ectoderm program. Our work provides a general framework to infer TF networks controlling chromatin transitions that will facilitate future regenerative medicine advances.


Asunto(s)
Linaje de la Célula , Cromatina/metabolismo , Epidermis/metabolismo , Redes Reguladoras de Genes , Factor de Transcripción AP-2/metabolismo , Factores de Transcripción/metabolismo , Proteínas Supresoras de Tumor/metabolismo , Diferenciación Celular , Ectodermo/citología , Epigénesis Genética , Retroalimentación Fisiológica , Humanos , Queratinocitos/citología , Transcriptoma/genética
14.
BMC Med Genomics ; 8 Suppl 2: S11, 2015.
Artículo en Inglés | MEDLINE | ID: mdl-26044366

RESUMEN

Identifying effective biomarkers to battle complex diseases is an important but challenging task in biomedical research today. Molecular data of complex diseases is increasingly abundant due to the rapid advance of high throughput technologies. However, a great gap remains in identifying the massive molecular data to phenotypic changes, in particular, at a network level, i.e., a novel method for identifying network biomarkers is in pressing need to accurately classify and diagnose diseases from molecular data and shed light on the mechanisms of disease pathogenesis. Rather than seeking differential genes at an individual-molecule level, here we propose a novel method for identifying network biomarkers based on protein-protein interaction affinity (PPIA), which identify the differential interactions at a network level. Specifically, we firstly define PPIAs by estimating the concentrations of protein complexes based on the law of mass action upon gene expression data. Then we select a small and non-redundant group of protein-protein interactions and single proteins according to the PPIAs, that maximizes the discerning ability of cases from controls. This method is mathematically formulated as a linear programming, which can be efficiently solved and guarantees a globally optimal solution. Extensive results on experimental data in breast cancer demonstrate the effectiveness and efficiency of the proposed method for identifying network biomarkers, which not only can accurately distinguish the phenotypes but also provides significant biological insights at a network or pathway level. In addition, our method provides a new way to integrate static protein-protein interaction information with dynamical gene expression data.


Asunto(s)
Biomarcadores de Tumor/metabolismo , Bases de Datos de Proteínas , Mapas de Interacción de Proteínas , Estadística como Asunto , Algoritmos , Neoplasias de la Mama/genética , Neoplasias de la Mama/metabolismo , Femenino , Perfilación de la Expresión Génica , Regulación Neoplásica de la Expresión Génica , Humanos , Programas Informáticos
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA