Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 10 de 10
Filtrar
Mais filtros








Base de dados
Intervalo de ano de publicação
1.
Nat Biotechnol ; 2024 Oct 11.
Artigo em Inglês | MEDLINE | ID: mdl-39394483

RESUMO

A systematic evaluation of how model architectures and training strategies impact genomics model performance is needed. To address this gap, we held a DREAM Challenge where competitors trained models on a dataset of millions of random promoter DNA sequences and corresponding expression levels, experimentally determined in yeast. For a robust evaluation of the models, we designed a comprehensive suite of benchmarks encompassing various sequence types. All top-performing models used neural networks but diverged in architectures and training strategies. To dissect how architectural and training choices impact performance, we developed the Prix Fixe framework to divide models into modular building blocks. We tested all possible combinations for the top three models, further improving their performance. The DREAM Challenge models not only achieved state-of-the-art results on our comprehensive yeast dataset but also consistently surpassed existing benchmarks on Drosophila and human genomic datasets, demonstrating the progress that can be driven by gold-standard genomics datasets.

2.
bioRxiv ; 2024 Feb 17.
Artigo em Inglês | MEDLINE | ID: mdl-38405704

RESUMO

Neural networks have emerged as immensely powerful tools in predicting functional genomic regions, notably evidenced by recent successes in deciphering gene regulatory logic. However, a systematic evaluation of how model architectures and training strategies impact genomics model performance is lacking. To address this gap, we held a DREAM Challenge where competitors trained models on a dataset of millions of random promoter DNA sequences and corresponding expression levels, experimentally determined in yeast, to best capture the relationship between regulatory DNA and gene expression. For a robust evaluation of the models, we designed a comprehensive suite of benchmarks encompassing various sequence types. While some benchmarks produced similar results across the top-performing models, others differed substantially. All top-performing models used neural networks, but diverged in architectures and novel training strategies, tailored to genomics sequence data. To dissect how architectural and training choices impact performance, we developed the Prix Fixe framework to divide any given model into logically equivalent building blocks. We tested all possible combinations for the top three models and observed performance improvements for each. The DREAM Challenge models not only achieved state-of-the-art results on our comprehensive yeast dataset but also consistently surpassed existing benchmarks on Drosophila and human genomic datasets. Overall, we demonstrate that high-quality gold-standard genomics datasets can drive significant progress in model development.

3.
Sci Rep ; 13(1): 9567, 2023 06 13.
Artigo em Inglês | MEDLINE | ID: mdl-37311768

RESUMO

With the advent of multiplex fluorescence in situ hybridization (FISH) and in situ RNA sequencing technologies, spatial transcriptomics analysis is advancing rapidly, providing spatial location and gene expression information about cells in tissue sections at single cell resolution. Cell type classification of these spatially-resolved cells can be inferred by matching the spatial transcriptomics data to reference atlases derived from single cell RNA-sequencing (scRNA-seq) in which cell types are defined by differences in their gene expression profiles. However, robust cell type matching of the spatially-resolved cells to reference scRNA-seq atlases is challenging due to the intrinsic differences in resolution between the spatial and scRNA-seq data. In this study, we systematically evaluated six computational algorithms for cell type matching across four image-based spatial transcriptomics experimental protocols (MERFISH, smFISH, BaristaSeq, and ExSeq) conducted on the same mouse primary visual cortex (VISp) brain region. We find that many cells are assigned as the same type by multiple cell type matching algorithms and are present in spatial patterns previously reported from scRNA-seq studies in VISp. Furthermore, by combining the results of individual matching strategies into consensus cell type assignments, we see even greater alignment with biological expectations. We present two ensemble meta-analysis strategies used in this study and share the consensus cell type matching results in the Cytosplore Viewer ( https://viewer.cytosplore.org ) for interactive visualization and data exploration. The consensus matching can also guide spatial data analysis using SSAM, allowing segmentation-free cell type assignment.


Assuntos
Córtex Visual Primário , Transcriptoma , Animais , Camundongos , Hibridização in Situ Fluorescente , Perfilação da Expressão Gênica , Algoritmos
4.
bioRxiv ; 2023 Mar 23.
Artigo em Inglês | MEDLINE | ID: mdl-36993643

RESUMO

Tissue biology involves an intricate balance between cell-intrinsic processes and interactions between cells organized in specific spatial patterns, which can be respectively captured by single-cell profiling methods, such as single-cell RNA-seq (scRNA-seq), and histology imaging data, such as Hematoxylin-and-Eosin (H&E) stains. While single-cell profiles provide rich molecular information, they can be challenging to collect routinely and do not have spatial resolution. Conversely, histological H&E assays have been a cornerstone of tissue pathology for decades, but do not directly report on molecular details, although the observed structure they capture arises from molecules and cells. Here, we leverage adversarial machine learning to develop SCHAF (Single-Cell omics from Histology Analysis Framework), to generate a tissue sample's spatially-resolved single-cell omics dataset from its H&E histology image. We demonstrate SCHAF on two types of human tumors-from lung and metastatic breast cancer-training with matched samples analyzed by both sc/snRNA-seq and by H&E staining. SCHAF generated appropriate single-cell profiles from histology images in test data, related them spatially, and compared well to ground-truth scRNA-Seq, expert pathologist annotations, or direct MERFISH measurements. SCHAF opens the way to next-generation H&E2.0 analyses and an integrated understanding of cell and tissue biology in health and disease.

5.
Nature ; 603(7901): 455-463, 2022 03.
Artigo em Inglês | MEDLINE | ID: mdl-35264797

RESUMO

Mutations in non-coding regulatory DNA sequences can alter gene expression, organismal phenotype and fitness1-3. Constructing complete fitness landscapes, in which DNA sequences are mapped to fitness, is a long-standing goal in biology, but has remained elusive because it is challenging to generalize reliably to vast sequence spaces4-6. Here we build sequence-to-expression models that capture fitness landscapes and use them to decipher principles of regulatory evolution. Using millions of randomly sampled promoter DNA sequences and their measured expression levels in the yeast Saccharomyces cerevisiae, we learn deep neural network models that generalize with excellent prediction performance, and enable sequence design for expression engineering. Using our models, we study expression divergence under genetic drift and strong-selection weak-mutation regimes to find that regulatory evolution is rapid and subject to diminishing returns epistasis; that conflicting expression objectives in different environments constrain expression adaptation; and that stabilizing selection on gene expression leads to the moderation of regulatory complexity. We present an approach for using such models to detect signatures of selection on expression from natural variation in regulatory sequences and use it to discover an instance of convergent regulatory evolution. We assess mutational robustness, finding that regulatory mutation effect sizes follow a power law, characterize regulatory evolvability, visualize promoter fitness landscapes, discover evolvability archetypes and illustrate the mutational robustness of natural regulatory sequence populations. Our work provides a general framework for designing regulatory sequences and addressing fundamental questions in regulatory evolution.


Assuntos
Deriva Genética , Modelos Genéticos , Evolução Biológica , DNA , Evolução Molecular , Regulação da Expressão Gênica , Mutação/genética , Fenótipo , Saccharomyces cerevisiae/genética
6.
Nat Med ; 27(3): 546-559, 2021 03.
Artigo em Inglês | MEDLINE | ID: mdl-33654293

RESUMO

Angiotensin-converting enzyme 2 (ACE2) and accessory proteases (TMPRSS2 and CTSL) are needed for severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) cellular entry, and their expression may shed light on viral tropism and impact across the body. We assessed the cell-type-specific expression of ACE2, TMPRSS2 and CTSL across 107 single-cell RNA-sequencing studies from different tissues. ACE2, TMPRSS2 and CTSL are coexpressed in specific subsets of respiratory epithelial cells in the nasal passages, airways and alveoli, and in cells from other organs associated with coronavirus disease 2019 (COVID-19) transmission or pathology. We performed a meta-analysis of 31 lung single-cell RNA-sequencing studies with 1,320,896 cells from 377 nasal, airway and lung parenchyma samples from 228 individuals. This revealed cell-type-specific associations of age, sex and smoking with expression levels of ACE2, TMPRSS2 and CTSL. Expression of entry factors increased with age and in males, including in airway secretory cells and alveolar type 2 cells. Expression programs shared by ACE2+TMPRSS2+ cells in nasal, lung and gut tissues included genes that may mediate viral entry, key immune functions and epithelial-macrophage cross-talk, such as genes involved in the interleukin-6, interleukin-1, tumor necrosis factor and complement pathways. Cell-type-specific expression patterns may contribute to the pathogenesis of COVID-19, and our work highlights putative molecular pathways for therapeutic intervention.


Assuntos
COVID-19/epidemiologia , COVID-19/genética , Interações Hospedeiro-Patógeno/genética , SARS-CoV-2/fisiologia , Análise de Sequência de RNA/estatística & dados numéricos , Análise de Célula Única/estatística & dados numéricos , Internalização do Vírus , Adulto , Idoso , Idoso de 80 Anos ou mais , Células Epiteliais Alveolares/metabolismo , Células Epiteliais Alveolares/virologia , Enzima de Conversão de Angiotensina 2/genética , Enzima de Conversão de Angiotensina 2/metabolismo , COVID-19/patologia , COVID-19/virologia , Catepsina L/genética , Catepsina L/metabolismo , Conjuntos de Dados como Assunto/estatística & dados numéricos , Demografia , Feminino , Perfilação da Expressão Gênica/estatística & dados numéricos , Humanos , Pulmão/metabolismo , Pulmão/virologia , Masculino , Pessoa de Meia-Idade , Especificidade de Órgãos/genética , Sistema Respiratório/metabolismo , Sistema Respiratório/virologia , Análise de Sequência de RNA/métodos , Serina Endopeptidases/genética , Serina Endopeptidases/metabolismo , Análise de Célula Única/métodos
7.
Nat Biotechnol ; 38(10): 1211, 2020 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-32792646

RESUMO

An amendment to this paper has been published and can be accessed via a link at the top of the paper.

8.
Nat Biotechnol ; 38(1): 56-65, 2020 01.
Artigo em Inglês | MEDLINE | ID: mdl-31792407

RESUMO

How transcription factors (TFs) interpret cis-regulatory DNA sequence to control gene expression remains unclear, largely because past studies using native and engineered sequences had insufficient scale. Here, we measure the expression output of >100 million synthetic yeast promoter sequences that are fully random. These sequences yield diverse, reproducible expression levels that can be explained by their chance inclusion of functional TF binding sites. We use machine learning to build interpretable models of transcriptional regulation that predict ~94% of the expression driven from independent test promoters and ~89% of the expression driven from native yeast promoter fragments. These models allow us to characterize each TF's specificity, activity and interactions with chromatin. TF activity depends on binding-site strand, position, DNA helical face and chromatin context. Notably, expression level is influenced by weak regulatory interactions, which confound designed-sequence studies. Our analyses show that massive-throughput assays of fully random DNA can provide the big data necessary to develop complex, predictive models of gene regulation.


Assuntos
Eucariotos/genética , Regulação da Expressão Gênica , Lógica , Regiões Promotoras Genéticas , Sítios de Ligação , DNA/metabolismo , Genes Reporter , Modelos Genéticos , Saccharomyces cerevisiae/genética , Fatores de Transcrição/metabolismo
9.
Cell ; 177(7): 1915-1932.e16, 2019 06 13.
Artigo em Inglês | MEDLINE | ID: mdl-31130381

RESUMO

Stroma is a poorly defined non-parenchymal component of virtually every organ with key roles in organ development, homeostasis, and repair. Studies of the bone marrow stroma have defined individual populations in the stem cell niche regulating hematopoietic regeneration and capable of initiating leukemia. Here, we use single-cell RNA sequencing (scRNA-seq) to define a cellular taxonomy of the mouse bone marrow stroma and its perturbation by malignancy. We identified seventeen stromal subsets expressing distinct hematopoietic regulatory genes spanning new fibroblastic and osteoblastic subpopulations including distinct osteoblast differentiation trajectories. Emerging acute myeloid leukemia impaired mesenchymal osteogenic differentiation and reduced regulatory molecules necessary for normal hematopoiesis. These data suggest that tissue stroma responds to malignant cells by disadvantaging normal parenchymal cells. Our taxonomy of the stromal compartment provides a comprehensive bone marrow cell census and experimental support for cancer cell crosstalk with specific stromal elements to impair normal tissue function and thereby enable emergent cancer.


Assuntos
Células da Medula Óssea/metabolismo , Diferenciação Celular , Homeostase , Leucemia Mieloide Aguda/metabolismo , Osteoblastos/metabolismo , Osteogênese , Microambiente Tumoral , Animais , Células da Medula Óssea/patologia , Humanos , Leucemia Mieloide Aguda/patologia , Camundongos , Osteoblastos/patologia , Células Estromais/metabolismo , Células Estromais/patologia
10.
Nat Commun ; 8: 15014, 2017 05 15.
Artigo em Inglês | MEDLINE | ID: mdl-28504247

RESUMO

Sculpting organism shape requires that cells produce forces with proper directionality. Thus, it is critical to understand how cells orient the cytoskeleton to produce forces that deform tissues. During Drosophila gastrulation, actomyosin contraction in ventral cells generates a long, narrow epithelial furrow, termed the ventral furrow, in which actomyosin fibres and tension are directed along the length of the furrow. Using a combination of genetic and mechanical perturbations that alter tissue shape, we demonstrate that geometrical and mechanical constraints act as cues to orient the cytoskeleton and tension during ventral furrow formation. We developed an in silico model of two-dimensional actomyosin meshwork contraction, demonstrating that actomyosin meshworks exhibit an inherent force orienting mechanism in response to mechanical constraints. Together, our in vivo and in silico data provide a framework for understanding how cells orient force generation, establishing a role for geometrical and mechanical patterning of force production in tissues.


Assuntos
Citoesqueleto de Actina/fisiologia , Actomiosina/fisiologia , Forma Celular/fisiologia , Modelos Biológicos , Animais , Animais Geneticamente Modificados , Simulação por Computador , Drosophila , Embrião não Mamífero , Feminino , Gastrulação/fisiologia , Microscopia Intravital , Proteínas Luminescentes/química , Microtúbulos/fisiologia , Estresse Fisiológico/fisiologia
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA