RESUMEN
Retrotransposons mediate gene regulation in important developmental and pathological processes. Here, we characterized the transient retrotransposon induction during preimplantation development of eight mammals. Induced retrotransposons exhibit similar preimplantation profiles across species, conferring gene regulatory activities, particularly through long terminal repeat (LTR) retrotransposon promoters. A mouse-specific MT2B2 retrotransposon promoter generates an N-terminally truncated Cdk2ap1ΔN that peaks in preimplantation embryos and promotes proliferation. In contrast, the canonical Cdk2ap1 peaks in mid-gestation and represses cell proliferation. This MT2B2 promoter, whose deletion abolishes Cdk2ap1ΔN production, reduces cell proliferation and impairs embryo implantation, is developmentally essential. Intriguingly, Cdk2ap1ΔN is evolutionarily conserved in sequence and function yet is driven by different promoters across mammals. The distinct preimplantation Cdk2ap1ΔN expression in each mammalian species correlates with the duration of its preimplantation development. Hence, species-specific transposon promoters can yield evolutionarily conserved, alternative protein isoforms, bestowing them with new functions and species-specific expression to govern essential biological divergence.
Asunto(s)
Secuencia Conservada , Desarrollo Embrionario/genética , Proteínas Quinasas/metabolismo , Retroelementos/genética , Proteínas Supresoras de Tumor/metabolismo , Animales , Secuencia de Bases , Blastocisto/metabolismo , Proliferación Celular , Evolución Molecular , Femenino , Regulación del Desarrollo de la Expresión Génica , Células Madre Embrionarias Humanas/metabolismo , Humanos , Mamíferos/genética , Ratones Endogámicos C57BL , Ratones Noqueados , Modelos Biológicos , Regiones Promotoras Genéticas , Isoformas de Proteínas/metabolismoRESUMEN
Tissue-resident memory T cells (TRM cells) provide rapid and superior control of localized infections. While the transcription factor Runx3 is a critical regulator of CD8+ T cell tissue residency, its expression is repressed in CD4+ T cells. Here, we show that, as a direct consequence of this Runx3-deficiency, CD4+ TRM cells lacked the transforming growth factor (TGF)-ß-responsive transcriptional network that underpins the tissue residency of epithelial CD8+ TRM cells. While CD4+ TRM cell formation required Runx1, this, along with the modest expression of Runx3 in CD4+ TRM cells, was insufficient to engage the TGF-ß-driven residency program. Ectopic expression of Runx3 in CD4+ T cells incited this TGF-ß-transcriptional network to promote prolonged survival, decreased tissue egress, a microanatomical redistribution towards epithelial layers and enhanced effector functionality. Thus, our results reveal distinct programming of tissue residency in CD8+ and CD4+ TRM cell subsets that is attributable to divergent Runx3 activity.
Asunto(s)
Memoria Inmunológica , Linfocitos T CD4-Positivos/metabolismo , Linfocitos T CD8-positivos/metabolismo , Factor de Crecimiento Transformador beta/metabolismoRESUMEN
Tissue-resident memory T (TRM) cells are non-recirculating cells that exist throughout the body. Although TRM cells in various organs rely on common transcriptional networks to establish tissue residency, location-specific factors adapt these cells to their tissue of lodgment. Here we analyze TRM cell heterogeneity between organs and find that the different environments in which these cells differentiate dictate TRM cell function, durability and malleability. We find that unequal responsiveness to TGFß is a major driver of this diversity. Notably, dampened TGFß signaling results in CD103- TRM cells with increased proliferative potential, enhanced function and reduced longevity compared with their TGFß-responsive CD103+ TRM counterparts. Furthermore, whereas CD103- TRM cells readily modified their phenotype upon relocation, CD103+ TRM cells were comparatively resistant to transdifferentiation. Thus, despite common requirements for TRM cell development, tissue adaptation of these cells confers discrete functional properties such that TRM cells exist along a spectrum of differentiation potential that is governed by their local tissue microenvironment.
Asunto(s)
Linfocitos T CD8-positivos/inmunología , Diferenciación Celular/inmunología , Plasticidad de la Célula/inmunología , Microambiente Celular/inmunología , Memoria Inmunológica/inmunología , Animales , Antígenos CD/inmunología , Linfocitos T CD8-positivos/citología , Femenino , Cadenas alfa de Integrinas/inmunología , Ratones , Ratones Endogámicos C57BL , Ratones Noqueados , Transducción de Señal/inmunología , Factor de Crecimiento Transformador beta1/metabolismoRESUMEN
Tissue-resident memory T (TRM) cells are integral to tissue immunity, persisting in diverse anatomical sites where they adhere to a common transcriptional framework. How these cells integrate distinct local cues to adopt the common TRM cell fate remains poorly understood. Here, we show that whereas skin TRM cells strictly require transforming growth factor ß (TGF-ß) for tissue residency, those in other locations utilize the metabolite retinoic acid (RA) to drive an alternative differentiation pathway, directing a TGF-ß-independent tissue residency program in the liver and synergizing with TGF-ß to drive TRM cells in the small intestine. We found that RA was required for the long-term maintenance of intestinal TRM populations, in part by impeding their retrograde migration. Moreover, enhanced RA signaling modulated TRM cell phenotype and function, a phenomenon mirrored in mice with increased microbial diversity. Together, our findings reveal RA as a fundamental component of the host-environment interaction that directs immunosurveillance in tissues.
RESUMEN
T cell responses are guided by cytokines that induce transcriptional regulators, which ultimately control differentiation of effector and memory T cells. However, it is unknown how the activities of these molecular regulators are coordinated and integrated during the differentiation process. Using genetic approaches and transcriptional profiling of antigen-specific CD8(+) T cells, we reveal a common program of effector differentiation that is regulated by IL-2 and IL-12 signaling and the combined activities of the transcriptional regulators Blimp-1 and T-bet. The loss of both T-bet and Blimp-1 leads to abrogated cytotoxic function and ectopic IL-17 production in CD8(+) T cells. Overall, our data reveal two major overlapping pathways of effector differentiation governed by the availability of Blimp-1 and T-bet and suggest a model for cytokine-induced transcriptional changes that combine, quantitatively and qualitatively, to promote robust effector CD8(+) T cell differentiation.
Asunto(s)
Linfocitos T CD8-positivos/inmunología , Diferenciación Celular/inmunología , Interleucina-12/inmunología , Interleucina-2/inmunología , Proteínas de Dominio T Box/inmunología , Factores de Transcripción/inmunología , Animales , Infecciones por Arenaviridae/inmunología , Inmunoprecipitación de Cromatina , Citocinas/inmunología , Citometría de Flujo , Perfilación de la Expresión Génica , Subtipo H1N1 del Virus de la Influenza A , Interleucina-17/inmunología , Virus de la Coriomeningitis Linfocítica , Ratones , Infecciones por Orthomyxoviridae/inmunología , Factor 1 de Unión al Dominio 1 de Regulación Positiva , Reacción en Cadena en Tiempo Real de la Polimerasa , Factor de Transcripción STAT4/inmunología , Factor de Transcripción STAT5/inmunología , Análisis de Secuencia de ARN , Transducción de SeñalRESUMEN
Venetoclax, a first-in-class BH3 mimetic drug targeting BCL-2, has improved outcomes for patients with chronic lymphocytic leukemia (CLL). Early measurements of the depth of the venetoclax treatment response, assessed by minimal residual disease, are strong predictors of long-term clinical outcomes. Yet, there are limited data concerning the early changes induced by venetoclax treatment that might inform strategies to improve responses. To address this gap, we conducted longitudinal mass cytometric profiling of blood cells from patients with CLL during the first five weeks of venetoclax monotherapy. At baseline, we resolved CLL heterogeneity at the single-cell level to define multiple subpopulations in all patients distinguished by proliferative, metabolic and cell survival proteins. Venetoclax induced significant reduction in all CLL subpopulations coincident with rapid upregulation of pro-survival BCL-2, BCL-XL and MCL-1 proteins in surviving cells, which had reduced sensitivity to the drug. Mouse models recapitulated the venetoclax-induced elevation of survival proteins in B cells and CLL-like cells that persisted in vivo, with genetic models demonstrating that extensive apoptosis and access to the B cell cytokine, BAFF, were essential. Accordingly, analysis of patients with CLL that were treated with venetoclax or the anti-CD20 antibody obinutuzumab exhibited marked elevation of BAFF and increased pro-survival proteins in leukemic cells that persisted. Overall, these data highlight the rapid adaptation of CLL cells to targeted therapies via homeostatic factors and support co-targeting of cytokine signals to achieve deeper and more durable long-term responses.
RESUMEN
Cellular omics such as single-cell genomics, proteomics, and microbiomics allow the characterization of tissue and microbial community composition, which can be compared between conditions to identify biological drivers. This strategy has been critical to revealing markers of disease progression, such as cancer and pathogen infection. A dedicated statistical method for differential variability analysis is lacking for cellular omics data, and existing methods for differential composition analysis do not model some compositional data properties, suggesting there is room to improve model performance. Here, we introduce sccomp, a method for differential composition and variability analyses that jointly models data count distribution, compositionality, group-specific variability, and proportion mean-variability association, being aware of outliers. sccomp provides a comprehensive analysis framework that offers realistic data simulation and cross-study knowledge transfer. Here, we demonstrate that mean-variability association is ubiquitous across technologies, highlighting the inadequacy of the very popular Dirichlet-multinomial distribution. We show that sccomp accurately fits experimental data, significantly improving performance over state-of-the-art algorithms. Using sccomp, we identified differential constraints and composition in the microenvironment of primary breast cancer.
Asunto(s)
Genómica , Microbiota , Proteómica/métodos , Simulación por Computador , AlgoritmosRESUMEN
Normalization of single cell RNA-seq data remains a challenging task. The performance of different methods can vary greatly between datasets when unwanted factors and biology are associated. Most normalization methods also only remove the effects of unwanted variation for the cell embedding but not from gene-level data typically used for differential expression (DE) analysis to identify marker genes. We propose RUV-III-NB, a method that can be used to remove unwanted variation from both the cell embedding and gene-level counts. Using pseudo-replicates, RUV-III-NB explicitly takes into account potential association with biology when removing unwanted variation. The method can be used for both UMI or read counts and returns adjusted counts that can be used for downstream analyses such as clustering, DE and pseudotime analyses. Using published datasets with different technological platforms, kinds of biology and levels of association between biology and unwanted variation, we show that RUV-III-NB manages to remove library size and batch effects, strengthen biological signals, improve DE analyses, and lead to results exhibiting greater concordance with independent datasets of the same kind. The performance of RUV-III-NB is consistent and is not sensitive to the number of factors assumed to contribute to the unwanted variation.
Asunto(s)
Perfilación de la Expresión Génica , Perfilación de la Expresión Génica/métodos , Biblioteca de Genes , RNA-Seq , Análisis de Secuencia de ARN/métodosRESUMEN
T cell receptor repertoires can be profiled using next generation sequencing (NGS) to measure and monitor adaptive dynamical changes in response to disease and other perturbations. Genomic DNA-based bulk sequencing is cost-effective but necessitates multiplex target amplification using multiple primer pairs with highly variable amplification efficiencies. Here, we utilize an equimolar primer mixture and propose a single statistical normalization step that efficiently corrects for amplification bias post sequencing. Using samples analyzed by both our open protocol and a commercial solution, we show high concordance between bulk clonality metrics. This approach is an inexpensive and open-source alternative to commercial solutions.
Asunto(s)
Secuenciación de Nucleótidos de Alto Rendimiento , Linfocitos T , Secuencia de Bases , Mapeo Cromosómico , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Receptores de Antígenos de Linfocitos T alfa-beta/genéticaRESUMEN
Concerted examination of multiple collections of single-cell RNA sequencing (RNA-seq) data promises further biological insights that cannot be uncovered with individual datasets. Here we present scMerge, an algorithm that integrates multiple single-cell RNA-seq datasets using factor analysis of stably expressed genes and pseudoreplicates across datasets. Using a large collection of public datasets, we benchmark scMerge against published methods and demonstrate that it consistently provides improved cell type separation by removing unwanted factors; scMerge can also enhance biological discovery through robust data integration, which we show through the inference of development trajectory in a liver dataset collection.
Asunto(s)
Metaanálisis como Asunto , Análisis de Secuencia de ARN , Análisis de la Célula Individual , Programas Informáticos , Algoritmos , Animales , Desarrollo Embrionario , Análisis Factorial , Expresión Génica , Humanos , RatonesRESUMEN
Automated cell type identification is a key computational challenge in single-cell RNA-sequencing (scRNA-seq) data. To capitalise on the large collection of well-annotated scRNA-seq datasets, we developed scClassify, a multiscale classification framework based on ensemble learning and cell type hierarchies constructed from single or multiple annotated datasets as references. scClassify enables the estimation of sample size required for accurate classification of cell types in a cell type hierarchy and allows joint classification of cells when multiple references are available. We show that scClassify consistently performs better than other supervised cell type classification methods across 114 pairs of reference and testing data, representing a diverse combination of sizes, technologies and levels of complexity, and further demonstrate the unique components of scClassify through simulations and compendia of experimental datasets. Finally, we demonstrate the scalability of scClassify on large single-cell atlases and highlight a novel application of identifying subpopulations of cells from the Tabula Muris data that were unidentified in the original publication. Together, scClassify represents state-of-the-art methodology in automated cell type identification from scRNA-seq data.
Asunto(s)
Células/metabolismo , Animales , Análisis por Conglomerados , Bases de Datos como Asunto , Humanos , Leucocitos Mononucleares/metabolismo , Aprendizaje Automático , Ratones , Páncreas/metabolismo , Tamaño de la Muestra , Programas InformáticosRESUMEN
The Nanostring nCounter gene expression assay uses molecular barcodes and single molecule imaging to detect and count hundreds of unique transcripts in a single reaction. These counts need to be normalized to adjust for the amount of sample, variations in assay efficiency and other factors. Most users adopt the normalization approach described in the nSolver analysis software, which involves background correction based on the observed values of negative control probes, a within-sample normalization using the observed values of positive control probes and normalization across samples using reference (housekeeping) genes. Here we present a new normalization method, Removing Unwanted Variation-III (RUV-III), which makes vital use of technical replicates and suitable control genes. We also propose an approach using pseudo-replicates when technical replicates are not available. The effectiveness of RUV-III is illustrated on four different datasets. We also offer suggestions on the design and analysis of studies involving this technology.
Asunto(s)
Perfilación de la Expresión Génica/métodos , Adenocarcinoma del Pulmón/genética , Adenocarcinoma del Pulmón/metabolismo , Células Dendríticas/metabolismo , Humanos , Enfermedades Inflamatorias del Intestino/genética , Enfermedades Inflamatorias del Intestino/metabolismo , Neoplasias Pulmonares/genética , Neoplasias Pulmonares/metabolismo , Activación de Linfocitos/genética , Imagen Individual de MoléculaRESUMEN
Systematic variation in the methylation of cytosines at CpG sites plays a critical role in early development of humans and other mammals. Of particular interest are regions of differential methylation between parental alleles, as these often dictate monoallelic gene expression, resulting in parent of origin specific control of the embryonic transcriptome and subsequent development, in a phenomenon known as genomic imprinting. Using long-read nanopore sequencing we show that, with an average genomic coverage of â¼10, it is possible to determine both the level of methylation of CpG sites and the haplotype from which each read arises. The long-read property is exploited to characterize, using novel methods, both methylation and haplotype for reads that have reduced basecalling precision compared to Sanger sequencing. We validate the analysis both through comparison of nanopore-derived methylation patterns with those from Reduced Representation Bisulfite Sequencing data and through comparison with previously reported data. Our analysis successfully identifies known imprinting control regions (ICRs) as well as some novel differentially methylated regions which, due to their proximity to hitherto unknown monoallelically expressed genes, may represent new ICRs.
Asunto(s)
Genoma , Impresión Genómica , Técnicas de Genotipaje , Haplotipos , Análisis de Secuencia de ADN/estadística & datos numéricos , Alelos , Animales , Mapeo Cromosómico , Islas de CpG , Metilación de ADN , Embrión de Mamíferos/química , Embrión de Mamíferos/metabolismo , Femenino , Secuenciación de Nucleótidos de Alto Rendimiento , Masculino , Ratones , Placenta/química , Placenta/metabolismo , EmbarazoRESUMEN
BACKGROUND: RNA sequencing allows the study of both gene expression changes and transcribed mutations, providing a highly effective way to gain insight into cancer biology. When planning the sequencing of a large cohort of samples, library size is a fundamental factor affecting both the overall cost and the quality of the results. Here we specifically address how overall library size influences the detection of somatic mutations in RNA-seq data in two acute myeloid leukaemia datasets. RESULTS : We simulated shallower sequencing depths by downsampling 45 acute myeloid leukaemia samples (100 bp PE) that are part of the Leucegene project, which were originally sequenced at high depth. We compared the sensitivity of six methods of recovering validated mutations on the same samples. The methods compared are a combination of three popular callers (MuTect, VarScan, and VarDict) and two filtering strategies. We observed an incremental loss in sensitivity when simulating libraries of 80M, 50M, 40M, 30M and 20M fragments, with the largest loss detected with less than 30M fragments (below 90%, average loss of 7%). The sensitivity in recovering insertions and deletions varied markedly between callers, with VarDict showing the highest sensitivity (60%). Single nucleotide variant sensitivity is relatively consistent across methods, apart from MuTect, whose default filters need adjustment when using RNA-Seq. We also analysed 136 RNA-Seq samples from the TCGA-LAML cohort (50 bp PE) and assessed the change in sensitivity between the initial libraries (average 59M fragments) and after downsampling to 40M fragments. When considering single nucleotide variants in recurrently mutated myeloid genes we found a comparable performance, with a 6% average loss in sensitivity using 40M fragments. CONCLUSIONS: Between 30M and 40M 100 bp PE reads are needed to recover 90-95% of the initial variants on recurrently mutated myeloid genes. To extend this result to another cancer type, an exploration of the characteristics of its mutations and gene expression patterns is suggested.
Asunto(s)
Biblioteca de Genes , Polimorfismo de Nucleótido Simple/genética , RNA-Seq/métodos , Secuencia de Bases , Bases de Datos Genéticas , Secuenciación de Nucleótidos de Alto Rendimiento , Humanos , Neoplasias/genéticaRESUMEN
The identification of genomic rearrangements with high sensitivity and specificity using massively parallel sequencing remains a major challenge, particularly in precision medicine and cancer research. Here, we describe a new method for detecting rearrangements, GRIDSS (Genome Rearrangement IDentification Software Suite). GRIDSS is a multithreaded structural variant (SV) caller that performs efficient genome-wide break-end assembly prior to variant calling using a novel positional de Bruijn graph-based assembler. By combining assembly, split read, and read pair evidence using a probabilistic scoring, GRIDSS achieves high sensitivity and specificity on simulated, cell line, and patient tumor data, recently winning SV subchallenge #5 of the ICGC-TCGA DREAM8.5 Somatic Mutation Calling Challenge. On human cell line data, GRIDSS halves the false discovery rate compared to other recent methods while matching or exceeding their sensitivity. GRIDSS identifies nontemplate sequence insertions, microhomologies, and large imperfect homologies, estimates a quality score for each breakpoint, stratifies calls into high or low confidence, and supports multisample analysis.
Asunto(s)
Reordenamiento Génico , Genómica/métodos , Programas Informáticos , Línea Celular , Simulación por Computador , Genoma , Variación Estructural del Genoma , Humanos , Neoplasias/genética , Plasmodium falciparum/genética , Sensibilidad y EspecificidadRESUMEN
MOTIVATION: Dropout is a common phenomenon in single-cell RNA-seq (scRNA-seq) data, and when left unaddressed it affects the validity of the statistical analyses. Despite this, few current methods for differential expression (DE) analysis of scRNA-seq data explicitly model the process that gives rise to the dropout events. We develop DECENT, a method for DE analysis of scRNA-seq data that explicitly and accurately models the molecule capture process in scRNA-seq experiments. RESULTS: We show that DECENT demonstrates improved DE performance over existing DE methods that do not explicitly model dropout. This improvement is consistently observed across several public scRNA-seq datasets generated using different technological platforms. The gain in improvement is especially large when the capture process is overdispersed. DECENT maintains type I error well while achieving better sensitivity. Its performance without spike-ins is almost as good as when spike-ins are used to calibrate the capture model. AVAILABILITY AND IMPLEMENTATION: The method is implemented as a publicly available R package available from https://github.com/cz-ye/DECENT. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
Asunto(s)
Análisis de la Célula Individual , Programas Informáticos , Perfilación de la Expresión Génica , RNA-Seq , Análisis de Secuencia de ARNRESUMEN
MOTIVATION: A synoptic view of the human genome benefits chiefly from the application of nucleic acid sequencing and microarray technologies. These platforms allow interrogation of patterns such as gene expression and DNA methylation at the vast majority of canonical loci, allowing granular insights and opportunities for validation of original findings. However, problems arise when validating against a "gold standard" measurement, since this immediately biases all subsequent measurements towards that particular technology or protocol. Since all genomic measurements are estimates, in the absence of a "gold standard" we instead empirically assess the measurement precision and sensitivity of a large suite of genomic technologies via a consensus modelling method called the row-linear model. This method is an application of the American Society for Testing and Materials Standard E691 for assessing interlaboratory precision and sources of variability across multiple testing sites. Both cross-platform and cross-locus comparisons can be made across all common loci, allowing identification of technology- and locus-specific tendencies. RESULTS: We assess technologies including the Infinium MethylationEPIC BeadChip, whole genome bisulfite sequencing (WGBS), two different RNA-Seq protocols (PolyA+ and Ribo-Zero) and five different gene expression array platforms. Each technology thus is characterised herein, relative to the consensus. We showcase a number of applications of the row-linear model, including correlation with known interfering traits. We demonstrate a clear effect of cross-hybridisation on the sensitivity of Infinium methylation arrays. Additionally, we perform a true interlaboratory test on a set of samples interrogated on the same platform across twenty-one separate testing laboratories. AVAILABILITY AND IMPLEMENTATION: A full implementation of the row-linear model, plus extra functions for visualisation, are found in the R package consensus at https://github.com/timpeters82/consensus. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
Asunto(s)
Biología Computacional , Metilación de ADN , Genómica , Genoma Humano , Humanos , Análisis de Secuencia por Matrices de Oligonucleótidos , Programas InformáticosRESUMEN
For the reference citation '[57]' in the second paragraph of the Results section of the original article there was no corresponding entry in the References section. It should have referred to the below mentioned article by Ebrahimkhani et al. (2018).
RESUMEN
PURPOSE: A circulating biomarker has potential to provide more accurate information for glioma progression post treatment, however no such biomarker is currently available. We aimed to discover a microRNA serum biomarker for longitudinal monitoring of glioma patients. METHODS: A prospectively collected cohort of 91 glioma patients and 17 healthy controls underwent pre and post-operative serum miRNA profiling using Nanostring®. Differentially expressed miRNAs were discovered using a machine learning random forest analysis. Candidate miRNAs were then assessed by droplet digital PCR in 11 patients with multiple follow up samples and compared to tumor volume based on magnetic resonance imaging. RESULTS: A 9-gene miRNA signature was identified that could distinguish between glioma and healthy controls with 99.8% accuracy. Two miRNAs miR-223 and miR-320e, best demonstrated dynamic changes that correlated closely with tumor volume in LGG and GBM respectively. Importantly, miRNA levels did not increase in two cases of pseudo-progression, indicating the potential utility of this test in guiding treatment decisions. CONCLUSIONS: We identified a highly accurate 9-miRNA signature associated with glioma serum. Additionally, we observed dynamic changes in specific miRNAs correlating with tumor volume over long-term follow up. These results support a large prospective validation study of serum miRNA biomarkers in glioma.
Asunto(s)
Biomarcadores de Tumor/genética , Neoplasias Encefálicas/sangre , Glioma/sangre , MicroARNs/genética , Recurrencia Local de Neoplasia/sangre , Adulto , Anciano , Anciano de 80 o más Años , Biomarcadores de Tumor/sangre , Neoplasias Encefálicas/genética , Neoplasias Encefálicas/patología , Neoplasias Encefálicas/cirugía , Femenino , Estudios de Seguimiento , Perfilación de la Expresión Génica , Regulación Neoplásica de la Expresión Génica , Glioma/genética , Glioma/patología , Glioma/cirugía , Humanos , Masculino , MicroARNs/sangre , Persona de Mediana Edad , Recurrencia Local de Neoplasia/genética , Recurrencia Local de Neoplasia/patología , Recurrencia Local de Neoplasia/cirugía , Pronóstico , Estudios Prospectivos , Adulto JovenRESUMEN
New approaches to lineage tracking have allowed the study of differentiation in multicellular organisms over many generations of cells. Understanding the phenotypic variability observed in these lineage trees requires new statistical methods. Whereas an invariant cell lineage, such as that for the nematode Caenorhabditis elegans, can be described by a lineage map, defined as the pattern of phenotypes overlaid onto the binary tree, a traditional lineage map is static and does not describe the variability inherent in the cell lineages of higher organisms. Here, we introduce lineage variability maps which describe the pattern of second-order variation in lineage trees. These maps can be undirected graphs of the partial correlations between every lineal position, or directed graphs showing the dynamics of bifurcated patterns in each subtree. We show how to infer these graphical models for lineages of any depth from sample sizes of only a few pedigrees. This required developing the generalized spectral analysis for a binary tree, the natural framework for describing tree-structured variation. When tested on pedigrees from C. elegans expressing a marker for pharyngeal differentiation potential, the variability maps recover essential features of the known lineage map. When applied to highly-variable pedigrees monitoring cell size in T lymphocytes, the maps show that most of the phenotype is set by the founder naive T cell. Lineage variability maps thus elevate the concept of the lineage map to the population level, addressing questions about the potency and dynamics of cell lineages and providing a way to quantify the progressive restriction of cell fate with increasing depth in the tree.