Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 43
Filtrar
Mais filtros

Base de dados
Tipo de documento
Intervalo de ano de publicação
1.
Proc Natl Acad Sci U S A ; 120(1): e2209856120, 2023 01 03.
Artigo em Inglês | MEDLINE | ID: mdl-36574653

RESUMO

Breast cancer (BC) is a complex disease comprising multiple distinct subtypes with different genetic features and pathological characteristics. Although a large number of antineoplastic compounds have been approved for clinical use, patient-to-patient variability in drug response is frequently observed, highlighting the need for efficient treatment prediction for individualized therapy. Several patient-derived models have been established lately for the prediction of drug response. However, each of these models has its limitations that impede their clinical application. Here, we report that the whole-tumor cell culture (WTC) ex vivo model could be stably established from all breast tumors with a high success rate (98 out of 116), and it could reassemble the parental tumors with the endogenous microenvironment. We observed strong clinical associations and predictive values from the investigation of a broad range of BC therapies with WTCs derived from a patient cohort. The accuracy was further supported by the correlation between WTC-based test results and patients' clinical responses in a separate validation study, where the neoadjuvant treatment regimens of 15 BC patients were mimicked. Collectively, the WTC model allows us to accomplish personalized drug testing within 10 d, even for small-sized tumors, highlighting its potential for individualized BC therapy. Furthermore, coupled with genomic and transcriptomic analyses, WTC-based testing can also help to stratify specific patient groups for assignment into appropriate clinical trials, as well as validate potential biomarkers during drug development.


Assuntos
Antineoplásicos , Neoplasias da Mama , Humanos , Feminino , Neoplasias da Mama/tratamento farmacológico , Neoplasias da Mama/genética , Neoplasias da Mama/patologia , Antineoplásicos/farmacologia , Antineoplásicos/uso terapêutico , Perfilação da Expressão Gênica , Biomarcadores , Técnicas de Cultura de Células , Microambiente Tumoral
2.
Bioinformatics ; 40(5)2024 May 02.
Artigo em Inglês | MEDLINE | ID: mdl-38676578

RESUMO

MOTIVATION: Copy number variations (CNVs) are common genetic alterations in tumour cells. The delineation of CNVs holds promise for enhancing our comprehension of cancer progression. Moreover, accurate inference of CNVs from single-cell sequencing data is essential for unravelling intratumoral heterogeneity. However, existing inference methods face limitations in resolution and sensitivity. RESULTS: To address these challenges, we present CopyVAE, a deep learning framework based on a variational autoencoder architecture. Through experiments, we demonstrated that CopyVAE can accurately and reliably detect CNVs from data obtained using single-cell RNA sequencing. CopyVAE surpasses existing methods in terms of sensitivity and specificity. We also discussed CopyVAE's potential to advance our understanding of genetic alterations and their impact on disease advancement. AVAILABILITY AND IMPLEMENTATION: CopyVAE is implemented and freely available under MIT license at https://github.com/kurtsemih/copyVAE.


Assuntos
Variações do Número de Cópias de DNA , Análise de Célula Única , Análise de Célula Única/métodos , Humanos , Aprendizado Profundo , Software , Transcriptoma/genética , Análise de Sequência de RNA/métodos , Neoplasias/genética
3.
PLoS Comput Biol ; 20(5): e1012094, 2024 May.
Artigo em Inglês | MEDLINE | ID: mdl-38723024

RESUMO

Cell lineage tree reconstruction methods are developed for various tasks, such as investigating the development, differentiation, and cancer progression. Single-cell sequencing technologies enable more thorough analysis with higher resolution. We present Scuphr, a distance-based cell lineage tree reconstruction method using bulk and single-cell DNA sequencing data from healthy tissues. Common challenges of single-cell DNA sequencing, such as allelic dropouts and amplification errors, are included in Scuphr. Scuphr computes the distance between cell pairs and reconstructs the lineage tree using the neighbor-joining algorithm. With its embarrassingly parallel design, Scuphr can do faster analysis than the state-of-the-art methods while obtaining better accuracy. The method's robustness is investigated using various synthetic datasets and a biological dataset of 18 cells.


Assuntos
Algoritmos , Linhagem da Célula , Biologia Computacional , Análise de Célula Única , Análise de Célula Única/métodos , Linhagem da Célula/genética , Humanos , Biologia Computacional/métodos , Análise de Sequência de DNA/métodos , Software , Modelos Estatísticos
4.
Bioinformatics ; 38(5): 1235-1243, 2022 02 07.
Artigo em Inglês | MEDLINE | ID: mdl-34718417

RESUMO

MOTIVATION: DNA methylation plays a key role in a variety of biological processes. Recently, Nanopore long-read sequencing has enabled direct detection of these modifications. As a consequence, a range of computational methods have been developed to exploit Nanopore data for methylation detection. However, current approaches rely on a human-defined threshold to detect the methylation status of a genomic position and are not optimized to detect sites methylated at low frequency. Furthermore, most methods use either the Nanopore signals or the basecalling errors as the model input and do not take advantage of their combination. RESULTS: Here, we present DeepMP, a convolutional neural network-based model that takes information from Nanopore signals and basecalling errors to detect whether a given motif in a read is methylated or not. Besides, DeepMP introduces a threshold-free position modification calling model sensitive to sites methylated at low frequency across cells. We comprehensively benchmarked DeepMP against state-of-the-art methods on Escherichia coli, human and pUC19 datasets. DeepMP outperforms current approaches at read-based and position-based methylation detection across sites methylated at different frequencies in the three datasets. AVAILABILITY AND IMPLEMENTATION: DeepMP is implemented and freely available under MIT license at https://github.com/pepebonet/DeepMP. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Aprendizado Profundo , Sequenciamento por Nanoporos , Nanoporos , Humanos , Software , Análise de Sequência de DNA , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Escherichia coli/genética , DNA/genética
5.
PLoS Comput Biol ; 18(12): e1010732, 2022 12.
Artigo em Inglês | MEDLINE | ID: mdl-36469540

RESUMO

Identifying the interrelations among cancer driver genes and the patterns in which the driver genes get mutated is critical for understanding cancer. In this paper, we study cross-sectional data from cohorts of tumors to identify the cancer-type (or subtype) specific process in which the cancer driver genes accumulate critical mutations. We model this mutation accumulation process using a tree, where each node includes a driver gene or a set of driver genes. A mutation in each node enables its children to have a chance of mutating. This model simultaneously explains the mutual exclusivity patterns observed in mutations in specific cancer genes (by its nodes) and the temporal order of events (by its edges). We introduce a computationally efficient dynamic programming procedure for calculating the likelihood of our noisy datasets and use it to build our Markov Chain Monte Carlo (MCMC) inference algorithm, ToMExO. Together with a set of engineered MCMC moves, our fast likelihood calculations enable us to work with datasets with hundreds of genes and thousands of tumors, which cannot be dealt with using available cancer progression analysis methods. We demonstrate our method's performance on several synthetic datasets covering various scenarios for cancer progression dynamics. Then, a comparison against two state-of-the-art methods on a moderate-size biological dataset shows the merits of our algorithm in identifying significant and valid patterns. Finally, we present our analyses of several large biological datasets, including colorectal cancer, glioblastoma, and pancreatic cancer. In all the analyses, we validate the results using a set of method-independent metrics testing the causality and significance of the relations identified by ToMExO or competing methods.


Assuntos
Glioblastoma , Neoplasias , Criança , Humanos , Estudos Transversais , Neoplasias/genética , Neoplasias/patologia , Processos Neoplásicos , Algoritmos , Método de Monte Carlo , Mutação , Glioblastoma/genética
6.
PLoS Comput Biol ; 16(10): e1008183, 2020 10.
Artigo em Inglês | MEDLINE | ID: mdl-33035204

RESUMO

Identification of mutations of the genes that give cancer a selective advantage is an important step towards research and clinical objectives. As such, there has been a growing interest in developing methods for identification of driver genes and their temporal order within a single patient (intra-tumor) as well as across a cohort of patients (inter-tumor). In this paper, we develop a probabilistic model for tumor progression, in which the driver genes are clustered into several ordered driver pathways. We develop an efficient inference algorithm that exhibits favorable scalability to the number of genes and samples compared to a previously introduced ILP-based method. Adopting a probabilistic approach also allows principled approaches to model selection and uncertainty quantification. Using a large set of experiments on synthetic datasets, we demonstrate our superior performance compared to the ILP-based method. We also analyze two biological datasets of colorectal and glioblastoma cancers. We emphasize that while the ILP-based method puts many seemingly passenger genes in the driver pathways, our algorithm keeps focused on truly driver genes and outputs more accurate models for cancer progression.


Assuntos
Genes Neoplásicos/genética , Modelos Estatísticos , Neoplasias/genética , Neoplasias/patologia , Algoritmos , Biologia Computacional , Bases de Dados Genéticas , Progressão da Doença , Humanos , Mutação/genética
7.
Semin Cell Dev Biol ; 79: 123-130, 2018 07.
Artigo em Inglês | MEDLINE | ID: mdl-29146145

RESUMO

Cancer arises when pathways that control cell functions such as proliferation and migration are dysregulated to such an extent that cells start to divide uncontrollably and eventually spread throughout the body, ultimately endangering the survival of an affected individual. It is well established that somatic mutations are important in cancer initiation and progression as well as in creation of tumor diversity. Now also modifications of the transcriptome are emerging as a significant force during the transition from normal cell to malignant tumor. Editing of adenosine (A) to inosine (I) in double-stranded RNA, catalyzed by adenosine deaminases acting on RNA (ADARs), is one dynamic modification that in a combinatorial manner can give rise to a very diverse transcriptome. Since the cell interprets inosine as guanosine (G), editing can result in non-synonymous codon changes in transcripts as well as yield alternative splicing, but also affect targeting and disrupt maturation of microRNA. ADAR editing is essential for survival in mammals but its dysregulation can lead to cancer. ADAR1 is for instance overexpressed in, e.g., lung cancer, liver cancer, esophageal cancer and chronic myoelogenous leukemia, which with few exceptions promotes cancer progression. In contrast, ADAR2 is lowly expressed in e.g. glioblastoma, where the lower levels of ADAR2 editing leads to malignant phenotypes. Altogether, RNA editing by the ADAR enzymes is a powerful regulatory mechanism during tumorigenesis. Depending on the cell type, cancer progression seems to mainly be induced by ADAR1 upregulation or ADAR2 downregulation, although in a few cases ADAR1 is instead downregulated. In this review, we discuss how aberrant editing of specific substrates contributes to malignancy.


Assuntos
Adenosina Desaminase/metabolismo , Neoplasias/genética , Edição de RNA , RNA de Cadeia Dupla/genética , Proteínas de Ligação a RNA/metabolismo , Animais , Progressão da Doença , Regulação Neoplásica da Expressão Gênica , Humanos , Neoplasias/metabolismo , Neoplasias/patologia , Isoformas de RNA/genética , Isoformas de RNA/metabolismo , RNA de Cadeia Dupla/metabolismo
8.
PLoS Comput Biol ; 13(6): e1005556, 2017 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-28586362

RESUMO

A complex disease has, by definition, multiple genetic causes. In theory, these causes could be identified individually, but their identification will likely benefit from informed use of anticipated interactions between causes. In addition, characterizing and understanding interactions must be considered key to revealing the etiology of any complex disease. Large-scale collaborative efforts are now paving the way for comprehensive studies of interaction. As a consequence, there is a need for methods with a computational efficiency sufficient for modern data sets as well as for improvements of statistical accuracy and power. Another issue is that, currently, the relation between different methods for interaction inference is in many cases not transparent, complicating the comparison and interpretation of results between different interaction studies. In this paper we present computationally efficient tests of interaction for the complete family of generalized linear models (GLMs). The tests can be applied for inference of single or multiple interaction parameters, but we show, by simulation, that jointly testing the full set of interaction parameters yields superior power and control of false positive rate. Based on these tests we also describe how to combine results from multiple independent studies of interaction in a meta-analysis. We investigate the impact of several assumptions commonly made when modeling interactions. We also show that, across the important class of models with a full set of interaction parameters, jointly testing the interaction parameters yields identical results. Further, we apply our method to genetic data for cardiovascular disease. This allowed us to identify a putative interaction involved in Lp(a) plasma levels between two 'tag' variants in the LPA locus (p = 2.42 ⋅ 10-09) as well as replicate the interaction (p = 6.97 ⋅ 10-07). Finally, our meta-analysis method is used in a small (N = 16,181) study of interactions in myocardial infarction.


Assuntos
Mapeamento Cromossômico/métodos , Epistasia Genética/genética , Estudos de Associação Genética/métodos , Estudo de Associação Genômica Ampla/métodos , Modelos Lineares , Modelos Genéticos , Algoritmos , Animais , Humanos , Modelos Teóricos
9.
PLoS Genet ; 11(9): e1005502, 2015 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-26402789

RESUMO

Despite the success of genome-wide association studies in medical genetics, the underlying genetics of many complex diseases remains enigmatic. One plausible reason for this could be the failure to account for the presence of genetic interactions in current analyses. Exhaustive investigations of interactions are typically infeasible because the vast number of possible interactions impose hard statistical and computational challenges. There is, therefore, a need for computationally efficient methods that build on models appropriately capturing interaction. We introduce a new methodology where we augment the interaction hypothesis with a set of simpler hypotheses that are tested, in order of their complexity, against a saturated alternative hypothesis representing interaction. This sequential testing provides an efficient way to reduce the number of non-interacting variant pairs before the final interaction test. We devise two different methods, one that relies on a priori estimated numbers of marginally associated variants to correct for multiple tests, and a second that does this adaptively. We show that our methodology in general has an improved statistical power in comparison to seven other methods, and, using the idea of closed testing, that it controls the family-wise error rate. We apply our methodology to genetic data from the PROCARDIS coronary artery disease case/control cohort and discover three distinct interactions. While analyses on simulated data suggest that the statistical power may suffice for an exhaustive search of all variant pairs in ideal cases, we explore strategies for a priori selecting subsets of variant pairs to test. Our new methodology facilitates identification of new disease-relevant interactions from existing and future genome-wide association data, which may involve genes with previously unknown association to the disease. Moreover, it enables construction of interaction networks that provide a systems biology view of complex diseases, serving as a basis for more comprehensive understanding of disease pathophysiology and its clinical consequences.


Assuntos
Epistasia Genética , Estudo de Associação Genômica Ampla , Funções Verossimilhança , Humanos , Modelos Teóricos
10.
BMC Bioinformatics ; 17(Suppl 14): 431, 2016 Nov 11.
Artigo em Inglês | MEDLINE | ID: mdl-28185583

RESUMO

BACKGROUND: Lateral gene transfer (LGT) is an evolutionary process that has an important role in biology. It challenges the traditional binary tree-like evolution of species and is attracting increasing attention of the molecular biologists due to its involvement in antibiotic resistance. A number of attempts have been made to model LGT in the presence of gene duplication and loss, but reliably placing LGT events in the species tree has remained a challenge. RESULTS: In this paper, we propose probabilistic methods that samples reconciliations of the gene tree with a dated species tree and computes maximum a posteriori probabilities. The MCMC-based method uses the probabilistic model DLTRS, that integrates LGT, gene duplication, gene loss, and sequence evolution under a relaxed molecular clock for substitution rates. We can estimate posterior distributions on gene trees and, in contrast to previous work, the actual placement of potential LGT, which can be used to, e.g., identify "highways" of LGT. CONCLUSIONS: Based on a simulation study, we conclude that the method is able to infer the true LGT events on gene tree and reconcile it to the correct edges on the species tree in most cases. Applied to two biological datasets, containing gene families from Cyanobacteria and Molicutes, we find potential LGTs highways that corroborate other studies as well as previously undetected examples.


Assuntos
Transferência Genética Horizontal/genética , Modelos Genéticos , Evolução Biológica , Entomoplasmataceae/classificação , Entomoplasmataceae/genética , Filogenia
11.
Mol Biol Evol ; 32(9): 2469-82, 2015 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-25963975

RESUMO

Species tree reconstruction has been a subject of substantial research due to its central role across biology and medicine. A species tree is often reconstructed using a set of gene trees or by directly using sequence data. In either of these cases, one of the main confounding phenomena is the discordance between a species tree and a gene tree due to evolutionary events such as duplications and losses. Probabilistic methods can resolve the discordance by coestimating gene trees and the species tree but this approach poses a scalability problem for larger data sets. We present MixTreEM-DLRS: A two-phase approach for reconstructing a species tree in the presence of gene duplications and losses. In the first phase, MixTreEM, a novel structural expectation maximization algorithm based on a mixture model is used to reconstruct a set of candidate species trees, given sequence data for monocopy gene families from the genomes under study. In the second phase, PrIME-DLRS, a method based on the DLRS model (Åkerborg O, Sennblad B, Arvestad L, Lagergren J. 2009. Simultaneous Bayesian gene tree reconstruction and reconciliation analysis. Proc Natl Acad Sci U S A. 106(14):5714-5719), is used for selecting the best species tree. PrIME-DLRS can handle multicopy gene families since DLRS, apart from modeling sequence evolution, models gene duplication and loss using a gene evolution model (Arvestad L, Lagergren J, Sennblad B. 2009. The gene evolution model and computing its associated probabilities. J ACM. 56(2):1-44). We evaluate MixTreEM-DLRS using synthetic and biological data, and compare its performance with a recent genome-scale species tree reconstruction method PHYLDOG (Boussau B, Szöllosi GJ, Duret L, Gouy M, Tannier E, Daubin V. 2013. Genome-scale coestimation of species and gene trees. Genome Res. 23(2):323-330) as well as with a fast parsimony-based algorithm Duptree (Wehe A, Bansal MS, Burleigh JG, Eulenstein O. 2008. Duptree: a program for large-scale phylogenetic analyses using gene tree parsimony. Bioinformatics 24(13):1540-1541). Our method is competitive with PHYLDOG in terms of accuracy and runs significantly faster and our method outperforms Duptree in accuracy. The analysis constituted by MixTreEM without DLRS may also be used for selecting the target species tree, yielding a fast and yet accurate algorithm for larger data sets. MixTreEM is freely available at http://prime.scilifelab.se/mixtreem/.


Assuntos
Modelos Genéticos , Filogenia , Algoritmos , Animais , Análise por Conglomerados , Genes , Humanos
12.
Syst Biol ; 64(6): 969-82, 2015 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-26130236

RESUMO

Orthology analysis, that is, finding out whether a pair of homologous genes are orthologs - stemming from a speciation - or paralogs - stemming from a gene duplication - is of central importance in computational biology, genome annotation, and phylogenetic inference. In particular, an orthologous relationship makes functional equivalence of the two genes highly likely. A major approach to orthology analysis is to reconcile a gene tree to the corresponding species tree, (most commonly performed using the most parsimonious reconciliation, MPR). However, most such phylogenetic orthology methods infer the gene tree without considering the constraints implied by the species tree and, perhaps even more importantly, only allow the gene sequences to influence the orthology analysis through the a priori reconstructed gene tree. We propose a sound, comprehensive Bayesian Markov chain Monte Carlo-based method, DLRSOrthology, to compute orthology probabilities. It efficiently sums over the possible gene trees and jointly takes into account the current gene tree, all possible reconciliations to the species tree, and the, typically strong, signal conveyed by the sequences. We compare our method with PrIME-GEM, a probabilistic orthology approach built on a probabilistic duplication-loss model, and MrBayesMPR, a probabilistic orthology approach that is based on conventional Bayesian inference coupled with MPR. We find that DLRSOrthology outperforms these competing approaches on synthetic data as well as on biological data sets and is robust to incomplete taxon sampling artifacts.


Assuntos
Classificação/métodos , Filogenia , Algoritmos , Simulação por Computador , Homologia de Sequência , Software
13.
BMC Genomics ; 16 Suppl 10: S12, 2015.
Artigo em Inglês | MEDLINE | ID: mdl-26449131

RESUMO

Over the last decade, methods have been developed for the reconstruction of gene trees that take into account the species tree. Many of these methods have been based on the probabilistic duplication-loss model, which describes how a gene-tree evolves over a species-tree with respect to duplication and losses, as well as extension of this model, e.g., the DLRS (Duplication, Loss, Rate and Sequence evolution) model that also includes sequence evolution under relaxed molecular clock. A disjoint, almost as recent, and very important line of research has been focused on non protein-coding, but yet, functional DNA. For instance, DNA sequences being pseudogenes in the sense that they are not translated, may still be transcribed and the thereby produced RNA may be functional.


Assuntos
DNA/genética , Evolução Molecular , Filogenia , Pseudogenes/genética , Duplicação Gênica
14.
Genome Res ; 22(8): 1477-87, 2012 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-22645261

RESUMO

Adenosine-to-inosine (A-to-I) RNA editing targets double-stranded RNA stem-loop structures in the mammalian brain. It has previously been shown that miRNAs are substrates for A-to-I editing. For the first time, we show that for several definitions of edited miRNA, the level of editing increases with development, thereby indicating a regulatory role for editing during brain maturation. We use high-throughput RNA sequencing to determine editing levels in mature miRNA, from the mouse transcriptome, and compare these with the levels of editing in pri-miRNA. We show that increased editing during development gradually changes the proportions of the two miR-376a isoforms, which previously have been shown to have different targets. Several other miRNAs that also are edited in the seed sequence show an increased level of editing through development. By comparing editing of pri-miRNA with editing and expression of the corresponding mature miRNA, we also show an editing-induced developmental regulation of miRNA expression. Taken together, our results imply that RNA editing influences the miRNA repertoire during brain maturation.


Assuntos
Adenosina/metabolismo , Encéfalo/metabolismo , Regulação da Expressão Gênica no Desenvolvimento , Inosina/metabolismo , MicroRNAs/metabolismo , Edição de RNA , Adenosina/genética , Animais , Sequência de Bases , Encéfalo/embriologia , Encéfalo/crescimento & desenvolvimento , Biologia Computacional , Dendritos/genética , Dendritos/metabolismo , Embrião de Mamíferos/metabolismo , Desenvolvimento Embrionário/genética , Sequenciamento de Nucleotídeos em Larga Escala , Inosina/genética , Camundongos , MicroRNAs/genética , Isoformas de RNA/genética , Isoformas de RNA/metabolismo , Proteínas de Ligação a RNA/genética , Proteínas de Ligação a RNA/metabolismo , Transcriptoma
15.
Syst Biol ; 63(3): 409-20, 2014 May.
Artigo em Inglês | MEDLINE | ID: mdl-24562812

RESUMO

Lateral gene transfer (LGT)--which transfers DNA between two non-vertically related individuals belonging to the same or different species--is recognized as a major force in prokaryotic evolution, and evidence of its impact on eukaryotic evolution is ever increasing. LGT has attracted much public attention for its potential to transfer pathogenic elements and antibiotic resistance in bacteria, and to transfer pesticide resistance from genetically modified crops to other plants. In a wider perspective, there is a growing body of studies highlighting the role of LGT in enabling organisms to occupy new niches or adapt to environmental changes. The challenge LGT poses to the standard tree-based conception of evolution is also being debated. Studies of LGT have, however, been severely limited by a lack of computational tools. The best currently available LGT algorithms are parsimony-based phylogenetic methods, which require a pre-computed gene tree and cannot choose between sometimes wildly differing most parsimonious solutions. Moreover, in many studies, simple heuristics are applied that can only handle putative orthologs and completely disregard gene duplications (GDs). Consequently, proposed LGT among specific gene families, and the rate of LGT in general, remain debated. We present a Bayesian Markov-chain Monte Carlo-based method that integrates GD, gene loss, LGT, and sequence evolution, and apply the method in a genome-wide analysis of two groups of bacteria: Mollicutes and Cyanobacteria. Our analyses show that although the LGT rate between distant species is high, the net combined rate of duplication and close-species LGT is on average higher. We also show that the common practice of disregarding reconcilability in gene tree inference overestimates the number of LGT and duplication events.


Assuntos
Classificação/métodos , Transferência Genética Horizontal , Teorema de Bayes , Cianobactérias/classificação , Cianobactérias/genética , Evolução Molecular , Modelos Teóricos , Filogenia , Tenericutes/classificação , Tenericutes/genética
16.
Cell Syst ; 15(2): 149-165.e10, 2024 Feb 21.
Artigo em Inglês | MEDLINE | ID: mdl-38340731

RESUMO

Cell types can be classified according to shared patterns of transcription. Non-genetic variability among individual cells of the same type has been ascribed to stochastic transcriptional bursting and transient cell states. Using high-coverage single-cell RNA profiling, we asked whether long-term, heritable differences in gene expression can impart diversity within cells of the same type. Studying clonal human lymphocytes and mouse brain cells, we uncovered a vast diversity of heritable gene expression patterns among different clones of cells of the same type in vivo. We combined chromatin accessibility and RNA profiling on different lymphocyte clones to reveal thousands of regulatory regions exhibiting interclonal variation, which could be directly linked to interclonal variation in gene expression. Our findings identify a source of cellular diversity, which may have important implications for how cellular populations are shaped by selective processes in development, aging, and disease. A record of this paper's transparent peer review process is included in the supplemental information.


Assuntos
Cromatina , RNA , Humanos , Camundongos , Animais , Envelhecimento , Expressão Gênica
17.
BMC Bioinformatics ; 14: 209, 2013 Jun 27.
Artigo em Inglês | MEDLINE | ID: mdl-23803001

RESUMO

BACKGROUND: PrIME-GenPhyloData is a suite of tools for creating realistic simulated phylogenetic trees, in particular for families of homologous genes. It supports generation of trees based on a birth-death process and--perhaps more interestingly--also supports generation of gene family trees guided by a known (synthetic or biological) species tree while accounting for events such as gene duplication, gene loss, and lateral gene transfer (LGT). The suite also supports a wide range of branch rate models enabling relaxation of the molecular clock. RESULT: Simulated data created with PrIME-GenPhyloData can be used for benchmarking phylogenetic approaches, or for characterizing models or model parameters with respect to biological data. CONCLUSION: The concept of tree-in-tree evolution can also be used to model, for instance, biogeography or host-parasite co-evolution.


Assuntos
Duplicação Gênica/genética , Família Multigênica/genética , Filogenia , Relógios Biológicos/genética , Simulação por Computador , Evolução Molecular , Técnicas de Transferência de Genes , Humanos , Modelos Biológicos , Especificidade da Espécie
18.
BMC Bioinformatics ; 14 Suppl 15: S10, 2013.
Artigo em Inglês | MEDLINE | ID: mdl-24564421

RESUMO

Gene duplication is considered to be a major driving force in evolution that enables the genome of a species to acquire new functions. A reconciliation--a mapping of gene tree vertices to the edges or vertices of a species tree--explains where gene duplications have occurred on the species tree. In this study, we sample reconciliations from a posterior over reconciliations, gene trees, edge lengths and other parameters, given a species tree and gene sequences. We employ a Bayesian analysis tool, based on the probabilistic model DLRS that integrates gene duplication, gene loss and sequence evolution under a relaxed molecular clock for substitution rates, to obtain this posterior.


Assuntos
Genoma , Vertebrados/genética , Algoritmos , Animais , Teorema de Bayes , Evolução Molecular , Duplicação Gênica , Humanos , Filogenia
19.
BMC Bioinformatics ; 14: 334, 2013 Nov 20.
Artigo em Inglês | MEDLINE | ID: mdl-24255987

RESUMO

BACKGROUND: Distance methods are ubiquitous tools in phylogenetics. Their primary purpose may be to reconstruct evolutionary history, but they are also used as components in bioinformatic pipelines. However, poor computational efficiency has been a constraint on the applicability of distance methods on very large problem instances. RESULTS: We present fastphylo, a software package containing implementations of efficient algorithms for two common problems in phylogenetics: estimating DNA/protein sequence distances and reconstructing a phylogeny from a distance matrix. We compare fastphylo with other neighbor joining based methods and report the results in terms of speed and memory efficiency. CONCLUSIONS: Fastphylo is a fast, memory efficient, and easy to use software suite. Due to its modular architecture, fastphylo is a flexible tool for many phylogenetic studies.


Assuntos
Biologia Computacional/instrumentação , Biologia Computacional/métodos , Filogenia , Algoritmos , Sequência de Aminoácidos , Evolução Biológica , Idioma , Memória , Família Multigênica , Software
20.
Bioinformatics ; 28(22): 2994-5, 2012 Nov 15.
Artigo em Inglês | MEDLINE | ID: mdl-22982573

RESUMO

SUMMARY: PrIME-DLRS (or colloquially: 'Delirious') is a phylogenetic software tool to simultaneously infer and reconcile a gene tree given a species tree. It accounts for duplication and loss events, a relaxed molecular clock and is intended for the study of homologous gene families, for example in a comparative genomics setting involving multiple species. PrIME-DLRS uses a Bayesian MCMC framework, where the input is a known species tree with divergence times and a multiple sequence alignment, and the output is a posterior distribution over gene trees and model parameters. AVAILABILITY AND IMPLEMENTATION: PrIME-DLRS is available for Java SE 6+ under the New BSD License, and JAR files and source code can be downloaded from http://code.google.com/p/jprime/. There is also a slightly older C++ version available as a binary package for Ubuntu, with download instructions at http://prime.sbc.su.se. The C++ source code is available upon request. CONTACT: joel.sjostrand@scilifelab.se or jens.lagergren@scilifelab.se. SUPPLEMENTARY INFORMATION: PrIME-DLRS is based on a sound probabilistic model (Åkerborg et al., 2009) and has been thoroughly validated on synthetic and biological datasets (Supplementary Material online).


Assuntos
Evolução Molecular , Filogenia , Software , Algoritmos , Animais , Teorema de Bayes , Modelos Estatísticos , Linguagens de Programação , Alinhamento de Sequência
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA