Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 131
Filtrar
1.
Genome Res ; 31(4): 635-644, 2021 04.
Artigo em Inglês | MEDLINE | ID: mdl-33602693

RESUMO

The COVID-19 pandemic has sparked an urgent need to uncover the underlying biology of this devastating disease. Though RNA viruses mutate more rapidly than DNA viruses, there are a relatively small number of single nucleotide polymorphisms (SNPs) that differentiate the main SARS-CoV-2 lineages that have spread throughout the world. In this study, we investigated 129 RNA-seq data sets and 6928 consensus genomes to contrast the intra-host and inter-host diversity of SARS-CoV-2. Our analyses yielded three major observations. First, the mutational profile of SARS-CoV-2 highlights intra-host single nucleotide variant (iSNV) and SNP similarity, albeit with differences in C > U changes. Second, iSNV and SNP patterns in SARS-CoV-2 are more similar to MERS-CoV than SARS-CoV-1. Third, a significant fraction of insertions and deletions contribute to the genetic diversity of SARS-CoV-2. Altogether, our findings provide insight into SARS-CoV-2 genomic diversity, inform the design of detection tests, and highlight the potential of iSNVs for tracking the transmission of SARS-CoV-2.


Assuntos
COVID-19/diagnóstico , COVID-19/transmissão , Variação Genética , Genoma Viral , Reação em Cadeia da Polimerase em Tempo Real/métodos , SARS-CoV-2/genética , COVID-19/virologia , Interações Hospedeiro-Patógeno , Humanos , Polimorfismo de Nucleotídeo Único
2.
Nat Rev Genet ; 19(11): 733, 2018 11.
Artigo em Inglês | MEDLINE | ID: mdl-30283054

RESUMO

The originally published article contained errors in Fig. 1 (Decision tree for the selection of a suitable NGS genomic simulator), whereby the labels 'Variants' and 'No Variants' had been switched. The correct figure is presented in this notice.

3.
Bioinformatics ; 38(Suppl 1): i195-i202, 2022 06 24.
Artigo em Inglês | MEDLINE | ID: mdl-35758771

RESUMO

MOTIVATION: Single-nucleotide variants (SNVs) are the most common variations in the human genome. Recently developed methods for SNV detection from single-cell DNA sequencing data, such as SCIΦ and scVILP, leverage the evolutionary history of the cells to overcome the technical errors associated with single-cell sequencing protocols. Despite being accurate, these methods are not scalable to the extensive genomic breadth of single-cell whole-genome (scWGS) and whole-exome sequencing (scWES) data. RESULTS: Here, we report on a new scalable method, Phylovar, which extends the phylogeny-guided variant calling approach to sequencing datasets containing millions of loci. Through benchmarking on simulated datasets under different settings, we show that, Phylovar outperforms SCIΦ in terms of running time while being more accurate than Monovar (which is not phylogeny-aware) in terms of SNV detection. Furthermore, we applied Phylovar to two real biological datasets: an scWES triple-negative breast cancer data consisting of 32 cells and 3375 loci as well as an scWGS data of neuron cells from a normal human brain containing 16 cells and approximately 2.5 million loci. For the cancer data, Phylovar detected somatic SNVs with high or moderate functional impact that were also supported by bulk sequencing dataset and for the neuron dataset, Phylovar identified 5745 SNVs with non-synonymous effects some of which were associated with neurodegenerative diseases. AVAILABILITY AND IMPLEMENTATION: Phylovar is implemented in Python and is publicly available at https://github.com/NakhlehLab/Phylovar.


Assuntos
Sequenciamento de Nucleotídeos em Larga Escala , Nucleotídeos , Genoma Humano , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Humanos , Filogenia , Análise de Sequência de DNA
4.
Genomics ; 114(2): 110315, 2022 03.
Artigo em Inglês | MEDLINE | ID: mdl-35181467

RESUMO

Human mitochondria can be genetically distinct within the same individual, a phenomenon known as heteroplasmy. In cancer, this phenomenon seems exacerbated, and most mitochondrial mutations seem to be heteroplasmic. How this genetic variation is arranged within and among normal and tumor cells is not well understood. To address this question, here we sequenced single-cell mitochondrial genomes from multiple normal and tumoral locations in four colorectal cancer patients. Our results suggest that single cells, both normal and tumoral, can carry various mitochondrial haplotypes. Remarkably, this intra-cell heteroplasmy can arise before tumor development and be maintained afterward in specific tumoral cell subpopulations. At least in the colorectal patients studied here, the somatic mutations in the single-cells do not seem to have a prominent role in tumorigenesis.


Assuntos
Neoplasias Colorretais , DNA Mitocondrial , Neoplasias Colorretais/genética , DNA Mitocondrial/genética , Haplótipos , Heteroplasmia , Humanos , Mitocôndrias/genética
5.
Genomics ; 114(6): 110500, 2022 11.
Artigo em Inglês | MEDLINE | ID: mdl-36202322

RESUMO

The genomic profiling of circulating tumor cells (CTCs) in the bloodstream should provide clinically relevant information on therapeutic efficacy and help predict cancer survival. Here, we contrasted the genomic profiles of CTC pools recovered from metastatic colorectal cancer (mCRC) patients using different enrichment strategies (CellSearch, Parsortix, and FACS). Mutations inferred in the CTC pools differed depending on the enrichment strategy and, in all cases, represented a subset of the mutations detected in the matched primary tumor samples. However, the CTC pools from Parsortix, and in part, CellSearch, showed diversity estimates, mutational signatures, and drug-suitability scores remarkably close to those found in matching primary tumor samples. In addition, FACS CTC pools were enriched in apparent sequencing artifacts, leading to much higher genomic diversity estimates. Our results highlight the utility of CTCs to assess the genomic heterogeneity of individual tumors and help clinicians prioritize drugs in mCRC.


Assuntos
Neoplasias Colorretais , Células Neoplásicas Circulantes , Humanos , Genômica , Neoplasias Colorretais/genética
6.
Nat Rev Genet ; 17(8): 459-69, 2016 08.
Artigo em Inglês | MEDLINE | ID: mdl-27320129

RESUMO

Computer simulation of genomic data has become increasingly popular for assessing and validating biological models or for gaining an understanding of specific data sets. Several computational tools for the simulation of next-generation sequencing (NGS) data have been developed in recent years, which could be used to compare existing and new NGS analytical pipelines. Here we review 23 of these tools, highlighting their distinct functionality, requirements and potential applications. We also provide a decision tree for the informed selection of an appropriate NGS simulation tool for the specific question at hand.


Assuntos
Biologia Computacional/métodos , Genômica/métodos , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Mutação/genética , Animais , Simulação por Computador , Genoma Humano , Humanos , Software
7.
Mol Biol Evol ; 37(5): 1535-1542, 2020 05 01.
Artigo em Inglês | MEDLINE | ID: mdl-32027371

RESUMO

Our capacity to study individual cells has enabled a new level of resolution for understanding complex biological systems such as multicellular organisms or microbial communities. Not surprisingly, several methods have been developed in recent years with a formidable potential to investigate the somatic evolution of single cells in both healthy and pathological tissues. However, single-cell sequencing data can be quite noisy due to different technical biases, so inferences resulting from these new methods need to be carefully contrasted. Here, I introduce CellCoal, a software tool for the coalescent simulation of single-cell sequencing genotypes. CellCoal simulates the history of single-cell samples obtained from somatic cell populations with different demographic histories and produces single-nucleotide variants under a variety of mutation models, sequencing read counts, and genotype likelihoods, considering allelic imbalance, allelic dropout, amplification, and sequencing errors, typical of this type of data. CellCoal is a flexible tool that can be used to understand the implications of different somatic evolutionary processes at the single-cell level, and to benchmark dedicated bioinformatic tools for the analysis of single-cell sequencing data. CellCoal is available at https://github.com/dapogon/cellcoal.


Assuntos
Técnicas Genéticas , Genótipo , Análise de Célula Única , Software , Evolução Molecular , Análise de Sequência de DNA
8.
Mol Biol Evol ; 37(1): 291-294, 2020 Jan 01.
Artigo em Inglês | MEDLINE | ID: mdl-31432070

RESUMO

ModelTest-NG is a reimplementation from scratch of jModelTest and ProtTest, two popular tools for selecting the best-fit nucleotide and amino acid substitution models, respectively. ModelTest-NG is one to two orders of magnitude faster than jModelTest and ProtTest but equally accurate and introduces several new features, such as ascertainment bias correction, mixture, and free-rate models, or the automatic processing of single partitions. ModelTest-NG is available under a GNU GPL3 license at https://github.com/ddarriba/modeltest , last accessed September 2, 2019.


Assuntos
Substituição de Aminoácidos , Evolução Molecular , Técnicas Genéticas , Modelos Genéticos , Software
9.
J Mol Evol ; 89(3): 134-145, 2021 04.
Artigo em Inglês | MEDLINE | ID: mdl-33438113

RESUMO

In 1981, the Journal of Molecular Evolution (JME) published an article entitled "Evolutionary trees from DNA sequences: A maximum likelihood approach" by Joseph (Joe) Felsenstein (J Mol Evol 17:368-376, 1981). This groundbreaking work laid the foundation for the emerging field of statistical phylogenetics, providing a tractable way of finding maximum likelihood (ML) estimates of evolutionary trees from DNA sequence data. This paper is the second most cited (more than 9000 citations) in JME after Kimura's (J Mol Evol 16:111-120, 1980) seminal paper on a model of nucleotide substitution (with nearly 20,000 citations). On the occasion of the 50th anniversary of JME, we elaborate on the significance of Felsenstein's ML approach to estimating phylogenetic trees.


Assuntos
Evolução Molecular , Sequência de Bases , Funções Verossimilhança , Filogenia
10.
Theor Popul Biol ; 142: 1-11, 2021 12.
Artigo em Inglês | MEDLINE | ID: mdl-34563554

RESUMO

A coalescent model of a sample of size n is derived from a birth-death process that originates at a random time in the past from a single founder individual. Over time, the descendants of the founder evolve into a population of large (infinite) size from which a sample of size n is taken. The parameters and time of the birth-death process are scaled in N0, the size of the present-day population, while letting N0→∞, similarly to how the standard Kingman coalescent process arises from the Wright-Fisher model. The model is named the Limit Birth-Death (LBD) coalescent model. Simulations from the LBD coalescent model with sample size n are computationally slow compared to standard coalescent models. Therefore, we suggest different approximations to the LBD coalescent model assuming the population size is a deterministic function of time rather than a stochastic process. Furthermore, we introduce a hybrid LBD coalescent model, that combines the exactness of the LBD coalescent model model with the speed of the approximations.


Assuntos
Genética Populacional , Modelos Genéticos , Densidade Demográfica , Tamanho da Amostra , Processos Estocásticos
11.
BMC Biol ; 18(1): 116, 2020 09 07.
Artigo em Inglês | MEDLINE | ID: mdl-32895052

RESUMO

BACKGROUND: Colorectal cancer (CRC) development is generally accepted as a sequential process, with genetic mutations determining phenotypic tumor progression. However, matching genetic profiles with histological transition requires the analyses of temporal samples from the same patient at key stages of progression. RESULTS: Here, we compared the genetic profiles of 34 early carcinomas with their respective adenomatous precursors to assess timing and heterogeneity of driver alterations accompanying the switch from benign adenoma to malignant carcinoma. In almost half of the cases, driver mutations specific to the carcinoma stage were not observed. In samples where carcinoma-specific alterations were present, TP53 mutations and chromosome 20 copy gains commonly accompanied the switch from adenomatous tissue to carcinoma. Remarkably, 40% and 50% of high-grade adenomas shared TP53 mutations and chromosome 20 gains, respectively, with their matched carcinomas. In addition, multi-regional analyses revealed greater heterogeneity of driver mutations in adenomas compared to their matched carcinomas. CONCLUSION: Genetic alterations in TP53 and chromosome 20 occur at the earliest histological stage in colorectal carcinomas (pTis and pT1). However, high-grade adenomas can share these alterations despite their histological distinction. Based on the well-defined sequence of CRC development, we suggest that the timing of genetic changes during neoplastic progression is frequently uncoupled from histological progression.


Assuntos
Adenoma/patologia , Carcinoma/patologia , Transformação Celular Neoplásica/patologia , Neoplasias Colorretais/patologia , Mutação , Adenoma/genética , Carcinoma/genética , Neoplasias Colorretais/genética , Progressão da Doença , Humanos
12.
J Mol Evol ; 88(3): 211-226, 2020 04.
Artigo em Inglês | MEDLINE | ID: mdl-32060574

RESUMO

A collection of the editors of Journal of Molecular Evolution have gotten together to pose a set of key challenges and future directions for the field of molecular evolution. Topics include challenges and new directions in prebiotic chemistry and the RNA world, reconstruction of early cellular genomes and proteins, macromolecular and functional evolution, evolutionary cell biology, genome evolution, molecular evolutionary ecology, viral phylodynamics, theoretical population genomics, somatic cell molecular evolution, and directed evolution. While our effort is not meant to be exhaustive, it reflects research questions and problems in the field of molecular evolution that are exciting to our editors.


Assuntos
Evolução Molecular , Origem da Vida , RNA/genética , Ecologia , Genética Populacional , Genoma , Publicações Periódicas como Assunto , Proteínas/genética , Seleção Genética
13.
Bioinformatics ; 34(14): 2506-2507, 2018 07 15.
Artigo em Inglês | MEDLINE | ID: mdl-29534152

RESUMO

Motivation: Advances in sequencing technologies have made it feasible to obtain massive datasets for phylogenomic inference, often consisting of large numbers of loci from multiple species and individuals. The phylogenomic analysis of next-generation sequencing (NGS) data requires a complex computational pipeline where multiple technical and methodological decisions are necessary that can influence the final tree obtained, like those related to coverage, assembly, mapping, variant calling and/or phasing. Results: To assess the influence of these variables we introduce NGSphy, an open-source tool for the simulation of Illumina reads/read counts obtained from haploid/diploid individual genomes with thousands of independent gene families evolving under a common species tree. In order to resemble real NGS experiments, NGSphy includes multiple options to model sequencing coverage (depth) heterogeneity across species, individuals and loci, including off-target or uncaptured loci. For comprehensive simulations covering multiple evolutionary scenarios, parameter values for the different replicates can be sampled from user-defined statistical distributions. Availability and implementation: Source code, full documentation and tutorials including a 'Getting started' guide are available at http://github.com/merlyescalona/ngsphy. Supplementary information: Supplementary data are available at Bioinformatics online.


Assuntos
Simulação por Computador , Genoma , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Filogenia , Análise de Sequência de DNA/métodos , Software , Evolução Biológica , Genômica/métodos
14.
Bioinformatics ; 34(21): 3646-3652, 2018 11 01.
Artigo em Inglês | MEDLINE | ID: mdl-29762653

RESUMO

Motivation: A reconciliation is an annotation of the nodes of a gene tree with evolutionary events-for example, speciation, gene duplication, transfer, loss, etc.-along with a mapping onto a species tree. Many algorithms and software produce or use reconciliations but often using different reconciliation formats, regarding the type of events considered or whether the species tree is dated or not. This complicates the comparison and communication between different programs. Results: Here, we gather a consortium of software developers in gene tree species tree reconciliation to propose and endorse a format that aims to promote an integrative-albeit flexible-specification of phylogenetic reconciliations. This format, named recPhyloXML, is accompanied by several tools such as a reconciled tree visualizer and conversion utilities. Availability and implementation: http://phylariane.univ-lyon1.fr/recphyloxml/.


Assuntos
Evolução Molecular , Duplicação Gênica , Algoritmos , Filogenia , Software
15.
Syst Biol ; 65(3): 397-416, 2016 May.
Artigo em Inglês | MEDLINE | ID: mdl-25281847

RESUMO

Current phylogenomic data sets highlight the need for species tree methods able to deal with several sources of gene tree/species tree incongruence. At the same time, we need to make most use of all available data. Most species tree methods deal with single processes of phylogenetic discordance, namely, gene duplication and loss, incomplete lineage sorting (ILS) or horizontal gene transfer. In this manuscript, we address the problem of species tree inference from multilocus, genome-wide data sets regardless of the presence of gene duplication and loss and ILS therefore without the need to identify orthologs or to use a single individual per species. We do this by extending the idea of Maximum Likelihood (ML) supertrees to a hierarchical Bayesian model where several sources of gene tree/species tree disagreement can be accounted for in a modular manner. We implemented this model in a computer program called guenomu whose inputs are posterior distributions of unrooted gene tree topologies for multiple gene families, and whose output is the posterior distribution of rooted species tree topologies. We conducted extensive simulations to evaluate the performance of our approach in comparison with other species tree approaches able to deal with more than one leaf from the same species. Our method ranked best under simulated data sets, in spite of ignoring branch lengths, and performed well on empirical data, as well as being fast enough to analyze relatively large data sets. Our Bayesian supertree method was also very successful in obtaining better estimates of gene trees, by reducing the uncertainty in their distributions. In addition, our results show that under complex simulation scenarios, gene tree parsimony is also a competitive approach once we consider its speed, in contrast to more sophisticated models.


Assuntos
Classificação/métodos , Modelos Genéticos , Filogenia , Teorema de Bayes , Simulação por Computador , Genoma/genética , Software
16.
Syst Biol ; 65(2): 334-44, 2016 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-26526427

RESUMO

We present a fast and flexible software package--SimPhy--for the simulation of multiple gene families evolving under incomplete lineage sorting, gene duplication and loss, horizontal gene transfer--all three potentially leading to species tree/gene tree discordance--and gene conversion. SimPhy implements a hierarchical phylogenetic model in which the evolution of species, locus, and gene trees is governed by global and local parameters (e.g., genome-wide, species-specific, locus-specific), that can be fixed or be sampled from a priori statistical distributions. SimPhy also incorporates comprehensive models of substitution rate variation among lineages (uncorrelated relaxed clocks) and the capability of simulating partitioned nucleotide, codon, and protein multilocus sequence alignments under a plethora of substitution models using the program INDELible. We validate SimPhy's output using theoretical expectations and other programs, and show that it scales extremely well with complex models and/or large trees, being an order of magnitude faster than the most similar program (DLCoal-Sim). In addition, we demonstrate how SimPhy can be useful to understand interactions among different evolutionary processes, conducting a simulation study to characterize the systematic overestimation of the duplication time when using standard reconciliation methods. SimPhy is available at https://github.com/adamallo/SimPhy, where users can find the source code, precompiled executables, a detailed manual and example cases.


Assuntos
Classificação/métodos , Simulação por Computador , Filogenia , Software/normas , Genes/genética , Loci Gênicos/genética , Especiação Genética , Reprodutibilidade dos Testes
17.
Mol Biol Evol ; 32(4): 1109-12, 2015 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-25577191

RESUMO

The estimation of substitution and recombination rates can provide important insights into the molecular evolution of protein-coding sequences. Here, we present a new computational framework, called "CodABC," to jointly estimate recombination, substitution and synonymous and nonsynonymous rates from coding data. CodABC uses approximate Bayesian computation with and without regression adjustment and implements a variety of codon models, intracodon recombination, and longitudinal sampling. CodABC can provide accurate joint parameter estimates from recombining coding sequences, often outperforming maximum-likelihood methods based on more approximate models. In addition, CodABC allows for the inclusion of several nuisance parameters such as those representing codon frequencies, transition matrices, heterogeneity across sites or invariable sites. CodABC is freely available from http://code.google.com/p/codabc/, includes a GUI, extensive documentation and ready-to-use examples, and can run in parallel on multicore machines.


Assuntos
Simulação por Computador , Taxa de Mutação , Fases de Leitura Aberta/genética , Recombinação Genética , Teorema de Bayes , Funções Verossimilhança , Software
18.
BMC Genomics ; 16: 728, 2015 Sep 24.
Artigo em Inglês | MEDLINE | ID: mdl-26400066

RESUMO

BACKGROUND: The Mediterranean mussel (Mytilus galloprovincialis) is a cosmopolitan, cultured bivalve with worldwide commercial and ecological importance. However, there is a qualitative and quantitative lack of knowledge of the molecular mechanisms involved in the physiology and immune response of this mollusc. In order to start filling this gap, we have studied the transcriptome of mantle, muscle and gills from naïve Mediterranean mussels and hemocytes exposed to distinct stimuli. RESULTS: A total of 393,316 million raw RNA-Seq reads were obtained and assembled into 151,320 non-redundant transcripts with an average length of 570 bp. Only 55 % of the transcripts were shared across all tissues. Hemocyte and gill transcriptomes shared 60 % of the transcripts while mantle and muscle transcriptomes were most similar, with 77 % shared transcripts. Stimulated hemocytes showed abundant defense and immune-related proteins, in particular, an extremely high amount of antimicrobial peptides. Gills expressed many transcripts assigned to both structure and recognition of non-self patterns, while in mantle many transcripts were related to reproduction and shell formation. Moreover, this tissue presented additional and interesting hematopoietic, antifungal and sensorial functions. Finally, muscle expressed many myofibril and calcium-related proteins and was found to be unexpectedly associated with defense functions. In addition, many metabolic routes related to cancer were represented. CONCLUSIONS: Our analyses indicate that whereas the transcriptomes of these four tissues have characteristic expression profiles in agreement with their biological structures and expected functions, tissue-specific transcriptomes reveal a complex and specialized functions.


Assuntos
Brânquias , Hemócitos , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Transcriptoma/genética , Animais , Regulação da Expressão Gênica , Mytilus/genética , Distribuição Tecidual/genética
19.
Mol Biol Evol ; 31(5): 1295-301, 2014 May.
Artigo em Inglês | MEDLINE | ID: mdl-24557445

RESUMO

Genomic evolution can be highly heterogeneous. Here, we introduce a new framework to simulate genome-wide sequence evolution under a variety of substitution models that may change along the genome and the phylogeny, following complex multispecies coalescent histories that can include recombination, demographics, longitudinal sampling, population subdivision/species history, and migration. A key aspect of our simulation strategy is that the heterogeneity of the whole evolutionary process can be parameterized according to statistical prior distributions specified by the user. We used this framework to carry out a study of the impact of variable codon frequencies across genomic regions on the estimation of the genome-wide nonsynonymous/synonymous ratio. We found that both variable codon frequencies across genes and rate variation among sites and regions can lead to severe underestimation of the global dN/dS values. The program SGWE-Simulation of Genome-Wide Evolution-is freely available from http://code.google.com/p/sgwe-project/, including extensive documentation and detailed examples.


Assuntos
Evolução Molecular , Genoma , Modelos Genéticos , Códon/genética , Simulação por Computador , Alinhamento de Sequência , Software
20.
Bioinformatics ; 30(9): 1310-1, 2014 May 01.
Artigo em Inglês | MEDLINE | ID: mdl-24451621

RESUMO

The selection of models of nucleotide substitution is one of the major steps of modern phylogenetic analysis. Different tools exist to accomplish this task, among which jModelTest 2 (jMT2) is one of the most popular. Still, to deal with large DNA alignments with hundreds or thousands of loci, users of jMT2 need to have access to High Performance Computing clusters, including installation and configuration capabilities, conditions not always met. Here we present jmodeltest.org, a novel web server for the transparent execution of jMT2 across different platforms and for a wide range of users. Its main benefit is straightforward execution, avoiding any configuration/execution issues, and reducing significantly in most cases the time required to complete the analysis.


Assuntos
Nucleotídeos/genética , Análise por Conglomerados , Internet , Modelos Genéticos , Filogenia , Software
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA