Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 10 de 10
Filtrar
Mais filtros








Base de dados
Intervalo de ano de publicação
1.
Microb Genom ; 10(5)2024 May.
Artigo em Inglês | MEDLINE | ID: mdl-38785221

RESUMO

Wastewater-based surveillance (WBS) is an important epidemiological and public health tool for tracking pathogens across the scale of a building, neighbourhood, city, or region. WBS gained widespread adoption globally during the SARS-CoV-2 pandemic for estimating community infection levels by qPCR. Sequencing pathogen genes or genomes from wastewater adds information about pathogen genetic diversity, which can be used to identify viral lineages (including variants of concern) that are circulating in a local population. Capturing the genetic diversity by WBS sequencing is not trivial, as wastewater samples often contain a diverse mixture of viral lineages with real mutations and sequencing errors, which must be deconvoluted computationally from short sequencing reads. In this study we assess nine different computational tools that have recently been developed to address this challenge. We simulated 100 wastewater sequence samples consisting of SARS-CoV-2 BA.1, BA.2, and Delta lineages, in various mixtures, as well as a Delta-Omicron recombinant and a synthetic 'novel' lineage. Most tools performed well in identifying the true lineages present and estimating their relative abundances and were generally robust to variation in sequencing depth and read length. While many tools identified lineages present down to 1 % frequency, results were more reliable above a 5 % threshold. The presence of an unknown synthetic lineage, which represents an unclassified SARS-CoV-2 lineage, increases the error in relative abundance estimates of other lineages, but the magnitude of this effect was small for most tools. The tools also varied in how they labelled novel synthetic lineages and recombinants. While our simulated dataset represents just one of many possible use cases for these methods, we hope it helps users understand potential sources of error or bias in wastewater sequencing analysis and to appreciate the commonalities and differences across methods.


Assuntos
COVID-19 , Genoma Viral , SARS-CoV-2 , Águas Residuárias , Águas Residuárias/virologia , SARS-CoV-2/genética , SARS-CoV-2/classificação , COVID-19/virologia , COVID-19/epidemiologia , Humanos , Biologia Computacional/métodos , Genômica/métodos , Vigilância Epidemiológica Baseada em Águas Residuárias , Filogenia
2.
BMC Bioinformatics ; 24(1): 439, 2023 Nov 22.
Artigo em Inglês | MEDLINE | ID: mdl-37990302

RESUMO

BACKGROUND: Cancer is a collection of diseases caused by the deregulation of cell processes, which is triggered by somatic mutations. The search for patterns in somatic mutations, known as mutational signatures, is a growing field of study that has already become a useful tool in oncology. Several algorithms have been proposed to perform one or both the following two tasks: (1) de novo estimation of signatures and their exposures, (2) estimation of the exposures of each one of a set of pre-defined signatures. RESULTS: Our group developed signeR, a Bayesian approach to both of these tasks. Here we present a new version of the software, signeR 2.0, which extends the possibilities of previous analyses to explore the relation of signature exposures to other data of clinical relevance. signeR 2.0 includes a user-friendly interface developed using the R-Shiny framework and improvements in performance. This version allows the analysis of submitted data or public TCGA data, which is embedded in the package for easy access. CONCLUSION: signeR 2.0 is a valuable tool to generate and explore exposure data, both from de novo or fitting analyses and is an open-source R package available through the Bioconductor project at ( https://doi.org/10.18129/B9.bioc.signeR ).


Assuntos
Neoplasias , Humanos , Teorema de Bayes , Neoplasias/genética , Mutação , Software , Algoritmos
3.
Bioinformatics ; 38(7): 1809-1815, 2022 03 28.
Artigo em Inglês | MEDLINE | ID: mdl-35104309

RESUMO

MOTIVATION: Despite of the fast development of highly effective vaccines to control the current COVID-19 pandemics, the unequal distribution and availability of these vaccines worldwide and the number of people infected in the world lead to the continuous emergence of Severe Acute Respiratory Syndrome coronavirus 2 (SARS-CoV-2) variants of concern. Therefore, it is likely that real-time genomic surveillance will be continuously needed as an unceasing monitoring tool, necessary to follow the spread of the disease and the evolution of the virus. In this context, new genomic variants of SARS-CoV-2, including variants refractory to current vaccines, makes genomic surveillance programs tools of utmost importance. Nevertheless, the lack of appropriate analytical tools to quickly and effectively access the viral composition in meta-transcriptomic sequencing data, including environmental surveillance, represent possible challenges that may impact the fast adoption of this approach to mitigate the spread and transmission of viruses. RESULTS: We propose a statistical model for the estimation of the relative frequencies of SARS-CoV-2 variants in pooled samples. This model is built by considering a previously defined selection of genomic polymorphisms that characterize SARS-CoV-2 variants. The methods described here support both raw sequencing reads for polymorphisms-based markers calling and predefined markers in the variant call format. Results obtained using simulated data show that our method is quite effective in recovering the correct variant proportions. Further, results obtained by considering longitudinal data from wastewater samples of two locations in Switzerland agree well with those describing the epidemiological evolution of COVID-19 variants in clinical samples of these locations. Our results show that the described method can be a valuable tool for tracking the proportions of SARS-CoV-2 variants in complex mixtures such as waste water and environmental samples. AVAILABILITY AND IMPLEMENTATION: http://github.com/rvalieris/LCS. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
COVID-19 , SARS-CoV-2 , Humanos , SARS-CoV-2/genética , Perfilação da Expressão Gênica , Genômica
4.
Bioinformatics ; 33(1): 8-16, 2017 01 01.
Artigo em Inglês | MEDLINE | ID: mdl-27591080

RESUMO

MOTIVATION: Mutational signatures can be used to understand cancer origins and provide a unique opportunity to group tumor types that share the same origins and result from similar processes. These signatures have been identified from high throughput sequencing data generated from cancer genomes by using non-negative matrix factorisation (NMF) techniques. Current methods based on optimization techniques are strongly sensitive to initial conditions due to high dimensionality and nonconvexity of the NMF paradigm. In this context, an important question consists in the determination of the actual number of signatures that best represent the data. The extraction of mutational signatures from high-throughput data still remains a daunting task. RESULTS: Here we present a new method for the statistical estimation of mutational signatures based on an empirical Bayesian treatment of the NMF model. While requiring minimal intervention from the user, our method addresses the determination of the number of signatures directly as a model selection problem. In addition, we introduce two new concepts of significant clinical relevance for evaluating the mutational profile. The advantages brought by our approach are shown by the analysis of real and synthetic data. The later is used to compare our approach against two alternative methods mostly used in the literature and with the same NMF parametrization as the one considered here. Our approach is robust to initial conditions and more accurate than competing alternatives. It also estimates the correct number of signatures even when other methods fail. Results on real data agree well with current knowledge. AVAILABILITY AND IMPLEMENTATION: signeR is implemented in R and C ++, and is available as a R package at http://bioconductor.org/packages/signeR CONTACT: itojal@cipe.accamargo.org.brSupplementary information: Supplementary data are available at Bioinformatics online.


Assuntos
Análise Mutacional de DNA/métodos , Mutação , Neoplasias/genética , Software , Algoritmos , Animais , Teorema de Bayes , Sequenciamento de Nucleotídeos em Larga Escala , Humanos
5.
Cell ; 160(3): 420-32, 2015 Jan 29.
Artigo em Inglês | MEDLINE | ID: mdl-25635456

RESUMO

The barrier to curing HIV-1 is thought to reside primarily in CD4(+) T cells containing silent proviruses. To characterize these latently infected cells, we studied the integration profile of HIV-1 in viremic progressors, individuals receiving antiretroviral therapy, and viremic controllers. Clonally expanded T cells represented the majority of all integrations and increased during therapy. However, none of the 75 expanded T cell clones assayed contained intact virus. In contrast, the cells bearing single integration events decreased in frequency over time on therapy, and the surviving cells were enriched for HIV-1 integration in silent regions of the genome. Finally, there was a strong preference for integration into, or in close proximity to, Alu repeats, which were also enriched in local hotspots for integration. The data indicate that dividing clonally expanded T cells contain defective proviruses and that the replication-competent reservoir is primarily found in CD4(+) T cells that remain relatively quiescent.


Assuntos
Linfócitos T CD4-Positivos/virologia , Infecções por HIV/virologia , HIV-1/fisiologia , Integração Viral , Latência Viral , Elementos Alu , Células Clonais , Vírus Defeituosos/genética , Vírus Defeituosos/fisiologia , Infecções por HIV/tratamento farmacológico , HIV-1/genética , Humanos , Memória Imunológica , Provírus/fisiologia , Análise de Célula Única
6.
Bioinformatics ; 30(18): 2551-8, 2014 Sep 15.
Artigo em Inglês | MEDLINE | ID: mdl-24860160

RESUMO

MOTIVATION: The detection of genomic regions unusually rich in a given pattern is an important undertaking in the analysis of next-generation sequencing data. Recent studies of chromosomal translocations in activated B lymphocytes have identified regions that are frequently translocated to c-myc oncogene. A quantitative method for the identification of translocation hotspots was crucial to this study. Here we improve this analysis by using a simple probabilistic model and the framework provided by scan statistics to define the number and location of translocation breakpoint hotspots. A key feature of our method is that it provides a global chromosome-wide nominal control level to clustering, as opposed to previous methods based on local criteria. While being motivated by a specific application, the detection of unusual clusters is a widespread problem in bioinformatics. We expect our method to be useful in the analysis of data from other experimental approaches such as of ChIP-seq and 4C-seq. RESULTS: The analysis of translocations from B lymphocytes with the method described here reveals the presence of longer hotspots when compared with those defined previously. Further, we show that the hotspot size changes substantially in the absence of DNA repair protein 53BP1. When 53BP1 deficiency is combined with overexpression of activation-induced cytidine deaminase, the hotspot length increases even further. These changes are not detected by previous methods that use local significance criteria for clustering. Our method is also able to identify several exclusive translocation hotspots located in genes of known tumor supressors. AVAILABILITY AND IMPLEMENTATION: The detection of translocation hotspots is done with hot_scan, a program implemented in R and Perl. Source code and documentation are freely available for download at https://github.com/itojal/hot_scan.


Assuntos
Biometria/métodos , Genômica/métodos , Translocação Genética/genética , Linfócitos B/metabolismo , Pontos de Quebra do Cromossomo , Análise por Conglomerados , Citidina Desaminase/genética , Reparo do DNA , Sequenciamento de Nucleotídeos em Larga Escala , Modelos Estatísticos
7.
Biophys J ; 96(10): 3987-96, 2009 May 20.
Artigo em Inglês | MEDLINE | ID: mdl-19450470

RESUMO

Large-conductance Ca(2+)-activated K(+) channels (BK) play a fundamental role in modulating membrane potential in many cell types. The gating of BK channels and its modulation by Ca(2+) and voltage has been the subject of intensive research over almost three decades, yielding several of the most complicated kinetic mechanisms ever proposed. A large number of open and closed states disposed, respectively, in two planes, named tiers, characterize these mechanisms. Transitions between states in the same plane are cooperative and modulated by Ca(2+). Transitions across planes are highly concerted and voltage-dependent. Here we reexamine the validity of the two-tiered hypothesis by restricting attention to the modulation by Ca(2+). Large single channel data sets at five Ca(2+) concentrations were simultaneously analyzed from a Bayesian perspective by using hidden Markov models and Markov-chain Monte Carlo stochastic integration techniques. Our results support a dramatic reduction in model complexity, favoring a simple mechanism derived from the Monod-Wyman-Changeux allosteric model for homotetramers, able to explain the Ca(2+) modulation of the gating process. This model differs from the standard Monod-Wyman-Changeux scheme in that one distinguishes when two Ca(2+) ions are bound to adjacent or diagonal subunits of the tetramer.


Assuntos
Ativação do Canal Iônico , Subunidades alfa do Canal de Potássio Ativado por Cálcio de Condutância Alta/metabolismo , Regulação Alostérica , Cálcio/metabolismo , Relação Dose-Resposta a Droga , Cadeias de Markov , Modelos Biológicos , Método de Monte Carlo , Probabilidade , Reprodutibilidade dos Testes
8.
Bull Math Biol ; 66(5): 1173-99, 2004 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-15294422

RESUMO

This paper is concerned with the statistical analysis of single ion channel records. Single channels are modelled by using hidden Markov models and a combination of Bayesian statistics and Markov chain Monte Carlo methods. The techniques presented here provide a straightforward generalization to those in Rosales et al. (2001, Biophys. J., 80, 1088-1103), allowing to consider constraints imposed by a gating mechanism such as the aggregation of states into classes. This paper also presents an extension that allows to consider correlated background noise and filtered data, extending the scope of the analysis toward real experimental conditions. The methods described here are based on a solid probabilistic basis and are less computationally intensive than alternative Bayesian treatments or frequentist approaches that consider correlated data.


Assuntos
Canais Iônicos/fisiologia , Cadeias de Markov , Modelos Biológicos , Modelos Estatísticos , Método de Monte Carlo , Teorema de Bayes
9.
J Gen Physiol ; 123(5): 533-53, 2004 May.
Artigo em Inglês | MEDLINE | ID: mdl-15111644

RESUMO

Type-II ryanodine receptor channels (RYRs) play a fundamental role in intracellular Ca(2+) dynamics in heart. The processes of activation, inactivation, and regulation of these channels have been the subject of intensive research and the focus of recent debates. Typically, approaches to understand these processes involve statistical analysis of single RYRs, involving signal restoration, model estimation, and selection. These tasks are usually performed by following rather phenomenological criteria that turn models into self-fulfilling prophecies. Here, a thorough statistical treatment is applied by modeling single RYRs using aggregated hidden Markov models. Inferences are made using Bayesian statistics and stochastic search methods known as Markov chain Monte Carlo. These methods allow extension of the temporal resolution of the analysis far beyond the limits of previous approaches and provide a direct measure of the uncertainties associated with every estimation step, together with a direct assessment of why and where a particular model fails. Analyses of single RYRs at several Ca(2+) concentrations are made by considering 16 models, some of them previously reported in the literature. Results clearly show that single RYRs have Ca(2+)-dependent gating modes. Moreover, our results demonstrate that single RYRs responding to a sudden change in Ca(2+) display adaptation kinetics. Interestingly, best ranked models predict microscopic reversibility when monovalent cations are used as the main permeating species. Finally, the extended bandwidth revealed the existence of novel fast buzz-mode at low Ca(2+) concentrations.


Assuntos
Algoritmos , Cálcio/metabolismo , Ativação do Canal Iônico/fisiologia , Modelos Biológicos , Modelos Estatísticos , Técnicas de Patch-Clamp/métodos , Canal de Liberação de Cálcio do Receptor de Rianodina/fisiologia , Animais , Sinalização do Cálcio/fisiologia , Células Cultivadas , Simulação por Computador , Interpretação Estatística de Dados , Cães , Homeostase/fisiologia , Cadeias de Markov , Potenciais da Membrana/fisiologia , Microssomos/fisiologia , Método de Monte Carlo , Miócitos Cardíacos/fisiologia , Reprodutibilidade dos Testes , Sensibilidade e Especificidade , Processos Estocásticos
10.
Biophys J ; 82(1 Pt 1): 29-35, 2002 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-11751293

RESUMO

In this paper, we compare nonparametric kernel estimates with smoothed histograms as methods for displaying logarithmically transformed dwell-time distributions. Kernel density plots provide a simpler means for producing estimates of the probability density function (pdf) and they have the advantage of being smoothed in a well-specified, carefully controlled manner. Smoothing is essential for multidimensional plots because, with realistic amounts of data, the number of counts per bin is small. Examples are presented for a 2-dimensional pdf and its associated dependency-difference plot that display the correlations between successive dwell times.


Assuntos
Canais Iônicos/fisiologia , Cinética , Modelos Biológicos , Estatísticas não Paramétricas , Fatores de Tempo
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA