Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 18 de 18
Filtrar
1.
ArXiv ; 2024 Feb 13.
Artículo en Inglés | MEDLINE | ID: mdl-37744463

RESUMEN

Neurophysiology research has demonstrated that it is possible and valuable to investigate sensory processing in scenarios involving continuous sensory streams, such as speech and music. Over the past 10 years or so, novel analytic frameworks combined with the growing participation in data sharing has led to a surge of publicly available datasets involving continuous sensory experiments. However, open science efforts in this domain of research remain scattered, lacking a cohesive set of guidelines. This paper presents an end-to-end open science framework for the storage, analysis, sharing, and re-analysis of neural data recorded during continuous sensory experiments. The framework has been designed to interface easily with existing toolboxes, such as EelBrain, NapLib, MNE, and the mTRF-Toolbox. We present guidelines by taking both the user view (how to rapidly re-analyse existing data) and the experimenter view (how to store, analyse, and share), making the process as straightforward and accessible as possible for all users. Additionally, we introduce a web-based data browser that enables the effortless replication of published results and data re-analysis.

2.
Nat Commun ; 14(1): 2829, 2023 05 17.
Artículo en Inglés | MEDLINE | ID: mdl-37198156

RESUMEN

Human cellular reprogramming to induced pluripotency is still an inefficient process, which has hindered studying the role of critical intermediate stages. Here we take advantage of high efficiency reprogramming in microfluidics and temporal multi-omics to identify and resolve distinct sub-populations and their interactions. We perform secretome analysis and single-cell transcriptomics to show functional extrinsic pathways of protein communication between reprogramming sub-populations and the re-shaping of a permissive extracellular environment. We pinpoint the HGF/MET/STAT3 axis as a potent enhancer of reprogramming, which acts via HGF accumulation within the confined system of microfluidics, and in conventional dishes needs to be supplied exogenously to enhance efficiency. Our data suggest that human cellular reprogramming is a transcription factor-driven process that it is deeply dependent on extracellular context and cell population determinants.


Asunto(s)
Células Madre Pluripotentes Inducidas , Humanos , Células Madre Pluripotentes Inducidas/metabolismo , Reprogramación Celular , Regulación de la Expresión Génica , Factores de Transcripción/genética , Factores de Transcripción/metabolismo , Células Cultivadas
3.
Sci Rep ; 13(1): 6303, 2023 04 18.
Artículo en Inglés | MEDLINE | ID: mdl-37072468

RESUMEN

A growing body of evidence links gut microbiota changes with inflammatory bowel disease (IBD), raising the potential benefit of exploiting metagenomics data for non-invasive IBD diagnostics. The sbv IMPROVER metagenomics diagnosis for inflammatory bowel disease challenge investigated computational metagenomics methods for discriminating IBD and nonIBD subjects. Participants in this challenge were given independent training and test metagenomics data from IBD and nonIBD subjects, which could be wither either raw read data (sub-challenge 1, SC1) or processed Taxonomy- and Function-based profiles (sub-challenge 2, SC2). A total of 81 anonymized submissions were received between September 2019 and March 2020. Most participants' predictions performed better than random predictions in classifying IBD versus nonIBD, Ulcerative Colitis (UC) versus nonIBD, and Crohn's Disease (CD) versus nonIBD. However, discrimination between UC and CD remains challenging, with the classification quality similar to the set of random predictions. We analyzed the class prediction accuracy, the metagenomics features by the teams, and computational methods used. These results will be openly shared with the scientific community to help advance IBD research and illustrate the application of a range of computational methodologies for effective metagenomic classification.


Asunto(s)
Colitis Ulcerosa , Enfermedad de Crohn , Microbioma Gastrointestinal , Enfermedades Inflamatorias del Intestino , Humanos , Enfermedades Inflamatorias del Intestino/diagnóstico , Enfermedades Inflamatorias del Intestino/genética , Colitis Ulcerosa/diagnóstico , Enfermedad de Crohn/diagnóstico , Enfermedad de Crohn/genética , Metagenómica , Microbioma Gastrointestinal/genética
4.
Microbiol Spectr ; : e0294422, 2023 Mar 22.
Artículo en Inglés | MEDLINE | ID: mdl-36946740

RESUMEN

Bacteria respond to nutrient starvation implementing the stringent response, a stress signaling system resulting in metabolic remodeling leading to decreased growth rate and energy requirements. A well-characterized model of stringent response in Mycobacterium tuberculosis is the one induced by growth in low phosphate. The extracytoplasmic function (ECF) sigma factor SigE was previously suggested as having a key role in the activation of stringent response. In this study, we challenge this hypothesis by analyzing the temporal dynamics of the transcriptional response of a sigE mutant and its wild-type parental strain to low phosphate using RNA sequencing. We found that both strains responded to low phosphate with a typical stringent response trait, including the downregulation of genes encoding ribosomal proteins and RNA polymerase. We also observed transcriptional changes that support the occurring of an energetics imbalance, compensated by a reduced activity of the electron transport chain, decreased export of protons, and a remodeling of central metabolism. The most striking difference between the two strains was the induction in the sigE mutant of several stress-related genes, in particular, the genes encoding the ECF sigma factor SigH and the transcriptional regulator WhiB6. Since both proteins respond to redox unbalances, their induction suggests that the sigE mutant is not able to maintain redox homeostasis in response to the energetics imbalance induced by low phosphate. In conclusion, our data suggest that SigE is not directly involved in initiating stringent response but in protecting the cell from stress consequent to the low phosphate exposure and activation of stringent response. IMPORTANCE Mycobacterium tuberculosis can enter a dormant state enabling it to establish latent infections and to become tolerant to antibacterial drugs. Dormant bacteria's physiology and the mechanism(s) used by bacteria to enter dormancy during infection are still unknown due to the lack of reliable animal models. However, several in vitro models, mimicking conditions encountered during infection, can reproduce different aspects of dormancy (growth arrest, metabolic slowdown, drug tolerance). The stringent response, a stress response program enabling bacteria to cope with nutrient starvation, is one of them. In this study, we provide evidence suggesting that the sigma factor SigE is not directly involved in the activation of stringent response as previously hypothesized, but it is important to help the bacteria to handle the metabolic stress related to the adaptation to low phosphate and activation of stringent response, thus giving an important contribution to our understanding of the mechanism behind stringent response development.

5.
PLoS Comput Biol ; 18(9): e1010467, 2022 09.
Artículo en Inglés | MEDLINE | ID: mdl-36074761

RESUMEN

The development of increasingly efficient and cost-effective high throughput DNA sequencing techniques has enhanced the possibility of studying complex microbial systems. Recently, researchers have shown great interest in studying the microorganisms that characterise different ecological niches. Differential abundance analysis aims to find the differences in the abundance of each taxa between two classes of subjects or samples, assigning a significance value to each comparison. Several bioinformatic methods have been specifically developed, taking into account the challenges of microbiome data, such as sparsity, the different sequencing depth constraint between samples and compositionality. Differential abundance analysis has led to important conclusions in different fields, from health to the environment. However, the lack of a known biological truth makes it difficult to validate the results obtained. In this work we exploit metaSPARSim, a microbial sequencing count data simulator, to simulate data with differential abundance features between experimental groups. We perform a complete comparison of recently developed and established methods on a common benchmark with great effort to the reliability of both the simulated scenarios and the evaluation metrics. The performance overview includes the investigation of numerous scenarios, studying the effect on methods' results on the main covariates such as sample size, percentage of differentially abundant features, sequencing depth, feature variability, normalisation approach and ecological niches. Mainly, we find that methods show a good control of the type I error and, generally, also of the false discovery rate at high sample size, while recall seem to depend on the dataset and sample size.


Asunto(s)
Benchmarking , Microbiota , Biología Computacional/métodos , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Reproducibilidad de los Resultados
6.
BMC Bioinformatics ; 22(Suppl 15): 618, 2022 Feb 07.
Artículo en Inglés | MEDLINE | ID: mdl-35130833

RESUMEN

BACKGROUND: 16S rRNA-gene sequencing is a valuable approach to characterize the taxonomic content of the whole bacterial population inhabiting a metabolic and spatial niche, providing an important opportunity to study bacteria and their role in many health and environmental mechanisms. The analysis of data produced by amplicon sequencing, however, brings very specific methodological issues that need to be properly addressed to obtain reliable biological conclusions. Among these, 16S count data tend to be very sparse, with many null values reflecting species that are present but got unobserved due to the multiplexing constraints. However, current data workflows do not consider a step in which the information about unobserved species is recovered. RESULTS: In this work, we evaluate for the first time the effects of introducing in the 16S data workflow a new preprocessing step, zero-imputation, to recover this lost information. Due to the lack of published zero-imputation methods specifically designed for 16S count data, we considered a set of zero-imputation strategies available for other frameworks, and benchmarked them using in silico 16S count data reflecting different experimental designs. Additionally, we assessed the effect of combining zero-imputation and normalization, i.e. the only preprocessing step in current 16S workflow. Overall, we benchmarked 35 16S preprocessing pipelines assessing their ability to handle data sparsity, identify species presence/absence, recovery sample proportional abundance distributions, and improve typical downstream analyses such as computation of alpha and beta diversity indices and differential abundance analysis. CONCLUSIONS: The results clearly show that 16S data analysis greatly benefits from a properly-performed zero-imputation step, despite the choice of the right zero-imputation method having a pivotal role. In addition, we identify a set of best-performing pipelines that could be a valuable indication for data analysts.


Asunto(s)
Bacterias , Análisis de Datos , Bacterias/genética , Genes de ARNr , Secuenciación de Nucleótidos de Alto Rendimiento , ARN Ribosómico 16S/genética , Análisis de Secuencia de ADN
7.
Bioinformatics ; 38(7): 1920-1929, 2022 03 28.
Artículo en Inglés | MEDLINE | ID: mdl-35043939

RESUMEN

MOTIVATION: Recently, single-cell RNA-seq (scRNA-seq) data have been used to study cellular communication. Most bioinformatics methods infer only the intercellular signaling between groups of cells, mainly exploiting ligand-receptor expression levels. Only few methods consider the entire intercellular + intracellular signaling, mainly inferring lists/networks of signaling involved genes. RESULTS: Here, we present scSeqComm, a computational method to identify and quantify the evidence of ongoing intercellular and intracellular signaling from scRNA-seq data, and at the same time providing a functional characterization of the inferred cellular communication. The possibility to quantify the evidence of ongoing communication assists the prioritization of the results, while the combined evidence of both intercellular and intracellular signaling increase the reliability of inferred communication. The application to a scRNA-seq dataset of tumor microenvironment, the agreement with independent bioinformatics analysis, the validation using spatial transcriptomics data and the comparison with state-of-the-art intercellular scoring schemes confirmed the robustness and reliability of the proposed method. AVAILABILITY AND IMPLEMENTATION: scSeqComm R package is freely available at https://gitlab.com/sysbiobig/scseqcomm and https://sysbiobig.dei.unipd.it/software/#scSeqComm. Submitted software version and test data are available in Zenodo, at https://dx.doi.org/10.5281/zenodo.5833298. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Asunto(s)
Análisis de la Célula Individual , Programas Informáticos , Análisis de Secuencia de ARN/métodos , Reproducibilidad de los Resultados , Análisis de la Célula Individual/métodos , Perfilación de la Expresión Génica/métodos , Comunicación
8.
Bioinform Adv ; 2(1): vbac092, 2022.
Artículo en Inglés | MEDLINE | ID: mdl-36699399

RESUMEN

Motivation: Recently, several computational modeling approaches, such as agent-based models, have been applied to study the interaction dynamics between immune and tumor cells in human cancer. However, each tumor is characterized by a specific and unique tumor microenvironment, emphasizing the need for specialized and personalized studies of each cancer scenario. Results: We present MAST, a hybrid Multi-Agent Spatio-Temporal model which can be informed using a data-driven approach to simulate unique tumor subtypes and tumor-immune dynamics starting from high-throughput sequencing data. It captures essential components of the tumor microenvironment by coupling a discrete agent-based model with a continuous partial differential equations-based model.The application to real data of human colorectal cancer tissue investigating the spatio-temporal evolution and emergent properties of four simulated human colorectal cancer subtypes, along with their agreement with current biological knowledge of tumors and clinical outcome endpoints in a patient cohort, endorse the validity of our approach. Availability and implementation: MAST, implemented in Python language, is freely available with an open-source license through GitLab (https://gitlab.com/sysbiobig/mast), and a Docker image is provided to ease its deployment. The submitted software version and test data are available in Zenodo at https://dx.doi.org/10.5281/zenodo.7267745. Supplementary information: Supplementary data are available at Bioinformatics Advances online.

9.
Bioinformatics ; 37(22): 4253-4254, 2021 11 18.
Artículo en Inglés | MEDLINE | ID: mdl-34117876

RESUMEN

SUMMARY: ITSoneWB (ITSone WorkBench) is a Galaxy-based bioinformatic environment where comprehensive and high-quality reference data are connected with established pipelines and new tools in an automated and easy-to-use service targeted at global taxonomic analysis of eukaryotic communities based on Internal Transcribed Spacer 1 variants high-throughput sequencing. AVAILABILITY AND IMPLEMENTATION: ITSoneWB has been deployed on the INFN-Bari ReCaS cloud facility and is freely available on the web at http://itsonewb.cloud.ba.infn.it/galaxy. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Asunto(s)
Eucariontes , Programas Informáticos , Biología Computacional , Secuenciación de Nucleótidos de Alto Rendimiento , Exactitud de los Datos
10.
Eur J Haematol ; 107(4): 436-448, 2021 Oct.
Artículo en Inglés | MEDLINE | ID: mdl-34139044

RESUMEN

Chronic Myeloid Leukemia is a clonal disorder characterized by the presence of the Ph-chromosome and the BCR-ABL tyrosine-kinase (TK). Target-therapy with Imatinib has greatly improved its outcome. Deeper and faster responses are reported with the second-generation TKI Nilotinib. Sustained responses may enable TKI discontinuation. However, even in a complete molecular response, some patients experience disease recurrence possibly due to persistence of quiescent leukemic CD34+/lin-Ph+ stem cells (LSCs). Degree and mechanisms of LSCs clearance during TKI treatment are not clearly established. The PhilosoPhi34 study was designed to verify the in-vivo activity and timecourse of first-line Nilotinib therapy on BM CD34+/lin-Ph+ cells clearance. Eighty-seven CP-CML patients were enrolled. BM cells were collected and tested for Ph+ residual cells, at diagnosis, 3, 6 and 12 months of treatment. FISH analysis of unstimulated CD34+/lin- cells in CCyR patients were positive in 8/65 (12.3%), 5/71 (7%), 0/69 (0%) evaluable tests, respectively. Per-Protocol analysis response rates were as follows: CCyR 95% at 12 months, MR4.5 31% and 46% at 12 and 36 months, respectively. An exploratory Gene Expression Profiling (GEP) study of CD34+/lin- cells was performed on 30 patients at diagnosis and after, on 79 patients at diagnosis vs 12 months of nilotinib treatment vs 10 healthy subjects. Data demonstrated some genes significantly different expressed: NFKBIA, many cell cycle genes, ABC transporters, JAK-STAT signaling pathway (JAK2). In addition, a correlation between different expression of some genes (JAK2, OLFM4, ICAM1, NFKBIA) among patients at diagnosis and their achievement of an early and deeper MR was observed.


Asunto(s)
Antineoplásicos/uso terapéutico , Regulación Leucémica de la Expresión Génica/efectos de los fármacos , Leucemia Mielógena Crónica BCR-ABL Positiva/tratamiento farmacológico , Células Madre Neoplásicas/efectos de los fármacos , Inhibidores de Proteínas Quinasas/uso terapéutico , Pirimidinas/uso terapéutico , Transportadoras de Casetes de Unión a ATP/genética , Transportadoras de Casetes de Unión a ATP/metabolismo , Adolescente , Adulto , Anciano , Anciano de 80 o más Años , Biomarcadores Farmacológicos , Médula Ósea/efectos de los fármacos , Médula Ósea/metabolismo , Médula Ósea/patología , Estudios de Casos y Controles , Proteínas de Ciclo Celular/genética , Proteínas de Ciclo Celular/metabolismo , Femenino , Perfilación de la Expresión Génica , Factor Estimulante de Colonias de Granulocitos/genética , Factor Estimulante de Colonias de Granulocitos/metabolismo , Humanos , Molécula 1 de Adhesión Intercelular/genética , Molécula 1 de Adhesión Intercelular/metabolismo , Janus Quinasa 2/genética , Janus Quinasa 2/metabolismo , Leucemia Mielógena Crónica BCR-ABL Positiva/diagnóstico , Leucemia Mielógena Crónica BCR-ABL Positiva/genética , Leucemia Mielógena Crónica BCR-ABL Positiva/patología , Masculino , Persona de Mediana Edad , Inhibidor NF-kappaB alfa/genética , Inhibidor NF-kappaB alfa/metabolismo , Células Madre Neoplásicas/metabolismo , Células Madre Neoplásicas/patología , Cromosoma Filadelfia , Estudios Prospectivos , Recurrencia
11.
Curr Genomics ; 22(4): 267-290, 2021 Dec 16.
Artículo en Inglés | MEDLINE | ID: mdl-35273458

RESUMEN

In the current research landscape, microbiota composition studies are of extreme interest, since it has been widely shown that resident microorganisms affect and shape the ecological niche they inhabit. This complex micro-world is characterized by different types of interactions. Understanding these relationships provides a useful tool for decoding the causes and effects of communities' organizations. Next-Generation Sequencing technologies allow to reconstruct the internal composition of the whole microbial community present in a sample. Sequencing data can then be investigated through statistical and computational method coming from network theory to infer the network of interactions among microbial species. Since there are several network inference approaches in the literature, in this paper we tried to shed light on their main characteristics and challenges, providing a useful tool not only to those interested in using the methods, but also to those who want to develop new ones. In addition, we focused on the frameworks used to produce synthetic data, starting from the simulation of network structures up to their integration with abundance models, with the aim of clarifying the key points of the entire generative process.

12.
Bioinformatics ; 36(5): 1468-1475, 2020 03 01.
Artículo en Inglés | MEDLINE | ID: mdl-31598633

RESUMEN

MOTIVATION: Single cell RNA-seq (scRNA-seq) count data show many differences compared with bulk RNA-seq count data, making the application of many RNA-seq pre-processing/analysis methods not straightforward or even inappropriate. For this reason, the development of new methods for handling scRNA-seq count data is currently one of the most active research fields in bioinformatics. To help the development of such new methods, the availability of simulated data could play a pivotal role. However, only few scRNA-seq count data simulators are available, often showing poor or not demonstrated similarity with real data. RESULTS: In this article we present SPARSim, a scRNA-seq count data simulator based on a Gamma-Multivariate Hypergeometric model. We demonstrate that SPARSim allows to generate count data that resemble real data in terms of count intensity, variability and sparsity, performing comparably or better than one of the most used scRNA-seq simulator, Splat. In particular, SPARSim simulated count matrices well resemble the distribution of zeros across different expression intensities observed in real count data. AVAILABILITY AND IMPLEMENTATION: SPARSim R package is freely available at http://sysbiobig.dei.unipd.it/? q=SPARSim and at https://gitlab.com/sysbiobig/sparsim. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Asunto(s)
Análisis de la Célula Individual , Programas Informáticos , Perfilación de la Expresión Génica , RNA-Seq , Análisis de Secuencia de ARN
13.
BMC Bioinformatics ; 20(Suppl 9): 416, 2019 Nov 22.
Artículo en Inglés | MEDLINE | ID: mdl-31757204

RESUMEN

BACKGROUND: In the last few years, 16S rRNA gene sequencing (16S rDNA-seq) has seen a surprisingly rapid increase in election rate as a methodology to perform microbial community studies. Despite the considerable popularity of this technique, an exiguous number of specific tools are currently available for proper 16S rDNA-seq count data preprocessing and simulation. Indeed, the great majority of tools have been developed adapting methodologies previously used for bulk RNA-seq data, with poor assessment of their applicability in the metagenomics field. For such tools and the few ones specifically developed for 16S rDNA-seq data, performance assessment is challenging, mainly due to the complex nature of the data and the lack of realistic simulation models. In fact, to the best of our knowledge, no software thought for data simulation are available to directly obtain synthetic 16S rDNA-seq count tables that properly model heavy sparsity and compositionality typical of these data. RESULTS: In this paper we present metaSPARSim, a sparse count matrix simulator intended for usage in development of 16S rDNA-seq metagenomic data processing pipelines. metaSPARSim implements a new generative process that models the sequencing process with a Multivariate Hypergeometric distribution in order to realistically simulate 16S rDNA-seq count table, resembling real experimental data compositionality and sparsity. It provides ready-to-use count matrices and comes with the possibility to reproduce different pre-coded scenarios and to estimate simulation parameters from real experimental data. The tool is made available at http://sysbiobig.dei.unipd.it/?q=Software#metaSPARSimand https://gitlab.com/sysbiobig/metasparsim. CONCLUSION: metaSPARSim is able to generate count matrices resembling real 16S rDNA-seq data. The availability of count data simulators is extremely valuable both for methods developers, for which a ground truth for tools validation is needed, and for users who want to assess state of the art analysis tools for choosing the most accurate one. Thus, we believe that metaSPARSim is a valuable tool for researchers involved in developing, testing and using robust and reliable data analysis methods in the context of 16S rRNA gene sequencing.


Asunto(s)
Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Metagenómica , ARN Ribosómico 16S/genética , Programas Informáticos , Animales , Simulación por Computador , ADN Ribosómico/genética , Bases de Datos Genéticas , Humanos , Metagenoma
14.
PLoS One ; 14(7): e0218444, 2019.
Artículo en Inglés | MEDLINE | ID: mdl-31318870

RESUMEN

Chronic myeloid leukemia (CML) is characterized by the constitutive tyrosine kinase activity of the oncoprotein BCR-ABL1 in myeloid progenitor cells that activates multiple signal transduction pathways leading to the leukemic phenotype. The tyrosine-kinase inhibitor (TKI) nilotinib inhibits the tyrosine kinase activity of BCR-ABL1 in CML patients. Despite the success of nilotinib treatment in patients with chronic-phase (CP) CML, a population of Philadelphia-positive (Ph+) quiescent stem cells escapes the drug activity and can lead to drug resistance. The molecular mechanism by which these quiescent cells remain insensitive is poorly understood. The aim of this study was to compare the gene expression profiling (GEP) of bone marrow (BM) CD34+/lin- cells from CP-CML patients at diagnosis and after 12 months of nilotinib treatment by microarray, in order to identify gene expression changes and the dysregulation of pathways due to nilotinib action. We selected BM CD34+/lin- cells from 78 CP-CML patients at diagnosis and after 12 months of first-line nilotinib therapy and microarray analysis was performed. GEP bioinformatic analyses identified 2,959 differently expressed probes and functional clustering determined some significantly enriched pathways between diagnosis and 12 months of nilotinib treatment. Among these pathways, we observed the under expression of 26 genes encoding proteins belonging to the cell cycle after 12 months of nilotinib treatment which led to the up-regulation of chromosome replication, cell proliferation, DNA replication, and DNA damage checkpoint at diagnosis. We demonstrated the under expression of the ATP-binding cassette (ABC) transporters ABCC4, ABCC5, and ABCD3 encoding proteins which pumped drugs out of the cells after 12 months of nilotinib. Moreover, GEP data demonstrated the deregulation of genes involved in the JAK-STAT signaling pathway. The down-regulation of JAK2, IL7, STAM, PIK3CA, PTPN11, RAF1, and SOS1 key genes after 12 months of nilotinib could demonstrate the up-regulation of cell cycle, proliferation and differentiation via MAPK and PI3K-AKT signaling pathways at diagnosis.


Asunto(s)
Transportadoras de Casetes de Unión a ATP/sangre , Ciclo Celular/efectos de los fármacos , Regulación Leucémica de la Expresión Génica/efectos de los fármacos , Quinasas Janus/sangre , Leucemia Mielógena Crónica BCR-ABL Positiva/sangre , Proteínas de Neoplasias/sangre , Pirimidinas/administración & dosificación , Factores de Transcripción STAT/sangre , Transducción de Señal/efectos de los fármacos , Femenino , Humanos , Masculino , Persona de Mediana Edad , Factores de Tiempo
15.
BMC Bioinformatics ; 19(1): 343, 2018 Sep 29.
Artículo en Inglés | MEDLINE | ID: mdl-30268091

RESUMEN

BACKGROUND: Targeted amplicon sequencing of the 16S ribosomal RNA gene is one of the key tools for studying microbial diversity. The accuracy of this approach strongly depends on the choice of primer pairs and, in particular, on the balance between efficiency, specificity and sensitivity in the amplification of the different bacterial 16S sequences contained in a sample. There is thus the need for computational methods to design optimal bacterial 16S primers able to take into account the knowledge provided by the new sequencing technologies. RESULTS: We propose here a computational method for optimizing the choice of primer sets, based on multi-objective optimization, which simultaneously: 1) maximizes efficiency and specificity of target amplification; 2) maximizes the number of different bacterial 16S sequences matched by at least one primer; 3) minimizes the differences in the number of primers matching each bacterial 16S sequence. Our algorithm can be applied to any desired amplicon length without affecting computational performance. The source code of the developed algorithm is released as the mopo16S software tool (Multi-Objective Primer Optimization for 16S experiments) under the GNU General Public License and is available at http://sysbiobig.dei.unipd.it/?q=Software#mopo16S . CONCLUSIONS: Results show that our strategy is able to find better primer pairs than the ones available in the literature according to all three optimization criteria. We also experimentally validated three of the primer pairs identified by our method on multiple bacterial species, belonging to different genera and phyla. Results confirm the predicted efficiency and the ability to maximize the number of different bacterial 16S sequences matched by primers.


Asunto(s)
Bacterias/genética , Cartilla de ADN/normas , Reacción en Cadena de la Polimerasa/normas , ARN Bacteriano/genética , ARN Ribosómico 16S/genética , Programas Informáticos , Cartilla de ADN/genética
16.
BMC Genomics ; 18(1): 602, 2017 08 10.
Artículo en Inglés | MEDLINE | ID: mdl-28797240

RESUMEN

BACKGROUND: Though Illumina has largely dominated the RNA-Seq field, the simultaneous availability of Ion Torrent has left scientists wondering which platform is most effective for differential gene expression (DGE) analysis. Previous investigations of this question have typically used reference samples derived from cell lines and brain tissue, and do not involve biological variability. While these comparisons might inform studies of tissue-specific expression, marked by large-scale transcriptional differences, this is not the common use case. RESULTS: Here we employ a standard treatment/control experimental design, which enables us to evaluate these platforms in the context of the expression differences common in differential gene expression experiments. Specifically, we assessed the hepatic inflammatory response of mice by assaying liver RNA from control and IL-1ß treated animals with both the Illumina HiSeq and the Ion Torrent Proton sequencing platforms. We found the greatest difference between the platforms at the level of read alignment, a moderate level of concordance at the level of DGE analysis, and nearly identical results at the level of differentially affected pathways. Interestingly, we also observed a strong interaction between sequencing platform and choice of aligner. By aligning both real and simulated Illumina and Ion Torrent data with the twelve most commonly-cited aligners in the literature, we observed that different aligner and platform combinations were better suited to probing different genomic features; for example, disentangling the source of expression in gene-pseudogene pairs. CONCLUSIONS: Taken together, our results indicate that while Illumina and Ion Torrent have similar capacities to detect changes in biology from a treatment/control experiment, these platforms may be tailored to interrogate different transcriptional phenomena through careful selection of alignment software.


Asunto(s)
Perfilación de la Expresión Génica , Análisis de Secuencia de ARN/métodos , Algoritmos , Secuenciación de Nucleótidos de Alto Rendimiento
17.
Front Genet ; 8: 62, 2017.
Artículo en Inglés | MEDLINE | ID: mdl-28588607

RESUMEN

The sequencing of the transcriptomes of single-cells, or single-cell RNA-sequencing, has now become the dominant technology for the identification of novel cell types and for the study of stochastic gene expression. In recent years, various tools for analyzing single-cell RNA-sequencing data have been proposed, many of them with the purpose of performing differentially expression analysis. In this work, we compare four different tools for single-cell RNA-sequencing differential expression, together with two popular methods originally developed for the analysis of bulk RNA-sequencing data, but largely applied to single-cell data. We discuss results obtained on two real and one synthetic dataset, along with considerations about the perspectives of single-cell differential expression analysis. In particular, we explore the methods performance in four different scenarios, mimicking different unimodal or bimodal distributions of the data, as characteristic of single-cell transcriptomics. We observed marked differences between the selected methods in terms of precision and recall, the number of detected differentially expressed genes and the overall performance. Globally, the results obtained in our study suggest that is difficult to identify a best performing tool and that efforts are needed to improve the methodologies for single-cell RNA-sequencing data analysis and gain better accuracy of results.

18.
Nat Methods ; 14(2): 135-139, 2017 02.
Artículo en Inglés | MEDLINE | ID: mdl-27941783

RESUMEN

Alignment is the first step in most RNA-seq analysis pipelines, and the accuracy of downstream analyses depends heavily on it. Unlike most steps in the pipeline, alignment is particularly amenable to benchmarking with simulated data. We performed a comprehensive benchmarking of 14 common splice-aware aligners for base, read, and exon junction-level accuracy and compared default with optimized parameters. We found that performance varied by genome complexity, and accuracy and popularity were poorly correlated. The most widely cited tool underperforms for most metrics, particularly when using default settings.


Asunto(s)
Plasmodium falciparum/genética , Alineación de Secuencia/métodos , Análisis de Secuencia de ARN/métodos , Benchmarking , Simulación por Computador , Exones , Genoma Humano , Humanos , Intrones , Anotación de Secuencia Molecular , Polimorfismo Genético , Empalme del ARN , Programas Informáticos
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA
...