Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 7 de 7
Filtrar
1.
Environ Microbiol ; 22(1): 32-44, 2020 01.
Artículo en Inglés | MEDLINE | ID: mdl-31602783

RESUMEN

Horizontal gene transfer via plasmids plays a pivotal role in microbial evolution. The forces that shape plasmidomes functionality and distribution in natural environments are insufficiently understood. Here, we present a comparative study of plasmidomes across adjacent microbial environments present in different individual rumen microbiomes. Our findings show that the rumen plasmidome displays enormous unknown functional potential currently unannotated in available databases. Nevertheless, this unknown functionality is conserved and shared with published rat gut plasmidome data. Moreover, the rumen plasmidome is highly diverse compared with the microbiome that hosts these plasmids, across both similar and different rumen habitats. Our analysis demonstrates that its structure is shaped more by stochasticity than selection. Nevertheless, the plasmidome is an active partner in its intricate relationship with the host microbiome with both interacting with and responding to their environment.


Asunto(s)
Bacterias/genética , Microbiota/genética , Plásmidos/genética , Rumen/microbiología , Animales , Transferencia de Gen Horizontal
2.
Bioinformatics ; 34(1): 147-154, 2018 01 01.
Artículo en Inglés | MEDLINE | ID: mdl-29036597

RESUMEN

Motivation: We present Faucet, a two-pass streaming algorithm for assembly graph construction. Faucet builds an assembly graph incrementally as each read is processed. Thus, reads need not be stored locally, as they can be processed while downloading data and then discarded. We demonstrate this functionality by performing streaming graph assembly of publicly available data, and observe that the ratio of disk use to raw data size decreases as coverage is increased. Results: Faucet pairs the de Bruijn graph obtained from the reads with additional meta-data derived from them. We show these metadata-coverage counts collected at junction k-mers and connections bridging between junction pairs-contain most salient information needed for assembly, and demonstrate they enable cleaning of metagenome assembly graphs, greatly improving contiguity while maintaining accuracy. We compared Fauceted resource use and assembly quality to state of the art metagenome assemblers, as well as leading resource-efficient genome assemblers. Faucet used orders of magnitude less time and disk space than the specialized metagenome assemblers MetaSPAdes and Megahit, while also improving on their memory use; this broadly matched performance of other assemblers optimizing resource efficiency-namely, Minia and LightAssembler. However, on metagenomes tested, Faucet,o outputs had 14-110% higher mean NGA50 lengths compared with Minia, and 2- to 11-fold higher mean NGA50 lengths compared with LightAssembler, the only other streaming assembler available. Availability and implementation: Faucet is available at https://github.com/Shamir-Lab/Faucet. Contact: rshamir@tau.ac.il or eranhalperin@gmail.com. Supplementary information: Supplementary data are available at Bioinformatics online.


Asunto(s)
Genómica/métodos , Metagenoma , Microbiota/genética , Análisis de Secuencia de ADN/métodos , Programas Informáticos , Algoritmos , Humanos
3.
Bioinformatics ; 33(4): 475-482, 2017 02 15.
Artículo en Inglés | MEDLINE | ID: mdl-28003256

RESUMEN

Motivation: Plasmids and other mobile elements are central contributors to microbial evolution and genome innovation. Recently, they have been found to have important roles in antibiotic resistance and in affecting production of metabolites used in industrial and agricultural applications. However, their characterization through deep sequencing remains challenging, in spite of rapid drops in cost and throughput increases for sequencing. Here, we attempt to ameliorate this situation by introducing a new circular element assembly algorithm, leveraging assembly graphs provided by a conventional de novo assembler and alignments of paired-end reads to assemble cyclic sequences likely to be plasmids, phages and other circular elements. Results: We introduce Recycler, the first tool that can extract complete circular contigs from sequence data of isolate microbial genomes, plasmidome and metagenome sequence data. We show that Recycler greatly increases the number of true plasmids recovered relative to other approaches while remaining highly accurate. We demonstrate this trend via simulations of plasmidomes, comparisons of predictions with reference data for isolate samples, and assessments of annotation accuracy on metagenome data. In addition, we provide validation by DNA amplification of 77 plasmids predicted by Recycler from the different sequenced samples in which Recycler showed mean accuracy of 89% across all data types-isolate, microbiome and plasmidome. Availability and Implementation: Recycler is available at http://github.com/Shamir-Lab/Recycler. Contact: imizrahi@bgu.ac.il. Supplementary information: Supplementary data are available at Bioinformatics online.


Asunto(s)
Bacterias/genética , Genoma Bacteriano , Metagenoma , Plásmidos , Análisis de Secuencia de ADN/métodos , Programas Informáticos , Algoritmos , Escherichia coli/genética , Secuenciación de Nucleótidos de Alto Rendimiento/métodos
4.
Nature ; 465(7299): 808-12, 2010 Jun 10.
Artículo en Inglés | MEDLINE | ID: mdl-20535210

RESUMEN

The generation of reprogrammed induced pluripotent stem cells (iPSCs) from patients with defined genetic disorders holds the promise of increased understanding of the aetiologies of complex diseases and may also facilitate the development of novel therapeutic interventions. We have generated iPSCs from patients with LEOPARD syndrome (an acronym formed from its main features; that is, lentigines, electrocardiographic abnormalities, ocular hypertelorism, pulmonary valve stenosis, abnormal genitalia, retardation of growth and deafness), an autosomal-dominant developmental disorder belonging to a relatively prevalent class of inherited RAS-mitogen-activated protein kinase signalling diseases, which also includes Noonan syndrome, with pleomorphic effects on several tissues and organ systems. The patient-derived cells have a mutation in the PTPN11 gene, which encodes the SHP2 phosphatase. The iPSCs have been extensively characterized and produce multiple differentiated cell lineages. A major disease phenotype in patients with LEOPARD syndrome is hypertrophic cardiomyopathy. We show that in vitro-derived cardiomyocytes from LEOPARD syndrome iPSCs are larger, have a higher degree of sarcomeric organization and preferential localization of NFATC4 in the nucleus when compared with cardiomyocytes derived from human embryonic stem cells or wild-type iPSCs derived from a healthy brother of one of the LEOPARD syndrome patients. These features correlate with a potential hypertrophic state. We also provide molecular insights into signalling pathways that may promote the disease phenotype.


Asunto(s)
Células Madre Pluripotentes Inducidas/patología , Síndrome LEOPARD/patología , Modelos Biológicos , Medicina de Precisión , Adulto , Diferenciación Celular , Línea Celular , Linaje de la Célula , Células Cultivadas , Células Madre Embrionarias/metabolismo , Activación Enzimática , Femenino , Fibroblastos/metabolismo , Fibroblastos/patología , Perfilación de la Expresión Génica , Proteínas de Homeodominio/genética , Humanos , Células Madre Pluripotentes Inducidas/enzimología , Células Madre Pluripotentes Inducidas/metabolismo , Síndrome LEOPARD/tratamiento farmacológico , Síndrome LEOPARD/metabolismo , Masculino , Proteínas Quinasas Activadas por Mitógenos/metabolismo , Miocitos Cardíacos/metabolismo , Miocitos Cardíacos/patología , Factores de Transcripción NFATC/genética , Factores de Transcripción NFATC/metabolismo , Proteína Homeótica Nanog , Factor 3 de Transcripción de Unión a Octámeros/genética , Fosfoproteínas/análisis , Reacción en Cadena de la Polimerasa , Proteína Tirosina Fosfatasa no Receptora Tipo 11/genética , Proteína Tirosina Fosfatasa no Receptora Tipo 11/metabolismo , Factores de Transcripción SOXB1/genética
5.
Nature ; 462(7271): 358-62, 2009 Nov 19.
Artículo en Inglés | MEDLINE | ID: mdl-19924215

RESUMEN

Molecular regulation of embryonic stem cell (ESC) fate involves a coordinated interaction between epigenetic, transcriptional and translational mechanisms. It is unclear how these different molecular regulatory mechanisms interact to regulate changes in stem cell fate. Here we present a dynamic systems-level study of cell fate change in murine ESCs following a well-defined perturbation. Global changes in histone acetylation, chromatin-bound RNA polymerase II, messenger RNA (mRNA), and nuclear protein levels were measured over 5 days after downregulation of Nanog, a key pluripotency regulator. Our data demonstrate how a single genetic perturbation leads to progressive widespread changes in several molecular regulatory layers, and provide a dynamic view of information flow in the epigenome, transcriptome and proteome. We observe that a large proportion of changes in nuclear protein levels are not accompanied by concordant changes in the expression of corresponding mRNAs, indicating important roles for translational and post-translational regulation of ESC fate. Gene-ontology analysis across different molecular layers indicates that although chromatin reconfiguration is important for altering cell fate, it is preceded by transcription-factor-mediated regulatory events. The temporal order of gene expression alterations shows the order of the regulatory network reconfiguration and offers further insight into the gene regulatory network. Our studies extend the conventional systems biology approach to include many molecular species, regulatory layers and temporal series, and underscore the complexity of the multilayer regulatory mechanisms responsible for changes in protein expression that determine stem cell fate.


Asunto(s)
Diferenciación Celular , Células Madre Embrionarias/citología , Células Madre Embrionarias/metabolismo , Animales , Epigénesis Genética , Perfilación de la Expresión Génica , Regulación del Desarrollo de la Expresión Génica , Ratones , Proteoma , Factores de Tiempo
6.
BMC Bioinformatics ; 15 Suppl 9: S7, 2014.
Artículo en Inglés | MEDLINE | ID: mdl-25252952

RESUMEN

BACKGROUND: Data from large Next Generation Sequencing (NGS) experiments present challenges both in terms of costs associated with storage and in time required for file transfer. It is sometimes possible to store only a summary relevant to particular applications, but generally it is desirable to keep all information needed to revisit experimental results in the future. Thus, the need for efficient lossless compression methods for NGS reads arises. It has been shown that NGS-specific compression schemes can improve results over generic compression methods, such as the Lempel-Ziv algorithm, Burrows-Wheeler transform, or Arithmetic Coding. When a reference genome is available, effective compression can be achieved by first aligning the reads to the reference genome, and then encoding each read using the alignment position combined with the differences in the read relative to the reference. These reference-based methods have been shown to compress better than reference-free schemes, but the alignment step they require demands several hours of CPU time on a typical dataset, whereas reference-free methods can usually compress in minutes. RESULTS: We present a new approach that achieves highly efficient compression by using a reference genome, but completely circumvents the need for alignment, affording a great reduction in the time needed to compress. In contrast to reference-based methods that first align reads to the genome, we hash all reads into Bloom filters to encode, and decode by querying the same Bloom filters using read-length subsequences of the reference genome. Further compression is achieved by using a cascade of such filters. CONCLUSIONS: Our method, called BARCODE, runs an order of magnitude faster than reference-based methods, while compressing an order of magnitude better than reference-free methods, over a broad range of sequencing coverage. In high coverage (50-100 fold), compared to the best tested compressors, BARCODE saves 80-90% of the running time while only increasing space slightly.


Asunto(s)
Compresión de Datos/métodos , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Algoritmos , Compresión de Datos/economía , Genoma , Secuenciación de Nucleótidos de Alto Rendimiento/economía , Programas Informáticos
7.
BMC Bioinformatics ; 13 Suppl 6: S2, 2012 Apr 19.
Artículo en Inglés | MEDLINE | ID: mdl-22537041

RESUMEN

BACKGROUND: RNA-Seq is a technique that uses Next Generation Sequencing to identify transcripts and estimate transcription levels. When applying this technique for quantification, one must contend with reads that align to multiple positions in the genome (multireads). Previous efforts to resolve multireads have shown that RNA-Seq expression estimation can be improved using probabilistic allocation of reads to genes. These methods use a probabilistic generative model for data generation and resolve ambiguity using likelihood-based approaches. In many instances, RNA-seq experiments are performed in the context of a population. The generative models of current methods do not take into account such population information, and it is an open question whether this information can improve quantification of the individual samples RESULTS: In order to explore the contribution of population level information in RNA-seq quantification, we apply a hierarchical probabilistic generative model, which assumes that expression levels of different individuals are sampled from a Dirichlet distribution with parameters specific to the population, and reads are sampled from the distribution of expression levels. We introduce an optimization procedure for the estimation of the model parameters, and use HapMap data and simulated data to demonstrate that the model yields a significant improvement in the accuracy of expression levels of paralogous genes. CONCLUSIONS: We provide a proof of principal of the benefit of drawing on population commonalities to estimate expression. The results of our experiments demonstrate this approach can be beneficial, primarily for estimation at the gene level.


Asunto(s)
Perfilación de la Expresión Génica/métodos , Genética de Población/métodos , Modelos Estadísticos , Genoma , Proyecto Mapa de Haplotipos , Secuenciación de Nucleótidos de Alto Rendimiento , Humanos , Análisis de Secuencia de ARN
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA