Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 46
Filtrar
Más filtros

Bases de datos
País/Región como asunto
Tipo del documento
Intervalo de año de publicación
1.
Alzheimers Dement ; 17(9): 1509-1527, 2021 09.
Artículo en Inglés | MEDLINE | ID: mdl-33797837

RESUMEN

INTRODUCTION: Genome-wide association studies have led to numerous genetic loci associated with Alzheimer's disease (AD). Whole-genome sequencing (WGS) now permits genome-wide analyses to identify rare variants contributing to AD risk. METHODS: We performed single-variant and spatial clustering-based testing on rare variants (minor allele frequency [MAF] ≤1%) in a family-based WGS-based association study of 2247 subjects from 605 multiplex AD families, followed by replication in 1669 unrelated individuals. RESULTS: We identified 13 new AD candidate loci that yielded consistent rare-variant signals in discovery and replication cohorts (4 from single-variant, 9 from spatial-clustering), implicating these genes: FNBP1L, SEL1L, LINC00298, PRKCH, C15ORF41, C2CD3, KIF2A, APC, LHX9, NALCN, CTNNA2, SYTL3, and CLSTN2. DISCUSSION: Downstream analyses of these novel loci highlight synaptic function, in contrast to common AD-associated variants, which implicate innate immunity and amyloid processing. These loci have not been associated previously with AD, emphasizing the ability of WGS to identify AD-associated rare variants, particularly outside of the exome.


Asunto(s)
Enfermedad de Alzheimer/genética , Frecuencia de los Genes/genética , Predisposición Genética a la Enfermedad , Secuenciación Completa del Genoma , Estudio de Asociación del Genoma Completo , Humanos , Canales Iónicos/genética , Cinesinas/genética , Proteínas de la Membrana/genética , Proteínas Asociadas a Microtúbulos/genética , Proteínas/genética
2.
Nature ; 514(7522): 322-7, 2014 Oct 16.
Artículo en Inglés | MEDLINE | ID: mdl-25296256

RESUMEN

It is currently thought that life-long blood cell production is driven by the action of a small number of multipotent haematopoietic stem cells. Evidence supporting this view has been largely acquired through the use of functional assays involving transplantation. However, whether these mechanisms also govern native non-transplant haematopoiesis is entirely unclear. Here we have established a novel experimental model in mice where cells can be uniquely and genetically labelled in situ to address this question. Using this approach, we have performed longitudinal analyses of clonal dynamics in adult mice that reveal unprecedented features of native haematopoiesis. In contrast to what occurs following transplantation, steady-state blood production is maintained by the successive recruitment of thousands of clones, each with a minimal contribution to mature progeny. Our results demonstrate that a large number of long-lived progenitors, rather than classically defined haematopoietic stem cells, are the main drivers of steady-state haematopoiesis during most of adulthood. Our results also have implications for understanding the cellular origin of haematopoietic disease.


Asunto(s)
Linaje de la Célula , Células Clonales/citología , Hematopoyesis , Animales , Senescencia Celular , Células Clonales/metabolismo , Elementos Transponibles de ADN/genética , Femenino , Marcadores Genéticos/genética , Trasplante de Células Madre Hematopoyéticas , Células Madre Hematopoyéticas/citología , Células Madre Hematopoyéticas/metabolismo , Masculino , Ratones , Mielopoyesis , Coloración y Etiquetado , Factores de Tiempo
3.
Hum Mol Genet ; 26(8): 1472-1482, 2017 04 15.
Artículo en Inglés | MEDLINE | ID: mdl-28186563

RESUMEN

SOX5 encodes a transcription factor that is expressed in multiple tissues including heart, lung and brain. Mutations in SOX5 have been previously found in patients with amyotrophic lateral sclerosis (ALS) and developmental delay, intellectual disability and dysmorphic features. To characterize the neuronal role of SOX5, we silenced the Drosophila ortholog of SOX5, Sox102F, by RNAi in various neuronal subtypes in Drosophila. Silencing of Sox102F led to misorientated and disorganized michrochaetes, neurons with shorter dendritic arborization (DA) and reduced complexity, diminished larval peristaltic contractions, loss of neuromuscular junction bouton structures, impaired olfactory perception, and severe neurodegeneration in brain. Silencing of SOX5 in human SH-SY5Y neuroblastoma cells resulted in a significant repression of WNT signaling activity and altered expression of WNT-related genes. Genetic association and meta-analyses of the results in several large family-based and case-control late-onset familial Alzheimer's disease (LOAD) samples of SOX5 variants revealed several variants that show significant association with AD disease status. In addition, analysis for rare and highly penetrate functional variants revealed four novel variants/mutations in SOX5, which taken together with functional prediction analysis, suggests a strong role of SOX5 causing AD in the carrier families. Collectively, these findings indicate that SOX5 is a novel candidate gene for LOAD with an important role in neuronal function. The genetic findings warrant further studies to identify and characterize SOX5 variants that confer risk for AD, ALS and intellectual disability.


Asunto(s)
Enfermedad de Alzheimer/genética , Esclerosis Amiotrófica Lateral/genética , Discapacidades del Desarrollo/genética , Proteínas de Drosophila/genética , Factores de Transcripción SOXD/genética , Enfermedad de Alzheimer/patología , Esclerosis Amiotrófica Lateral/patología , Animales , Discapacidades del Desarrollo/patología , Drosophila/genética , Silenciador del Gen , Estudios de Asociación Genética , Humanos , Unión Neuromuscular/genética , Unión Neuromuscular/patología , Plasticidad Neuronal/genética , Neuronas/metabolismo , Neuronas/patología , Interferencia de ARN , Vía de Señalización Wnt/genética
4.
Nature ; 493(7434): 694-8, 2013 Jan 31.
Artículo en Inglés | MEDLINE | ID: mdl-23364702

RESUMEN

Genetic and biochemical analyses of RNA interference (RNAi) and microRNA (miRNA) pathways have revealed proteins such as Argonaute and Dicer as essential cofactors that process and present small RNAs to their targets. Well-validated small RNA pathway cofactors such as these show distinctive patterns of conservation or divergence in particular animal, plant, fungal and protist species. We compared 86 divergent eukaryotic genome sequences to discern sets of proteins that show similar phylogenetic profiles with known small RNA cofactors. A large set of additional candidate small RNA cofactors have emerged from functional genomic screens for defects in miRNA- or short interfering RNA (siRNA)-mediated repression in Caenorhabditis elegans and Drosophila melanogaster, and from proteomic analyses of proteins co-purifying with validated small RNA pathway proteins. The phylogenetic profiles of many of these candidate small RNA pathway proteins are similar to those of known small RNA cofactor proteins. We used a Bayesian approach to integrate the phylogenetic profile analysis with predictions from diverse transcriptional coregulation and proteome interaction data sets to assign a probability for each protein for a role in a small RNA pathway. Testing high-confidence candidates from this analysis for defects in RNAi silencing, we found that about one-half of the predicted small RNA cofactors are required for RNAi silencing. Many of the newly identified small RNA pathway proteins are orthologues of proteins implicated in RNA splicing. In support of a deep connection between the mechanism of RNA splicing and small-RNA-mediated gene silencing, the presence of the Argonaute proteins and other small RNA components in the many species analysed strongly correlates with the number of introns in those species.


Asunto(s)
Caenorhabditis elegans/genética , Variación Genética , Filogenia , ARN Interferente Pequeño/genética , Animales , Caenorhabditis elegans/clasificación , Proteínas de Caenorhabditis elegans/genética , Eucariontes/clasificación , Eucariontes/genética , Genoma/genética , MicroARNs/genética , Proteoma , Empalme del ARN
5.
Genes Dev ; 25(20): 2210-21, 2011 Oct 15.
Artículo en Inglés | MEDLINE | ID: mdl-22012622

RESUMEN

Polycomb group (PcG) proteins are required for the epigenetic maintenance of developmental genes in a silent state. Proteins in the Polycomb-repressive complex 1 (PRC1) class of the PcG are conserved from flies to humans and inhibit transcription. One hypothesis for PRC1 mechanism is that it compacts chromatin, based in part on electron microscopy experiments demonstrating that Drosophila PRC1 compacts nucleosomal arrays. We show that this function is conserved between Drosophila and mouse PRC1 complexes and requires a region with an overrepresentation of basic amino acids. While the active region is found in the Posterior Sex Combs (PSC) subunit in Drosophila, it is unexpectedly found in a different PRC1 subunit, a Polycomb homolog called M33, in mice. We provide experimental support for the general importance of a charged region by predicting the compacting capability of PcG proteins from species other than Drosophila and mice and by testing several of these proteins using solution assays and microscopy. We infer that the ability of PcG proteins to compact chromatin in vitro can be predicted by the presence of domains of high positive charge and that PRC1 components from a variety of species conserve this highly charged region. This supports the hypothesis that compaction is a key aspect of PcG function.


Asunto(s)
Cromatina/metabolismo , Proteínas Represoras/química , Proteínas Represoras/metabolismo , Animales , Línea Celular , Secuencia Conservada/genética , Drosophila melanogaster/clasificación , Drosophila melanogaster/genética , Drosophila melanogaster/metabolismo , Evolución Molecular , Ratones , Mutación , Filogenia , Complejo Represivo Polycomb 1 , Proteínas del Grupo Polycomb , Proteínas Represoras/genética , Relación Estructura-Actividad
6.
Nucleic Acids Res ; 44(11): e108, 2016 06 20.
Artículo en Inglés | MEDLINE | ID: mdl-27060149

RESUMEN

Accurate variant calling in next generation sequencing (NGS) is critical to understand cancer genomes better. Here we present VarDict, a novel and versatile variant caller for both DNA- and RNA-sequencing data. VarDict simultaneously calls SNV, MNV, InDels, complex and structural variants, expanding the detected genetic driver landscape of tumors. It performs local realignments on the fly for more accurate allele frequency estimation. VarDict performance scales linearly to sequencing depth, enabling ultra-deep sequencing used to explore tumor evolution or detect tumor DNA circulating in blood. In addition, VarDict performs amplicon aware variant calling for polymerase chain reaction (PCR)-based targeted sequencing often used in diagnostic settings, and is able to detect PCR artifacts. Finally, VarDict also detects differences in somatic and loss of heterozygosity variants between paired samples. VarDict reprocessing of The Cancer Genome Atlas (TCGA) Lung Adenocarcinoma dataset called known driver mutations in KRAS, EGFR, BRAF, PIK3CA and MET in 16% more patients than previously published variant calls. We believe VarDict will greatly facilitate application of NGS in clinical cancer research.


Asunto(s)
Biología Computacional/métodos , Secuenciación de Nucleótidos de Alto Rendimiento , Análisis de Secuencia de ADN , Programas Informáticos , Alelos , Frecuencia de los Genes , Variación Genética , Humanos , Mutación INDEL , Pérdida de Heterocigocidad , Neoplasias Pulmonares/genética , Neoplasias/genética , Curva ROC , Investigación
7.
PLoS Comput Biol ; 12(2): e1004691, 2016 Feb.
Artículo en Inglés | MEDLINE | ID: mdl-26914653

RESUMEN

The Bioinformatics Open Source Conference (BOSC) is organized by the Open Bioinformatics Foundation (OBF), a nonprofit group dedicated to promoting the practice and philosophy of open source software development and open science within the biological research community. Since its inception in 2000, BOSC has provided bioinformatics developers with a forum for communicating the results of their latest efforts to the wider research community. BOSC offers a focused environment for developers and users to interact and share ideas about standards; software development practices; practical techniques for solving bioinformatics problems; and approaches that promote open science and sharing of data, results, and software. BOSC is run as a two-day special interest group (SIG) before the annual Intelligent Systems in Molecular Biology (ISMB) conference. BOSC 2015 took place in Dublin, Ireland, and was attended by over 125 people, about half of whom were first-time attendees. Session topics included "Data Science;" "Standards and Interoperability;" "Open Science and Reproducibility;" "Translational Bioinformatics;" "Visualization;" and "Bioinformatics Open Source Project Updates". In addition to two keynote talks and dozens of shorter talks chosen from submitted abstracts, BOSC 2015 included a panel, titled "Open Source, Open Door: Increasing Diversity in the Bioinformatics Open Source Community," that provided an opportunity for open discussion about ways to increase the diversity of participants in BOSC in particular, and in open source bioinformatics in general. The complete program of BOSC 2015 is available online at http://www.open-bio.org/wiki/BOSC_2015_Schedule.


Asunto(s)
Biología Computacional/organización & administración , Congresos como Asunto , Humanos , Irlanda
8.
Breast Cancer Res ; 18(1): 109, 2016 11 05.
Artículo en Inglés | MEDLINE | ID: mdl-27814745

RESUMEN

BACKGROUND: Although genome-wide association studies (GWASs) have identified thousands of disease susceptibility regions, the underlying causal mechanism in these regions is not fully known. It is likely that the GWAS signal originates from one or many as yet unidentified causal variants. METHODS: Using next-generation sequencing, we characterized 12 breast cancer susceptibility regions identified by GWASs in 2288 breast cancer cases and 2323 controls across four populations of African American, European, Japanese, and Hispanic ancestry. RESULTS: After genotype calling and quality control, we identified 137,530 single-nucleotide variants (SNVs); of those, 87.2 % had a minor allele frequency (MAF) <0.005. For SNVs with MAF >0.005, we calculated the smallest number of SNVs needed to obtain a posterior probability set (PPS) such that there is 90 % probability that the causal SNV is included. We found that the PPS for two regions, 2q35 and 11q13, contained less than 5 % of the original SNVs, dramatically decreasing the number of potentially causal SNVs. However, we did not find strong evidence supporting a causal role for any individual SNV. In addition, there were no significant gene-based rare SNV associations after correcting for multiple testing. CONCLUSIONS: This study illustrates some of the challenges faced in fine-mapping studies in the post-GWAS era, most importantly the large sample sizes needed to identify rare-variant associations or to distinguish the effects of strongly correlated common SNVs.


Asunto(s)
Neoplasias de la Mama/genética , Etnicidad/genética , Predisposición Genética a la Enfermedad , Secuenciación de Nucleótidos de Alto Rendimiento , Adulto , Estudios de Casos y Controles , Femenino , Frecuencia de los Genes , Variación Genética , Estudio de Asociación del Genoma Completo , Humanos , Persona de Mediana Edad , Anotación de Secuencia Molecular , Enfermeras y Enfermeros , Sistemas de Lectura Abierta , Polimorfismo de Nucleótido Simple
10.
BMC Bioinformatics ; 15 Suppl 14: S7, 2014.
Artículo en Inglés | MEDLINE | ID: mdl-25472764

RESUMEN

BACKGROUND: Computational biology comprises a wide range of technologies and approaches. Multiple technologies can be combined to create more powerful workflows if the individuals contributing the data or providing tools for its interpretation can find mutual understanding and consensus. Much conversation and joint investigation are required in order to identify and implement the best approaches. Traditionally, scientific conferences feature talks presenting novel technologies or insights, followed up by informal discussions during coffee breaks. In multi-institution collaborations, in order to reach agreement on implementation details or to transfer deeper insights in a technology and practical skills, a representative of one group typically visits the other. However, this does not scale well when the number of technologies or research groups is large. Conferences have responded to this issue by introducing Birds-of-a-Feather (BoF) sessions, which offer an opportunity for individuals with common interests to intensify their interaction. However, parallel BoF sessions often make it hard for participants to join multiple BoFs and find common ground between the different technologies, and BoFs are generally too short to allow time for participants to program together. RESULTS: This report summarises our experience with computational biology Codefests, Hackathons and Sprints, which are interactive developer meetings. They are structured to reduce the limitations of traditional scientific meetings described above by strengthening the interaction among peers and letting the participants determine the schedule and topics. These meetings are commonly run as loosely scheduled "unconferences" (self-organized identification of participants and topics for meetings) over at least two days, with early introductory talks to welcome and organize contributors, followed by intensive collaborative coding sessions. We summarise some prominent achievements of those meetings and describe differences in how these are organised, how their audience is addressed, and their outreach to their respective communities. CONCLUSIONS: Hackathons, Codefests and Sprints share a stimulating atmosphere that encourages participants to jointly brainstorm and tackle problems of shared interest in a self-driven proactive environment, as well as providing an opportunity for new participants to get involved in collaborative projects.


Asunto(s)
Biología Computacional , Conducta Cooperativa , Programas Informáticos , Comunicación , Internet
11.
PLoS Comput Biol ; 9(7): e1003153, 2013.
Artículo en Inglés | MEDLINE | ID: mdl-23874191

RESUMEN

Modern DNA sequencing technologies enable geneticists to rapidly identify genetic variation among many human genomes. However, isolating the minority of variants underlying disease remains an important, yet formidable challenge for medical genetics. We have developed GEMINI (GEnome MINIng), a flexible software package for exploring all forms of human genetic variation. Unlike existing tools, GEMINI integrates genetic variation with a diverse and adaptable set of genome annotations (e.g., dbSNP, ENCODE, UCSC, ClinVar, KEGG) into a unified database to facilitate interpretation and data exploration. Whereas other methods provide an inflexible set of variant filters or prioritization methods, GEMINI allows researchers to compose complex queries based on sample genotypes, inheritance patterns, and both pre-installed and custom genome annotations. GEMINI also provides methods for ad hoc queries and data exploration, a simple programming interface for custom analyses that leverage the underlying database, and both command line and graphical tools for common analyses. We demonstrate GEMINI's utility for exploring variation in personal genomes and family based genetic studies, and illustrate its ability to scale to studies involving thousands of human samples. GEMINI is designed for reproducibility and flexibility and our goal is to provide researchers with a standard framework for medical genomics.


Asunto(s)
Bases de Datos Genéticas , Variación Genética , Genoma Humano , Genómica/métodos , Programas Informáticos , Minería de Datos , Genotipo , Humanos
12.
Nucleic Acids Res ; 40(Database issue): D984-91, 2012 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-22121217

RESUMEN

Mounting evidence suggests that malignant tumors are initiated and maintained by a subpopulation of cancerous cells with biological properties similar to those of normal stem cells. However, descriptions of stem-like gene and pathway signatures in cancers are inconsistent across experimental systems. Driven by a need to improve our understanding of molecular processes that are common and unique across cancer stem cells (CSCs), we have developed the Stem Cell Discovery Engine (SCDE)-an online database of curated CSC experiments coupled to the Galaxy analytical framework. The SCDE allows users to consistently describe, share and compare CSC data at the gene and pathway level. Our initial focus has been on carefully curating tissue and cancer stem cell-related experiments from blood, intestine and brain to create a high quality resource containing 53 public studies and 1098 assays. The experimental information is captured and stored in the multi-omics Investigation/Study/Assay (ISA-Tab) format and can be queried in the data repository. A linked Galaxy framework provides a comprehensive, flexible environment populated with novel tools for gene list comparisons against molecular signatures in GeneSigDB and MSigDB, curated experiments in the SCDE and pathways in WikiPathways. The SCDE is available at http://discovery.hsci.harvard.edu.


Asunto(s)
Bases de Datos Genéticas , Células Madre Neoplásicas/metabolismo , Animales , Perfilación de la Expresión Génica , Humanos , Ratones , Integración de Sistemas
13.
Proc Natl Acad Sci U S A ; 108(51): 20497-502, 2011 Dec 20.
Artículo en Inglés | MEDLINE | ID: mdl-22143764

RESUMEN

Long noncoding RNAs (lncRNAs) have important regulatory roles and can function at the level of chromatin. To determine where lncRNAs bind to chromatin, we developed capture hybridization analysis of RNA targets (CHART), a hybridization-based technique that specifically enriches endogenous RNAs along with their targets from reversibly cross-linked chromatin extracts. CHART was used to enrich the DNA and protein targets of endogenous lncRNAs from flies and humans. This analysis was extended to genome-wide mapping of roX2, a well-studied ncRNA involved in dosage compensation in Drosophila. CHART revealed that roX2 binds at specific genomic sites that coincide with the binding sites of proteins from the male-specific lethal complex that affects dosage compensation. These results reveal the genomic targets of roX2 and demonstrate how CHART can be used to study RNAs in a manner analogous to chromatin immunoprecipitation for proteins.


Asunto(s)
Proteínas de Drosophila/genética , Drosophila/genética , Genómica , ARN no Traducido/genética , Proteínas de Unión al ARN/genética , Secuencias de Aminoácidos , Animales , Sitios de Unión , Cromatina/química , Cromatina/genética , Inmunoprecipitación de Cromatina , Compensación de Dosificación (Genética) , Masculino , Modelos Genéticos , Hibridación de Ácido Nucleico , Ribonucleasa H/química
14.
BMC Bioinformatics ; 13: 315, 2012 Nov 27.
Artículo en Inglés | MEDLINE | ID: mdl-23181507

RESUMEN

BACKGROUND: Cloud computing provides an infrastructure that facilitates large scale computational analysis in a scalable, democratized fashion, However, in this context it is difficult to ensure sharing of an analysis environment and associated data in a scalable and precisely reproducible way. RESULTS: CloudMan (usecloudman.org) enables individual researchers to easily deploy, customize, and share their entire cloud analysis environment, including data, tools, and configurations. CONCLUSIONS: With the enabled customization and sharing of instances, CloudMan can be used as a platform for collaboration. The presented solution improves accessibility of cloud resources, tools, and data to the level of an individual researcher and contributes toward reproducibility and transparency of research solutions.


Asunto(s)
Almacenamiento y Recuperación de la Información , Programas Informáticos
15.
BMC Bioinformatics ; 13: 209, 2012 Aug 21.
Artículo en Inglés | MEDLINE | ID: mdl-22909249

RESUMEN

BACKGROUND: Ongoing innovation in phylogenetics and evolutionary biology has been accompanied by a proliferation of software tools, data formats, analytical techniques and web servers. This brings with it the challenge of integrating phylogenetic and other related biological data found in a wide variety of formats, and underlines the need for reusable software that can read, manipulate and transform this information into the various forms required to build computational pipelines. RESULTS: We built a Python software library for working with phylogenetic data that is tightly integrated with Biopython, a broad-ranging toolkit for computational biology. Our library, Bio.Phylo, is highly interoperable with existing libraries, tools and standards, and is capable of parsing common file formats for phylogenetic trees, performing basic transformations and manipulations, attaching rich annotations, and visualizing trees. We unified the modules for working with the standard file formats Newick, NEXUS and phyloXML behind a consistent and simple API, providing a common set of functionality independent of the data source. CONCLUSIONS: Bio.Phylo meets a growing need in bioinformatics for working with heterogeneous types of phylogenetic data. By supporting interoperability with multiple file formats and leveraging existing Biopython features, this library simplifies the construction of phylogenetic workflows. We also provide examples of the benefits of building a community around a shared open-source project. Bio.Phylo is included with Biopython, available through the Biopython website, http://biopython.org.


Asunto(s)
Filogenia , Programas Informáticos , Biología Computacional/métodos
16.
BMC Bioinformatics ; 13: 42, 2012 Mar 19.
Artículo en Inglés | MEDLINE | ID: mdl-22429538

RESUMEN

BACKGROUND: A steep drop in the cost of next-generation sequencing during recent years has made the technology affordable to the majority of researchers, but downstream bioinformatic analysis still poses a resource bottleneck for smaller laboratories and institutes that do not have access to substantial computational resources. Sequencing instruments are typically bundled with only the minimal processing and storage capacity required for data capture during sequencing runs. Given the scale of sequence datasets, scientific value cannot be obtained from acquiring a sequencer unless it is accompanied by an equal investment in informatics infrastructure. RESULTS: Cloud BioLinux is a publicly accessible Virtual Machine (VM) that enables scientists to quickly provision on-demand infrastructures for high-performance bioinformatics computing using cloud platforms. Users have instant access to a range of pre-configured command line and graphical software applications, including a full-featured desktop interface, documentation and over 135 bioinformatics packages for applications including sequence alignment, clustering, assembly, display, editing, and phylogeny. Each tool's functionality is fully described in the documentation directly accessible from the graphical interface of the VM. Besides the Amazon EC2 cloud, we have started instances of Cloud BioLinux on a private Eucalyptus cloud installed at the J. Craig Venter Institute, and demonstrated access to the bioinformatic tools interface through a remote connection to EC2 instances from a local desktop computer. Documentation for using Cloud BioLinux on EC2 is available from our project website, while a Eucalyptus cloud image and VirtualBox Appliance is also publicly available for download and use by researchers with access to private clouds. CONCLUSIONS: Cloud BioLinux provides a platform for developing bioinformatics infrastructures on the cloud. An automated and configurable process builds Virtual Machines, allowing the development of highly customized versions from a shared code base. This shared community toolkit enables application specific analysis platforms on the cloud by minimizing the effort required to prepare and maintain them.


Asunto(s)
Metodologías Computacionales , Genómica/métodos , Animales , Computadores , Humanos , Alineación de Secuencia , Programas Informáticos
17.
Nucleic Acids Res ; 38(8): 2594-602, 2010 May.
Artículo en Inglés | MEDLINE | ID: mdl-20194119

RESUMEN

The engineering of biological components has been facilitated by de novo synthesis of gene-length DNA. Biological engineering at the level of pathways and genomes, however, requires a scalable and cost-effective assembly of DNA molecules that are longer than approximately 10 kb, and this remains a challenge. Here we present the development of pairwise selection assembly (PSA), a process that involves hierarchical construction of long-length DNA through the use of a standard set of components and operations. In PSA, activation tags at the termini of assembly sub-fragments are reused throughout the assembly process to activate vector-encoded selectable markers. Marker activation enables stringent selection for a correctly assembled product in vivo, often obviating the need for clonal isolation. Importantly, construction via PSA is sequence-independent, and does not require primary sequence modification (e.g. the addition or removal of restriction sites). The utility of PSA is demonstrated in the construction of a completely synthetic 91-kb chromosome arm from Saccharomyces cerevisiae.


Asunto(s)
ADN/síntesis química , Ingeniería Genética/métodos , Saccharomyces cerevisiae/genética , Secuencia de Bases , Cromosomas Fúngicos , ADN/química
19.
BMC Bioinformatics ; 11 Suppl 12: S4, 2010 Dec 21.
Artículo en Inglés | MEDLINE | ID: mdl-21210983

RESUMEN

BACKGROUND: Widespread adoption of high-throughput sequencing has greatly increased the scale and sophistication of computational infrastructure needed to perform genomic research. An alternative to building and maintaining local infrastructure is "cloud computing", which, in principle, offers on demand access to flexible computational infrastructure. However, cloud computing resources are not yet suitable for immediate "as is" use by experimental biologists. RESULTS: We present a cloud resource management system that makes it possible for individual researchers to compose and control an arbitrarily sized compute cluster on Amazon's EC2 cloud infrastructure without any informatics requirements. Within this system, an entire suite of biological tools packaged by the NERC Bio-Linux team (http://nebc.nerc.ac.uk/tools/bio-linux) is available for immediate consumption. The provided solution makes it possible, using only a web browser, to create a completely configured compute cluster ready to perform analysis in less than five minutes. Moreover, we provide an automated method for building custom deployments of cloud resources. This approach promotes reproducibility of results and, if desired, allows individuals and labs to add or customize an otherwise available cloud system to better meet their needs. CONCLUSIONS: The expected knowledge and associated effort with deploying a compute cluster in the Amazon EC2 cloud is not trivial. The solution presented in this paper eliminates these barriers, making it possible for researchers to deploy exactly the amount of computing power they need, combined with a wealth of existing analysis software, to handle the ongoing data deluge.


Asunto(s)
Biología Computacional/métodos , Programas Informáticos , Análisis por Conglomerados , Internet
20.
Bioinformatics ; 25(11): 1422-3, 2009 Jun 01.
Artículo en Inglés | MEDLINE | ID: mdl-19304878

RESUMEN

SUMMARY: The Biopython project is a mature open source international collaboration of volunteer developers, providing Python libraries for a wide range of bioinformatics problems. Biopython includes modules for reading and writing different sequence file formats and multiple sequence alignments, dealing with 3D macro molecular structures, interacting with common tools such as BLAST, ClustalW and EMBOSS, accessing key online databases, as well as providing numerical methods for statistical learning. AVAILABILITY: Biopython is freely available, with documentation and source code at (www.biopython.org) under the Biopython license.


Asunto(s)
Biología Computacional/métodos , Programas Informáticos , Bases de Datos Factuales , Internet , Lenguajes de Programación
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA