Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 46
Filtrar
Mais filtros

Base de dados
País/Região como assunto
Tipo de documento
Intervalo de ano de publicação
1.
Alzheimers Dement ; 17(9): 1509-1527, 2021 09.
Artigo em Inglês | MEDLINE | ID: mdl-33797837

RESUMO

INTRODUCTION: Genome-wide association studies have led to numerous genetic loci associated with Alzheimer's disease (AD). Whole-genome sequencing (WGS) now permits genome-wide analyses to identify rare variants contributing to AD risk. METHODS: We performed single-variant and spatial clustering-based testing on rare variants (minor allele frequency [MAF] ≤1%) in a family-based WGS-based association study of 2247 subjects from 605 multiplex AD families, followed by replication in 1669 unrelated individuals. RESULTS: We identified 13 new AD candidate loci that yielded consistent rare-variant signals in discovery and replication cohorts (4 from single-variant, 9 from spatial-clustering), implicating these genes: FNBP1L, SEL1L, LINC00298, PRKCH, C15ORF41, C2CD3, KIF2A, APC, LHX9, NALCN, CTNNA2, SYTL3, and CLSTN2. DISCUSSION: Downstream analyses of these novel loci highlight synaptic function, in contrast to common AD-associated variants, which implicate innate immunity and amyloid processing. These loci have not been associated previously with AD, emphasizing the ability of WGS to identify AD-associated rare variants, particularly outside of the exome.


Assuntos
Doença de Alzheimer/genética , Frequência do Gene/genética , Predisposição Genética para Doença , Sequenciamento Completo do Genoma , Estudo de Associação Genômica Ampla , Humanos , Canais Iônicos/genética , Cinesinas/genética , Proteínas de Membrana/genética , Proteínas Associadas aos Microtúbulos/genética , Proteínas/genética
2.
Nature ; 514(7522): 322-7, 2014 Oct 16.
Artigo em Inglês | MEDLINE | ID: mdl-25296256

RESUMO

It is currently thought that life-long blood cell production is driven by the action of a small number of multipotent haematopoietic stem cells. Evidence supporting this view has been largely acquired through the use of functional assays involving transplantation. However, whether these mechanisms also govern native non-transplant haematopoiesis is entirely unclear. Here we have established a novel experimental model in mice where cells can be uniquely and genetically labelled in situ to address this question. Using this approach, we have performed longitudinal analyses of clonal dynamics in adult mice that reveal unprecedented features of native haematopoiesis. In contrast to what occurs following transplantation, steady-state blood production is maintained by the successive recruitment of thousands of clones, each with a minimal contribution to mature progeny. Our results demonstrate that a large number of long-lived progenitors, rather than classically defined haematopoietic stem cells, are the main drivers of steady-state haematopoiesis during most of adulthood. Our results also have implications for understanding the cellular origin of haematopoietic disease.


Assuntos
Linhagem da Célula , Células Clonais/citologia , Hematopoese , Animais , Senescência Celular , Células Clonais/metabolismo , Elementos de DNA Transponíveis/genética , Feminino , Marcadores Genéticos/genética , Transplante de Células-Tronco Hematopoéticas , Células-Tronco Hematopoéticas/citologia , Células-Tronco Hematopoéticas/metabolismo , Masculino , Camundongos , Mielopoese , Coloração e Rotulagem , Fatores de Tempo
3.
Hum Mol Genet ; 26(8): 1472-1482, 2017 04 15.
Artigo em Inglês | MEDLINE | ID: mdl-28186563

RESUMO

SOX5 encodes a transcription factor that is expressed in multiple tissues including heart, lung and brain. Mutations in SOX5 have been previously found in patients with amyotrophic lateral sclerosis (ALS) and developmental delay, intellectual disability and dysmorphic features. To characterize the neuronal role of SOX5, we silenced the Drosophila ortholog of SOX5, Sox102F, by RNAi in various neuronal subtypes in Drosophila. Silencing of Sox102F led to misorientated and disorganized michrochaetes, neurons with shorter dendritic arborization (DA) and reduced complexity, diminished larval peristaltic contractions, loss of neuromuscular junction bouton structures, impaired olfactory perception, and severe neurodegeneration in brain. Silencing of SOX5 in human SH-SY5Y neuroblastoma cells resulted in a significant repression of WNT signaling activity and altered expression of WNT-related genes. Genetic association and meta-analyses of the results in several large family-based and case-control late-onset familial Alzheimer's disease (LOAD) samples of SOX5 variants revealed several variants that show significant association with AD disease status. In addition, analysis for rare and highly penetrate functional variants revealed four novel variants/mutations in SOX5, which taken together with functional prediction analysis, suggests a strong role of SOX5 causing AD in the carrier families. Collectively, these findings indicate that SOX5 is a novel candidate gene for LOAD with an important role in neuronal function. The genetic findings warrant further studies to identify and characterize SOX5 variants that confer risk for AD, ALS and intellectual disability.


Assuntos
Doença de Alzheimer/genética , Esclerose Lateral Amiotrófica/genética , Deficiências do Desenvolvimento/genética , Proteínas de Drosophila/genética , Fatores de Transcrição SOXD/genética , Doença de Alzheimer/patologia , Esclerose Lateral Amiotrófica/patologia , Animais , Deficiências do Desenvolvimento/patologia , Drosophila/genética , Inativação Gênica , Estudos de Associação Genética , Humanos , Junção Neuromuscular/genética , Junção Neuromuscular/patologia , Plasticidade Neuronal/genética , Neurônios/metabolismo , Neurônios/patologia , Interferência de RNA , Via de Sinalização Wnt/genética
4.
Nature ; 493(7434): 694-8, 2013 Jan 31.
Artigo em Inglês | MEDLINE | ID: mdl-23364702

RESUMO

Genetic and biochemical analyses of RNA interference (RNAi) and microRNA (miRNA) pathways have revealed proteins such as Argonaute and Dicer as essential cofactors that process and present small RNAs to their targets. Well-validated small RNA pathway cofactors such as these show distinctive patterns of conservation or divergence in particular animal, plant, fungal and protist species. We compared 86 divergent eukaryotic genome sequences to discern sets of proteins that show similar phylogenetic profiles with known small RNA cofactors. A large set of additional candidate small RNA cofactors have emerged from functional genomic screens for defects in miRNA- or short interfering RNA (siRNA)-mediated repression in Caenorhabditis elegans and Drosophila melanogaster, and from proteomic analyses of proteins co-purifying with validated small RNA pathway proteins. The phylogenetic profiles of many of these candidate small RNA pathway proteins are similar to those of known small RNA cofactor proteins. We used a Bayesian approach to integrate the phylogenetic profile analysis with predictions from diverse transcriptional coregulation and proteome interaction data sets to assign a probability for each protein for a role in a small RNA pathway. Testing high-confidence candidates from this analysis for defects in RNAi silencing, we found that about one-half of the predicted small RNA cofactors are required for RNAi silencing. Many of the newly identified small RNA pathway proteins are orthologues of proteins implicated in RNA splicing. In support of a deep connection between the mechanism of RNA splicing and small-RNA-mediated gene silencing, the presence of the Argonaute proteins and other small RNA components in the many species analysed strongly correlates with the number of introns in those species.


Assuntos
Caenorhabditis elegans/genética , Variação Genética , Filogenia , RNA Interferente Pequeno/genética , Animais , Caenorhabditis elegans/classificação , Proteínas de Caenorhabditis elegans/genética , Eucariotos/classificação , Eucariotos/genética , Genoma/genética , MicroRNAs/genética , Proteoma , Splicing de RNA
5.
Genes Dev ; 25(20): 2210-21, 2011 Oct 15.
Artigo em Inglês | MEDLINE | ID: mdl-22012622

RESUMO

Polycomb group (PcG) proteins are required for the epigenetic maintenance of developmental genes in a silent state. Proteins in the Polycomb-repressive complex 1 (PRC1) class of the PcG are conserved from flies to humans and inhibit transcription. One hypothesis for PRC1 mechanism is that it compacts chromatin, based in part on electron microscopy experiments demonstrating that Drosophila PRC1 compacts nucleosomal arrays. We show that this function is conserved between Drosophila and mouse PRC1 complexes and requires a region with an overrepresentation of basic amino acids. While the active region is found in the Posterior Sex Combs (PSC) subunit in Drosophila, it is unexpectedly found in a different PRC1 subunit, a Polycomb homolog called M33, in mice. We provide experimental support for the general importance of a charged region by predicting the compacting capability of PcG proteins from species other than Drosophila and mice and by testing several of these proteins using solution assays and microscopy. We infer that the ability of PcG proteins to compact chromatin in vitro can be predicted by the presence of domains of high positive charge and that PRC1 components from a variety of species conserve this highly charged region. This supports the hypothesis that compaction is a key aspect of PcG function.


Assuntos
Cromatina/metabolismo , Proteínas Repressoras/química , Proteínas Repressoras/metabolismo , Animais , Linhagem Celular , Sequência Conservada/genética , Drosophila melanogaster/classificação , Drosophila melanogaster/genética , Drosophila melanogaster/metabolismo , Evolução Molecular , Camundongos , Mutação , Filogenia , Complexo Repressor Polycomb 1 , Proteínas do Grupo Polycomb , Proteínas Repressoras/genética , Relação Estrutura-Atividade
6.
Nucleic Acids Res ; 44(11): e108, 2016 06 20.
Artigo em Inglês | MEDLINE | ID: mdl-27060149

RESUMO

Accurate variant calling in next generation sequencing (NGS) is critical to understand cancer genomes better. Here we present VarDict, a novel and versatile variant caller for both DNA- and RNA-sequencing data. VarDict simultaneously calls SNV, MNV, InDels, complex and structural variants, expanding the detected genetic driver landscape of tumors. It performs local realignments on the fly for more accurate allele frequency estimation. VarDict performance scales linearly to sequencing depth, enabling ultra-deep sequencing used to explore tumor evolution or detect tumor DNA circulating in blood. In addition, VarDict performs amplicon aware variant calling for polymerase chain reaction (PCR)-based targeted sequencing often used in diagnostic settings, and is able to detect PCR artifacts. Finally, VarDict also detects differences in somatic and loss of heterozygosity variants between paired samples. VarDict reprocessing of The Cancer Genome Atlas (TCGA) Lung Adenocarcinoma dataset called known driver mutations in KRAS, EGFR, BRAF, PIK3CA and MET in 16% more patients than previously published variant calls. We believe VarDict will greatly facilitate application of NGS in clinical cancer research.


Assuntos
Biologia Computacional/métodos , Sequenciamento de Nucleotídeos em Larga Escala , Análise de Sequência de DNA , Software , Alelos , Frequência do Gene , Variação Genética , Humanos , Mutação INDEL , Perda de Heterozigosidade , Neoplasias Pulmonares/genética , Neoplasias/genética , Curva ROC , Pesquisa
7.
PLoS Comput Biol ; 12(2): e1004691, 2016 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-26914653

RESUMO

The Bioinformatics Open Source Conference (BOSC) is organized by the Open Bioinformatics Foundation (OBF), a nonprofit group dedicated to promoting the practice and philosophy of open source software development and open science within the biological research community. Since its inception in 2000, BOSC has provided bioinformatics developers with a forum for communicating the results of their latest efforts to the wider research community. BOSC offers a focused environment for developers and users to interact and share ideas about standards; software development practices; practical techniques for solving bioinformatics problems; and approaches that promote open science and sharing of data, results, and software. BOSC is run as a two-day special interest group (SIG) before the annual Intelligent Systems in Molecular Biology (ISMB) conference. BOSC 2015 took place in Dublin, Ireland, and was attended by over 125 people, about half of whom were first-time attendees. Session topics included "Data Science;" "Standards and Interoperability;" "Open Science and Reproducibility;" "Translational Bioinformatics;" "Visualization;" and "Bioinformatics Open Source Project Updates". In addition to two keynote talks and dozens of shorter talks chosen from submitted abstracts, BOSC 2015 included a panel, titled "Open Source, Open Door: Increasing Diversity in the Bioinformatics Open Source Community," that provided an opportunity for open discussion about ways to increase the diversity of participants in BOSC in particular, and in open source bioinformatics in general. The complete program of BOSC 2015 is available online at http://www.open-bio.org/wiki/BOSC_2015_Schedule.


Assuntos
Biologia Computacional/organização & administração , Congressos como Assunto , Humanos , Irlanda
8.
Breast Cancer Res ; 18(1): 109, 2016 11 05.
Artigo em Inglês | MEDLINE | ID: mdl-27814745

RESUMO

BACKGROUND: Although genome-wide association studies (GWASs) have identified thousands of disease susceptibility regions, the underlying causal mechanism in these regions is not fully known. It is likely that the GWAS signal originates from one or many as yet unidentified causal variants. METHODS: Using next-generation sequencing, we characterized 12 breast cancer susceptibility regions identified by GWASs in 2288 breast cancer cases and 2323 controls across four populations of African American, European, Japanese, and Hispanic ancestry. RESULTS: After genotype calling and quality control, we identified 137,530 single-nucleotide variants (SNVs); of those, 87.2 % had a minor allele frequency (MAF) <0.005. For SNVs with MAF >0.005, we calculated the smallest number of SNVs needed to obtain a posterior probability set (PPS) such that there is 90 % probability that the causal SNV is included. We found that the PPS for two regions, 2q35 and 11q13, contained less than 5 % of the original SNVs, dramatically decreasing the number of potentially causal SNVs. However, we did not find strong evidence supporting a causal role for any individual SNV. In addition, there were no significant gene-based rare SNV associations after correcting for multiple testing. CONCLUSIONS: This study illustrates some of the challenges faced in fine-mapping studies in the post-GWAS era, most importantly the large sample sizes needed to identify rare-variant associations or to distinguish the effects of strongly correlated common SNVs.


Assuntos
Neoplasias da Mama/genética , Etnicidade/genética , Predisposição Genética para Doença , Sequenciamento de Nucleotídeos em Larga Escala , Adulto , Estudos de Casos e Controles , Feminino , Frequência do Gene , Variação Genética , Estudo de Associação Genômica Ampla , Humanos , Pessoa de Meia-Idade , Anotação de Sequência Molecular , Enfermeiras e Enfermeiros , Fases de Leitura Aberta , Polimorfismo de Nucleotídeo Único
10.
BMC Bioinformatics ; 15 Suppl 14: S7, 2014.
Artigo em Inglês | MEDLINE | ID: mdl-25472764

RESUMO

BACKGROUND: Computational biology comprises a wide range of technologies and approaches. Multiple technologies can be combined to create more powerful workflows if the individuals contributing the data or providing tools for its interpretation can find mutual understanding and consensus. Much conversation and joint investigation are required in order to identify and implement the best approaches. Traditionally, scientific conferences feature talks presenting novel technologies or insights, followed up by informal discussions during coffee breaks. In multi-institution collaborations, in order to reach agreement on implementation details or to transfer deeper insights in a technology and practical skills, a representative of one group typically visits the other. However, this does not scale well when the number of technologies or research groups is large. Conferences have responded to this issue by introducing Birds-of-a-Feather (BoF) sessions, which offer an opportunity for individuals with common interests to intensify their interaction. However, parallel BoF sessions often make it hard for participants to join multiple BoFs and find common ground between the different technologies, and BoFs are generally too short to allow time for participants to program together. RESULTS: This report summarises our experience with computational biology Codefests, Hackathons and Sprints, which are interactive developer meetings. They are structured to reduce the limitations of traditional scientific meetings described above by strengthening the interaction among peers and letting the participants determine the schedule and topics. These meetings are commonly run as loosely scheduled "unconferences" (self-organized identification of participants and topics for meetings) over at least two days, with early introductory talks to welcome and organize contributors, followed by intensive collaborative coding sessions. We summarise some prominent achievements of those meetings and describe differences in how these are organised, how their audience is addressed, and their outreach to their respective communities. CONCLUSIONS: Hackathons, Codefests and Sprints share a stimulating atmosphere that encourages participants to jointly brainstorm and tackle problems of shared interest in a self-driven proactive environment, as well as providing an opportunity for new participants to get involved in collaborative projects.


Assuntos
Biologia Computacional , Comportamento Cooperativo , Software , Comunicação , Internet
11.
PLoS Comput Biol ; 9(7): e1003153, 2013.
Artigo em Inglês | MEDLINE | ID: mdl-23874191

RESUMO

Modern DNA sequencing technologies enable geneticists to rapidly identify genetic variation among many human genomes. However, isolating the minority of variants underlying disease remains an important, yet formidable challenge for medical genetics. We have developed GEMINI (GEnome MINIng), a flexible software package for exploring all forms of human genetic variation. Unlike existing tools, GEMINI integrates genetic variation with a diverse and adaptable set of genome annotations (e.g., dbSNP, ENCODE, UCSC, ClinVar, KEGG) into a unified database to facilitate interpretation and data exploration. Whereas other methods provide an inflexible set of variant filters or prioritization methods, GEMINI allows researchers to compose complex queries based on sample genotypes, inheritance patterns, and both pre-installed and custom genome annotations. GEMINI also provides methods for ad hoc queries and data exploration, a simple programming interface for custom analyses that leverage the underlying database, and both command line and graphical tools for common analyses. We demonstrate GEMINI's utility for exploring variation in personal genomes and family based genetic studies, and illustrate its ability to scale to studies involving thousands of human samples. GEMINI is designed for reproducibility and flexibility and our goal is to provide researchers with a standard framework for medical genomics.


Assuntos
Bases de Dados Genéticas , Variação Genética , Genoma Humano , Genômica/métodos , Software , Mineração de Dados , Genótipo , Humanos
12.
Nucleic Acids Res ; 40(Database issue): D984-91, 2012 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-22121217

RESUMO

Mounting evidence suggests that malignant tumors are initiated and maintained by a subpopulation of cancerous cells with biological properties similar to those of normal stem cells. However, descriptions of stem-like gene and pathway signatures in cancers are inconsistent across experimental systems. Driven by a need to improve our understanding of molecular processes that are common and unique across cancer stem cells (CSCs), we have developed the Stem Cell Discovery Engine (SCDE)-an online database of curated CSC experiments coupled to the Galaxy analytical framework. The SCDE allows users to consistently describe, share and compare CSC data at the gene and pathway level. Our initial focus has been on carefully curating tissue and cancer stem cell-related experiments from blood, intestine and brain to create a high quality resource containing 53 public studies and 1098 assays. The experimental information is captured and stored in the multi-omics Investigation/Study/Assay (ISA-Tab) format and can be queried in the data repository. A linked Galaxy framework provides a comprehensive, flexible environment populated with novel tools for gene list comparisons against molecular signatures in GeneSigDB and MSigDB, curated experiments in the SCDE and pathways in WikiPathways. The SCDE is available at http://discovery.hsci.harvard.edu.


Assuntos
Bases de Dados Genéticas , Células-Tronco Neoplásicas/metabolismo , Animais , Perfilação da Expressão Gênica , Humanos , Camundongos , Integração de Sistemas
13.
Proc Natl Acad Sci U S A ; 108(51): 20497-502, 2011 Dec 20.
Artigo em Inglês | MEDLINE | ID: mdl-22143764

RESUMO

Long noncoding RNAs (lncRNAs) have important regulatory roles and can function at the level of chromatin. To determine where lncRNAs bind to chromatin, we developed capture hybridization analysis of RNA targets (CHART), a hybridization-based technique that specifically enriches endogenous RNAs along with their targets from reversibly cross-linked chromatin extracts. CHART was used to enrich the DNA and protein targets of endogenous lncRNAs from flies and humans. This analysis was extended to genome-wide mapping of roX2, a well-studied ncRNA involved in dosage compensation in Drosophila. CHART revealed that roX2 binds at specific genomic sites that coincide with the binding sites of proteins from the male-specific lethal complex that affects dosage compensation. These results reveal the genomic targets of roX2 and demonstrate how CHART can be used to study RNAs in a manner analogous to chromatin immunoprecipitation for proteins.


Assuntos
Proteínas de Drosophila/genética , Drosophila/genética , Genômica , RNA não Traduzido/genética , Proteínas de Ligação a RNA/genética , Motivos de Aminoácidos , Animais , Sítios de Ligação , Cromatina/química , Cromatina/genética , Imunoprecipitação da Cromatina , Mecanismo Genético de Compensação de Dose , Masculino , Modelos Genéticos , Hibridização de Ácido Nucleico , Ribonuclease H/química
14.
BMC Bioinformatics ; 13: 315, 2012 Nov 27.
Artigo em Inglês | MEDLINE | ID: mdl-23181507

RESUMO

BACKGROUND: Cloud computing provides an infrastructure that facilitates large scale computational analysis in a scalable, democratized fashion, However, in this context it is difficult to ensure sharing of an analysis environment and associated data in a scalable and precisely reproducible way. RESULTS: CloudMan (usecloudman.org) enables individual researchers to easily deploy, customize, and share their entire cloud analysis environment, including data, tools, and configurations. CONCLUSIONS: With the enabled customization and sharing of instances, CloudMan can be used as a platform for collaboration. The presented solution improves accessibility of cloud resources, tools, and data to the level of an individual researcher and contributes toward reproducibility and transparency of research solutions.


Assuntos
Armazenamento e Recuperação da Informação , Software
15.
BMC Bioinformatics ; 13: 209, 2012 Aug 21.
Artigo em Inglês | MEDLINE | ID: mdl-22909249

RESUMO

BACKGROUND: Ongoing innovation in phylogenetics and evolutionary biology has been accompanied by a proliferation of software tools, data formats, analytical techniques and web servers. This brings with it the challenge of integrating phylogenetic and other related biological data found in a wide variety of formats, and underlines the need for reusable software that can read, manipulate and transform this information into the various forms required to build computational pipelines. RESULTS: We built a Python software library for working with phylogenetic data that is tightly integrated with Biopython, a broad-ranging toolkit for computational biology. Our library, Bio.Phylo, is highly interoperable with existing libraries, tools and standards, and is capable of parsing common file formats for phylogenetic trees, performing basic transformations and manipulations, attaching rich annotations, and visualizing trees. We unified the modules for working with the standard file formats Newick, NEXUS and phyloXML behind a consistent and simple API, providing a common set of functionality independent of the data source. CONCLUSIONS: Bio.Phylo meets a growing need in bioinformatics for working with heterogeneous types of phylogenetic data. By supporting interoperability with multiple file formats and leveraging existing Biopython features, this library simplifies the construction of phylogenetic workflows. We also provide examples of the benefits of building a community around a shared open-source project. Bio.Phylo is included with Biopython, available through the Biopython website, http://biopython.org.


Assuntos
Filogenia , Software , Biologia Computacional/métodos
16.
BMC Bioinformatics ; 13: 42, 2012 Mar 19.
Artigo em Inglês | MEDLINE | ID: mdl-22429538

RESUMO

BACKGROUND: A steep drop in the cost of next-generation sequencing during recent years has made the technology affordable to the majority of researchers, but downstream bioinformatic analysis still poses a resource bottleneck for smaller laboratories and institutes that do not have access to substantial computational resources. Sequencing instruments are typically bundled with only the minimal processing and storage capacity required for data capture during sequencing runs. Given the scale of sequence datasets, scientific value cannot be obtained from acquiring a sequencer unless it is accompanied by an equal investment in informatics infrastructure. RESULTS: Cloud BioLinux is a publicly accessible Virtual Machine (VM) that enables scientists to quickly provision on-demand infrastructures for high-performance bioinformatics computing using cloud platforms. Users have instant access to a range of pre-configured command line and graphical software applications, including a full-featured desktop interface, documentation and over 135 bioinformatics packages for applications including sequence alignment, clustering, assembly, display, editing, and phylogeny. Each tool's functionality is fully described in the documentation directly accessible from the graphical interface of the VM. Besides the Amazon EC2 cloud, we have started instances of Cloud BioLinux on a private Eucalyptus cloud installed at the J. Craig Venter Institute, and demonstrated access to the bioinformatic tools interface through a remote connection to EC2 instances from a local desktop computer. Documentation for using Cloud BioLinux on EC2 is available from our project website, while a Eucalyptus cloud image and VirtualBox Appliance is also publicly available for download and use by researchers with access to private clouds. CONCLUSIONS: Cloud BioLinux provides a platform for developing bioinformatics infrastructures on the cloud. An automated and configurable process builds Virtual Machines, allowing the development of highly customized versions from a shared code base. This shared community toolkit enables application specific analysis platforms on the cloud by minimizing the effort required to prepare and maintain them.


Assuntos
Metodologias Computacionais , Genômica/métodos , Animais , Computadores , Humanos , Alinhamento de Sequência , Software
17.
Nucleic Acids Res ; 38(8): 2594-602, 2010 May.
Artigo em Inglês | MEDLINE | ID: mdl-20194119

RESUMO

The engineering of biological components has been facilitated by de novo synthesis of gene-length DNA. Biological engineering at the level of pathways and genomes, however, requires a scalable and cost-effective assembly of DNA molecules that are longer than approximately 10 kb, and this remains a challenge. Here we present the development of pairwise selection assembly (PSA), a process that involves hierarchical construction of long-length DNA through the use of a standard set of components and operations. In PSA, activation tags at the termini of assembly sub-fragments are reused throughout the assembly process to activate vector-encoded selectable markers. Marker activation enables stringent selection for a correctly assembled product in vivo, often obviating the need for clonal isolation. Importantly, construction via PSA is sequence-independent, and does not require primary sequence modification (e.g. the addition or removal of restriction sites). The utility of PSA is demonstrated in the construction of a completely synthetic 91-kb chromosome arm from Saccharomyces cerevisiae.


Assuntos
DNA/síntese química , Engenharia Genética/métodos , Saccharomyces cerevisiae/genética , Sequência de Bases , Cromossomos Fúngicos , DNA/química
19.
BMC Bioinformatics ; 11 Suppl 12: S4, 2010 Dec 21.
Artigo em Inglês | MEDLINE | ID: mdl-21210983

RESUMO

BACKGROUND: Widespread adoption of high-throughput sequencing has greatly increased the scale and sophistication of computational infrastructure needed to perform genomic research. An alternative to building and maintaining local infrastructure is "cloud computing", which, in principle, offers on demand access to flexible computational infrastructure. However, cloud computing resources are not yet suitable for immediate "as is" use by experimental biologists. RESULTS: We present a cloud resource management system that makes it possible for individual researchers to compose and control an arbitrarily sized compute cluster on Amazon's EC2 cloud infrastructure without any informatics requirements. Within this system, an entire suite of biological tools packaged by the NERC Bio-Linux team (http://nebc.nerc.ac.uk/tools/bio-linux) is available for immediate consumption. The provided solution makes it possible, using only a web browser, to create a completely configured compute cluster ready to perform analysis in less than five minutes. Moreover, we provide an automated method for building custom deployments of cloud resources. This approach promotes reproducibility of results and, if desired, allows individuals and labs to add or customize an otherwise available cloud system to better meet their needs. CONCLUSIONS: The expected knowledge and associated effort with deploying a compute cluster in the Amazon EC2 cloud is not trivial. The solution presented in this paper eliminates these barriers, making it possible for researchers to deploy exactly the amount of computing power they need, combined with a wealth of existing analysis software, to handle the ongoing data deluge.


Assuntos
Biologia Computacional/métodos , Software , Análise por Conglomerados , Internet
20.
Bioinformatics ; 25(11): 1422-3, 2009 Jun 01.
Artigo em Inglês | MEDLINE | ID: mdl-19304878

RESUMO

SUMMARY: The Biopython project is a mature open source international collaboration of volunteer developers, providing Python libraries for a wide range of bioinformatics problems. Biopython includes modules for reading and writing different sequence file formats and multiple sequence alignments, dealing with 3D macro molecular structures, interacting with common tools such as BLAST, ClustalW and EMBOSS, accessing key online databases, as well as providing numerical methods for statistical learning. AVAILABILITY: Biopython is freely available, with documentation and source code at (www.biopython.org) under the Biopython license.


Assuntos
Biologia Computacional/métodos , Software , Bases de Dados Factuais , Internet , Linguagens de Programação
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA