Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 106
Filtrar
Más filtros

País/Región como asunto
Tipo del documento
Intervalo de año de publicación
1.
Development ; 148(6)2021 03 21.
Artículo en Inglés | MEDLINE | ID: mdl-33653874

RESUMEN

To gain a deeper understanding of pancreatic ß-cell development, we used iterative weighted gene correlation network analysis to calculate a gene co-expression network (GCN) from 11 temporally and genetically defined murine cell populations. The GCN, which contained 91 distinct modules, was then used to gain three new biological insights. First, we found that the clustered protocadherin genes are differentially expressed during pancreas development. Pcdhγ genes are preferentially expressed in pancreatic endoderm, Pcdhß genes in nascent islets, and Pcdhα genes in mature ß-cells. Second, after extracting sub-networks of transcriptional regulators for each developmental stage, we identified 81 zinc finger protein (ZFP) genes that are preferentially expressed during endocrine specification and ß-cell maturation. Third, we used the GCN to select three ZFPs for further analysis by CRISPR mutagenesis of mice. Zfp800 null mice exhibited early postnatal lethality, and at E18.5 their pancreata exhibited a reduced number of pancreatic endocrine cells, alterations in exocrine cell morphology, and marked changes in expression of genes involved in protein translation, hormone secretion and developmental pathways in the pancreas. Together, our results suggest that developmentally oriented GCNs have utility for gaining new insights into gene regulation during organogenesis.


Asunto(s)
Diferenciación Celular/genética , Proteínas de Homeodominio/genética , Organogénesis/genética , Páncreas/crecimiento & desarrollo , Animales , Cadherinas/genética , Linaje de la Célula/genética , Regulación del Desarrollo de la Expresión Génica/genética , Insulina/metabolismo , Islotes Pancreáticos/citología , Islotes Pancreáticos/metabolismo , Ratones , Páncreas/metabolismo
2.
Alzheimers Dement ; 20(2): 1123-1136, 2024 Feb.
Artículo en Inglés | MEDLINE | ID: mdl-37881831

RESUMEN

INTRODUCTION: The National Institute on Aging Genetics of Alzheimer's Disease Data Storage Site Alzheimer's Genomics Database (GenomicsDB) is a public knowledge base of Alzheimer's disease (AD) genetic datasets and genomic annotations. METHODS: GenomicsDB uses a custom systems architecture to adopt and enforce rigorous standards that facilitate harmonization of AD-relevant genome-wide association study summary statistics datasets with functional annotations, including over 230 million annotated variants from the AD Sequencing Project. RESULTS: GenomicsDB generates interactive reports compiled from the harmonized datasets and annotations. These reports contextualize AD-risk associations in a broader functional genomic setting and summarize them in the context of functionally annotated genes and variants. DISCUSSION: Created to make AD-genetics knowledge more accessible to AD researchers, the GenomicsDB is designed to guide users unfamiliar with genetic data in not only exploring but also interpreting this ever-growing volume of data. Scalable and interoperable with other genomics resources using data technology standards, the GenomicsDB can serve as a central hub for research and data analysis on AD and related dementias. HIGHLIGHTS: The National Institute on Aging Genetics of Alzheimer's Disease Data Storage Site (NIAGADS) offers to the public a unique, disease-centric collection of AD-relevant GWAS summary statistics datasets. Interpreting these data is challenging and requires significant bioinformatics expertise to standardize datasets and harmonize them with functional annotations on genome-wide scales. The NIAGADS Alzheimer's GenomicsDB helps overcome these challenges by providing a user-friendly public knowledge base for AD-relevant genetics that shares harmonized, annotated summary statistics datasets from the NIAGADS repository in an interpretable, easily searchable format.


Asunto(s)
Enfermedad de Alzheimer , Estados Unidos , Humanos , Enfermedad de Alzheimer/genética , Estudio de Asociación del Genoma Completo , National Institute on Aging (U.S.) , Genómica , Bases de Datos Factuales , Predisposición Genética a la Enfermedad/genética
3.
J Biomed Inform ; 112S: 100086, 2020.
Artículo en Inglés | MEDLINE | ID: mdl-34417005

RESUMEN

Standardizing clinical information in a semantically rich data model is useful for promoting interoperability and facilitating high quality research. Semantic Web technologies such as Resource Description Framework can be utilized to their full potential when a model accurately reflects the semantics of the clinical situation it describes. To this end, ontologies that abide by sound organizational principles can be used as the building blocks of a semantically rich model for the storage of clinical data. However, it is a challenge to programmatically define such a model and load data from disparate sources. The PennTURBO Semantic Engine is a tool developed at the University of Pennsylvania that transforms concise RDF data into a source-independent, semantically rich model. This system sources classes from an application ontology and specifically defines how instances of those classes may relate to each other. Additionally, the system defines and executes RDF data transformations by launching dynamically generated SPARQL update statements. The Semantic Engine was designed as a generalizable data standardization tool, and is able to work with various data models and incoming data sources. Its human-readable configuration files can easily be shared between institutions, providing the basis for collaboration on a standard data model.

4.
Nucleic Acids Res ; 46(D1): D684-D691, 2018 01 04.
Artículo en Inglés | MEDLINE | ID: mdl-29106667

RESUMEN

MicrobiomeDB (http://microbiomeDB.org) is a data discovery and analysis platform that empowers researchers to fully leverage experimental variables to interrogate microbiome datasets. MicrobiomeDB was developed in collaboration with the Eukaryotic Pathogens Bioinformatics Resource Center (http://EuPathDB.org) and leverages the infrastructure and user interface of EuPathDB, which allows users to construct in silico experiments using an intuitive graphical 'strategy' approach. The current release of the database integrates microbial census data with sample details for nearly 14 000 samples originating from human, animal and environmental sources, including over 9000 samples from healthy human subjects in the Human Microbiome Project (http://portal.ihmpdcc.org/). Query results can be statistically analyzed and graphically visualized via interactive web applications launched directly in the browser, providing insight into microbial community diversity and allowing users to identify taxa associated with any experimental covariate.


Asunto(s)
Minería de Datos/métodos , Bases de Datos Genéticas , Microbiota , Biología de Sistemas , Animales , Simulación por Computador , Conjuntos de Datos como Asunto , Microbiología Ambiental , Variación Genética , Humanos , Internet , Aplicaciones Móviles , Interfaz Usuario-Computador , Flujo de Trabajo
5.
Nucleic Acids Res ; 45(D1): D581-D591, 2017 01 04.
Artículo en Inglés | MEDLINE | ID: mdl-27903906

RESUMEN

The Eukaryotic Pathogen Genomics Database Resource (EuPathDB, http://eupathdb.org) is a collection of databases covering 170+ eukaryotic pathogens (protists & fungi), along with relevant free-living and non-pathogenic species, and select pathogen hosts. To facilitate the discovery of meaningful biological relationships, the databases couple preconfigured searches with visualization and analysis tools for comprehensive data mining via intuitive graphical interfaces and APIs. All data are analyzed with the same workflows, including creation of gene orthology profiles, so data are easily compared across data sets, data types and organisms. EuPathDB is updated with numerous new analysis tools, features, data sets and data types. New tools include GO, metabolic pathway and word enrichment analyses plus an online workspace for analysis of personal, non-public, large-scale data. Expanded data content is mostly genomic and functional genomic data while new data types include protein microarray, metabolic pathways, compounds, quantitative proteomics, copy number variation, and polysomal transcriptomics. New features include consistent categorization of searches, data sets and genome browser tracks; redesigned gene pages; effective integration of alternative transcripts; and a EuPathDB Galaxy instance for private analyses of a user's data. Forthcoming upgrades include user workspaces for private integration of data with existing EuPathDB data and improved integration and presentation of host-pathogen interactions.


Asunto(s)
Bases de Datos Genéticas , Eucariontes , Genómica/métodos , Interacciones Huésped-Patógeno/genética , Metagenoma , Metagenómica/métodos , Programas Informáticos , Biología Computacional/métodos , Variaciones en el Número de Copia de ADN , Perfilación de la Expresión Génica , Proteómica , Navegador Web
6.
Genes Dev ; 24(10): 1035-44, 2010 May 15.
Artículo en Inglés | MEDLINE | ID: mdl-20478996

RESUMEN

The transcriptional mechanisms by which temporary exposure to developmental signals instigates adipocyte differentiation are unknown. During early adipogenesis, we find transient enrichment of the glucocorticoid receptor (GR), CCAAT/enhancer-binding protein beta (CEBPbeta), p300, mediator subunit 1, and histone H3 acetylation near genes involved in cell proliferation, development, and differentiation, including the gene encoding the master regulator of adipocyte differentiation, peroxisome proliferator-activated receptor gamma2 (PPARgamma2). Occupancy and enhancer function are triggered by adipogenic signals, and diminish upon their removal. GR, which is important for adipogenesis but need not be active in the mature adipocyte, functions transiently with other enhancer proteins to propagate a new program of gene expression that includes induction of PPARgamma2, thereby providing a memory of the earlier adipogenic signal. Thus, the conversion of preadipocyte to adipocyte involves the formation of an epigenomic transition state that is not observed in cells at the beginning or end of the differentiation process.


Asunto(s)
Adipogénesis/fisiología , Epigénesis Genética , Transducción de Señal , Acetilación , Animales , Proteína beta Potenciadora de Unión a CCAAT/metabolismo , Línea Celular , Histonas/metabolismo , Ratones , Receptores Activados del Proliferador del Peroxisoma/metabolismo , Receptores de Glucocorticoides/metabolismo
7.
Development ; 141(15): 2939-49, 2014 Aug.
Artículo en Inglés | MEDLINE | ID: mdl-25053427

RESUMEN

Insulinoma associated 1 (Insm1) plays an important role in regulating the development of cells in the central and peripheral nervous systems, olfactory epithelium and endocrine pancreas. To better define the role of Insm1 in pancreatic endocrine cell development we generated mice with an Insm1(GFPCre) reporter allele and used them to study Insm1-expressing and null populations. Endocrine progenitor cells lacking Insm1 were less differentiated and exhibited broad defects in hormone production, cell proliferation and cell migration. Embryos lacking Insm1 contained greater amounts of a non-coding Neurog3 mRNA splice variant and had fewer Neurog3/Insm1 co-expressing progenitor cells, suggesting that Insm1 positively regulates Neurog3. Moreover, endocrine progenitor cells that express either high or low levels of Pdx1, and thus may be biased towards the formation of specific cell lineages, exhibited cell type-specific differences in the genes regulated by Insm1. Analysis of the function of Ripply3, an Insm1-regulated gene enriched in the Pdx1-high cell population, revealed that it negatively regulates the proliferation of early endocrine cells. Taken together, these findings indicate that in developing pancreatic endocrine cells Insm1 promotes the transition from a ductal progenitor to a committed endocrine cell by repressing a progenitor cell program and activating genes essential for RNA splicing, cell migration, controlled cellular proliferation, vasculogenesis, extracellular matrix and hormone secretion.


Asunto(s)
Factores de Transcripción con Motivo Hélice-Asa-Hélice Básico/metabolismo , Proteínas de Unión al ADN/fisiología , Células Endocrinas/citología , Regulación del Desarrollo de la Expresión Génica , Proteínas del Tejido Nervioso/metabolismo , Proteínas Represoras/metabolismo , Factores de Transcripción/fisiología , Alelos , Empalme Alternativo , Animales , Diferenciación Celular , Linaje de la Célula , Movimiento Celular , Proliferación Celular , Separación Celular , Matriz Extracelular/metabolismo , Citometría de Flujo , Redes Reguladoras de Genes , Genes Reporteros , Proteínas Fluorescentes Verdes/metabolismo , Ratones , Ratones Noqueados , Páncreas/embriología , ARN/metabolismo , Empalme del ARN , Células Madre/citología , Factores de Tiempo , Transcripción Genética
8.
BMC Genomics ; 16: 506, 2015 Jul 07.
Artículo en Inglés | MEDLINE | ID: mdl-26148682

RESUMEN

BACKGROUND: Atherosclerosis is a heterogeneously distributed disease of arteries in which the endothelium plays an important central role. Spatial transcriptome profiling of endothelium in pre-lesional arteries has demonstrated differential phenotypes primed for athero-susceptibility at hemodynamic sites associated with disturbed blood flow. DNA methylation is a powerful epigenetic regulator of endothelial transcription recently associated with flow characteristics. We investigated differential DNA methylation in flow region-specific aortic endothelial cells in vivo in adult domestic male and female swine. RESULTS: Genome-wide DNA methylation was profiled in endothelial cells (EC) isolated from two robust locations of differing patho-susceptibility:--an athero-susceptible site located at the inner curvature of the aortic arch (AA) and an athero-protected region in the descending thoracic (DT) aorta. Complete methylated DNA immunoprecipitation sequencing (MeDIP-seq) identified over 5500 endothelial differentially methylated regions (DMRs). DMR density was significantly enriched in exons and 5'UTR sequences of annotated genes, 60 of which are linked to cardiovascular disease. The set of DMR-associated genes was enriched in transcriptional regulation, pattern specification HOX loci, oxidative stress and the ER stress adaptive pathway, all categories linked to athero-susceptible endothelium. Examination of the relationship between DMR and mRNA in HOXA genes demonstrated a significant inverse relationship between CpG island promoter methylation and gene expression. Methylation-specific PCR (MSP) confirmed differential CpG methylation of HOXA genes, the ER stress gene ATF4, inflammatory regulator microRNA-10a and ARHGAP25 that encodes a negative regulator of Rho GTPases involved in cytoskeleton remodeling. Gender-specific DMRs associated with ciliogenesis that may be linked to defects in cilia development were also identified in AA DMRs. CONCLUSIONS: An endothelial methylome analysis identifies epigenetic DMR characteristics associated with transcriptional regulation in regions of atherosusceptibility in swine aorta in vivo. The data represent the first methylome blueprint for spatio-temporal analyses of lesion susceptibility predisposing to endothelial dysfunction in complex flow environments in vivo.


Asunto(s)
Aorta/metabolismo , Metilación de ADN/genética , Endotelio Vascular/metabolismo , Transcriptoma/genética , Animales , Aterosclerosis/genética , Islas de CpG/genética , Células Endoteliales/metabolismo , Femenino , Perfilación de la Expresión Génica/métodos , Regulación de la Expresión Génica/genética , Masculino , Fenotipo , Regiones Promotoras Genéticas/genética , ARN Mensajero/genética , Análisis Espacio-Temporal , Porcinos
9.
Bioinformatics ; 30(9): 1340-2, 2014 May 01.
Artículo en Inglés | MEDLINE | ID: mdl-24413522

RESUMEN

Biomedical ontologies are often very large and complex. Only a subset of the ontology may be needed for a specified application or community. For ontology end users, it is desirable to have community-based labels rather than the labels generated by ontology developers. Ontodog is a web-based system that can generate an ontology subset based on Excel input, and support generation of an ontology community view, which is defined as the whole or a subset of the source ontology with user-specified annotations including user-preferred labels. Ontodog allows users to easily generate community views with minimal ontology knowledge and no programming skills or installation required. Currently >100 ontologies including all OBO Foundry ontologies are available to generate the views based on user needs. We demonstrate the application of Ontodog for the generation of community views using the Ontology for Biomedical Investigations as the source ontology.


Asunto(s)
Ontologías Biológicas , Internet , Programas Informáticos
10.
Blood ; 121(6): e5-e13, 2013 Feb 07.
Artículo en Inglés | MEDLINE | ID: mdl-23243273

RESUMEN

Erythroid ontogeny is characterized by overlapping waves of primitive and definitive erythroid lineages that share many morphologic features during terminal maturation but have marked differences in cell size and globin expression. In the present study, we compared global gene expression in primitive, fetal definitive, and adult definitive erythroid cells at morphologically equivalent stages of maturation purified from embryonic, fetal, and adult mice. Surprisingly, most transcriptional complexity in erythroid precursors is already present by the proerythroblast stage. Transcript levels are markedly modulated during terminal erythroid maturation, but housekeeping genes are not preferentially lost. Although primitive and definitive erythroid lineages share a large set of nonhousekeeping genes, annotation of lineage-restricted genes shows that alternate gene usage occurs within shared functional categories, as exemplified by the selective expression of aquaporins 3 and 8 in primitive erythroblasts and aquaporins 1 and 9 in adult definitive erythroblasts. Consistent with the known functions of Aqp3 and Aqp8 as H2O2 transporters, primitive, but not definitive, erythroblasts preferentially accumulate reactive oxygen species after exogenous H2O2 exposure. We have created a user-friendly Web site (http://www.cbil.upenn.edu/ErythronDB) to make these global expression data readily accessible and amenable to complex search strategies by the scientific community.


Asunto(s)
Células Eritroides/metabolismo , Eritropoyesis/genética , Perfilación de la Expresión Génica , Regulación del Desarrollo de la Expresión Génica , Animales , Acuaporina 1/genética , Acuaporina 3/genética , Acuaporinas/genética , Linaje de la Célula/genética , Células Cultivadas , Eritroblastos/metabolismo , Eritrocitos/metabolismo , Femenino , Sistema Hematopoyético/citología , Sistema Hematopoyético/embriología , Sistema Hematopoyético/crecimiento & desarrollo , Ratones , Ratones Endogámicos ICR , Especies Reactivas de Oxígeno/metabolismo , Reacción en Cadena de la Polimerasa de Transcriptasa Inversa , Factores de Tiempo
11.
Nucleic Acids Res ; 41(Database issue): D684-91, 2013 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-23175615

RESUMEN

EuPathDB (http://eupathdb.org) resources include 11 databases supporting eukaryotic pathogen genomic and functional genomic data, isolate data and phylogenomics. EuPathDB resources are built using the same infrastructure and provide a sophisticated search strategy system enabling complex interrogations of underlying data. Recent advances in EuPathDB resources include the design and implementation of a new data loading workflow, a new database supporting Piroplasmida (i.e. Babesia and Theileria), the addition of large amounts of new data and data types and the incorporation of new analysis tools. New data include genome sequences and annotation, strand-specific RNA-seq data, splice junction predictions (based on RNA-seq), phosphoproteomic data, high-throughput phenotyping data, single nucleotide polymorphism data based on high-throughput sequencing (HTS) and expression quantitative trait loci data. New analysis tools enable users to search for DNA motifs and define genes based on their genomic colocation, view results from searches graphically (i.e. genes mapped to chromosomes or isolates displayed on a map) and analyze data from columns in result tables (word cloud and histogram summaries of column content). The manuscript herein describes updates to EuPathDB since the previous report published in NAR in 2010.


Asunto(s)
Bases de Datos Genéticas , Parásitos/genética , Animales , Genómica , Internet , Anotación de Secuencia Molecular , Fenotipo , Piroplasmida/genética , Polimorfismo de Nucleótido Simple , Proteómica , Sitios de Carácter Cuantitativo , Sitios de Empalme de ARN , Análisis de Secuencia de ARN , Programas Informáticos
12.
Nature ; 455(7214): 757-63, 2008 Oct 09.
Artículo en Inglés | MEDLINE | ID: mdl-18843361

RESUMEN

The human malaria parasite Plasmodium vivax is responsible for 25-40% of the approximately 515 million annual cases of malaria worldwide. Although seldom fatal, the parasite elicits severe and incapacitating clinical symptoms and often causes relapses months after a primary infection has cleared. Despite its importance as a major human pathogen, P. vivax is little studied because it cannot be propagated continuously in the laboratory except in non-human primates. We sequenced the genome of P. vivax to shed light on its distinctive biological features, and as a means to drive development of new drugs and vaccines. Here we describe the synteny and isochore structure of P. vivax chromosomes, and show that the parasite resembles other malaria parasites in gene content and metabolic potential, but possesses novel gene families and potential alternative invasion pathways not recognized previously. Completion of the P. vivax genome provides the scientific community with a valuable resource that can be used to advance investigation into this neglected species.


Asunto(s)
Genoma de Protozoos/genética , Genómica , Malaria Vivax/parasitología , Plasmodium vivax/genética , Secuencias de Aminoácidos , Animales , Artemisininas/metabolismo , Artemisininas/farmacología , Atovacuona/metabolismo , Atovacuona/farmacología , Núcleo Celular/genética , Cromosomas/genética , Secuencia Conservada/genética , Eritrocitos/parasitología , Evolución Molecular , Haplorrinos/parasitología , Humanos , Isocoras/genética , Ligandos , Malaria Vivax/metabolismo , Familia de Multigenes , Plasmodium vivax/efectos de los fármacos , Plasmodium vivax/patogenicidad , Plasmodium vivax/fisiología , Análisis de Secuencia de ADN , Especificidad de la Especie , Sintenía/genética
13.
Stem Cells ; 30(10): 2297-308, 2012 Oct.
Artículo en Inglés | MEDLINE | ID: mdl-22865702

RESUMEN

Sox17 is essential for both endoderm development and fetal hematopoietic stem cell (HSC) maintenance. While endoderm-derived organs are well known to originate from Sox17-expressing cells, it is less certain whether fetal HSCs also originate from Sox17-expressing cells. By generating a Sox17(GFPCre) allele and using it to assess the fate of Sox17-expressing cells during embryogenesis, we confirmed that both endodermal and a part of definitive hematopoietic cells are derived from Sox17-positive cells. Prior to E9.5, the expression of Sox17 is restricted to the endoderm lineage. However, at E9.5 Sox17 is expressed in the endothelial cells (ECs) at the para-aortic splanchnopleural region that contribute to the formation of HSCs at a later stage. The identification of two distinct progenitor cell populations that express Sox17 at E9.5 was confirmed using fluorescence-activated cell sorting together with RNA-Seq to determine the gene expression profiles of the two cell populations. Interestingly, this analysis revealed differences in the RNA processing of the Sox17 mRNA during embryogenesis. Taken together, these results indicate that Sox17 is expressed in progenitor cells derived from two different germ layers, further demonstrating the complex expression pattern of this gene and suggesting caution when using Sox17 as a lineage-specific marker.


Asunto(s)
Células Madre Fetales/metabolismo , Regulación del Desarrollo de la Expresión Génica , Proteínas HMGB/genética , Células Madre Hematopoyéticas/metabolismo , Factores de Transcripción SOXF/genética , Animales , Diferenciación Celular , Linaje de la Célula , Embrión de Mamíferos , Desarrollo Embrionario , Endodermo/citología , Endodermo/metabolismo , Células Madre Fetales/citología , Citometría de Flujo , Proteínas Fluorescentes Verdes/genética , Proteínas HMGB/metabolismo , Células Madre Hematopoyéticas/citología , Ratones , Ratones Transgénicos , ARN Mensajero/biosíntesis , Factores de Transcripción SOXF/metabolismo
14.
Nucleic Acids Res ; 39(Database issue): D612-9, 2011 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-20974635

RESUMEN

AmoebaDB (http://AmoebaDB.org) and MicrosporidiaDB (http://MicrosporidiaDB.org) are new functional genomic databases serving the amoebozoa and microsporidia research communities, respectively. AmoebaDB contains the genomes of three Entamoeba species (E. dispar, E. invadens and E. histolityca) and microarray expression data for E. histolytica. MicrosporidiaDB contains the genomes of Encephalitozoon cuniculi, E. intestinalis and E. bieneusi. The databases belong to the National Institute of Allergy and Infectious Diseases (NIAID) funded EuPathDB (http://EuPathDB.org) Bioinformatics Resource Center family of integrated databases and assume the same architectural and graphical design as other EuPathDB resources such as PlasmoDB and TriTrypDB. Importantly they utilize the graphical strategy builder that affords a database user the ability to ask complex multi-data-type questions with relative ease and versatility. Genomic scale data can be queried based on BLAST searches, annotation keywords and gene ID searches, GO terms, sequence motifs, protein characteristics, phylogenetic relationships and functional data such as transcript (microarray and EST evidence) and protein expression data. Search strategies can be saved within a user's profile for future retrieval and may also be shared with other researchers using a unique strategy web address.


Asunto(s)
Bases de Datos Genéticas , Encephalitozoon/genética , Entamoeba/genética , Genoma Fúngico , Genoma de Protozoos , Genómica
15.
Nat Genet ; 32 Suppl: 469-73, 2002 Dec.
Artículo en Inglés | MEDLINE | ID: mdl-12454640

RESUMEN

A single microarray can provide information on the expression of tens of thousands of genes. The amount of information generated by a microarray-based experiment is sufficiently large that no single study can be expected to mine each nugget of scientific information. As a consequence, the scale and complexity of microarray experiments require that computer software programs do much of the data processing, storage, visualization, analysis and transfer. The adoption of common standards and ontologies for the management and sharing of microarray data is essential and will provide immediate benefit to the research community.


Asunto(s)
Sistemas de Administración de Bases de Datos , Bases de Datos Genéticas/normas , Perfilación de la Expresión Génica/normas , Análisis de Secuencia por Matrices de Oligonucleótidos/normas , Procesamiento Automatizado de Datos , Perfilación de la Expresión Génica/métodos , Humanos , Almacenamiento y Recuperación de la Información , Internet , Modelos Biológicos , Análisis de Secuencia por Matrices de Oligonucleótidos/métodos , Lenguajes de Programación , Control de Calidad , Análisis de Secuencia de ADN , Programas Informáticos
16.
Bioinformatics ; 27(18): 2518-28, 2011 Sep 15.
Artículo en Inglés | MEDLINE | ID: mdl-21775302

RESUMEN

MOTIVATION: A critical task in high-throughput sequencing is aligning millions of short reads to a reference genome. Alignment is especially complicated for RNA sequencing (RNA-Seq) because of RNA splicing. A number of RNA-Seq algorithms are available, and claim to align reads with high accuracy and efficiency while detecting splice junctions. RNA-Seq data are discrete in nature; therefore, with reasonable gene models and comparative metrics RNA-Seq data can be simulated to sufficient accuracy to enable meaningful benchmarking of alignment algorithms. The exercise to rigorously compare all viable published RNA-Seq algorithms has not been performed previously. RESULTS: We developed an RNA-Seq simulator that models the main impediments to RNA alignment, including alternative splicing, insertions, deletions, substitutions, sequencing errors and intron signal. We used this simulator to measure the accuracy and robustness of available algorithms at the base and junction levels. Additionally, we used reverse transcription-polymerase chain reaction (RT-PCR) and Sanger sequencing to validate the ability of the algorithms to detect novel transcript features such as novel exons and alternative splicing in RNA-Seq data from mouse retina. A pipeline based on BLAT was developed to explore the performance of established tools for this problem, and to compare it to the recently developed methods. This pipeline, the RNA-Seq Unified Mapper (RUM), performs comparably to the best current aligners and provides an advantageous combination of accuracy, speed and usability. AVAILABILITY: The RUM pipeline is distributed via the Amazon Cloud and for computing clusters using the Sun Grid Engine (http://cbil.upenn.edu/RUM). CONTACT: ggrant@pcbi.upenn.edu; epierce@mail.med.upenn.edu SUPPLEMENTARY INFORMATION: The RNA-Seq sequence reads described in the article are deposited at GEO, accession GSE26248.


Asunto(s)
Análisis de Secuencia de ARN/métodos , Algoritmos , Animales , Secuencia de Bases , Benchmarking , Análisis por Conglomerados , Exones , Biblioteca de Genes , Genoma , Secuenciación de Nucleótidos de Alto Rendimiento , Ratones , Modelos Genéticos , Datos de Secuencia Molecular , ARN/genética , Empalme del ARN , Alineación de Secuencia , Programas Informáticos
17.
Nucleic Acids Res ; 38(Database issue): D415-9, 2010 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-19914931

RESUMEN

EuPathDB (http://EuPathDB.org; formerly ApiDB) is an integrated database covering the eukaryotic pathogens of the genera Cryptosporidium, Giardia, Leishmania, Neospora, Plasmodium, Toxoplasma, Trichomonas and Trypanosoma. While each of these groups is supported by a taxon-specific database built upon the same infrastructure, the EuPathDB portal offers an entry point to all these resources, and the opportunity to leverage orthology for searches across genera. The most recent release of EuPathDB includes updates and changes affecting data content, infrastructure and the user interface, improving data access and enhancing the user experience. EuPathDB currently supports more than 80 searches and the recently-implemented 'search strategy' system enables users to construct complex multi-step searches via a graphical interface. Search results are dynamically displayed as the strategy is constructed or modified, and can be downloaded, saved, revised, or shared with other database users.


Asunto(s)
Biología Computacional/métodos , Bases de Datos Genéticas , Bases de Datos de Ácidos Nucleicos , Infecciones por Protozoos/parasitología , Proteínas Protozoarias/genética , Animales , Biología Computacional/tendencias , Bases de Datos de Proteínas , Genoma de Protozoos , Humanos , Almacenamiento y Recuperación de la Información/métodos , Internet , Estructura Terciaria de Proteína , Infecciones por Protozoos/genética , Programas Informáticos
18.
Nucleic Acids Res ; 38(Database issue): D457-62, 2010 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-19843604

RESUMEN

TriTrypDB (http://tritrypdb.org) is an integrated database providing access to genome-scale datasets for kinetoplastid parasites, and supporting a variety of complex queries driven by research and development needs. TriTrypDB is a collaborative project, utilizing the GUS/WDK computational infrastructure developed by the Eukaryotic Pathogen Bioinformatics Resource Center (EuPathDB.org) to integrate genome annotation and analyses from GeneDB and elsewhere with a wide variety of functional genomics datasets made available by members of the global research community, often pre-publication. Currently, TriTrypDB integrates datasets from Leishmania braziliensis, L. infantum, L. major, L. tarentolae, Trypanosoma brucei and T. cruzi. Users may examine individual genes or chromosomal spans in their genomic context, including syntenic alignments with other kinetoplastid organisms. Data within TriTrypDB can be interrogated utilizing a sophisticated search strategy system that enables a user to construct complex queries combining multiple data types. All search strategies are stored, allowing future access and integrated searches. 'User Comments' may be added to any gene page, enhancing available annotation; such comments become immediately searchable via the text search, and are forwarded to curators for incorporation into the reference annotation when appropriate.


Asunto(s)
Biología Computacional/métodos , Bases de Datos Genéticas , Bases de Datos de Ácidos Nucleicos , Leishmania/genética , Trypanosoma/genética , Animales , Biología Computacional/tendencias , Bases de Datos de Proteínas , Genoma de Protozoos , Almacenamiento y Recuperación de la Información/métodos , Internet , Estructura Terciaria de Proteína , Proteínas Protozoarias/genética , Programas Informáticos , Interfaz Usuario-Computador
19.
Bioinformatics ; 26(19): 2470-1, 2010 Oct 01.
Artículo en Inglés | MEDLINE | ID: mdl-20733062

RESUMEN

UNLABELLED: Computational methods in molecular biology will increasingly depend on standards-based annotations that describe biological experiments in an unambiguous manner. Annotare is a software tool that enables biologists to easily annotate their high-throughput experiments, biomaterials and data in a standards-compliant way that facilitates meaningful search and analysis. AVAILABILITY AND IMPLEMENTATION: Annotare is available from http://code.google.com/p/annotare/ under the terms of the open-source MIT License (http://www.opensource.org/licenses/mit-license.php). It has been tested on both Mac and Windows.


Asunto(s)
Perfilación de la Expresión Génica , Análisis de Secuencia por Matrices de Oligonucleótidos , Programas Informáticos , Biología Computacional/métodos , Bases de Datos Factuales , Anotación de Secuencia Molecular , Interfaz Usuario-Computador
20.
Circ Res ; 105(5): 453-61, 2009 Aug 28.
Artículo en Inglés | MEDLINE | ID: mdl-19661457

RESUMEN

RATIONALE: Endothelial function and dysfunction are central to the focal origin and regional development of atherosclerosis; however, an in vivo endothelial phenotypic footprint of susceptibility to atherosclerosis preceding pathological change remains elusive. OBJECTIVE: To conduct a comparative multi-site genomics study of arterial endothelial phenotype in atherosusceptible and atheroprotected regions. METHODS AND RESULTS: Transcript profiles of freshly isolated endothelial cells from 7 discrete arterial regions in normal swine were analyzed to determine the steady state in vivo endothelial phenotypes in regions of varying susceptibilities to atherosclerosis. The most abundant common feature of the endothelium of all atherosusceptible regions was the upregulation of genes associated with endoplasmic reticulum (ER) stress. The unfolded protein response pathway, induced by ER stress, was therefore investigated in detail in endothelium of the atherosusceptible aortic arch and was found to be partially activated. ER transmembrane signal transducers IRE1alpha and ATF6alpha and their downstream effectors, but not PERK, were activated concomitant with a higher transcript expression of protein folding enzymes and chaperones, indicative of ER stress in vivo. CONCLUSIONS: The findings demonstrate the prevalence of chronic endothelial ER stress and activated unfolded protein response in vivo at atherosusceptible arterial sites. We propose that chronic localized biological stress is linked to spatial susceptibility of the endothelium to the initiation of atherosclerosis.


Asunto(s)
Aterosclerosis/genética , Retículo Endoplásmico/química , Endotelio Vascular/química , Estrés Fisiológico/genética , Animales , Aorta/química , Aterosclerosis/metabolismo , Arterias Carótidas/química , Perfilación de la Expresión Génica/métodos , Redes Reguladoras de Genes , Predisposición Genética a la Enfermedad , Análisis de Secuencia por Matrices de Oligonucleótidos , Fenotipo , Biosíntesis de Proteínas/genética , Pliegue de Proteína , ARN Mensajero/análisis , Transducción de Señal/genética , Porcinos
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA