Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 41
Filtrar
Más filtros

Banco de datos
País/Región como asunto
Tipo del documento
Intervalo de año de publicación
1.
BMC Bioinformatics ; 22(1): 459, 2021 Sep 25.
Artículo en Inglés | MEDLINE | ID: mdl-34563119

RESUMEN

BACKGROUND: We present ARCHes, a fast and accurate haplotype-based approach for inferring an individual's ancestry composition. Our approach works by modeling haplotype diversity from a large, admixed cohort of hundreds of thousands, then annotating those models with population information from reference panels of known ancestry. RESULTS: The running time of ARCHes does not depend on the size of a reference panel because training and testing are separate processes, and the inferred population-annotated haplotype models can be written to disk and reused to label large test sets in parallel (in our experiments, it averages less than one minute to assign ancestry from 32 populations using 10 CPU). We test ARCHes on public data from the 1000 Genomes Project and the Human Genome Diversity Project (HGDP) as well as simulated examples of known admixture. CONCLUSIONS: Our results demonstrate that ARCHes outperforms RFMix at correctly assigning both global and local ancestry at finer population scales regardless of the amount of population admixture.


Asunto(s)
Genética de Población , Genoma Humano , Haplotipos , Humanos , Polimorfismo de Nucleótido Simple
2.
Nucleic Acids Res ; 42(Database issue): D677-84, 2014 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-24285306

RESUMEN

PortEco (http://porteco.org) aims to collect, curate and provide data and analysis tools to support basic biological research in Escherichia coli (and eventually other bacterial systems). PortEco is implemented as a 'virtual' model organism database that provides a single unified interface to the user, while integrating information from a variety of sources. The main focus of PortEco is to enable broad use of the growing number of high-throughput experiments available for E. coli, and to leverage community annotation through the EcoliWiki and GONUTS systems. Currently, PortEco includes curated data from hundreds of genome-wide RNA expression studies, from high-throughput phenotyping of single-gene knockouts under hundreds of annotated conditions, from chromatin immunoprecipitation experiments for tens of different DNA-binding factors and from ribosome profiling experiments that yield insights into protein expression. Conditions have been annotated with a consistent vocabulary, and data have been consistently normalized to enable users to find, compare and interpret relevant experiments. PortEco includes tools for data analysis, including clustering, enrichment analysis and exploration via genome browsers. PortEco search and data analysis tools are extensively linked to the curated gene, metabolic pathway and regulation content at its sister site, EcoCyc.


Asunto(s)
Bases de Datos Genéticas , Escherichia coli/genética , Alelos , Proteínas de Unión al ADN/metabolismo , Escherichia coli/metabolismo , Proteínas de Escherichia coli/metabolismo , Genes Bacterianos , Genoma Bacteriano , Secuenciación de Nucleótidos de Alto Rendimiento , Internet , Fenotipo , ARN Mensajero/metabolismo , Ribosomas/metabolismo , Programas Informáticos
3.
Nat Genet ; 32 Suppl: 469-73, 2002 Dec.
Artículo en Inglés | MEDLINE | ID: mdl-12454640

RESUMEN

A single microarray can provide information on the expression of tens of thousands of genes. The amount of information generated by a microarray-based experiment is sufficiently large that no single study can be expected to mine each nugget of scientific information. As a consequence, the scale and complexity of microarray experiments require that computer software programs do much of the data processing, storage, visualization, analysis and transfer. The adoption of common standards and ontologies for the management and sharing of microarray data is essential and will provide immediate benefit to the research community.


Asunto(s)
Sistemas de Administración de Bases de Datos , Bases de Datos Genéticas/normas , Perfilación de la Expresión Génica/normas , Análisis de Secuencia por Matrices de Oligonucleótidos/normas , Procesamiento Automatizado de Datos , Perfilación de la Expresión Génica/métodos , Humanos , Almacenamiento y Recuperación de la Información , Internet , Modelos Biológicos , Análisis de Secuencia por Matrices de Oligonucleótidos/métodos , Lenguajes de Programación , Control de Calidad , Análisis de Secuencia de ADN , Programas Informáticos
4.
BMJ Open ; 12(10): e049657, 2022 10 12.
Artículo en Inglés | MEDLINE | ID: mdl-36223959

RESUMEN

OBJECTIVES: The enormous toll of the COVID-19 pandemic has heightened the urgency of collecting and analysing population-scale datasets in real time to monitor and better understand the evolving pandemic. The objectives of this study were to examine the relationship of risk factors to COVID-19 susceptibility and severity and to develop risk models to accurately predict COVID-19 outcomes using rapidly obtained self-reported data. DESIGN: A cross-sectional study. SETTING: AncestryDNA customers in the USA who consented to research. PARTICIPANTS: The AncestryDNA COVID-19 Study collected self-reported survey data on symptoms, outcomes, risk factors and exposures for over 563 000 adult individuals in the USA in just under 4 months, including over 4700 COVID-19 cases as measured by a self-reported positive test. RESULTS: We replicated previously reported associations between several risk factors and COVID-19 susceptibility and severity outcomes, and additionally found that differences in known exposures accounted for many of the susceptibility associations. A notable exception was elevated susceptibility for men even after adjusting for known exposures and age (adjusted OR=1.36, 95% CI=1.19 to 1.55). We also demonstrated that self-reported data can be used to build accurate risk models to predict individualised COVID-19 susceptibility (area under the curve (AUC)=0.84) and severity outcomes including hospitalisation and critical illness (AUC=0.87 and 0.90, respectively). The risk models achieved robust discriminative performance across different age, sex and genetic ancestry groups within the study. CONCLUSIONS: The results highlight the value of self-reported epidemiological data to rapidly provide public health insights into the evolving COVID-19 pandemic.


Asunto(s)
COVID-19 , Adulto , COVID-19/epidemiología , Estudios Transversales , Humanos , Masculino , Pandemias , Factores de Riesgo , SARS-CoV-2
5.
Nat Genet ; 54(4): 374-381, 2022 04.
Artículo en Inglés | MEDLINE | ID: mdl-35410379

RESUMEN

Multiple COVID-19 genome-wide association studies (GWASs) have identified reproducible genetic associations indicating that there is a genetic component to susceptibility and severity risk. To complement these studies, we collected deep coronavirus disease 2019 (COVID-19) phenotype data from a survey of 736,723 AncestryDNA research participants. With these data, we defined eight phenotypes related to COVID-19 outcomes: four phenotypes that align with previously studied COVID-19 definitions and four 'expanded' phenotypes that focus on susceptibility given exposure, mild clinical manifestations and an aggregate score of symptom severity. We performed a replication analysis of 12 previously reported COVID-19 genetic associations with all eight phenotypes in a trans-ancestry meta-analysis of AncestryDNA research participants. In this analysis, we show distinct patterns of association at the 12 loci with the eight outcomes that we assessed. We also performed a genome-wide discovery analysis of all eight phenotypes, which did not yield new genome-wide significant loci but did suggest that three of the four 'expanded' COVID-19 phenotypes have enhanced power to capture protective genetic associations relative to the previously studied phenotypes. Thus, we conclude that continued large-scale ascertainment of deep COVID-19 phenotype data would likely represent a boon for COVID-19 therapeutic target identification.


Asunto(s)
COVID-19 , Estudio de Asociación del Genoma Completo , COVID-19/genética , Predisposición Genética a la Enfermedad , Humanos , Fenotipo , Polimorfismo de Nucleótido Simple/genética
6.
Bioinformatics ; 26(19): 2470-1, 2010 Oct 01.
Artículo en Inglés | MEDLINE | ID: mdl-20733062

RESUMEN

UNLABELLED: Computational methods in molecular biology will increasingly depend on standards-based annotations that describe biological experiments in an unambiguous manner. Annotare is a software tool that enables biologists to easily annotate their high-throughput experiments, biomaterials and data in a standards-compliant way that facilitates meaningful search and analysis. AVAILABILITY AND IMPLEMENTATION: Annotare is available from http://code.google.com/p/annotare/ under the terms of the open-source MIT License (http://www.opensource.org/licenses/mit-license.php). It has been tested on both Mac and Windows.


Asunto(s)
Perfilación de la Expresión Génica , Análisis de Secuencia por Matrices de Oligonucleótidos , Programas Informáticos , Biología Computacional/métodos , Bases de Datos Factuales , Anotación de Secuencia Molecular , Interfaz Usuario-Computador
7.
Nucleic Acids Res ; 37(Database issue): D898-901, 2009 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-18953035

RESUMEN

Hundreds of researchers across the world use the Stanford Microarray Database (SMD; http://smd.stanford.edu/) to store, annotate, view, analyze and share microarray data. In addition to providing registered users at Stanford access to their own data, SMD also provides access to public data, and tools with which to analyze those data, to any public user anywhere in the world. Previously, the addition of new microarray data analysis tools to SMD has been limited by available engineering resources, and in addition, the existing suite of tools did not provide a simple way to design, execute and share analysis pipelines, or to document such pipelines for the purposes of publication. To address this, we have incorporated the GenePattern software package directly into SMD, providing access to many new analysis tools, as well as a plug-in architecture that allows users to directly integrate and share additional tools through SMD. In this article, we describe our implementation of the GenePattern microarray analysis software package into the SMD code base. This extension is available with the SMD source code that is fully and freely available to others under an Open Source license, enabling other groups to create a local installation of SMD with an enriched data analysis capability.


Asunto(s)
Bases de Datos Genéticas , Perfilación de la Expresión Génica , Análisis de Secuencia por Matrices de Oligonucleótidos , Animales , Humanos , Ratones , Programas Informáticos
8.
Nucleic Acids Res ; 37(Database issue): D499-508, 2009 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-18835847

RESUMEN

The effective control of tuberculosis (TB) has been thwarted by the need for prolonged, complex and potentially toxic drug regimens, by reliance on an inefficient vaccine and by the absence of biomarkers of clinical status. The promise of the genomics era for TB control is substantial, but has been hindered by the lack of a central repository that collects and integrates genomic and experimental data about this organism in a way that can be readily accessed and analyzed. The Tuberculosis Database (TBDB) is an integrated database providing access to TB genomic data and resources, relevant to the discovery and development of TB drugs, vaccines and biomarkers. The current release of TBDB houses genome sequence data and annotations for 28 different Mycobacterium tuberculosis strains and related bacteria. TBDB stores pre- and post-publication gene-expression data from M. tuberculosis and its close relatives. TBDB currently hosts data for nearly 1500 public tuberculosis microarrays and 260 arrays for Streptomyces. In addition, TBDB provides access to a suite of comparative genomics and microarray analysis software. By bringing together M. tuberculosis genome annotation and gene-expression data with a suite of analysis tools, TBDB (http://www.tbdb.org/) provides a unique discovery platform for TB research.


Asunto(s)
Bases de Datos Genéticas , Mycobacterium tuberculosis/genética , Tuberculosis/microbiología , Investigación Biomédica , Gráficos por Computador , Expresión Génica , Genoma Bacteriano , Genómica , Humanos , Mycobacterium tuberculosis/metabolismo , Integración de Sistemas , Tuberculosis/diagnóstico , Tuberculosis/tratamiento farmacológico
9.
Nat Commun ; 12(1): 6442, 2021 11 08.
Artículo en Inglés | MEDLINE | ID: mdl-34750360

RESUMEN

The genetic architecture of atrial fibrillation (AF) encompasses low impact, common genetic variants and high impact, rare variants. Here, we characterize a high impact AF-susceptibility allele, KCNQ1 R231H, and describe its transcontinental geographic distribution and history. Induced pluripotent stem cell-derived cardiomyocytes procured from risk allele carriers exhibit abbreviated action potential duration, consistent with a gain-of-function effect. Using identity-by-descent (IBD) networks, we estimate the broad- and fine-scale population ancestry of risk allele carriers and their relatives. Analysis of ancestral migration routes reveals ancestors who inhabited Denmark in the 1700s, migrated to the Northeastern United States in the early 1800s, and traveled across the Midwest to arrive in Utah in the late 1800s. IBD/coalescent-based allele dating analysis reveals a relatively recent origin of the AF risk allele (~5000 years). Thus, our approach broadens the scope of study for disease susceptibility alleles to the context of human migration and ancestral origins.


Asunto(s)
Fibrilación Atrial/genética , Predisposición Genética a la Enfermedad/genética , Canal de Potasio KCNQ1/genética , Mutación Missense , Polimorfismo de Nucleótido Simple , Potenciales de Acción , Alelos , Dinamarca , Emigrantes e Inmigrantes , Femenino , Genotipo , Geografía , Humanos , Células Madre Pluripotentes Inducidas/citología , Células Madre Pluripotentes Inducidas/metabolismo , Masculino , Persona de Mediana Edad , Miocitos Cardíacos/citología , Miocitos Cardíacos/metabolismo , Miocitos Cardíacos/fisiología , Linaje , Factores de Riesgo , Utah
10.
Nat Biotechnol ; 25(10): 1127-33, 2007 Oct.
Artículo en Inglés | MEDLINE | ID: mdl-17921998

RESUMEN

The Functional Genomics Experiment data model (FuGE) has been developed to facilitate convergence of data standards for high-throughput, comprehensive analyses in biology. FuGE models the components of an experimental activity that are common across different technologies, including protocols, samples and data. FuGE provides a foundation for describing entire laboratory workflows and for the development of new data formats. The Microarray Gene Expression Data society and the Proteomics Standards Initiative have committed to using FuGE as the basis for defining their respective standards, and other standards groups, including the Metabolomics Standards Initiative, are evaluating FuGE in their development efforts. Adoption of FuGE by multiple standards bodies will enable uniform reporting of common parts of functional genomics workflows, simplify data-integration efforts and ease the burden on researchers seeking to fulfill multiple minimum reporting requirements. Such advances are important for transparent data management and mining in functional genomics and systems biology.


Asunto(s)
Biología Computacional , Simulación por Computador/normas , Genómica/normas , Modelos Biológicos , Análisis de Secuencia por Matrices de Oligonucleótidos/normas , Proteómica/normas , Bases de Datos Factuales
11.
Nucleic Acids Res ; 36(Database issue): D871-7, 2008 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-17989087

RESUMEN

The Stanford Tissue Microarray Database (TMAD; http://tma.stanford.edu) is a public resource for disseminating annotated tissue images and associated expression data. Stanford University pathologists, researchers and their collaborators worldwide use TMAD for designing, viewing, scoring and analyzing their tissue microarrays. The use of tissue microarrays allows hundreds of human tissue cores to be simultaneously probed by antibodies to detect protein abundance (Immunohistochemistry; IHC), or by labeled nucleic acids (in situ hybridization; ISH) to detect transcript abundance. TMAD archives multi-wavelength fluorescence and bright-field images of tissue microarrays for scoring and analysis. As of July 2007, TMAD contained 205 161 images archiving 349 distinct probes on 1488 tissue microarray slides. Of these, 31 306 images for 68 probes on 125 slides have been released to the public. To date, 12 publications have been based on these raw public data. TMAD incorporates the NCI Thesaurus ontology for searching tissues in the cancer domain. Image processing researchers can extract images and scores for training and testing classification algorithms. The production server uses the Apache HTTP Server, Oracle Database and Perl application code. Source code is available to interested researchers under a no-cost license.


Asunto(s)
Bases de Datos Genéticas , Inmunohistoquímica , Hibridación in Situ , Análisis de Matrices Tisulares , Humanos , Internet , Proteínas/análisis , ARN Mensajero/análisis , Programas Informáticos , Interfaz Usuario-Computador
12.
Nat Biotechnol ; 24(11): 1374-6, 2006 Nov.
Artículo en Inglés | MEDLINE | ID: mdl-17093487
13.
Nucleic Acids Res ; 35(Database issue): D766-70, 2007 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-17182626

RESUMEN

The Stanford Microarray Database (SMD; http://smd.stanford.edu/) is a research tool and archive that allows hundreds of researchers worldwide to store, annotate, analyze and share data generated by microarray technology. SMD supports most major microarray platforms, and is MIAME-supportive and can export or import MAGE-ML. The primary mission of SMD is to be a research tool that supports researchers from the point of data generation to data publication and dissemination, but it also provides unrestricted access to analysis tools and public data from 300 publications. In addition to supporting ongoing research, SMD makes its source code fully and freely available to others under an Open Source license, enabling other groups to create a local installation of SMD. In this article, we describe several data analysis tools implemented in SMD and we discuss features of our software release.


Asunto(s)
Bases de Datos Genéticas , Perfilación de la Expresión Génica , Análisis de Secuencia por Matrices de Oligonucleótidos , Programas Informáticos , Animales , Humanos , Internet , Ratones , Interfaz Usuario-Computador
14.
G3 (Bethesda) ; 9(9): 2863-2878, 2019 09 04.
Artículo en Inglés | MEDLINE | ID: mdl-31484785

RESUMEN

We present a massive investigation into the genetic basis of human lifespan. Beginning with a genome-wide association (GWA) study using a de-identified snapshot of the unique AncestryDNA database - more than 300,000 genotyped individuals linked to pedigrees of over 400,000,000 people - we mapped six genome-wide significant loci associated with parental lifespan. We compared these results to a GWA analysis of the traditional lifespan proxy trait, age, and found only one locus, APOE, to be associated with both age and lifespan. By combining the AncestryDNA results with those of an independent UK Biobank dataset, we conducted a meta-analysis of more than 650,000 individuals and identified fifteen parental lifespan-associated loci. Beyond just those significant loci, our genome-wide set of polymorphisms accounts for up to 8% of the variance in human lifespan; this value represents a large fraction of the heritability estimated from phenotypic correlations between relatives.


Asunto(s)
Estudio de Asociación del Genoma Completo/métodos , Longevidad/genética , Anciano , Anciano de 80 o más Años , Apolipoproteínas E/genética , Proteínas Portadoras/genética , Bases de Datos Genéticas , Femenino , Humanos , Masculino , Proteínas Nucleares/genética , Linaje , Polimorfismo de Nucleótido Simple , Estudios Prospectivos , Proteínas Proto-Oncogénicas/genética
15.
BMC Bioinformatics ; 9: 28, 2008 Jan 18.
Artículo en Inglés | MEDLINE | ID: mdl-18205924

RESUMEN

BACKGROUND: MAGE-ML has been promoted as a standard format for describing microarray experiments and the data they produce. Two characteristics of the MAGE-ML format compromise its use as a universal standard: First, MAGE-ML files are exceptionally large - too large to be easily read by most people, and often too large to be read by most software programs. Second, the MAGE-ML standard permits many ways of representing the same information. As a result, different producers of MAGE-ML create different documents describing the same experiment and its data. Recognizing all the variants is an unwieldy software engineering task, resulting in software packages that can read and process MAGE-ML from some, but not all producers. This Tower of MAGE-ML Babel bars the unencumbered exchange of microarray experiment descriptions couched in MAGE-ML. RESULTS: We have developed XBabelPhish - an XQuery-based technology for translating one MAGE-ML variant into another. XBabelPhish's use is not restricted to translating MAGE-ML documents. It can transform XML files independent of their DTD, XML schema, or semantic content. Moreover, it is designed to work on very large (> 200 Mb.) files, which are common in the world of MAGE-ML. CONCLUSION: XBabelPhish provides a way to inter-translate MAGE-ML variants for improved interchange of microarray experiment information. More generally, it can be used to transform most XML files, including very large ones that exceed the capacity of most XML tools.


Asunto(s)
Sistemas de Administración de Bases de Datos , Hipermedia , Interfaz Usuario-Computador , Animales , Perfilación de la Expresión Génica/métodos , Humanos , Almacenamiento y Recuperación de la Información/métodos , Análisis de Secuencia por Matrices de Oligonucleótidos/métodos , Integración de Sistemas , Simplificación del Trabajo
16.
BMC Bioinformatics ; 8: 338, 2007 Sep 13.
Artículo en Inglés | MEDLINE | ID: mdl-17854506

RESUMEN

BACKGROUND: Biomedical ontologies are being widely used to annotate biological data in a computer-accessible, consistent and well-defined manner. However, due to their size and complexity, annotating data with appropriate terms from an ontology is often challenging for experts and non-experts alike, because there exist few tools that allow one to quickly find relevant ontology terms to easily populate a web form. RESULTS: We have produced a tool, OntologyWidget, which allows users to rapidly search for and browse ontology terms. OntologyWidget can easily be embedded in other web-based applications. OntologyWidget is written using AJAX (Asynchronous JavaScript and XML) and has two related elements. The first is a dynamic auto-complete ontology search feature. As a user enters characters into the search box, the appropriate ontology is queried remotely for terms that match the typed-in text, and the query results populate a drop-down list with all potential matches. Upon selection of a term from the list, the user can locate this term within a generic and dynamic ontology browser, which comprises the second element of the tool. The ontology browser shows the paths from a selected term to the root as well as parent/child tree hierarchies. We have implemented web services at the Stanford Microarray Database (SMD), which provide the OntologyWidget with access to over 40 ontologies from the Open Biological Ontology (OBO) website 1. Each ontology is updated weekly. Adopters of the OntologyWidget can either use SMD's web services, or elect to rely on their own. Deploying the OntologyWidget can be accomplished in three simple steps: (1) install Apache Tomcat 2 on one's web server, (2) download and install the OntologyWidget servlet stub that provides access to the SMD ontology web services, and (3) create an html (HyperText Markup Language) file that refers to the OntologyWidget using a simple, well-defined format. CONCLUSION: We have developed OntologyWidget, an easy-to-use ontology search and display tool that can be used on any web page by creating a simple html description. OntologyWidget provides a rapid auto-complete search function paired with an interactive tree display. We have developed a web service layer that communicates between the web page interface and a database of ontology terms. We currently store 40 of the ontologies from the OBO website 1, as well as a several others. These ontologies are automatically updated on a weekly basis. OntologyWidget can be used in any web-based application to take advantage of the ontologies we provide via web services or any other ontology that is provided elsewhere in the correct format. The full source code for the JavaScript and description of the OntologyWidget is available from http://smd.stanford.edu/ontologyWidget/.


Asunto(s)
Biología Computacional/métodos , Programas Informáticos , Terminología como Asunto , Lenguajes de Programación , Interfaz Usuario-Computador , Vocabulario Controlado
17.
Nucleic Acids Res ; 33(Database issue): D580-2, 2005 Jan 01.
Artículo en Inglés | MEDLINE | ID: mdl-15608265

RESUMEN

The Stanford Microarray Database (SMD) (http://smd.stanford.edu) is a research tool for hundreds of Stanford researchers and their collaborators. In addition, SMD functions as a resource for the entire biological research community by providing unrestricted access to microarray data published by SMD users and by disseminating its source code. In addition to storing GenePix (Axon Instruments) and ScanAlyze output from spotted microarrays, SMD has recently added the ability to store, retrieve, display and analyze the complete raw data produced by several additional microarray platforms and image analysis software packages, so that we can also now accept data from Affymetrix GeneChips (MAS5/GCOS or dChip), Agilent Catalog or Custom arrays (using Agilent's Feature Extraction software) or data created by SpotReader (Niles Scientific). We have implemented software that allows us to accept MAGE-ML documents from array manufacturers and to submit MIAME-compliant data in MAGE-ML format directly to ArrayExpress and GEO, greatly increasing the ease with which data from SMD can be published adhering to accepted standards and also increasing the accessibility of published microarray data to the general public. We have introduced a new tool to facilitate data sharing among our users, so that datasets can be shared during, before or after the completion of data analysis. The latest version of the source code for the complete database package was released in November 2004 (http://smd.stanford.edu/download/), allowing researchers around the world to deploy their own installations of SMD.


Asunto(s)
Bases de Datos Genéticas , Perfilación de la Expresión Génica , Análisis de Secuencia por Matrices de Oligonucleótidos , California , Sistemas de Administración de Bases de Datos , Integración de Sistemas
18.
Mol Biol Cell ; 13(6): 1977-2000, 2002 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-12058064

RESUMEN

The genome-wide program of gene expression during the cell division cycle in a human cancer cell line (HeLa) was characterized using cDNA microarrays. Transcripts of >850 genes showed periodic variation during the cell cycle. Hierarchical clustering of the expression patterns revealed coexpressed groups of previously well-characterized genes involved in essential cell cycle processes such as DNA replication, chromosome segregation, and cell adhesion along with genes of uncharacterized function. Most of the genes whose expression had previously been reported to correlate with the proliferative state of tumors were found herein also to be periodically expressed during the HeLa cell cycle. However, some of the genes periodically expressed in the HeLa cell cycle do not have a consistent correlation with tumor proliferation. Cell cycle-regulated transcripts of genes involved in fundamental processes such as DNA replication and chromosome segregation seem to be more highly expressed in proliferative tumors simply because they contain more cycling cells. The data in this report provide a comprehensive catalog of cell cycle regulated genes that can serve as a starting point for functional discovery. The full dataset is available at http://genome-www.stanford.edu/Human-CellCycle/HeLa/.


Asunto(s)
Ciclo Celular/genética , Regulación Neoplásica de la Expresión Génica , Regulación de la Expresión Génica , Neoplasias/genética , División Celular/genética , Replicación del ADN/genética , Enzimas/genética , Variación Genética , Genoma Humano , Células HeLa , Humanos , Mitosis , Familia de Multigenes , Neoplasias/patología , Análisis de Secuencia por Matrices de Oligonucleótidos , Proteínas/genética , Transcripción Genética , Transfección
19.
Nat Commun ; 8: 14238, 2017 02 07.
Artículo en Inglés | MEDLINE | ID: mdl-28169989

RESUMEN

Despite strides in characterizing human history from genetic polymorphism data, progress in identifying genetic signatures of recent demography has been limited. Here we identify very recent fine-scale population structure in North America from a network of over 500 million genetic (identity-by-descent, IBD) connections among 770,000 genotyped individuals of US origin. We detect densely connected clusters within the network and annotate these clusters using a database of over 20 million genealogical records. Recent population patterns captured by IBD clustering include immigrants such as Scandinavians and French Canadians; groups with continental admixture such as Puerto Ricans; settlers such as the Amish and Appalachians who experienced geographic or cultural isolation; and broad historical trends, including reduced north-south gene flow. Our results yield a detailed historical portrait of North America after European settlement and support substantial genetic heterogeneity in the United States beyond that uncovered by previous studies.


Asunto(s)
Demografía/estadística & datos numéricos , Genética de Población/métodos , Dinámica Poblacional/tendencias , Población/genética , Análisis por Conglomerados , Demografía/métodos , Emigrantes e Inmigrantes , Flujo Génico/genética , Técnicas de Genotipaje , Haplotipos/genética , Humanos , Polimorfismo de Nucleótido Simple , Dinámica Poblacional/estadística & datos numéricos , Análisis de Secuencia de ADN , Estados Unidos/etnología
20.
BMC Bioinformatics ; 7: 489, 2006 Nov 06.
Artículo en Inglés | MEDLINE | ID: mdl-17087822

RESUMEN

BACKGROUND: Sharing of microarray data within the research community has been greatly facilitated by the development of the disclosure and communication standards MIAME and MAGE-ML by the MGED Society. However, the complexity of the MAGE-ML format has made its use impractical for laboratories lacking dedicated bioinformatics support. RESULTS: We propose a simple tab-delimited, spreadsheet-based format, MAGE-TAB, which will become a part of the MAGE microarray data standard and can be used for annotating and communicating microarray data in a MIAME compliant fashion. CONCLUSION: MAGE-TAB will enable laboratories without bioinformatics experience or support to manage, exchange and submit well-annotated microarray data in a standard format using a spreadsheet. The MAGE-TAB format is self-contained, and does not require an understanding of MAGE-ML or XML.


Asunto(s)
Biología Computacional/métodos , Análisis de Secuencia por Matrices de Oligonucleótidos/métodos , Programas Informáticos , Bases de Datos Genéticas , Humanos
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA