Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 22
Filtrar
Más filtros

Banco de datos
Tipo del documento
País de afiliación
Intervalo de año de publicación
1.
Cell ; 174(4): 1015-1030.e16, 2018 08 09.
Artículo en Inglés | MEDLINE | ID: mdl-30096299

RESUMEN

The mammalian brain is composed of diverse, specialized cell populations. To systematically ascertain and learn from these cellular specializations, we used Drop-seq to profile RNA expression in 690,000 individual cells sampled from 9 regions of the adult mouse brain. We identified 565 transcriptionally distinct groups of cells using computational approaches developed to distinguish biological from technical signals. Cross-region analysis of these 565 cell populations revealed features of brain organization, including a gene-expression module for synthesizing axonal and presynaptic components, patterns in the co-deployment of voltage-gated ion channels, functional distinctions among the cells of the vasculature and specialization of glutamatergic neurons across cortical regions. Systematic neuronal classifications for two complex basal ganglia nuclei and the striatum revealed a rare population of spiny projection neurons. This adult mouse brain cell atlas, accessible through interactive online software (DropViz), serves as a reference for development, disease, and evolution.


Asunto(s)
Encéfalo/metabolismo , Linaje de la Célula , Regulación del Desarrollo de la Expresión Génica , Redes Reguladoras de Genes , Análisis de la Célula Individual/métodos , Transcriptoma , Animales , Encéfalo/crecimiento & desarrollo , Perfilación de la Expresión Génica , Secuenciación de Nucleótidos de Alto Rendimiento , Masculino , Ratones , Ratones Endogámicos C57BL
2.
Nature ; 586(7828): 262-269, 2020 10.
Artículo en Inglés | MEDLINE | ID: mdl-32999462

RESUMEN

Primates and rodents, which descended from a common ancestor around 90 million years ago1, exhibit profound differences in behaviour and cognitive capacity; the cellular basis for these differences is unknown. Here we use single-nucleus RNA sequencing to profile RNA expression in 188,776 individual interneurons across homologous brain regions from three primates (human, macaque and marmoset), a rodent (mouse) and a weasel (ferret). Homologous interneuron types-which were readily identified by their RNA-expression patterns-varied in abundance and RNA expression among ferrets, mice and primates, but varied less among primates. Only a modest fraction of the genes identified as 'markers' of specific interneuron subtypes in any one species had this property in another species. In the primate neocortex, dozens of genes showed spatial expression gradients among interneurons of the same type, which suggests that regional variation in cortical contexts shapes the RNA expression patterns of adult neocortical interneurons. We found that an interneuron type that was previously associated with the mouse hippocampus-the 'ivy cell', which has neurogliaform characteristics-has become abundant across the neocortex of humans, macaques and marmosets but not mice or ferrets. We also found a notable subcortical innovation: an abundant striatal interneuron type in primates that had no molecularly homologous counterpart in mice or ferrets. These interneurons expressed a unique combination of genes that encode transcription factors, receptors and neuropeptides and constituted around 30% of striatal interneurons in marmosets and humans.


Asunto(s)
Interneuronas/citología , Primates , Animales , Callithrix , Corteza Cerebral/citología , Femenino , Hurones , Hipocampo/citología , Humanos , Interneuronas/metabolismo , Proteínas con Homeodominio LIM/metabolismo , Proteínas de Membrana de los Lisosomas/metabolismo , Macaca , Masculino , Ratones , Neostriado/citología , Proteínas del Tejido Nervioso/metabolismo , ARN/genética , Especificidad de la Especie , Factores de Transcripción/metabolismo
4.
Stat Med ; 34(28): 3769-92, 2015 Dec 10.
Artículo en Inglés | MEDLINE | ID: mdl-26343929

RESUMEN

This tutorial is a learning resource that outlines the basic process and provides specific software tools for implementing a complete genome-wide association analysis. Approaches to post-analytic visualization and interrogation of potentially novel findings are also presented. Applications are illustrated using the free and open-source R statistical computing and graphics software environment, Bioconductor software for bioinformatics and the UCSC Genome Browser. Complete genome-wide association data on 1401 individuals across 861,473 typed single nucleotide polymorphisms from the PennCATH study of coronary artery disease are used for illustration. All data and code, as well as additional instructional resources, are publicly available through the Open Resources in Statistical Genomics project: http://www.stat-gen.org.


Asunto(s)
Biología Computacional , Estudio de Asociación del Genoma Completo , Bases de Datos Genéticas , Estudio de Asociación del Genoma Completo/estadística & datos numéricos , Humanos , Programas Informáticos
5.
Mol Cancer Ther ; 23(3): 285-300, 2024 Mar 04.
Artículo en Inglés | MEDLINE | ID: mdl-38102750

RESUMEN

The estrogen receptor (ER) is a well-established target for the treatment of breast cancer, with the majority of patients presenting as ER-positive (ER+). Endocrine therapy is a mainstay of breast cancer treatment but the development of resistance mutations in response to aromatase inhibitors, poor pharmacokinetic properties of fulvestrant, agonist activity of tamoxifen, and limited benefit for elacestrant leave unmet needs for patients with or without resistance mutations in ESR1, the gene that encodes the ER protein. Here we describe palazestrant (OP-1250), a novel, orally bioavailable complete ER antagonist and selective ER degrader. OP-1250, like fulvestrant, has no agonist activity on the ER and completely blocks estrogen-induced transcriptional activity. In addition, OP-1250 demonstrates favorable biochemical binding affinity, ER degradation, and antiproliferative activity in ER+ breast cancer models that is comparable or superior to other agents of interest. OP-1250 has superior pharmacokinetic properties relative to fulvestrant, including oral bioavailability and brain penetrance, as well as superior performance in wild-type and ESR1-mutant breast cancer xenograft studies. OP-1250 combines well with cyclin-dependent kinase 4 and 6 inhibitors in xenograft studies of ER+ breast cancer models and effectively shrinks intracranially implanted tumors, resulting in prolonged animal survival. With demonstrated preclinical efficacy exceeding fulvestrant in wild-type models, elacestrant in ESR1-mutant models, and tamoxifen in intracranial xenografts, OP-1250 has the potential to benefit patients with ER+ breast cancer.


Asunto(s)
Neoplasias de la Mama , Tetrahidronaftalenos , Animales , Humanos , Femenino , Neoplasias de la Mama/tratamiento farmacológico , Neoplasias de la Mama/genética , Neoplasias de la Mama/patología , Fulvestrant/farmacología , Fulvestrant/uso terapéutico , Antagonistas del Receptor de Estrógeno/uso terapéutico , Ensayos Antitumor por Modelo de Xenoinjerto , Tamoxifeno , Estrógenos , Receptor alfa de Estrógeno/genética , Receptor alfa de Estrógeno/metabolismo
6.
BMC Genomics ; 11: 603, 2010 Oct 25.
Artículo en Inglés | MEDLINE | ID: mdl-20974003

RESUMEN

BACKGROUND: Microarrays are invaluable tools for genome interrogation, SNP detection, and expression analysis, among other applications. Such broad capabilities would be of value to many pathogen research communities, although the development and use of genome-scale microarrays is often a costly undertaking. Therefore, effective methods for reducing unnecessary probes while maintaining or expanding functionality would be relevant to many investigators. RESULTS: Taking advantage of available genome sequences and annotation for Toxoplasma gondii (a pathogenic parasite responsible for illness in immunocompromised individuals) and Plasmodium falciparum (a related parasite responsible for severe human malaria), we designed a single oligonucleotide microarray capable of supporting a wide range of applications at relatively low cost, including genome-wide expression profiling for Toxoplasma, and single-nucleotide polymorphism (SNP)-based genotyping of both T. gondii and P. falciparum. Expression profiling of the three clonotypic lineages dominating T. gondii populations in North America and Europe provides a first comprehensive view of the parasite transcriptome, revealing that ~49% of all annotated genes are expressed in parasite tachyzoites (the acutely lytic stage responsible for pathogenesis) and 26% of genes are differentially expressed among strains. A novel design utilizing few probes provided high confidence genotyping, used here to resolve recombination points in the clonal progeny of sexual crosses. Recent sequencing of additional T. gondii isolates identifies >620 K new SNPs, including ~11 K that intersect with expression profiling probes, yielding additional markers for genotyping studies, and further validating the utility of a combined expression profiling/genotyping array design. Additional applications facilitating SNP and transcript discovery, alternative statistical methods for quantifying gene expression, etc. are also pursued at pilot scale to inform future array designs. CONCLUSIONS: In addition to providing an initial global view of the T. gondii transcriptome across major lineages and permitting detailed resolution of recombination points in a historical sexual cross, the multifunctional nature of this array also allowed opportunities to exploit probes for purposes beyond their intended use, enhancing analyses. This array is in widespread use by the T. gondii research community, and several aspects of the design strategy are likely to be useful for other pathogens.


Asunto(s)
Análisis de Secuencia por Matrices de Oligonucleótidos/métodos , Toxoplasma/genética , Animales , Exones/genética , Perfilación de la Expresión Génica , Regulación de la Expresión Génica , Genotipo , Interacciones Huésped-Parásitos/genética , Humanos , Ratones , Modelos Genéticos , Parásitos/genética , Filogenia , Polimorfismo de Nucleótido Simple/genética , ARN Mensajero/genética , ARN Mensajero/metabolismo , Reproducibilidad de los Resultados , Especificidad de la Especie
7.
BMC Genomics ; 10: 210, 2009 May 07.
Artículo en Inglés | MEDLINE | ID: mdl-19422688

RESUMEN

BACKGROUND: The availability of whole-genome sequences allows for the identification of the entire set of protein coding genes as well as their regulatory regions. This can be accomplished using multiple complementary methods that include ESTs, homology searches and ab initio gene predictions. Previously, the Genie gene-finding algorithm was trained on a small set of Chlamydomonas genes and shown to improve the accuracy of gene prediction in this species compared to other available programs. To improve ab initio gene finding in Chlamydomonas, we assemble a new training set consisting of over 2,300 cDNAs by assembling over 167,000 Chlamydomonas EST entries in GenBank using the EST assembly tool PASA. RESULTS: The prediction accuracy of our cDNA-trained gene-finder, GreenGenie2, attains 83% sensitivity and 83% specificity for exons on short-sequence predictions. We predict about 12,000 genes in the version v3 Chlamydomonas genome assembly, most of which (78%) are either identical to or significantly overlap the published catalog of Chlamydomonas genes 1. 22% of the published catalog is absent from the GreenGenie2 predictions; there is also a fraction (23%) of GreenGenie2 predictions that are absent from the published gene catalog. Randomly chosen gene models were tested by RT-PCR and most support the GreenGenie2 predictions. CONCLUSION: These data suggest that training with EST assemblies is highly effective and that GreenGenie2 is a valuable, complementary tool for predicting genes in Chlamydomonas reinhardtii.


Asunto(s)
Chlamydomonas reinhardtii/genética , Biología Computacional/métodos , Genes Protozoarios , Programas Informáticos , Algoritmos , Animales , Etiquetas de Secuencia Expresada , Modelos Genéticos , Sensibilidad y Especificidad , Alineación de Secuencia , Análisis de Secuencia de ADN
8.
BMC Bioinformatics ; 8 Suppl 10: S5, 2007.
Artículo en Inglés | MEDLINE | ID: mdl-18269699

RESUMEN

BACKGROUND: Many important high throughput projects use in situ hybridization and may require the analysis of images of spatial cross sections of organisms taken with cellular level resolution. Projects creating gene expression atlases at unprecedented scales for the embryonic fruit fly as well as the embryonic and adult mouse already involve the analysis of hundreds of thousands of high resolution experimental images mapping mRNA expression patterns. Challenges include accurate registration of highly deformed tissues, associating cells with known anatomical regions, and identifying groups of genes whose expression is coordinately regulated with respect to both concentration and spatial location. Solutions to these and other challenges will lead to a richer understanding of the complex system aspects of gene regulation in heterogeneous tissue. RESULTS: We present an end-to-end approach for processing raw in situ expression imagery and performing subsequent analysis. We use a non-linear, information theoretic based image registration technique specifically adapted for mapping expression images to anatomical annotations and a method for extracting expression information within an anatomical region. Our method consists of coarse registration, fine registration, and expression feature extraction steps. From this we obtain a matrix for expression characteristics with rows corresponding to genes and columns corresponding to anatomical sub-structures. We perform matrix block cluster analysis using a novel row-column mixture model and we relate clustered patterns to Gene Ontology (GO) annotations. CONCLUSION: Resulting registrations suggest that our method is robust over intensity levels and shape variations in ISH imagery. Functional enrichment studies from both simple analysis and block clustering indicate that gene relationships consistent with biological knowledge of neuronal gene functions can be extracted from large ISH image databases such as the Allen Brain Atlas 1 and the Max-Planck Institute 2 using our method. While we focus here on imagery and experiments of the mouse brain our approach should be applicable to a variety of in situ experiments.


Asunto(s)
Química Encefálica/genética , Mapeo Encefálico/métodos , Análisis por Conglomerados , Regulación de la Expresión Génica/fisiología , Hibridación in Situ/métodos , Animales , Drosophila melanogaster/embriología , Drosophila melanogaster/genética , Regulación de la Expresión Génica/genética , Ratones
9.
PLoS Comput Biol ; 2(1): e4, 2006 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-16424921

RESUMEN

Alternative splicing contributes to both gene regulation and protein diversity. To discover broad relationships between regulation of alternative splicing and sequence conservation, we applied a systems approach, using oligonucleotide microarrays designed to capture splicing information across the mouse genome. In a set of 22 adult tissues, we observe differential expression of RNA containing at least two alternative splice junctions for about 40% of the 6,216 alternative events we could detect. Statistical comparisons identify 171 cassette exons whose inclusion or skipping is different in brain relative to other tissues and another 28 exons whose splicing is different in muscle. A subset of these exons is associated with unusual blocks of intron sequence whose conservation in vertebrates rivals that of protein-coding exons. By focusing on sets of exons with similar regulatory patterns, we have identified new sequence motifs implicated in brain and muscle splicing regulation. Of note is a motif that is strikingly similar to the branchpoint consensus but is located downstream of the 5' splice site of exons included in muscle. Analysis of three paralogous membrane-associated guanylate kinase genes reveals that each contains a paralogous tissue-regulated exon with a similar tissue inclusion pattern. While the intron sequences flanking these exons remain highly conserved among mammalian orthologs, the paralogous flanking intron sequences have diverged considerably, suggesting unusually complex evolution of the regulation of alternative splicing in multigene families.


Asunto(s)
Empalme Alternativo/genética , Exones/genética , Intrones/genética , Proteínas Adaptadoras Transductoras de Señales/clasificación , Proteínas Adaptadoras Transductoras de Señales/genética , Animales , Secuencia de Bases , Encéfalo/metabolismo , Secuencia Conservada , Evolución Molecular , Humanos , Proteínas de la Membrana/clasificación , Proteínas de la Membrana/genética , Ratones , Datos de Secuencia Molecular , Músculos/metabolismo , Neuropéptidos/clasificación , Neuropéptidos/genética , Análisis de Secuencia por Matrices de Oligonucleótidos , Especificidad de Órganos , Isoformas de Proteínas/genética , Alineación de Secuencia
10.
BMC Genomics ; 7: 125, 2006 May 24.
Artículo en Inglés | MEDLINE | ID: mdl-16719927

RESUMEN

BACKGROUND: Correlations between polymorphic markers and observed phenotypes provide the basis for mapping traits in quantitative genetics. When the phenotype is gene expression, then loci involved in regulatory control can theoretically be implicated. Recent efforts to construct gene regulatory networks from genotype and gene expression data have shown that biologically relevant networks can be achieved from an integrative approach. In this paper, we consider the problem of identifying individual pairs of genes in a direct or indirect, causal, trans-acting relationship. RESULTS: Inspired by epistatic models of multi-locus quantitative trait (QTL) mapping, we propose a unified model of expression and genotype to identify quantitative trait genes (QTG) by extending the conventional linear model to include both genotype and expression of regulator genes and their interactions. The model provides mapping of specific genes in contrast to standard linkage approaches that implicate large QTL intervals typically containing tens of genes. In simulations, we found that the method can often detect weak trans-acting regulators amid the background noise of thousands of traits and is robust to transcription models containing multiple regulator genes. We reanalyze several pleiotropic loci derived from a large set of yeast matings and identify a likely alternative regulator not previously published. However, we also found that many regulators can not be so easily mapped due to the presence of cis-acting QTLs on the regulators, which induce close linkage among small neighborhoods of genes. QTG mapped regulator-target pairs linked to ARN1 were combined to form a regulatory module, which we observed to be highly enriched in iron homeostasis related genes and contained several causally directed links that had not been identified in other automatic reconstructions of that regulatory module. Finally, we also confirm the surprising, previously published results that regulators controlling gene expression are not enriched for transcription factors, but we do show that our more precise mapping model reveals functional enrichment for several other biological processes related to the regulation of the cell. CONCLUSION: By incorporating interacting expression and genotype, our QTG mapping method can identify specific regulator genes in contrast to standard QTL interval mapping. We have shown that the method can recover biologically significant regulator-target pairs and the approach leads to a general framework for inducing a regulatory module network topology of directed and undirected edges that can be used to identify leads in pathway analysis.


Asunto(s)
Genes Reguladores , Modelos Genéticos , Fenotipo , Sitios de Carácter Cuantitativo , Algoritmos , Animales , Cruzamientos Genéticos , Regulación de la Expresión Génica/genética , Genes Fúngicos , Genotipo , Funciones de Verosimilitud , Ratones , Ratones Endogámicos , Saccharomyces cerevisiae/genética , Proteínas de Saccharomyces cerevisiae/genética , Factores de Transcripción/genética
11.
Bioinformatics ; 21 Suppl 2: ii182-9, 2005 Sep 01.
Artículo en Inglés | MEDLINE | ID: mdl-16204100

RESUMEN

MOTIVATION: Basecalling is a critical step of the analysis of DNA resequencing microarray data for single nucleotide polymorphism discovery and genotyping. For microarrays hybridized with DNA derived from diploid organisms, basecalling with high accuracy at high call rates is a challenging task. Current methods sometimes do not produce satisfactory results. RESULTS: We explored using physical models based on the sequences of the probe and the target to predict feature intensities in resequencing microarrays. Based on these intensity-predicting models, a new basecalling method (Model-P), which takes into consideration the expected feature intensities for different potential genotypes, was developed. Model-P is shown to have better performance at high call rates compared with ABACUS, the current state-of-the-art method, on a test dataset and on relatively AT-rich regions. AVAILABILITY: Model-P is available upon request.


Asunto(s)
Algoritmos , Disparidad de Par Base/genética , Sondas de ADN/genética , Diploidia , Marcación de Gen/métodos , Análisis de Secuencia por Matrices de Oligonucleótidos/métodos , Análisis de Secuencia de ADN/métodos , Secuencia Rica en At/genética , Simulación por Computador , Modelos Genéticos , Tamaño de la Muestra , Programas Informáticos
12.
Nucleic Acids Res ; 31(1): 82-6, 2003 Jan 01.
Artículo en Inglés | MEDLINE | ID: mdl-12519953

RESUMEN

NetAffx (http://www.affymetrix.com) details and annotates probesets on Affymetrix GeneChip microarrays. These annotations include (i) static information specific to the probeset composition; (ii) sequence annotations extracted from public databases; and (iii) protein sequence-level annotations derived from public domain programs, as well as libraries of hidden Markov models (HMMs) developed at Affymetrix. For each probeset, NetAffx lists the probe sequences, and the consensus sequence interrogated by the probes; for the larger chip sets, interactive maps display this sequence data in genomic context. Sequence annotations include Gene Ontology (GO) terms and depiction of GO graph relationships; predicted protein domains and motifs; orthologous sequences; links to relevant pathways; and links to public databases including UniGene, LocusLink, SWISS-PROT and OMIM.


Asunto(s)
Bases de Datos Genéticas , Perfilación de la Expresión Génica , Análisis de Secuencia por Matrices de Oligonucleótidos , Animales , Secuencia de Consenso , Almacenamiento y Recuperación de la Información , Cadenas de Markov , Proteínas/química , Análisis de Secuencia de Proteína , Programas Informáticos
13.
Artículo en Inglés | MEDLINE | ID: mdl-17951819

RESUMEN

Statistical relations between genome-wide mRNA transcript levels have been successfully used to infer regulatory relations among the genes, however the most successful methods have relied on additional data and focused on small sub-networks of genes. Along these lines, we recently demonstrated a model for simultaneously incorporating micro-array expression data with whole genome genotype marker data to identify causal pairwise relationships among genes. In this paper we extend this methodology to the principled construction of networks describing local regulatory modules. Our method is a two-step process: starting with a seed gene of interest, a Markov Blanket over genotype and gene expression observations is inferred according to differential entropy estimation; a Bayes Net is then constructed from the resulting variables with important biological constraints yielding causally correct relationships. We tested our method by simulating a regulatory network within the background of of a real data set. We found that 45% of the genes in a regulatory module can be identified and the relations among the genes can be recovered with moderately high accuracy (> 70%). Since sample size is a practical and economic limitation, we considered the impact of increasing the number of samples and found that recovery of true gene-gene relationships only doubled with ten times the number of samples, suggesting that useful networks can be achieved with current experimental designs, but that significant improvements are not expected without major increases in the number of samples. When we applied this method to an actual data set of 111 back-crossed mice we were able to recover local gene regulatory networks supported by the biological literature.


Asunto(s)
Biología Computacional/métodos , Perfilación de la Expresión Génica/métodos , Regulación de la Expresión Génica/fisiología , Redes Reguladoras de Genes/genética , Marcadores Genéticos/genética , Modelos Genéticos , Transducción de Señal/genética , Factores de Transcripción/genética , Simulación por Computador , Teoría de la Información , Tamaño de la Muestra
14.
Science ; 316(5832): 1718-23, 2007 Jun 22.
Artículo en Inglés | MEDLINE | ID: mdl-17510324

RESUMEN

We present a draft sequence of the genome of Aedes aegypti, the primary vector for yellow fever and dengue fever, which at approximately 1376 million base pairs is about 5 times the size of the genome of the malaria vector Anopheles gambiae. Nearly 50% of the Ae. aegypti genome consists of transposable elements. These contribute to a factor of approximately 4 to 6 increase in average gene length and in sizes of intergenic regions relative to An. gambiae and Drosophila melanogaster. Nonetheless, chromosomal synteny is generally maintained among all three insects, although conservation of orthologous gene order is higher (by a factor of approximately 2) between the mosquito species than between either of them and the fruit fly. An increase in genes encoding odorant binding, cytochrome P450, and cuticle domains relative to An. gambiae suggests that members of these protein families underpin some of the biological differences between the two mosquito species.


Asunto(s)
Aedes/genética , Genoma de los Insectos , Insectos Vectores/genética , Aedes/metabolismo , Animales , Anopheles/genética , Anopheles/metabolismo , Arbovirus , Secuencia de Bases , Elementos Transponibles de ADN , Dengue/prevención & control , Dengue/transmisión , Drosophila melanogaster/genética , Femenino , Genes de Insecto , Humanos , Proteínas de Insectos/genética , Insectos Vectores/metabolismo , Masculino , Proteínas de Transporte de Membrana/genética , Datos de Secuencia Molecular , Familia de Multigenes , Estructura Terciaria de Proteína/genética , Análisis de Secuencia de ADN , Caracteres Sexuales , Procesos de Determinación del Sexo , Especificidad de la Especie , Sintenía , Transcripción Genética , Fiebre Amarilla/prevención & control , Fiebre Amarilla/transmisión
15.
Bioinformatics ; 21(9): 1958-63, 2005 May 01.
Artículo en Inglés | MEDLINE | ID: mdl-15657097

RESUMEN

MOTIVATION: A high density of single nucleotide polymorphism (SNP) coverage on the genome is desirable and often an essential requirement for population genetics studies. Region-specific or chromosome-specific linkage studies also benefit from the availability of as many high quality SNPs as possible. The availability of millions of SNPs from both Perlegen and the public domain and the development of an efficient microarray-based assay for genotyping SNPs has brought up some interesting analytical challenges. Effective methods for the selection of optimal subsets of SNPs spanning the genome and methods for accurately calling genotypes from probe hybridization patterns have enabled the development of a new microarray-based system for robustly genotyping over 100,000 SNPs per sample. RESULTS: We introduce a new dynamic model-based algorithm (DM) for screening over 3 million SNPs and genotyping over 100,000 SNPs. The model is based on four possible underlying states: Null, A, AB and B for each probe quartet. We calculate a probe-level log likelihood for each model and then select between the four competing models with an SNP-level statistical aggregation across multiple probe quartets to provide a high-quality genotype call along with a quality measure of the call. We assess performance with HapMap reference genotypes, informative Mendelian inheritance relationship in families, and consistency between DM and another genotype classification method. At a call rate of 95.91% the concordance with reference genotypes from the HapMap Project is 99.81% based on over 1.5 million genotypes, the Mendelian error rate is 0.018% based on 10 trios, and the consistency between DM and MPAM is 99.90% at a comparable rate of 97.18%. We also develop methods for SNP selection and optimal probe selection. AVAILABILITY: The DM algorithm is available in Affymetrix's Genotyping Tools software package and in Affymetrix's GDAS software package. See http://www.affymetrix.com for further information. 10 K and 100 K mapping array data are available on the Affymetrix website.


Asunto(s)
Algoritmos , Análisis Mutacional de ADN/métodos , Pruebas Genéticas/métodos , Modelos Genéticos , Análisis de Secuencia por Matrices de Oligonucleótidos/métodos , Polimorfismo de Nucleótido Simple/genética , Alineación de Secuencia/métodos , Análisis de Secuencia de ADN/métodos , Simulación por Computador , Genotipo , Humanos , Programas Informáticos
16.
J Biopharm Stat ; 14(3): 687-700, 2004 Aug.
Artículo en Inglés | MEDLINE | ID: mdl-15468759

RESUMEN

We have developed an algorithm for inferring the degree of similarity between genes by using the graph-based structure of Gene Ontology (GO). We applied this knowledge-based similarity metric to a clique-finding algorithm for detecting sets of related genes with biological classifications. We also combined it with an expression-based distance metric to produce a co-cluster analysis, which accentuates genes with both similar expression profiles and similar biological characteristics and identifies gene clusters that are more stable and biologically meaningful. These algorithms are demonstrated in the analysis of MPRO cell differentiation time series experiments.


Asunto(s)
Algoritmos , Inteligencia Artificial , Análisis por Conglomerados , Análisis de Secuencia por Matrices de Oligonucleótidos/estadística & datos numéricos , Diferenciación Celular/efectos de los fármacos , Diferenciación Celular/fisiología , Humanos , Neutrófilos/efectos de los fármacos , Tretinoina/farmacología
17.
Pac Symp Biocomput ; : 127-38, 2002.
Artículo en Inglés | MEDLINE | ID: mdl-11928469

RESUMEN

The field of comparative genomics allows us to elucidate the molecular mechanisms necessary for the machinery of an organism by contrasting its genome against those of other organisms. In this paper, we contrast the genome of homo sapiens against C. Elegans, Drosophila melanogaster, and S. cerevisiae to gain insights on what structural domains are present in each organism. Previous work has assessed this using sequence-based homology recognition systems such as Pfam [1] and Interpro [2]. Here, we pursue a structure-based assessment, analyzing genomes according to domains in the SCOP structural domain dictionary. Compared to other eukaryotic genomes, we observe additional domains in the human genome relating to signal transduction, immune response, transport, and certain enzymes. Compared to the metazoan genomes, the yeast genome shows an absence of domains relating to immune response, cell-cell interactions, and cell signaling.


Asunto(s)
Genoma , Genómica/métodos , Análisis de Secuencia de ADN/métodos , Animales , Caenorhabditis elegans/genética , Simulación por Computador , Drosophila melanogaster/genética , Enzimas/genética , Humanos , Modelos Genéticos , Saccharomyces cerevisiae/genética , Dedos de Zinc/genética
18.
J Eukaryot Microbiol ; 50(3): 145-55, 2003.
Artículo en Inglés | MEDLINE | ID: mdl-12836870

RESUMEN

Chlamydomonas reinhardtii is a unicellular green alga that has been used as a model organism for the study of flagella and basal bodies as well as photosynthesis. This report analyzes finished genomic DNA sequence for 0.5% of the nuclear genome. We have used three gene prediction programs as well as EST and protein homology data to estimate the total number of genes in Chlamydomonas to be between 12,000 and 16,400. Chlamydomonas appears to have many more genes than any other unicellular organism sequenced to date. Twenty-seven percent of the predicted genes have significant identity to both ESTs and to known proteins in other organisms, 32% of the predicted genes have significant identity to ESTs alone, and 14% have significant similarity to known proteins in other organisms. For gene prediction in Chlamydomonas, GreenGenie appeared to have the highest sensitivity and specificity at the exon level, scoring 71% and 82%. respectively. Two new alternative splicing events were predicted by aligning Chlamydomonas ESTs to the genomic sequence. Finally recombination differs between the two sequenced contigs. The 350-Kb of the Linkage group III contig is devoid of recombination, while the Linkage group I contig is 30 map units long over 33-kb.


Asunto(s)
Chlamydomonas reinhardtii/genética , Genoma , Secuencia de Aminoácidos , Animales , Secuencia de Bases , ADN Protozoario/análisis , ADN Protozoario/aislamiento & purificación , Ligamiento Genético , Datos de Secuencia Molecular , ARN de Transferencia , Secuencias Repetitivas de Ácidos Nucleicos , Alineación de Secuencia
19.
Bioinformatics ; 20(9): 1462-3, 2004 Jun 12.
Artículo en Inglés | MEDLINE | ID: mdl-14962933

RESUMEN

SUMMARY: The NetAffx Gene Ontology (GO) Mining Tool is a web-based, interactive tool that permits traversal of the GO graph in the context of microarray data. It accepts a list of Affymetrix probe sets and renders a GO graph as a heat map colored according to significance measurements. The rendered graph is interactive, with nodes linked to public web sites and to lists of the relevant probe sets. The GO Mining Tool provides visualization combining biological annotation with expression data, encompassing thousands of genes in one interactive view. AVAILABILITY: GO Mining Tool is freely available at http://www.affymetrix.com/analysis/query/go_analysis.affx


Asunto(s)
Algoritmos , Documentación/métodos , Almacenamiento y Recuperación de la Información/métodos , Procesamiento de Lenguaje Natural , Análisis de Secuencia por Matrices de Oligonucleótidos , Programas Informáticos , Interfaz Usuario-Computador , Indización y Redacción de Resúmenes/métodos , Gráficos por Computador , Sistemas de Administración de Bases de Datos , Perfilación de la Expresión Génica/métodos
20.
Proc Natl Acad Sci U S A ; 100(20): 11237-42, 2003 Sep 30.
Artículo en Inglés | MEDLINE | ID: mdl-14500916

RESUMEN

High-density oligonucleotide microarrays enable simultaneous monitoring of expression levels of tens of thousands of transcripts. For accurate detection and quantitation of transcripts in the presence of cellular mRNA, it is essential to design microarrays whose oligonucleotide probes produce hybridization intensities that accurately reflect the concentration of original mRNA. We present a model-based approach that predicts optimal probes by using sequence and empirical information. We constructed a thermodynamic model for hybridization behavior and determined the influence of empirical factors on the effective fitting parameters. We designed Affymetrix GeneChip probe arrays that contained all 25-mer probes for hundreds of human and yeast transcripts and collected data over a 4,000-fold concentration range. Multiple linear regression models were built to predict hybridization intensities of each probe at given target concentrations, and each intensity profile is summarized by a probe response metric. We selected probe sets to represent each transcript that were optimized with respect to responsiveness, independence (degree to which probe sequences are nonoverlapping), and uniqueness (lack of similarity to sequences in the expressed genomic background). We show that this approach is capable of selecting probes with high sensitivity and specificity for high-density oligonucleotide arrays.


Asunto(s)
Análisis de Secuencia por Matrices de Oligonucleótidos , Sondas ARN , Línea Celular , Humanos , Modelos Moleculares , Sistemas de Lectura Abierta
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA