RESUMO
The laboratory rat (Rattus norvegicus) is an indispensable tool in experimental medicine and drug development, having made inestimable contributions to human health. We report here the genome sequence of the Brown Norway (BN) rat strain. The sequence represents a high-quality 'draft' covering over 90% of the genome. The BN rat sequence is the third complete mammalian genome to be deciphered, and three-way comparisons with the human and mouse genomes resolve details of mammalian evolution. This first comprehensive analysis includes genes and proteins and their relation to human disease, repeated sequences, comparative genome-wide studies of mammalian orthologous chromosomal regions and rearrangement breakpoints, reconstruction of ancestral karyotypes and the events leading to existing species, rates of variation, and lineage-specific and lineage-independent evolutionary events such as expansion of gene families, orthology relations and protein evolution.
Assuntos
Evolução Molecular , Genoma , Genômica , Ratos Endogâmicos BN/genética , Animais , Composição de Bases , Centrômero/genética , Cromossomos de Mamíferos/genética , Ilhas de CpG/genética , Elementos de DNA Transponíveis/genética , DNA Mitocondrial/genética , Duplicação Gênica , Humanos , Íntrons/genética , Masculino , Camundongos , Modelos Moleculares , Mutagênese , Polimorfismo de Nucleotídeo Único/genética , Sítios de Splice de RNA/genética , RNA não Traduzido/genética , Ratos , Sequências Reguladoras de Ácido Nucleico/genética , Retroelementos/genética , Análise de Sequência de DNA , Telômero/genéticaRESUMO
OBJECTIVE: In isolated populations, 'background' linkage disequilibrium (LD) has been shown to extend over large genetic distances. This and their reduced environmental and genetic heterogeneity has stimulated interest in their potential for association mapping. We compared LD unit map distances with pair-wise measurements of LD in a dense single nucleotide polymorphism (SNP) set. METHODS: We genotyped 771 SNPs in an 8 Mb segment of chromosome 22 on 101 individuals from the isolated village of Talana, Sardinia, and compared with outbred European populations. RESULTS: Heterozygosity was remarkably similar in both populations. In contrast, the extent of LD observed was quite different. The decay of LD with distance is slower in the isolate. The differences in LD map lengths suggest that useful LD extends up to three times farther in the Sardinian population; smaller differences are seen with pairwise LD metrics. While LD map length slightly decreases with average relatedness, cryptic relatedness does not explain the decrease in LD map length. Haplotypes, block boundaries, and patterns of LD are similar in both populations, suggesting a shared distribution of recombination hotspots. CONCLUSIONS: About 15% fewer haplotype tagging SNPs need to be genotyped in the isolate, and possibly 70% fewer if selecting SNPs evenly spaced on the metric LD map.
Assuntos
Marcadores Genéticos , Desequilíbrio de Ligação , Polimorfismo de Nucleotídeo Único , Cromossomos Humanos Par 22 , Haplótipos , Heterozigoto , Humanos , ItáliaRESUMO
Expression of prolactin and of prolactin and estrogen receptors in lymphocytes, bone marrow, and lymphoma cell lines suggests that hormonal modulation may influence lymphoma risk. Prolactin and estrogen promote the proliferation and survival of B cells, factors that may increase non-Hodgkin lymphoma risk, and effects of estrogen may be modified by catechol-O-methyltransferase (COMT), an enzyme that alters estrogenic activity. Cytochrome P450 17A1 (CYP17A1), a key enzyme in estrogen biosynthesis, has been associated with increased cancer risk and may affect lymphoma susceptibility. We studied the polymorphisms prolactin (PRL) -1149G>T, CYP17A1 -34T>C, and COMT 108/158Val>Met, and predicted haplotypes among a subset of participants (n = 308 cases, n = 684 controls) in a San Francisco Bay Area population-based non-Hodgkin lymphoma study (n = 1,593 cases, n = 2,515 controls) conducted from 1988 to 1995. Oral contraceptive and other hormone use also was analyzed. Odds ratios (OR) for non-Hodgkin lymphoma and follicular lymphoma were reduced for carriers of the PRL -1149TT genotype [OR, 0.64; 95% confidence interval (95% CI), 0.41-1.0; OR, 0.53; 95% CI, 0.26-1.0, respectively]. Diffuse large-cell lymphoma risk was increased for those with CYP17A1 polymorphisms including CYP17A1 -34CC (OR, 2.0; 95% CI, 1.1-3.5). ORs for all non-Hodgkin lymphoma and follicular lymphoma among women were decreased for COMT IVS1 701A>G [rs737865; variant allele: OR, 0.53; 95% CI, 0.34-0.82; OR, 0.42; 95% CI, 0.23-0.78, respectively]. Compared with never users of oral contraceptives, a 35% reduced risk was observed among oral contraceptive users in the total population. Reduced ORs for all non-Hodgkin lymphoma were observed with use of exogenous estrogens among genotyped women although 95% CIs included unity. These results suggest that PRL, CYP17A1, and COMT may be relevant genetic loci for non-Hodgkin lymphoma and indicate a possible role for prolactin and estrogen in lymphoma pathogenesis.
Assuntos
Catecol O-Metiltransferase/genética , Receptor alfa de Estrogênio/genética , Linfoma não Hodgkin/genética , Polimorfismo de Nucleotídeo Único/genética , Prolactina/genética , Esteroide 17-alfa-Hidroxilase/genética , Adulto , Idoso , Feminino , Genética Populacional , Haplótipos , Humanos , Pessoa de Meia-Idade , Polimorfismo Genético , Fatores de Risco , São FranciscoRESUMO
The entire 2.9-billion-letter sequence (nucleotide base pairs) of the human genome is available as a resource for scientific discovery. Some of the findings from the completion of the human genome were expected, confirming knowledge anticipated by many years of research and analysis in both human and comparative genetics. Other findings were not expected. In either case, the availability of the human genome is likely to have significant implications on basic research, clinical investigation, and ultimately the practice of medicine.
Assuntos
Evolução Biológica , Biologia , Gastroenteropatias/genética , Genoma Humano , Genômica , Medicina , HumanosRESUMO
Modern large-scale genetic association studies generate increasingly high-dimensional datasets. Therefore, some variable selection procedure should be performed before the application of traditional data analysis methods, for reasons of both computational efficiency and problems related to overfitting. We describe here a "wrapper" strategy (SIZEFIT) for variable selection that uses a Random Forests classifier, coupled with various local search/optimization algorithms. We apply it to a large dataset consisting of 2,425 African-American and non-Hispanic white individuals genotyped for 4,869 single-nucleotide polymorphisms (SNPs) in a coronary heart disease (CHD) case-cohort association study (Atherosclerosis Risk in Communities), using incident CHD and plasma low-density lipoprotein (LDL) cholesterol levels as the dependent variables. We show that most SNPs can be safely removed from the dataset without compromising the predictive (classification) accuracy, with only a small number of SNPs (sometimes less than 100) containing any predictive signal. A statistical (SUMSTAT) approach is also applied to the dataset for comparison purposes. We describe a novel method for refining the subset of signal-containing SNPs (FIXFIT), based on an Extremal Optimization algorithm. Finally, we compare the top SNP rankings obtained by different methods and devise practical guidelines for researchers trying to generate a compact subset of predictive SNPs from genome-wide association datasets. Interestingly, there is a significant amount of overlap between seemingly very heterogeneous rankings. We conclude by constructing compact optimal predictive SNP subsets for CHD (less than 150 SNPs) and LDL (less than 300 SNPs) phenotypes, and by comparing various rankings for two well-known positive control SNPs for LDL in the apolipoprotein E gene.
Assuntos
Algoritmos , Estudo de Associação Genômica Ampla/métodos , Negro ou Afro-Americano/genética , Apolipoproteínas E/genética , Aterosclerose/genética , Bases de Dados Genéticas , Predisposição Genética para Doença , Humanos , Modelos GenéticosRESUMO
Admixture mapping (also known as "mapping by admixture linkage disequilibrium," or MALD) provides a way of localizing genes that cause disease, in admixed ethnic groups such as African Americans, with approximately 100 times fewer markers than are required for whole-genome haplotype scans. However, it has not been possible to perform powerful scans with admixture mapping because the method requires a dense map of validated markers known to have large frequency differences between Europeans and Africans. To create such a map, we screened through databases containing approximately 450000 single-nucleotide polymorphisms (SNPs) for which frequencies had been estimated in African and European population samples. We experimentally confirmed the frequencies of the most promising SNPs in a multiethnic panel of unrelated samples and identified 3011 as a MALD map (1.2 cM average spacing). We estimate that this map is approximately 70% informative in differentiating African versus European origins of chromosomal segments. This map provides a practical and powerful tool, which is freely available without restriction, for screening for disease genes in African American patient cohorts. The map is especially appropriate for those diseases that differ in incidence between the parental African and European populations.