RESUMO
Association mapping (i.e., linkage disequilibrium mapping) is a powerful tool for positional cloning of disease genes. We propose a kernel-based association test (KBAT), which is a composite function of "P-values of single-locus association tests" and "kernel weights related to intermarker distances and/or linkage disequilibria." The KBAT is a general form of some current test statistics. This method can be applied to the study of candidate genes and can scan each chromosome using a moving average procedure. We evaluated the performance of the KBAT through simulation studies that considered evolutionary parameters, disease models, sample sizes, kernel functions, test statistics, window attributes, empirical P-value estimations, and genetic/physical maps. The results showed that the KBAT had a well-controlled false positive rate and high power compared to existing methods. In addition, the KBAT was also applied to analyze a genomewide data set from the Collaborative Study on the Genetics of Alcoholism. Important genes associated with alcoholism dependence were identified. In summary, the merits of the KBAT are multifold: the KBAT is robust against the inclusion of nuisance markers, is invariant to the map scale, and accommodates different types of genomic data, study designs, and study purposes. The proposed methods are packaged in the user-friendly software, KBAT, available at http://www.stat.sinica.edu.tw/hsinchou/genetics/association/KBAT.htm.
Assuntos
Mapeamento Cromossômico/estatística & dados numéricos , Alcoolismo/genética , Evolução Biológica , Biometria , Clonagem Molecular , Simulação por Computador , Predisposição Genética para Doença , Humanos , Desequilíbrio de Ligação , Modelos Genéticos , Mapeamento Físico do Cromossomo/estatística & dados numéricos , Polimorfismo de Nucleotídeo Único , Tamanho da Amostra , SoftwareRESUMO
BACKGROUND: Understanding the mapping precision of genome-wide association studies (GWAS), that is the physical distances between the top associated single-nucleotide polymorphisms (SNPs) and the causal variants, is essential to design fine-mapping experiments for complex traits and diseases. RESULTS: Using simulations based on whole-genome sequencing (WGS) data from 3642 unrelated individuals of European descent, we show that the association signals at rare causal variants (minor allele frequency ≤ 0.01) are very unlikely to be mapped to common variants in GWAS using either WGS data or imputed data and vice versa. We predict that at least 80% of the common variants identified from published GWAS using imputed data are within 33.5 Kbp of the causal variants, a resolution that is comparable with that using WGS data. Mapping precision at these loci will improve with increasing sample sizes of GWAS in the future. For rare variants, the mapping precision of GWAS using WGS data is extremely high, suggesting WGS is an efficient strategy to detect and fine-map rare variants simultaneously. We further assess the mapping precision by linkage disequilibrium between GWAS hits and causal variants and develop an online tool (gwasMP) to query our results with different thresholds of physical distance and/or linkage disequilibrium ( http://cnsgenomics.com/shiny/gwasMP ). CONCLUSIONS: Our findings provide a benchmark to inform future design and development of fine-mapping experiments and technologies to pinpoint the causal variants at GWAS loci.
Assuntos
Genoma Humano , Estudo de Associação Genômica Ampla/estatística & dados numéricos , Mapeamento Físico do Cromossomo/estatística & dados numéricos , Locos de Características Quantitativas , Alelos , Frequência do Gene , Genótipo , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Desequilíbrio de Ligação , Fenótipo , Polimorfismo de Nucleotídeo Único , Tamanho da Amostra , População BrancaRESUMO
We describe a method to make physical maps of genomes using correlative hybridization patterns of probes to random pools of BACs. We derive thereby an estimated distance between probes, and then use this estimated distance to order probes. To test the method, we used BAC libraries from Schizzosaccharomyces pombe. We compared our data to the known sequence assembly, in order to assess accuracy. We demonstrate a small number of significant discrepancies between our method and the map derived by sequence assembly. Some of these discrepancies may arise because genome order within a population is not stable; imposing a linear order on a population may not be biologically meaningful.
Assuntos
Análise de Sequência com Séries de Oligonucleotídeos , Mapeamento Físico do Cromossomo , Schizosaccharomyces/genética , Análise de Sequência de DNA , Algoritmos , Cromossomos Artificiais Bacterianos , Hibridização Genômica Comparativa/estatística & dados numéricos , Modelos Genéticos , Análise de Sequência com Séries de Oligonucleotídeos/estatística & dados numéricos , Mapeamento Físico do Cromossomo/estatística & dados numéricos , Análise de Sequência de DNA/estatística & dados numéricosRESUMO
Multiple sclerosis (MS) is a demyelinating disease of the central nervous system with complex genetic background. In the present study, based in the Finnish population, we typed a large number of microsatellite markers in separately pooled DNA samples from 195 MS patients and 205 controls. A total of 108 markers showed evidence of association. Five genomic regions containing two or more of these markers within a 1-Mb interval were identified, 1q43, 2p16, 4p15, 4q34 and 6p21 (the MHC region). Substantial overlap with previously published linkage genome screens is also seen.
Assuntos
Genoma Humano , Repetições de Microssatélites , Esclerose Múltipla/genética , Alelos , Estudos de Casos e Controles , Feminino , Finlândia/epidemiologia , Frequência do Gene , Genótipo , Humanos , Masculino , Esclerose Múltipla/epidemiologia , Mapeamento Físico do Cromossomo/estatística & dados numéricos , Reação em Cadeia da Polimerase/estatística & dados numéricosRESUMO
We have completed a second-generation linkage map that incorporates sequence-based positional information. This new map, the Rutgers Map v.2, includes 28,121 polymorphic markers with physical positions corroborated by recombination-based data. Sex-averaged and sex-specific linkage map distances, along with confidence intervals, have been estimated for all map intervals. In addition, a regression-based smoothed map is provided that facilitates interpolation of positions of unmapped markers on this map. With nearly twice as many markers as our first-generation map, the Rutgers Map continues to be a unique and comprehensive resource for obtaining genetic map information for large sets of polymorphic markers.
Assuntos
Ligação Genética , Genoma Humano , Mapeamento Físico do Cromossomo , Intervalos de Confiança , Feminino , Marcadores Genéticos , Genótipo , Humanos , Masculino , Mapeamento Físico do Cromossomo/métodos , Mapeamento Físico do Cromossomo/estatística & dados numéricos , Polimorfismo de Nucleotídeo ÚnicoRESUMO
This unit provides concise overviews of the many physical mapping resources available and relates them to the genetic and transcript maps. Useful information on resolution of the maps, how to access them, and how to interpret them is compiled and presented in a clear fashion. Especially useful is a set of detailed protocols describing how to construct an STS marker and how to map it by means of available yeast artificial chromosomes (YACs). An additional protocol describes accessing EST marker maps.
Assuntos
Bases de Dados Genéticas , Mapeamento Físico do Cromossomo/estatística & dados numéricos , Cromossomos Artificiais de Levedura/genética , Cromossomos Humanos/genética , Etiquetas de Sequências Expressas , Marcadores Genéticos , Genoma Humano , Humanos , Mapeamento de Híbridos Radioativos , Sitios de Sequências RotuladasRESUMO
UNLABELLED: Physical map assembly is the inference of genome structure from experimental data. Map assembly depends on the integration of diverse data including sequence tagged site (STS) marker content, clone sizing, and restriction digest fingerprints (RDF). As experimentally measured data, these are uncertain and error prone. Physical map assembly from error free data is straightforward and can be accomplished in linear time in the number of clones, but the assembly of an optimal map from error prone data is an NP-hard problem. We present an alternative approach to physical map assembly that is based on a probabilistic view of the data and seeks to identify those features of the map that can be reliably inferred from the available data. With this approach, we achieve a number of goals. These include the use of multiple data sources, appropriate representation of uncertainties in the underlying data, the use of clone length information in fingerprint map assembly, and the use of higher order information in map assembly. By higher order information, we mean relationships that are not expressible in terms of neighbouring clone relationships. These include triplet and multiple clone overlaps, the uniqueness of STS position, and fingerprint marker locations. In a probabilistic view of physical mapping, we assert that all of the many possible map assemblies are equally likely a priori. Given experimental data, we can only state which assemblies are more likely than others given the experimental observations. Parameters of interest are then derived as likelihood weighted averages over map assemblies. Ideally these averages should be sums or integrals over all possible map assemblies, but computationally this is not feasible for real-world map assembly problems. Instead, sampling is used to asymptotically approach the desired parameters. Software implementing our probabilistic approach to mapping has been written. Assembly of mixed RDF and STS maps containing up to 60 clones can be accomplished on a desktop PC with run times under an hour. AVAILABILITY: http://stl.wustl.edu/software/gibbsmap/.
Assuntos
Clonagem Molecular , Modelos Estatísticos , Mapeamento Físico do Cromossomo/estatística & dados numéricos , Biologia Computacional , Técnicas Genéticas/estatística & dados numéricos , Mapeamento por Restrição/estatística & dados numéricos , Sitios de Sequências RotuladasRESUMO
In this study we quantify the features of meiotic recombination on the long arm of human chromosome 21. We constructed a 67. 3-centimorgan (cM) high-resolution, comprehensive, and accurate genetic linkage map of chromosome 21q using 187 highly polymorphic markers covering almost the entire long arm; 46 loci, consisting of mutually recombining marker sets, were ordered with greater than 1000:1 odds and with average interlocus distance of 1.46 cM. These markers were used to accurately identify all exchanges in 186 female and 160 male meioses and to show (1) significant excess of recombination in female versus male meioses, (2) an overall decline in female:male recombination between the centromere and the telomere, (3) greater positive chiasma interference in male than in female meioses, and (4) lack of correlation between exchange frequency and parental age. By comparing the genetic map with the 21q sequence map, we show a general trend of increasing male, but near-constant female, recombination versus physical distance across 21q, explaining the gender-specific recombination effect. The recombination rate varies considerably between genders across 21q but is the greatest (eightfold) in the pericentromeric region, with a rate of approximately 250 kb/cM in females and approximately 2125 kb/cM in males. We used information on the locations of all exchanges to construct an empirical map function that confirms the statistical findings of positive interference. These analyses reveal that occurrence of recombination on 21q is not only gender-specific but also region-specific and that recombination suppression at the centromere is not universal. We also find evidence that male exchange location is highly correlated with gene density.
Assuntos
Cromossomos Humanos Par 21/genética , Meiose/genética , Recombinação Genética/genética , Adolescente , Adulto , Fatores Etários , Idoso , Troca Genética/genética , Feminino , Ligação Genética , Humanos , Masculino , Pessoa de Meia-Idade , Pais , Mapeamento Físico do Cromossomo/estatística & dados numéricos , Fatores SexuaisRESUMO
Tandemly arrayed genes (TAGs) are an important genomic component. However, most previous studies have focused on individual TAG families, and a broader characterization of their genomic distribution is not yet available. In this study, we examined the distribution of TAGs in the Arabidopsis thaliana genome and examined TAG density with relation to recombination rates. Recombination rates along A. thaliana chromosomes were estimated by comparing a genetic map with the genome sequence. Average recombination rates in A. thaliana are high, and rates vary more than threefold among chromosomal regions. Comparisons between TAG density and recombination indicate a positive correlation on chromosomes 1, 2, and 3. Moreover, there is a consistent centromeric effect. Relative to single-copy genes, TAGs are proportionally less frequent in centromeres than on chromosomal arms. We also examined several factors that have been proposed to affect the sequence evolution of TAG members. Sequence divergence is related to the number of members in the TAG, but genomic location has no obvious effect on TAG sequence divergence, nor does the presence of unrelated genes within a TAG. Overall, the distribution of TAGs in the genome is not consistent with theoretical models predicting the accumulation of repeats in regions of low recombination but may be consistent with stabilizing selection models of TAG evolution.
Assuntos
Arabidopsis/genética , Evolução Molecular , Genes de Plantas/genética , Genoma de Planta , Recombinação Genética/genética , Mapeamento Cromossômico , Cromossomos de Plantas/genética , Biologia Computacional/estatística & dados numéricos , Mapeamento Físico do Cromossomo/métodos , Mapeamento Físico do Cromossomo/estatística & dados numéricosRESUMO
Loss-of-heterozygosity (LOH) studies have implicated one or more chromosome 11 tumor-suppressor gene(s) in the development of cutaneous melanoma as well as a variety of other forms of human cancer. In the present study, we have identified multiple independent critical regions on this chromosome by use of homozygosity mapping of deletions (HOMOD) analysis. This method of analysis involved the use of highly polymorphic microsatellite markers and statistics to identify regions of hemizygous deletion in unmatched melanoma cell line DNAs. Regions of loss were defined by the presence of an extended region of homozygosity (ERH) at > or =5 adjacent markers and having a statistical probability of < or =.001. Significant ERHs were similar in nature to deletions identified by LOH analyses performed on uncultured melanomas, although a higher frequency of loss (24 [60%] of 40 vs. 16 [34%] of 47) was observed in the cell lines. Overall, six small regions of overlapping deletions (SROs) were identified on chromosome 11 flanked by the markers D11S1338/D11S907 (11p13-15.5 [SRO1]), D11S1344/D11S11385 (11p11.2 [SRO2]), D11S917/D11S1886 (11q21-22.3 [SRO3]), D11S927/D11S4094 (11q23 [SRO4]), AFM210ve3/D11S990 (11q24 [SRO5]), and D11S1351/D11S4123 (11q24-25 [SRO6]). We propose that HOMOD analysis can be used as an adjunct to LOH analysis in the localization of tumor-suppressor genes.
Assuntos
Cromossomos Humanos Par 11/genética , Genes Supressores de Tumor/genética , Homozigoto , Melanoma/genética , Mapeamento Físico do Cromossomo/métodos , Deleção de Sequência/genética , DNA de Neoplasias/genética , Ligação Genética/genética , Heterozigoto , Humanos , Perda de Heterozigosidade/genética , Melanoma/patologia , Repetições de Microssatélites/genética , Dados de Sequência Molecular , Mapeamento Físico do Cromossomo/estatística & dados numéricos , Polimorfismo Genético/genética , Células Tumorais CultivadasRESUMO
Polymorphism data from 20 partially resequenced copies of human chromosome 21-more than 20,000 polymorphic sites-were analyzed. The allele-frequency distribution shows no deviation from the simplest population genetic model with a constant population size (although we show that our analysis has no power to detect population growth). The average rate of recombination per site is estimated to be roughly one-half of the rate of mutation per site, again in agreement with simple model predictions. However, sliding-window analyses of the amount of polymorphism and the extent of linkage disequilibrium (LD) show significant deviations from standard models. This could be due to the history of selection or demographic change, but it is impossible to draw strong conclusions without much better knowledge of variation in the relationship between genetic and physical distance along the chromosome.
Assuntos
Mapeamento Cromossômico/métodos , Cromossomos Humanos Par 21/genética , Polimorfismo Genético/genética , Alelos , Mapeamento Cromossômico/estatística & dados numéricos , Biologia Computacional/métodos , Biologia Computacional/estatística & dados numéricos , Frequência do Gene/genética , Variação Genética/genética , Haplótipos/genética , Humanos , Desequilíbrio de Ligação/genética , Modelos Genéticos , Mapeamento Físico do Cromossomo/métodos , Mapeamento Físico do Cromossomo/estatística & dados numéricos , Recombinação Genética/genéticaRESUMO
Genetic maps are used routinely in family-based linkage studies to identify the rough location of genes that influence human traits and diseases. Unlike physical maps, genetic maps are based on the amount of recombination occurring between adjacent loci rather than the actual number of bases separating them. Genetic maps are constructed by statistically characterizing the number of crossovers observed in parental meioses leading to the transmission of alleles to their offspring. Considerations such as the number of meioses observed, the heterozygosity and physical distance between the loci studied, and the statistical methods used can impact the construction and reliability of a genetic map. As is well known, poorly constructed genetic maps can have adverse effects on linkage mapping studies. With the availability of sequence-based maps, as well as genetic maps generated by different researchers (such as those generated by the Marshfield and deCODE groups), one can investigate the compatibility and properties of different maps. We have integrated information from the most current human genome sequence data (UCSC genome assembly Human July 2003) as well as 8399 microsatellite markers used in the Marshfield and deCODE maps to reconcile the these maps. Our efforts resulted in updated sex-specific genetic maps.