Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 17 de 17
Filter
Add more filters










Publication year range
1.
BMC Cancer ; 24(1): 840, 2024 Jul 15.
Article in English | MEDLINE | ID: mdl-39009999

ABSTRACT

BACKGROUND: Detection of cancer and identification of tumor origin at an early stage improve the survival and prognosis of patients. Herein, we proposed a plasma cfDNA-based approach called TOTEM to detect and trace the cancer signal origin (CSO) through methylation markers. METHODS: We performed enzymatic conversion-based targeted methylation sequencing on plasma cfDNA samples collected from a clinical cohort of 500 healthy controls and 733 cancer patients with seven types of cancer (breast, colorectum, esophagus, stomach, liver, lung, and pancreas) and randomly divided these samples into a training cohort and a testing cohort. An independent validation cohort of 143 healthy controls, 79 liver cancer patients and 100 stomach cancer patients were recruited to validate the generalizability of our approach. RESULTS: A total of 57 multi-cancer diagnostic markers and 873 CSO markers were selected for model development. The binary diagnostic model achieved an area under the curve (AUC) of 0.907, 0.908 and 0.868 in the training, testing and independent validation cohorts, respectively. With a training specificity of 98%, the specificities in the testing and independent validation cohorts were 100% and 98.6%, respectively. Overall sensitivity across all cancer stages was 65.5%, 67.3% and 55.9% in the training, testing and independent validation cohorts, respectively. Early-stage (I and II) sensitivity was 50.3% and 45.7% in the training and testing cohorts, respectively. For cancer patients correctly identified by the binary classifier, the top 1 and top 2 CSO accuracies were 77.7% and 86.5% in the testing cohort (n = 148) and 76.0% and 84.0% in the independent validation cohort (n = 100). Notably, performance was maintained with only 21 diagnostic and 214 CSO markers, achieving a training AUC of 0.865, a testing AUC of 0.866, and an integrated top 2 accuracy of 83.1% in the testing cohort. CONCLUSIONS: TOTEM demonstrates promising potential for accurate multi-cancer detection and localization by profiling plasma methylation markers. The real-world clinical performance of our approach needs to be investigated in a much larger prospective cohort.


Subject(s)
Biomarkers, Tumor , Circulating Tumor DNA , DNA Methylation , Neoplasms , Humans , Biomarkers, Tumor/blood , Biomarkers, Tumor/genetics , Neoplasms/genetics , Neoplasms/blood , Neoplasms/diagnosis , Female , Male , Circulating Tumor DNA/blood , Circulating Tumor DNA/genetics , Middle Aged , Aged , Early Detection of Cancer/methods , Case-Control Studies , Sensitivity and Specificity , Adult , Prognosis
2.
JCO Precis Oncol ; 8: e2400111, 2024 Jun.
Article in English | MEDLINE | ID: mdl-38976830

ABSTRACT

PURPOSE: Simultaneous profiling of cell-free DNA (cfDNA) methylation and fragmentation features to improve the performance of cfDNA-based cancer detection is technically challenging. We developed a method to comprehensively analyze multimodal cfDNA genomic features for more sensitive esophageal squamous cell carcinoma (ESCC) detection. MATERIALS AND METHODS: Enzymatic conversion-mediated whole-methylome sequencing was applied to plasma cfDNA samples extracted from 168 patients with ESCC and 251 noncancer controls. ESCC characteristic cfDNA methylation, fragmentation, and copy number signatures were analyzed both across the genome and at accessible cis-regulatory DNA elements. To distinguish ESCC from noncancer samples, a first-layer classifier was developed for each feature type, the prediction results of which were incorporated to construct the second-layer ensemble model. RESULTS: ESCC plasma genome displayed global hypomethylation, altered fragmentation size, and chromosomal copy number alteration. Methylation and fragmentation changes at cancer tissue-specific accessible cis-regulatory DNA elements were also observed in ESCC plasma. By integrating multimodal genomic features for ESCC detection, the ensemble model showed improved performance over individual modalities. In the training cohort with a specificity of 99.2%, the detection sensitivity was 81.0% for all stages and 70.0% for stage 0-II. Consistent performance was observed in the test cohort with a specificity of 98.4%, an all-stage sensitivity of 79.8%, and a stage 0-II sensitivity of 69.0%. The performance of the classifier was associated with the disease stage, irrespective of clinical covariates. CONCLUSION: This study comprehensively profiles the epigenomic landscape of ESCC plasma and provides a novel noninvasive and sensitive ESCC detection approach with genome-scale multimodal analysis.


Subject(s)
Cell-Free Nucleic Acids , DNA Methylation , Esophageal Neoplasms , Esophageal Squamous Cell Carcinoma , Humans , Esophageal Neoplasms/genetics , Esophageal Neoplasms/blood , Esophageal Neoplasms/diagnosis , Male , Female , Middle Aged , Cell-Free Nucleic Acids/blood , Cell-Free Nucleic Acids/genetics , Esophageal Squamous Cell Carcinoma/genetics , Aged , Epigenome
4.
Nat Commun ; 14(1): 6042, 2023 09 27.
Article in English | MEDLINE | ID: mdl-37758728

ABSTRACT

Multimodal epigenetic characterization of cell-free DNA (cfDNA) could improve the performance of blood-based early cancer detection. However, integrative profiling of cfDNA methylome and fragmentome has been technologically challenging. Here, we adapt an enzyme-mediated methylation sequencing method for comprehensive analysis of genome-wide cfDNA methylation, fragmentation, and copy number alteration (CNA) characteristics for enhanced cancer detection. We apply this method to plasma samples of 497 healthy controls and 780 patients of seven cancer types and develop an ensemble classifier by incorporating methylation, fragmentation, and CNA features. In the test cohort, our approach achieves an area under the curve value of 0.966 for overall cancer detection. Detection sensitivity for early-stage patients achieves 73% at 99% specificity. Finally, we demonstrate the feasibility to accurately localize the origin of cancer signals with combined methylation and fragmentation profiling of tissue-specific accessible chromatin regions. Overall, this proof-of-concept study provides a technical platform to utilize multimodal cfDNA features for improved cancer detection.


Subject(s)
Cell-Free Nucleic Acids , Neoplasms , Humans , Cell-Free Nucleic Acids/genetics , Epigenome , Neoplasms/diagnosis , Neoplasms/genetics , Epigenomics/methods , DNA Methylation/genetics , Biomarkers, Tumor/genetics
5.
Comput Biol Med ; 151(Pt B): 106323, 2022 12.
Article in English | MEDLINE | ID: mdl-36436482

ABSTRACT

Deep learning-based virtual screening methods have been shown to significantly improve the accuracy of traditional docking-based virtual screening methods. In this paper, we developed Deffini, a structure-based virtual screening neural network model. During training, Deffini learns protein-ligand docking poses to distinguish actives and decoys and then to predict whether a new ligand will bind to the protein target. Deffini outperformed Smina with an average AUC ROC of 0.92 and AUC PRC of 0.44 in 3-fold cross-validation on the benchmark dataset DUD-E. However, when tested on the maximum unbiased validation (MUV) dataset, Deffini achieved poor results with an average AUC ROC of 0.517. We used the family-specific training approach to train the model to improve the model performance and concluded that family-specific models performed better than the pan-family models. To explore the limits of the predictive power of the family-specific models, we constructed Kernie, a new protein kinase dataset consisting of 358 kinases. Deffini trained with the Kernie dataset outperformed all recent benchmarks on the MUV kinases, with an average AUC ROC of 0.745, which highlights the importance of quality datasets in improving the performance of deep neural network models and the importance of using family-specific models.


Subject(s)
Neural Networks, Computer , Proteins , Ligands , Proteins/metabolism
6.
BMC Bioinformatics ; 22(1): 23, 2021 Jan 15.
Article in English | MEDLINE | ID: mdl-33451280

ABSTRACT

BACKGROUND: Copy number alterations (CNAs), due to their large impact on the genome, have been an important contributing factor to oncogenesis and metastasis. Detecting genomic alterations from the shallow-sequencing data of a low-purity tumor sample remains a challenging task. RESULTS: We introduce Accucopy, a method to infer total copy numbers (TCNs) and allele-specific copy numbers (ASCNs) from challenging low-purity and low-coverage tumor samples. Accucopy adopts many robust statistical techniques such as kernel smoothing of coverage differentiation information to discern signals from noise and combines ideas from time-series analysis and the signal-processing field to derive a range of estimates for the period in a histogram of coverage differentiation information. Statistical learning models such as the tiered Gaussian mixture model, the expectation-maximization algorithm, and sparse Bayesian learning were customized and built into the model. Accucopy is implemented in C++ /Rust, packaged in a docker image, and supports non-human samples, more at http://www.yfish.org/software/ . CONCLUSIONS: We describe Accucopy, a method that can predict both TCNs and ASCNs from low-coverage low-purity tumor sequencing data. Through comparative analyses in both simulated and real-sequencing samples, we demonstrate that Accucopy is more accurate than Sclust, ABSOLUTE, and Sequenza.


Subject(s)
Algorithms , DNA Copy Number Variations , Neoplasms , Alleles , Bayes Theorem , High-Throughput Nucleotide Sequencing , Humans , Neoplasms/genetics , Software
7.
Bioinformatics ; 34(12): 2004-2011, 2018 06 15.
Article in English | MEDLINE | ID: mdl-29385401

ABSTRACT

Motivation: Tumor purity and ploidy have a substantial impact on next-gen sequence analyses of tumor samples and may alter the biological and clinical interpretation of results. Despite the existence of several computational methods that are dedicated to estimate tumor purity and/or ploidy from The Cancer Genome Atlas (TCGA) tumor-normal whole-genome-sequencing (WGS) data, an accurate, fast and fully-automated method that works in a wide range of sequencing coverage, level of tumor purity and level of intra-tumor heterogeneity, is still missing. Results: We describe a computational method called Accurity that infers tumor purity, tumor cell ploidy and absolute allelic copy numbers for somatic copy number alterations (SCNAs) from tumor-normal WGS data by jointly modelling SCNAs and heterozygous germline single-nucleotide-variants (HGSNVs). Results from both in silico and real sequencing data demonstrated that Accurity is highly accurate and robust, even in low-purity, high-ploidy and low-coverage settings in which several existing methods perform poorly. Accounting for tumor purity and ploidy, Accurity significantly increased signal/noise gaps between different copy numbers. We are hopeful that Accurity is of clinical use for identifying cancer diagnostic biomarkers. Availability and implementation: Accurity is implemented in C++/Rust, available at http://www.yfish.org/software/. Supplementary information: Supplementary data are available at Bioinformatics online.


Subject(s)
DNA Copy Number Variations , Neoplasms/genetics , Ploidies , Software , Whole Genome Sequencing/methods , Algorithms , Computational Biology/methods , Computer Simulation , Germ-Line Mutation , High-Throughput Nucleotide Sequencing/methods , Humans
8.
Nat Genet ; 49(12): 1714-1721, 2017 Dec.
Article in English | MEDLINE | ID: mdl-29083405

ABSTRACT

By analyzing multitissue gene expression and genome-wide genetic variation data in samples from a vervet monkey pedigree, we generated a transcriptome resource and produced the first catalog of expression quantitative trait loci (eQTLs) in a nonhuman primate model. This catalog contains more genome-wide significant eQTLs per sample than comparable human resources and identifies sex- and age-related expression patterns. Findings include a master regulatory locus that likely has a role in immune function and a locus regulating hippocampal long noncoding RNAs (lncRNAs), whose expression correlates with hippocampal volume. This resource will facilitate genetic investigation of quantitative traits, including brain and behavioral phenotypes relevant to neuropsychiatric disorders.


Subject(s)
Chlorocebus aethiops/genetics , Gene Expression Profiling , Genetic Variation , Quantitative Trait Loci/genetics , Animals , Brain/growth & development , Brain/metabolism , Chlorocebus aethiops/growth & development , Genome-Wide Association Study , Genotype , Humans , Phenotype , Polymorphism, Single Nucleotide
9.
BMC Biol ; 13: 41, 2015 Jun 20.
Article in English | MEDLINE | ID: mdl-26092298

ABSTRACT

BACKGROUND: We report here the first genome-wide high-resolution polymorphism resource for non-human primate (NHP) association and linkage studies, constructed for the Caribbean-origin vervet monkey, or African green monkey (Chlorocebus aethiops sabaeus), one of the most widely used NHPs in biomedical research. We generated this resource by whole genome sequencing (WGS) of monkeys from the Vervet Research Colony (VRC), an NIH-supported research resource for which extensive phenotypic data are available. RESULTS: We identified genome-wide single nucleotide polymorphisms (SNPs) by WGS of 721 members of an extended pedigree from the VRC. From high-depth WGS data we identified more than 4 million polymorphic unequivocal segregating sites; by pruning these SNPs based on heterozygosity, quality control filters, and the degree of linkage disequilibrium (LD) between SNPs, we constructed genome-wide panels suitable for genetic association (about 500,000 SNPs) and linkage analysis (about 150,000 SNPs). To further enhance the utility of these resources for linkage analysis, we used a further pruned subset of the linkage panel to generate multipoint identity by descent matrices. CONCLUSIONS: The genetic and phenotypic resources now available for the VRC and other Caribbean-origin vervets enable their use for genetic investigation of traits relevant to human diseases.


Subject(s)
Chlorocebus aethiops/genetics , Linkage Disequilibrium , Polymorphism, Single Nucleotide , Animals , Chromosome Mapping , Female , Genome-Wide Association Study , Genotype , Humans , Male , Microsatellite Repeats , Phenotype , Quantitative Trait Loci , Sequence Analysis
10.
PLoS Genet ; 8(9): e1002923, 2012 Sep.
Article in English | MEDLINE | ID: mdl-22969436

ABSTRACT

Understanding the mechanism of cadmium (Cd) accumulation in plants is important to help reduce its potential toxicity to both plants and humans through dietary and environmental exposure. Here, we report on a study to uncover the genetic basis underlying natural variation in Cd accumulation in a world-wide collection of 349 wild collected Arabidopsis thaliana accessions. We identified a 4-fold variation (0.5-2 µg Cd g(-1) dry weight) in leaf Cd accumulation when these accessions were grown in a controlled common garden. By combining genome-wide association mapping, linkage mapping in an experimental F2 population, and transgenic complementation, we reveal that HMA3 is the sole major locus responsible for the variation in leaf Cd accumulation we observe in this diverse population of A. thaliana accessions. Analysis of the predicted amino acid sequence of HMA3 from 149 A. thaliana accessions reveals the existence of 10 major natural protein haplotypes. Association of these haplotypes with leaf Cd accumulation and genetics complementation experiments indicate that 5 of these haplotypes are active and 5 are inactive, and that elevated leaf Cd accumulation is associated with the reduced function of HMA3 caused by a nonsense mutation and polymorphisms that change two specific amino acids.


Subject(s)
Adenosine Triphosphatases/metabolism , Arabidopsis Proteins/metabolism , Arabidopsis/metabolism , Plant Leaves/metabolism , Adenosine Triphosphatases/genetics , Arabidopsis/genetics , Arabidopsis Proteins/genetics , Cadmium , Genome-Wide Association Study , Plant Roots/metabolism , Plant Shoots/metabolism , Polymorphism, Single Nucleotide , Quantitative Trait Loci
11.
Nat Genet ; 44(2): 212-6, 2012 Jan 08.
Article in English | MEDLINE | ID: mdl-22231484

ABSTRACT

Arabidopsis thaliana is native to Eurasia and is naturalized across the world. Its ability to be easily propagated and its high phenotypic variability make it an ideal model system for functional, ecological and evolutionary genetics. To date, analyses of the natural genetic variation of A. thaliana have involved small numbers of individual plants or genetic markers. Here we genotype 1,307 worldwide accessions, including several regional samples, using a 250K SNP chip. This allowed us to produce a high-resolution description of the global pattern of genetic variation. We applied three complementary selection tests and identified new targets of selection. Further, we characterized the pattern of historical recombination in A. thaliana and observed an enrichment of hotspots in its intergenic regions and repetitive DNA, which is consistent with the pattern that is observed for humans but which is strikingly different from that observed in other plant species. We have made the seeds we used to produce this Regional Mapping (RegMap) panel publicly available. This panel comprises one of the largest genomic mapping resources currently available for global natural isolates of a non-human species.


Subject(s)
Arabidopsis/genetics , Genetic Variation , Genome, Plant , Chromosome Mapping , Genotype , Geography , Polymorphism, Single Nucleotide , Recombination, Genetic , Selection, Genetic
12.
Plant Cell ; 24(12): 4793-805, 2012 Dec.
Article in English | MEDLINE | ID: mdl-23277364

ABSTRACT

Arabidopsis thaliana is an important model organism for understanding the genetics and molecular biology of plants. Its highly selfing nature, small size, short generation time, small genome size, and wide geographic distribution make it an ideal model organism for understanding natural variation. Genome-wide association studies (GWAS) have proven a useful technique for identifying genetic loci responsible for natural variation in A. thaliana. Previously genotyped accessions (natural inbred lines) can be grown in replicate under different conditions and phenotyped for different traits. These important features greatly simplify association mapping of traits and allow for systematic dissection of the genetics of natural variation by the entire A. thaliana community. To facilitate this, we present GWAPP, an interactive Web-based application for conducting GWAS in A. thaliana. Using an efficient implementation of a linear mixed model, traits measured for a subset of 1386 publicly available ecotypes can be uploaded and mapped with a mixed model and other methods in just a couple of minutes. GWAPP features an extensive, interactive, and user-friendly interface that includes interactive Manhattan plots and linkage disequilibrium plots. It also facilitates exploratory data analysis by implementing features such as the inclusion of candidate polymorphisms in the model as cofactors.


Subject(s)
Arabidopsis/genetics , Genome-Wide Association Study/methods , Internet , Linkage Disequilibrium/genetics , Software , User-Computer Interface
13.
Database (Oxford) ; 2011: bar014, 2011.
Article in English | MEDLINE | ID: mdl-21609965

ABSTRACT

With large-scale genomic data becoming the norm in biological studies, the storing, integrating, viewing and searching of such data have become a major challenge. In this article, we describe the development of an Arabidopsis thaliana database that hosts the geographic information and genetic polymorphism data for over 6000 accessions and genome-wide association study (GWAS) results for 107 phenotypes representing the largest collection of Arabidopsis polymorphism data and GWAS results to date. Taking advantage of a series of the latest web 2.0 technologies, such as Ajax (Asynchronous JavaScript and XML), GWT (Google-Web-Toolkit), MVC (Model-View-Controller) web framework and Object Relationship Mapper, we have created a web-based application (web app) for the database, that offers an integrated and dynamic view of geographic information, genetic polymorphism and GWAS results. Essential search functionalities are incorporated into the web app to aid reverse genetics research. The database and its web app have proven to be a valuable resource to the Arabidopsis community. The whole framework serves as an example of how biological data, especially GWAS, can be presented and accessed through the web. In the end, we illustrate the potential to gain new insights through the web app by two examples, showcasing how it can be used to facilitate forward and reverse genetics research. Database URL: http://arabidopsis.usc.edu/


Subject(s)
Arabidopsis/genetics , Computational Biology/methods , Genome, Plant/genetics , Genome-Wide Association Study/methods , Internet , Alleles , Databases, Genetic , Genotype , Geography , Phenotype , Polymorphism, Single Nucleotide/genetics , Principal Component Analysis
14.
PLoS Genet ; 6(11): e1001193, 2010 Nov 11.
Article in English | MEDLINE | ID: mdl-21085628

ABSTRACT

The genetic model plant Arabidopsis thaliana, like many plant species, experiences a range of edaphic conditions across its natural habitat. Such heterogeneity may drive local adaptation, though the molecular genetic basis remains elusive. Here, we describe a study in which we used genome-wide association mapping, genetic complementation, and gene expression studies to identify cis-regulatory expression level polymorphisms at the AtHKT1;1 locus, encoding a known sodium (Na(+)) transporter, as being a major factor controlling natural variation in leaf Na(+) accumulation capacity across the global A. thaliana population. A weak allele of AtHKT1;1 that drives elevated leaf Na(+) in this population has been previously linked to elevated salinity tolerance. Inspection of the geographical distribution of this allele revealed its significant enrichment in populations associated with the coast and saline soils in Europe. The fixation of this weak AtHKT1;1 allele in these populations is genetic evidence supporting local adaptation to these potentially saline impacted environments.


Subject(s)
Arabidopsis Proteins/genetics , Arabidopsis Proteins/metabolism , Arabidopsis/genetics , Arabidopsis/metabolism , Cation Transport Proteins/genetics , Cation Transport Proteins/metabolism , Ecosystem , Genetic Variation , Seawater , Sodium/metabolism , Symporters/genetics , Symporters/metabolism , Alleles , Arabidopsis/growth & development , Gene Expression Regulation, Plant , Genetic Complementation Test , Genome, Plant/genetics , Genome-Wide Association Study , Geography , Plant Leaves/genetics , Plant Leaves/metabolism
15.
Proc Natl Acad Sci U S A ; 107(22): 10302-7, 2010 Jun 01.
Article in English | MEDLINE | ID: mdl-20479233

ABSTRACT

The model plant Arabidopsis thaliana exhibits extensive natural variation in resistance to parasites. Immunity is often conferred by resistance (R) genes that permit recognition of specific races of a disease. The number of such R genes and their distribution are poorly understood. In this study, we investigated the basis for resistance to the downy mildew agent Hyaloperonospora arabidopsidis ex parasitica (Hpa) in a global sample of A. thaliana. We implemented a combined genome-wide mapping of resistance using populations of recombinant inbred lines and a collection of wild A. thaliana accessions. We tested the interaction between 96 host genotypes collected worldwide and five strains of Hpa. Then, a fraction of the species-wide resistance was genetically dissected using six recently constructed populations of recombinant inbred lines. We found that resistance is usually governed by single dominant R genes that are concentrated in four genomic regions only. We show that association genetics of resistance to diseases such as downy mildew enables increased mapping resolution from quantitative trait loci interval to candidate gene level. Association patterns in quantitative trait loci intervals indicate that the pool of A. thaliana resistance sources against the tested Hpa isolates may be predominantly confined to six RPP (Resistance to Hpa) loci isolated in previous studies. Our results suggest that combining association and linkage mapping could accelerate resistance gene discovery in plants.


Subject(s)
Arabidopsis/genetics , Arabidopsis/microbiology , Genome, Plant , Oomycetes/pathogenicity , Plant Diseases/genetics , Plant Diseases/microbiology , Chromosome Mapping , Genetic Variation , Genome-Wide Association Study , Quantitative Trait Loci
16.
Nature ; 465(7298): 627-31, 2010 Jun 03.
Article in English | MEDLINE | ID: mdl-20336072

ABSTRACT

Although pioneered by human geneticists as a potential solution to the challenging problem of finding the genetic basis of common human diseases, genome-wide association (GWA) studies have, owing to advances in genotyping and sequencing technology, become an obvious general approach for studying the genetics of natural variation and traits of agricultural importance. They are particularly useful when inbred lines are available, because once these lines have been genotyped they can be phenotyped multiple times, making it possible (as well as extremely cost effective) to study many different traits in many different environments, while replicating the phenotypic measurements to reduce environmental noise. Here we demonstrate the power of this approach by carrying out a GWA study of 107 phenotypes in Arabidopsis thaliana, a widely distributed, predominantly self-fertilizing model plant known to harbour considerable genetic variation for many adaptively important traits. Our results are dramatically different from those of human GWA studies, in that we identify many common alleles of major effect, but they are also, in many cases, harder to interpret because confounding by complex genetics and population structure make it difficult to distinguish true associations from false. However, a-priori candidates are significantly over-represented among these associations as well, making many of them excellent candidates for follow-up experiments. Our study demonstrates the feasibility of GWA studies in A. thaliana and suggests that the approach will be appropriate for many other organisms.


Subject(s)
Arabidopsis/classification , Arabidopsis/genetics , Genome, Plant/genetics , Genome-Wide Association Study , Phenotype , Alleles , Arabidopsis Proteins/genetics , Flowers/genetics , Genes, Plant/genetics , Genetic Loci/genetics , Genotype , Immunity, Innate/genetics , Inbreeding , Polymorphism, Single Nucleotide/genetics
17.
PLoS Genet ; 6(2): e1000843, 2010 Feb 12.
Article in English | MEDLINE | ID: mdl-20169178

ABSTRACT

The population structure of an organism reflects its evolutionary history and influences its evolutionary trajectory. It constrains the combination of genetic diversity and reveals patterns of past gene flow. Understanding it is a prerequisite for detecting genomic regions under selection, predicting the effect of population disturbances, or modeling gene flow. This paper examines the detailed global population structure of Arabidopsis thaliana. Using a set of 5,707 plants collected from around the globe and genotyped at 149 SNPs, we show that while A. thaliana as a species self-fertilizes 97% of the time, there is considerable variation among local groups. This level of outcrossing greatly limits observed heterozygosity but is sufficient to generate considerable local haplotypic diversity. We also find that in its native Eurasian range A. thaliana exhibits continuous isolation by distance at every geographic scale without natural breaks corresponding to classical notions of populations. By contrast, in North America, where it exists as an exotic species, A. thaliana exhibits little or no population structure at a continental scale but local isolation by distance that extends hundreds of km. This suggests a pattern for the development of isolation by distance that can establish itself shortly after an organism fills a new habitat range. It also raises questions about the general applicability of many standard population genetics models. Any model based on discrete clusters of interchangeable individuals will be an uneasy fit to organisms like A. thaliana which exhibit continuous isolation by distance on many scales.


Subject(s)
Arabidopsis/genetics , Alleles , Crosses, Genetic , Geography , Haplotypes/genetics , Heterozygote , Inbreeding , Population Dynamics
SELECTION OF CITATIONS
SEARCH DETAIL
...