Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 33
Filter
1.
Am J Hum Genet ; 108(4): 656-668, 2021 04 01.
Article in English | MEDLINE | ID: mdl-33770507

ABSTRACT

Genetic studies in underrepresented populations identify disproportionate numbers of novel associations. However, most genetic studies use genotyping arrays and sequenced reference panels that best capture variation most common in European ancestry populations. To compare data generation strategies best suited for underrepresented populations, we sequenced the whole genomes of 91 individuals to high coverage as part of the Neuropsychiatric Genetics of African Population-Psychosis (NeuroGAP-Psychosis) study with participants from Ethiopia, Kenya, South Africa, and Uganda. We used a downsampling approach to evaluate the quality of two cost-effective data generation strategies, GWAS arrays versus low-coverage sequencing, by calculating the concordance of imputed variants from these technologies with those from deep whole-genome sequencing data. We show that low-coverage sequencing at a depth of ≥4× captures variants of all frequencies more accurately than all commonly used GWAS arrays investigated and at a comparable cost. Lower depths of sequencing (0.5-1×) performed comparably to commonly used low-density GWAS arrays. Low-coverage sequencing is also sensitive to novel variation; 4× sequencing detects 45% of singletons and 95% of common variants identified in high-coverage African whole genomes. Low-coverage sequencing approaches surmount the problems induced by the ascertainment of common genotyping arrays, effectively identify novel variation particularly in underrepresented populations, and present opportunities to enhance variant discovery at a cost similar to traditional approaches.


Subject(s)
DNA Mutational Analysis/economics , DNA Mutational Analysis/standards , Genetic Variation/genetics , Genetics, Population/economics , Africa , DNA Mutational Analysis/methods , Genetics, Population/methods , Genome, Human/genetics , Genome-Wide Association Study , Health Equity , Humans , Microbiota , Whole Genome Sequencing/economics , Whole Genome Sequencing/standards
2.
Mol Ecol ; 32(11): 2818-2834, 2023 06.
Article in English | MEDLINE | ID: mdl-36811385

ABSTRACT

The distribution of ecotypic variation in natural populations is influenced by neutral and adaptive evolutionary forces that are challenging to disentangle. This study provides a high-resolution portrait of genomic variation in Chinook salmon (Oncorhynchus tshawytscha) with emphasis on a region of major effect for ecotypic variation in migration timing. With a filtered data set of ~13 million single nucleotide polymorphisms (SNPs) from low-coverage whole genome resequencing of 53 populations (3566 barcoded individuals), we contrasted patterns of genomic structure within and among major lineages and examined the extent of a selective sweep at a major effect region underlying migration timing (GREB1L/ROCK1). Neutral variation provided support for fine-scale structure of populations, while allele frequency variation in GREB1L/ROCK1 was highly correlated with mean return timing for early and late migrating populations within each of the lineages (r2  = .58-.95; p < .001). However, the extent of selection within the genomic region controlling migration timing was much narrower in one lineage (interior stream-type) compared to the other two major lineages, which corresponded to the breadth of phenotypic variation in migration timing observed among lineages. Evidence of a duplicated block within GREB1L/ROCK1 may be responsible for reduced recombination in this portion of the genome and contributes to phenotypic variation within and across lineages. Lastly, SNP positions across GREB1L/ROCK1 were assessed for their utility in discriminating migration timing among lineages, and we recommend multiple markers nearest the duplication to provide highest accuracy in conservation applications such as those that aim to protect early migrating Chinook salmon. These results highlight the need to investigate variation throughout the genome and the effects of structural variants on ecologically relevant phenotypic variation in natural species.


Subject(s)
Genetic Variation , Salmon , Humans , Animals , Genetic Variation/genetics , Alleles , Salmon/genetics , Gene Frequency/genetics , Genomics , rho-Associated Kinases/genetics
3.
J Dairy Sci ; 105(4): 3355-3366, 2022 Apr.
Article in English | MEDLINE | ID: mdl-35151474

ABSTRACT

Low-coverage sequencing (LCS) followed by imputation has been proposed as a cost-effective genotyping approach for obtaining genotypes of whole-genome variants. Imputation performance is essential for the effectiveness of this approach. Several imputation methods have been proposed and successfully applied in genomic studies in human and other species. However, there are few reports on the performance of these methods in livestock. Here, we evaluated a variety of imputation methods, including Beagle v4.1, GeneImp v1.3, GLIMPSE v1.1.0, QUILT v1.0.0, Reveel, and STITCH v1.6.5, with varying sequencing depth, sample size, and reference panel size using LCS data of Holstein cattle. We found that all of these methods, except Reveel, performed well in most cases with an imputation accuracy over 0.9; on the whole, GLIMPSE, QUILT, and STITCH performed better than the other methods. For species with no reference panel available, STITCH followed by Beagle would be an optimal strategy, whereas for species with reference panel available, QUILT would be the method of choice. Overall, this study illustrated the promising potential of LCS for genomic analysis in livestock.


Subject(s)
High-Throughput Nucleotide Sequencing , Polymorphism, Single Nucleotide , Animals , Cattle/genetics , Genomics/methods , Genotype , High-Throughput Nucleotide Sequencing/methods , High-Throughput Nucleotide Sequencing/veterinary , Sequence Analysis, DNA/methods , Sequence Analysis, DNA/veterinary
4.
Fish Res ; 249: 106231, 2022 May.
Article in English | MEDLINE | ID: mdl-36798657

ABSTRACT

The Atlantic herring Clupea harengus L has a vast geographical distribution and a complex population structure with a few very large migratory units and many small local populations. Each population has its own spawning ground and/or time, thereby maintaining their genetic integrity. Several herring populations migrate between common feeding grounds and over-wintering areas resulting in frequent mixing of populations. Thus, many herring fisheries are based on mixed populations of different demographic status. In order to avoid over-exploitation of weak populations and to conserve biodiversity, understanding the population structure and population mixing is important for maintaining biologically sustainable herring fisheries. The aim of this study was to investigate the genetic population structure of herring in the Faroese and surrounding waters, and to develop genetic markers for distinguishing between four herring management units (often called stocks), namely the Norwegian spring-spawning herring (NSSH), Icelandic summer-spawning herring (ISSH), North Sea autumn-spawning herring (NSAH), and Faroese autumn-spawning herring (FASH). Herring from the four stocks were sequenced at low coverage, and single nucleotide polymorphisms (SNPs) were called and used for population structure analysis and individual assignment. An ancestry-informative SNP panel with 118 SNPs was developed and tested on 240 individuals. The results showed that all four stocks appeared to be genetically differentiated populations, but at lower levels of differentiation between FASH and ISSH than the other two populations. Overall assignment rate with the SNP panel was 80.7%, and agreement between the genetic and traditional visual assignment was 75.5%. The NSAH and NSSH samples had the highest assignment rate (100% and 98.3%, respectively) and highest agreement between traditional and genetic assignment methods (96.6% and 94.9%, respectively). The FASH and ISSH samples had substantially lower assignment rates (72.9% and 51.7%, respectively) and agreement between traditional and genetic methods (39.5% and 48.4%, respectively).

5.
Reprod Biol Endocrinol ; 19(1): 58, 2021 Apr 20.
Article in English | MEDLINE | ID: mdl-33879178

ABSTRACT

BACKGROUND: Preimplantation genetic testing for chromosomal structural rearrangements (PGT-SR) is widely applied in couples with single reciprocal translocation to increase the chance for a healthy live birth. However, limited knowledge is known on the data of PGT-SR when both parents have a reciprocal translocation. Here, we for the first time present a rare instance of PGT-SR for a non-consanguineous couple in which both parents carried an independent balanced reciprocal translocation and show how relevant genetic counseling data can be generated. METHODS: The precise translocation breakpoints were identified by whole genome low-coverage sequencing (WGLCS) and Sanger sequencing. Next-generation sequencing (NGS) combining with breakpoint-specific polymerase chain reaction (PCR) was used to define 24-chromosome and the carrier status of the euploid embryos. RESULTS: Surprisingly, 2 out of 3 day-5 blastocysts were found to be balanced for maternal reciprocal translocation while being normal for paternal translocation and thus transferable. The transferable embryo rate was significantly higher than that which would be expected theoretically. Transfer of one balanced embryo resulted in the birth of a healthy boy. CONCLUSION(S): Our data of PGT-SR together with a systematic review of the literature should help in providing couples carrying two different reciprocal translocations undergoing PGT-SR with more appropriate genetic counseling.


Subject(s)
Infertility/therapy , Preimplantation Diagnosis , Translocation, Genetic , Adult , Embryo Transfer , Family Characteristics , Female , Fertilization in Vitro , Genetic Testing , High-Throughput Nucleotide Sequencing , Humans , Infant, Newborn , Infertility/diagnosis , Infertility/genetics , Live Birth , Male , Parturition , Pedigree , Pregnancy , Treatment Outcome
6.
Genet Med ; 21(6): 1390-1399, 2019 06.
Article in English | MEDLINE | ID: mdl-30449887

ABSTRACT

PURPOSE: To develop an economical, user-friendly, and accurate all-in-one next-generation sequencing (NGS)-based workflow for single-cell gene variant detection combined with comprehensive chromosome screening in a 24-hour workflow protocol. METHODS: We subjected single lymphoblast cells or blastomere/blastocyst biopsies from four different families to low coverage (0.3×-1.4×) genome sequencing. We combined copy-number variant (CNV) detection and whole-genome haplotype phase prediction via Haploseek, a novel, user-friendly analysis pipeline. We validated haplotype predictions for each sample by comparing with clinical preimplantation genetic diagnosis (PGD) case results or by single-nucleotide polymorphism (SNP) microarray analysis of bulk DNA from each respective lymphoblast culture donor. CNV predictions were validated by established commercial kits for single-cell CNV prediction. RESULTS: Haplotype phasing of the single lymphoblast/embryo biopsy sequencing data was highly concordant with relevant ground truth haplotypes in all samples/biopsies from all four families. In addition, whole-genome copy-number assessments were concordant with the results of a commercial kit. CONCLUSION: Our results demonstrate the establishment of a reliable method for all-in-one molecular and chromosomal diagnosis of single cells. Important features of the Haploseek pipeline include rapid sample processing, rapid sequencing, streamlined analysis, and user-friendly reporting, so as to expedite clinical PGD implementation.


Subject(s)
Genetic Testing/methods , Haplotypes/genetics , Preimplantation Diagnosis/methods , Aneuploidy , Biopsy , Blastocyst , Chromosomes , DNA Copy Number Variations/genetics , Female , Fertilization in Vitro , Genetic Diseases, Inborn/diagnosis , Genetic Diseases, Inborn/genetics , High-Throughput Nucleotide Sequencing/methods , Humans , Pregnancy
7.
Br J Biomed Sci ; 75(3): 133-138, 2018 Jul.
Article in English | MEDLINE | ID: mdl-29968522

ABSTRACT

Background Non-invasive prenatal screening (NIPS) using cell-free foetal DNA (cfDNA) has been widely used for identifying common foetal aneuploidies (e.g. trisomy 21 (T21), trisomy (T18) and trisomy 13 (T13)) in clinical practice. The sensitivity and specificity of NIPS exceeds 99%, but the positive prediction value (PPV) is approximately 70% (combined T21, T18 and T13). Thus, some 30% of pregnant women who have positive NIPS results are eventually identified as normal by amniocentesis. These women therefore must undertake needless invasive tests and risk miscarrying healthy babies because of false positive NIPS results. Methods In order to achieve higher accuracy, we amended the standard NIPS (s-NIPS) protocol with an additional cfDNA size selecting step in agarose-electrophoresis. The advantage of the new method (named e-NIPS) was validated by comparing the results of e-NIPS and s-NIPS using 114 retrospective cases selected from 15,930 cases. Results Our results showed that the foetal cfDNA fraction can be enriched significantly by a size selection step. With this modification, all 98 negative cases and 9 of 11 false positive cases of s-NIPS were correctly identified by e-NIPS, resulting in an increased PPV from 71% to 77%. Additionally, a simulation test showed that e-NIPS is more reliable than s-NIPS, especially when the foetal cfDNA concentration and sequencing coverage are low. Conclusion cfDNA size selection is an important step in improving the accuracy of non-invasive prenatal screening for chromosomal abnormalities.


Subject(s)
Cell-Free Nucleic Acids/genetics , Down Syndrome/genetics , Prenatal Diagnosis/methods , Trisomy/genetics , Adult , Aneuploidy , Cell-Free Nucleic Acids/isolation & purification , Down Syndrome/blood , Down Syndrome/pathology , Female , Fetal Development/genetics , Fetus/pathology , Humans , Pregnancy , Retrospective Studies
8.
Mol Ecol ; 26(20): 5369-5406, 2017 Oct.
Article in English | MEDLINE | ID: mdl-28746784

ABSTRACT

Whole-genome resequencing (WGR) is a powerful method for addressing fundamental evolutionary biology questions that have not been fully resolved using traditional methods. WGR includes four approaches: the sequencing of individuals to a high depth of coverage with either unresolved or resolved haplotypes, the sequencing of population genomes to a high depth by mixing equimolar amounts of unlabelled-individual DNA (Pool-seq) and the sequencing of multiple individuals from a population to a low depth (lcWGR). These techniques require the availability of a reference genome. This, along with the still high cost of shotgun sequencing and the large demand for computing resources and storage, has limited their implementation in nonmodel species with scarce genomic resources and in fields such as conservation biology. Our goal here is to describe the various WGR methods, their pros and cons and potential applications in conservation biology. WGR offers an unprecedented marker density and surveys a wide diversity of genetic variations not limited to single nucleotide polymorphisms (e.g., structural variants and mutations in regulatory elements), increasing their power for the detection of signatures of selection and local adaptation as well as for the identification of the genetic basis of phenotypic traits and diseases. Currently, though, no single WGR approach fulfils all requirements of conservation genetics, and each method has its own limitations and sources of potential bias. We discuss proposed ways to minimize such biases. We envision a not distant future where the analysis of whole genomes becomes a routine task in many nonmodel species and fields including conservation biology.


Subject(s)
Conservation of Natural Resources/methods , Genetics, Population , Genomics/methods , Biological Evolution , Gene Frequency , Genomic Library , Genotype , Haplotypes , High-Throughput Nucleotide Sequencing , Phenotype , Polymorphism, Single Nucleotide , Population Density , Sequence Analysis, DNA/methods
9.
Am J Med Genet B Neuropsychiatr Genet ; 174(4): 435-450, 2017 Jun.
Article in English | MEDLINE | ID: mdl-28436151

ABSTRACT

EEG alpha activity is the dominant oscillation in most adult humans, is highly heritable, and has been associated with a number of cognitive functions. Two EEG phenotypes, low- and high-voltage alpha (LVA & HVA), have been demonstrated to have high heritabilities. They have different prevalence depending on a population's ancestral origins. In the present study we assessed the influence of ancestry admixture on EEG alpha power, and conducted a whole genome sequencing association analysis and an ancestry-informed polygenic study on those phenotypes in a Native American (NA) population that has a high prevalence of LVA. Seven common variants, in LD with each other upstream from gene ASIC2, reached genome-wide significance (p = 2 × 10-8 ) having a positive association with alpha voltage. They had lower minor allele frequencies in the NAs than in a global population sample. Overall correlations between lower degrees of NA (higher degree European) ancestry and HVA, and higher degrees of NA and LVA were also found. Additionally a rare-variant gene-based study identified gene TIA1 being negatively associated with LVA. Approximately 3% of SNPs exhibited a 15-fold enrichment that explained nearly half of the total SNP-heritability for EEG alpha. These regions showed the most significant anti-correlations between NA ancestry and alpha voltage, and were enriched for genes and pathways mediating cognitive functions. Our findings suggested that these regions likely harbor causal variants for HVA, and lacking of such variants could explain the high prevalence of LVA in this NA population, possibly illuminating the ancestral origin and genetic basis for EEG alpha.


Subject(s)
Alpha Rhythm/genetics , Biomarkers/analysis , Electroencephalography , Genome-Wide Association Study , Indians, North American/genetics , Multifactorial Inheritance/genetics , Polymorphism, Single Nucleotide , Acid Sensing Ion Channels/genetics , Adolescent , Adult , Aged , Aged, 80 and over , Female , Gene Frequency , Genetics, Population , Humans , Male , Middle Aged , Phenotype , Prognosis , T-Cell Intracellular Antigen-1/genetics , Young Adult
10.
BMC Med ; 14(1): 126, 2016 08 24.
Article in English | MEDLINE | ID: mdl-27558279

ABSTRACT

BACKGROUND: Non-invasive prenatal testing (NIPT) identifies fetal aneuploidy by sequencing cell-free DNA in the maternal plasma. Pre-symptomatic maternal malignancies have been incidentally detected during NIPT based on abnormal genomic profiles. This low coverage sequencing approach could have potential for ovarian cancer screening in the non-pregnant population. Our objective was to investigate whether plasma DNA sequencing with a clinical whole genome NIPT platform can detect early- and late-stage high-grade serous ovarian carcinomas (HGSOC). METHODS: This is a case control study of prospectively-collected biobank samples comprising preoperative plasma from 32 women with HGSOC (16 'early cancer' (FIGO I-II) and 16 'advanced cancer' (FIGO III-IV)) and 32 benign controls. Plasma DNA from cases and controls were sequenced using a commercial NIPT platform and chromosome dosage measured. Sequencing data were blindly analyzed with two methods: (1) Subchromosomal changes were called using an open source algorithm WISECONDOR (WIthin-SamplE COpy Number aberration DetectOR). Genomic gains or losses ≥ 15 Mb were prespecified as "screen positive" calls, and mapped to recurrent copy number variations reported in an ovarian cancer genome atlas. (2) Selected whole chromosome gains or losses were reported using the routine NIPT pipeline for fetal aneuploidy. RESULTS: We detected 13/32 cancer cases using the subchromosomal analysis (sensitivity 40.6 %, 95 % CI, 23.7-59.4 %), including 6/16 early and 7/16 advanced HGSOC cases. Two of 32 benign controls had subchromosomal gains ≥ 15 Mb (specificity 93.8 %, 95 % CI, 79.2-99.2 %). Twelve of the 13 true positive cancer cases exhibited specific recurrent changes reported in HGSOC tumors. The NIPT pipeline resulted in one "monosomy 18" call from the cancer group, and two "monosomy X" calls in the controls. CONCLUSIONS: Low coverage plasma DNA sequencing used for prenatal testing detected 40.6 % of all HGSOC, including 38 % of early stage cases. Our findings demonstrate the potential of a high throughput sequencing platform to screen for early HGSOC in plasma based on characteristic multiple segmental chromosome gains and losses. The performance of this approach may be further improved by refining bioinformatics algorithms and targeting selected cancer copy number variations.


Subject(s)
Early Detection of Cancer/methods , High-Throughput Nucleotide Sequencing/methods , Ovarian Neoplasms , Adult , Aged , Case-Control Studies , Chromosome Aberrations , Cytogenetic Analysis/methods , DNA/blood , DNA Copy Number Variations , Female , Humans , Middle Aged , Neoplasm Staging , Ovarian Neoplasms/diagnosis , Ovarian Neoplasms/genetics , Ovarian Neoplasms/pathology , Pregnancy , Prenatal Diagnosis/methods , Reproducibility of Results , Sensitivity and Specificity
11.
BMC Med Genet ; 17(1): 49, 2016 Jul 22.
Article in English | MEDLINE | ID: mdl-27448395

ABSTRACT

BACKGROUND: Ring chromosome 18 [r(18)] is formed by 18p- and 18q- partial deletion and generates a ring chromosome. Loss of critical genes on each arm of chromosome 18 may contribute to the specific phenotype, and the clinical spectrum varieties may heavily depend on the extent of the genomic deletion. The aim of this study is to identify the detailed breakpoints location and the deleted genes result from the r18. CASE PRESENTATION: Here we describe a detailed diagnosis of a seven-year-old Chinese girl with a ring chromosome 18 mutation by a high-throughput whole-genome low-coverage sequencing approach without karyotyping and other cytogenetic analysis. This method revealed two fragment heterozygous deletions of 18p and 18q, and further localized the detailed breakpoint sites and fusion, as well as the deleted genes. CONCLUSIONS: To our knowledge, this is the first report of a ring chromosome 18 patient in China analyzed by whole-genome low-coverage sequencing approach. Detailed breakpoints location and deleted genes identification help to estimate the risk of the disease in the future. The data and analysis here demonstrated the feasibility of next-generation sequencing technologies for chromosome structure variation including ring chromosome in an efficient and cost effective way.


Subject(s)
Gene Deletion , Ring Chromosomes , Child , China , Chromosomes, Human, Pair 18 , Cytogenetic Analysis , Female , High-Throughput Nucleotide Sequencing , Humans , Karyotype , Magnetic Resonance Imaging , Phenotype , Pituitary Gland/diagnostic imaging , Sequence Analysis, DNA
12.
Genomics ; 104(3): 170-6, 2014 Sep.
Article in English | MEDLINE | ID: mdl-25086333

ABSTRACT

Blepharophimosis-ptosis-epicanthus inversus syndrome (BPES) is a rare autosomal dominant disorder that affects craniofacial development and ovarian function. FOXL2 is the only gene known to be responsible for BPES. The majority of BPES patients show intragenic mutations of FOXL2. Recently, a 7.4 kb sequence disruption, which was 283 kb upstream of FOXL2, was identified to independently contribute to the BPES phenotype. Several breakpoints nearing FOXL2 (0 Mb to 1.2 Mb, several of which were distant from the 7.4 kb sequence disruption) have been mapped or deduced through a traditional method in BPES patients with chromosome reciprocal translocation. In this study, two BPES families with chromosome reciprocal translocation were investigated. Intragenic mutations of FOXL2 or pathogenic copy number variations were excluded for the two BPES families. All of the four breakpoints were identified at a base-precise manner using Giemsa banding and whole genome low-coverage sequencing (WGLCS). In family 01, the breakpoints were found at chr1:95,609,998 and chr3:138,879, 114 (213,132 bp upstream of FOXL2). In family 02, the breakpoints were located at chr3:138,665,431 (intragenic disruptions of FOXL2) and chr20:56,924,609. Results indicate that the intragenic and extragenic interruptions of FOXL2 can be accurately and rapidly detected using WGLCS. In addition, both the 213 kb upstream and intragenic interruptions of FOXL2 can cause BPES phenotype.


Subject(s)
Blepharophimosis/genetics , Chromosome Breakpoints , Duane Retraction Syndrome/genetics , Forkhead Transcription Factors/genetics , Genome, Human , Translocation, Genetic , Base Sequence , Blepharophimosis/diagnosis , Child, Preschool , Duane Retraction Syndrome/diagnosis , Female , Forkhead Box Protein L2 , Humans , Male , Molecular Sequence Data , Pedigree , Twins, Monozygotic
13.
Am J Med Genet B Neuropsychiatr Genet ; 165B(8): 673-83, 2014 Dec.
Article in English | MEDLINE | ID: mdl-25270064

ABSTRACT

Higher rates of alcohol use and other drug-dependence have been observed in some Native American (NA) populations relative to other ethnic groups in the US. Previous studies have shown that alcohol dehydrogenase (ADH) genes and aldehyde dehydrogenase (ALDH) genes may affect the risk of development of alcohol dependence, and that polymorphisms within these genes may differentially affect risk for the disorder depending on the ethnic group evaluated. We evaluated variations in the ADH and ALDH genes in a large study investigating risk factors for substance use in a NA population. We assessed ancestry admixture and tested for associations between alcohol-related phenotypes in the genomic regions around the ADH1-7 and ALDH2 and ALDH1A1 genes. Seventy-two ADH variants showed significant evidence of association with a severity level of alcohol drinking-related dependence symptoms phenotype. These significant variants spanned across the entire 7 ADH gene cluster regions. Two significant associations, one in ADH and one in ALDH2, were observed with alcohol dependence diagnosis. Seventeen variants showed significant association with the largest number of alcohol drinks ingested during any 24-hour period. Variants in or near ADH7 were significantly negatively associated with alcohol-related phenotypes, suggesting a potential protective effect of this gene. In addition, our results suggested that a higher degree of NA ancestry is associated with higher frequencies of potential risk variants and lower frequencies of potential protective variants for alcohol dependence phenotypes.


Subject(s)
Alcohol Dehydrogenase/genetics , Alcoholism/genetics , Aldehyde Dehydrogenase/genetics , Genetic Variation/genetics , Indians, North American/genetics , Polymorphism, Genetic/genetics , Adolescent , Adult , Aged , Aged, 80 and over , Female , Humans , Male , Middle Aged , Phenotype , Sequence Analysis, DNA , Young Adult
14.
Genome Biol ; 25(1): 171, 2024 07 01.
Article in English | MEDLINE | ID: mdl-38951917

ABSTRACT

BACKGROUND: The massive structural variations and frequent introgression highly contribute to the genetic diversity of wheat, while the huge and complex genome of polyploid wheat hinders efficient genotyping of abundant varieties towards accurate identification, management, and exploitation of germplasm resources. RESULTS: We develop a novel workflow that identifies 1240 high-quality large copy number variation blocks (CNVb) in wheat at the pan-genome level, demonstrating that CNVb can serve as an ideal DNA fingerprinting marker for discriminating massive varieties, with the accuracy validated by PCR assay. We then construct a digitalized genotyping CNVb map across 1599 global wheat accessions. Key CNVb markers are linked with trait-associated introgressions, such as the 1RS·1BL translocation and 2NvS translocation, and the beneficial alleles, such as the end-use quality allele Glu-D1d (Dx5 + Dy10) and the semi-dwarf r-e-z allele. Furthermore, we demonstrate that these tagged CNVb markers promote a stable and cost-effective strategy for evaluating wheat germplasm resources with ultra-low-coverage sequencing data, competing with SNP array for applications such as evaluating new varieties, efficient management of collections in gene banks, and describing wheat germplasm resources in a digitalized manner. We also develop a user-friendly interactive platform, WheatCNVb ( http://wheat.cau.edu.cn/WheatCNVb/ ), for exploring the CNVb profiles over ever-increasing wheat accessions, and also propose a QR-code-like representation of individual digital CNVb fingerprint. This platform also allows uploading new CNVb profiles for comparison with stored varieties. CONCLUSIONS: The CNVb-based approach provides a low-cost and high-throughput genotyping strategy for enabling digitalized wheat germplasm management and modern breeding with precise and practical decision-making.


Subject(s)
DNA Copy Number Variations , Triticum , Triticum/genetics , Genome, Plant , High-Throughput Nucleotide Sequencing , Genetic Markers , Alleles
15.
Genes (Basel) ; 15(2)2024 01 27.
Article in English | MEDLINE | ID: mdl-38397160

ABSTRACT

The European sardine (Sardina pilchardus, Walbaum 1792) is indisputably a commercially important species. Previous studies using uneven sampling or a limited number of makers have presented sometimes conflicting evidence of the genetic structure of S. pilchardus populations. Here, we show that whole genome data from 108 individuals from 16 sampling areas across 5000 km of the species' distribution range (from the Eastern Mediterranean to the archipelago of Azores) support at least three genetic clusters. One includes individuals from Azores and Madeira, with evidence of substructure separating these two archipelagos in the Atlantic. Another cluster broadly corresponds to the center of the distribution, including the sampling sites around Iberia, separated by the Almeria-Oran front from the third cluster that includes all of the Mediterranean samples, except those from the Alboran Sea. Individuals from the Canary Islands appear to belong to the Mediterranean cluster. This suggests at least two important geographical barriers to gene flow, even though these do not seem complete, with many individuals from around Iberia and the Mediterranean showing some patterns compatible with admixture with other genetic clusters. Genomic regions corresponding to the top outliers of genetic differentiation are located in areas of low recombination indicative that genetic architecture also has a role in shaping population structure. These regions include genes related to otolith formation, a calcium carbonate structure in the inner ear previously used to distinguish S. pilchardus populations. Our results provide a baseline for further characterization of physical and genetic barriers that divide European sardine populations, and information for transnational stock management of this highly exploited species towards sustainable fisheries.


Subject(s)
Fishes , Metagenomics , Humans , Animals , Fishes/genetics , Portugal , Genome/genetics , Spain
16.
Genome Biol Evol ; 15(12)2023 Dec 01.
Article in English | MEDLINE | ID: mdl-38085033

ABSTRACT

Low-coverage whole-genome sequencing (also known as "genome skimming") is becoming an increasingly affordable approach to large-scale phylogenetic analyses. While already routinely used to recover organellar genomes, genome skimming is rather rarely utilized for recovering single-copy nuclear markers. One reason might be that only few tools exist to work with this data type within a phylogenomic context, especially to deal with fragmented genome assemblies. We here present a new software tool called Patchwork for mining phylogenetic markers from highly fragmented short-read assemblies as well as directly from sequence reads. Patchwork is an alignment-based tool that utilizes the sequence aligner DIAMOND and is written in the programming language Julia. Homologous regions are obtained via a sequence similarity search, followed by a "hit stitching" phase, in which adjacent or overlapping regions are merged into a single unit. The novel sliding window algorithm trims away any noncoding regions from the resulting sequence. We demonstrate the utility of Patchwork by recovering near-universal single-copy orthologs within a benchmarking study, and we additionally assess the performance of Patchwork in comparison with other programs. We find that Patchwork allows for accurate retrieval of (putatively) single-copy genes from genome skimming data sets at different sequencing depths with high computational speed, outperforming existing software targeting similar tasks. Patchwork is released under the GNU General Public License version 3. Installation instructions, additional documentation, and the source code itself are all available via GitHub at https://github.com/fethalen/Patchwork.


Subject(s)
Genome , Genomics , Phylogeny , Sequence Analysis, DNA/methods , Genomics/methods , Software , High-Throughput Nucleotide Sequencing/methods
17.
Gene ; 851: 146956, 2023 Jan 30.
Article in English | MEDLINE | ID: mdl-36341727

ABSTRACT

MOTIVATION: Next-generation sequencing (NGS) technologies are decisive for discovering disease-causing variants, although their cost limits their utility in a clinical setting. A cost-mitigating alternative is an extremely low coverage whole-genome sequencing (XLC-WGS). We investigated its use to identify causal variants within a multi-generational pedigree of individuals with retinitis pigmentosa (RP). Causing progressive vision loss, RP is a group of genetically heterogeneous eye disorders with approximately 60 known causal genes. RESULTS: We performed XLC-WGS in seventeen members of this pedigree, including three individuals with a confirmed diagnosis of RP. Sequencing data were processed using Illumina's DRAGEN pipeline and filtered using Illumina's genotype quality score metric (GQX). The resulting variants were analyzed using Expert Variant Interpreter (eVai) from enGenome as a prioritization tool. A nonsense known mutation (c.1625C > G; p.Ser542*) in exon 4 of the RP1 gene emerged as the most likely causal variant. We identified two homozygous carriers of this variant among the three sequenced RP cases and three heterozygous individuals with sufficient coverage of the RP1 locus. Our data show the utility of combining pedigree information with XLC-WGS as a cost-effective approach to identify disease-causing variants.


Subject(s)
Eye Proteins , Retinitis Pigmentosa , Humans , Codon, Nonsense , DNA Mutational Analysis , Eye Proteins/genetics , Microtubule-Associated Proteins/genetics , Mutation , Pedigree , Retinitis Pigmentosa/genetics , Retinitis Pigmentosa/diagnosis , Whole Genome Sequencing
18.
Poult Sci ; 102(5): 102203, 2023 May.
Article in English | MEDLINE | ID: mdl-36907123

ABSTRACT

Genetic dissection of highly polygenic traits is a challenge, in part due to the power necessary to confidently identify loci with minor effects. Experimental crosses are valuable resources for mapping such traits. Traditionally, genome-wide analyses of experimental crosses have targeted major loci using data from a single generation (often the F2) with individuals from later generations being generated for replication and fine-mapping. Here, we aim to confidently identify minor-effect loci contributing to the highly polygenic basis of the long-term, bi-directional selection responses for 56-d body weight in the Virginia body weight chicken lines. To achieve this, a strategy was developed to make use of data from all generations (F2-F18) of the advanced intercross line, developed by crossing the low and high selected lines after 40 generations of selection. A cost-efficient low-coverage sequencing based approach was used to obtain high-confidence genotypes in 1Mb bins across 99.3% of the chicken genome for >3,300 intercross individuals. In total, 12 genome-wide significant, and 30 additional suggestive QTL reaching a 10% FDR threshold, were mapped for 56-d body weight. Only 2 of these QTL reached genome-wide significance in earlier analyses of the F2 generation. The minor-effect QTL mapped here were generally due to an overall increase in power by integrating data across generations, with contributions from increased genome-coverage and improved marker information content. The 12 significant QTL explain >37% of the difference between the parental lines, three times more than 2 previously reported significant QTL. The 42 significant and suggestive QTL together explain >80%. Making integrated use of all available samples from multiple generations in experimental crosses are economically feasible using the low-cost, sequencing-based genotyping strategies outlined here. Our empirical results illustrate the value of this strategy for mapping novel minor-effect loci contributing to complex traits to provide a more confident, comprehensive view of the individual loci that form the genetic basis of the highly polygenic, long-term selection responses for 56-d body weight in the Virginia body weight chicken lines.


Subject(s)
Multifactorial Inheritance , Quantitative Trait Loci , Animals , Chromosome Mapping/veterinary , Genome-Wide Association Study/veterinary , Virginia , Crosses, Genetic , Chickens/genetics , Phenotype , Body Weight/genetics
19.
Genome Biol ; 24(1): 144, 2023 06 20.
Article in English | MEDLINE | ID: mdl-37340508

ABSTRACT

Phylogenetic trees based on copy number profiles from multiple samples of a patient are helpful to understand cancer evolution. Here, we develop a new maximum likelihood method, CNETML, to infer phylogenies from such data. CNETML is the first program to jointly infer the tree topology, node ages, and mutation rates from total copy numbers of longitudinal samples. Our extensive simulations suggest CNETML performs well on copy numbers relative to ploidy and under slight violation of model assumptions. The application of CNETML to real data generates results consistent with previous discoveries and provides novel early copy number events for further investigation.


Subject(s)
DNA Copy Number Variations , Neoplasms , Humans , Phylogeny , Mutation Rate
20.
bioRxiv ; 2023 Nov 29.
Article in English | MEDLINE | ID: mdl-38076923

ABSTRACT

Genome-wide association studies typically evaluate the autosomes and sometimes the X Chromosome, but seldom consider the Y or mitochondrial Chromosomes. We genotyped the Y and mitochondrial chromosomes in heterogeneous stock rats (Rattus norvegicus), which were created in 1984 by intercrossing eight inbred strains and have subsequently been maintained as an outbred population for 100 generations. As the Y and mitochondrial Chromosomes do not recombine, we determined which founder had contributed these chromosomes for each rat, and then performed association analysis for all complex traits (n=12,055; intersection of 12,116 phenotyped and 15,042 haplotyped rats). We found the eight founders had 8 distinct Y and 4 distinct mitochondrial Chromosomes, however only two of each were observed in our modern heterogeneous stock rat population (Generations 81-97). Despite the unusually large sample size, the p-value distribution did not deviate from expectations; there were no significant associations for behavioral, physiological, metabolome, or microbiome traits after correcting for multiple comparisons. However, both Y and mitochondrial Chromosomes were strongly associated with expression of a few genes located on those chromosomes, which provided a positive control. Our results suggest that within modern heterogeneous stock rats there are no Y and mitochondrial Chromosomes differences that strongly influence behavioral or physiological traits. These results do not address other ancestral Y and mitochondrial Chromosomes that do not appear in modern heterogeneous stock rats, nor do they address effects that may exist in other rat populations, or in other species.

SELECTION OF CITATIONS
SEARCH DETAIL