Your browser doesn't support javascript.
loading
: 20 | 50 | 100
1 - 20 de 11.042
1.
BMC Genom Data ; 25(1): 39, 2024 May 01.
Article En | MEDLINE | ID: mdl-38693490

BACKGROUND: Sunflower (Helianthus annuus) is one of the most important economic crops in oilseed production worldwide. The different cultivars exhibit variability in their resistance genes. The NAC transcription factor (TF) family plays diverse roles in plant development and stress responses. With the completion of the H. annuus genome sequence, the entire complement of genes coding for NACs has been identified. However, the reference genome of a single individual cannot cover all the genetic information of the species. RESULTS: Considering only a single reference genome to study gene families will miss many meaningful genes. A pangenome-wide survey and characterization of the NAC genes in sunflower species were conducted. In total, 139 HaNAC genes are identified, of which 114 are core and 25 are variable. Phylogenetic analysis of sunflower NAC proteins categorizes these proteins into 16 subgroups. 138 HaNACs are randomly distributed on 17 chromosomes. SNP-based haplotype analysis shows haplotype diversity of the HaNAC genes in wild accessions is richer than in landraces and modern cultivars. Ten HaNAC genes in the basal stalk rot (BSR) resistance quantitative trait loci (QTL) are found. A total of 26 HaNAC genes are differentially expressed in response to Sclerotinia head rot (SHR). A total of 137 HaNAC genes are annotated in Gene Ontology (GO) and are classified into 24 functional groups. GO functional enrichment analysis reveals that HaNAC genes are involved in various functions of the biological process. CONCLUSIONS: We identified NAC genes in H. annuus (HaNAC) on a pangenome-wide scale and analyzed S. sclerotiorum resistance-related NACs. This study provided a theoretical basis for further genomic improvement targeting resistance-related NAC genes in sunflowers.


Ascomycota , Disease Resistance , Helianthus , Phylogeny , Plant Diseases , Helianthus/genetics , Helianthus/microbiology , Ascomycota/genetics , Disease Resistance/genetics , Plant Diseases/microbiology , Plant Diseases/genetics , Plant Diseases/immunology , Plant Proteins/genetics , Transcription Factors/genetics , Genome, Plant , Multigene Family/genetics , Genes, Plant/genetics , Polymorphism, Single Nucleotide/genetics , Haplotypes/genetics
2.
Mol Genet Genomics ; 299(1): 49, 2024 May 04.
Article En | MEDLINE | ID: mdl-38704518

The main objective of this study was to determine whether the common Y-haplogroups were be associated with the risk of developing severe COVID-19 in Spanish male. We studied 479 patients who required hospitalization due to COVID-19 and 285 population controls from the region of Asturias (northern Spain), They were genotyped for several polymorphisms that define the common European Y-haplogroups. We compared the frequencies between patients and controls aged ≤ 65 and >65 years. There were no different haplogroup frequencies between the two age groups of controls. Haplogroup R1b was less common in patients aged ≤65 years. Haplogroup I was more common in the two patient´s groups compared to controls (p = 0.02). Haplogroup R1b was significantly more frequent among hypertensive patients, without difference between the hypertensive and normotensive controls. This suggested that R1b could increase the risk for severe COVID-19 among male with pre-existing hypertension. In conclusion, we described the Y-haplogroup structure among Asturians. We found an increased risk of severe COVID-19 among haplogroup I carriers, and a significantly higher frequency of R1b among hypertensive patients. These results indicate that Y-chromosome variants could serve as markers to define the risk of developing a severe form of COVID-19.


COVID-19 , Chromosomes, Human, Y , Haplotypes , Hypertension , SARS-CoV-2 , Humans , Male , COVID-19/genetics , COVID-19/epidemiology , Spain/epidemiology , Haplotypes/genetics , Aged , Middle Aged , SARS-CoV-2/genetics , Chromosomes, Human, Y/genetics , Hypertension/genetics , Genetic Predisposition to Disease , Case-Control Studies , Polymorphism, Single Nucleotide , Adult , Female
3.
Mol Biol Rep ; 51(1): 612, 2024 May 05.
Article En | MEDLINE | ID: mdl-38704770

BACKGROUND: The α-Major Regulatory Element (α-MRE), also known as HS-40, is located upstream of the α-globin gene cluster and has a crucial role in the long-range regulation of the α-globin gene expression. This enhancer is polymorphic and several haplotypes were identified in different populations, with haplotype D almost exclusively found in African populations. The purpose of this research was to identify the HS-40 haplotype associated with the 3.7 kb α-thalassemia deletion (-α3.7del) in the Portuguese population, and determine its ancestry and influence on patients' hematological phenotype. METHODS AND RESULTS: We selected 111 Portuguese individuals previously analyzed by Gap-PCR to detect the presence of the -α3.7del: 50 without the -α3.7del, 34 heterozygous and 27 homozygous for the -α3.7del. The HS-40 region was amplified by PCR followed by Sanger sequencing. Four HS-40 haplotypes were found (A to D). The distribution of HS-40 haplotypes and genotypes are significantly different between individuals with and without the -α3.7del, being haplotype D and genotype AD the most prevalent in patients with this deletion in homozygosity. Furthermore, multiple correspondence analysis revealed that individuals without the -α3.7del are grouped with other European populations, while samples with the -α3.7del are separated from these and found more closely related to the African population. CONCLUSION: This study revealed for the first time an association of the HS-40 haplotype D with the -α3.7del in the Portuguese population, and its likely African ancestry. These results may have clinical importance as in vitro analysis of haplotype D showed a decrease in its enhancer activity on α-globin gene.


Haplotypes , Sequence Deletion , alpha-Globins , alpha-Thalassemia , Female , Humans , Male , alpha-Globins/genetics , alpha-Thalassemia/genetics , Black People/genetics , Gene Frequency/genetics , Genotype , Haplotypes/genetics , Portugal , Regulatory Sequences, Nucleic Acid/genetics , Sequence Deletion/genetics
4.
Proc Natl Acad Sci U S A ; 121(23): e2403750121, 2024 Jun 04.
Article En | MEDLINE | ID: mdl-38805269

Haplotype-resolved genome assemblies were produced for Chasselas and Ugni Blanc, two heterozygous Vitis vinifera cultivars by combining high-fidelity long-read sequencing and high-throughput chromosome conformation capture (Hi-C). The telomere-to-telomere full coverage of the chromosomes allowed us to assemble separately the two haplo-genomes of both cultivars and revealed structural variations between the two haplotypes of a given cultivar. The deletions/insertions, inversions, translocations, and duplications provide insight into the evolutionary history and parental relationship among grape varieties. Integration of de novo single long-read sequencing of full-length transcript isoforms (Iso-Seq) yielded a highly improved genome annotation. Given its higher contiguity, and the robustness of the IsoSeq-based annotation, the Chasselas assembly meets the standard to become the annotated reference genome for V. vinifera. Building on these resources, we developed VitExpress, an open interactive transcriptomic platform, that provides a genome browser and integrated web tools for expression profiling, and a set of statistical tools (StatTools) for the identification of highly correlated genes. Implementation of the correlation finder tool for MybA1, a major regulator of the anthocyanin pathway, identified candidate genes associated with anthocyanin metabolism, whose expression patterns were experimentally validated as discriminating between black and white grapes. These resources and innovative tools for mining genome-related data are anticipated to foster advances in several areas of grapevine research.


Genome, Plant , Haplotypes , Transcriptome , Vitis , Vitis/genetics , Haplotypes/genetics , Transcriptome/genetics , Molecular Sequence Annotation/methods , Gene Expression Profiling/methods , Software
5.
Mol Neurodegener ; 19(1): 43, 2024 May 29.
Article En | MEDLINE | ID: mdl-38812061

A ~ 1 Mb inversion polymorphism exists within the 17q21.31 locus of the human genome as direct (H1) and inverted (H2) haplotype clades. This inversion region demonstrates high linkage disequilibrium, but the frequency of each haplotype differs across ancestries. While the H1 haplotype exists in all populations and shows a normal pattern of genetic variability and recombination, the H2 haplotype is enriched in European ancestry populations, is less frequent in African ancestry populations, and nearly absent in East Asian ancestry populations. H1 is a known risk factor for several neurodegenerative diseases, and has been associated with many other traits, suggesting its importance in cellular phenotypes of the brain and entire body. Conversely, H2 is protective for these diseases, but is associated with predisposition to recurrent microdeletion syndromes and neurodevelopmental disorders such as autism. Many single nucleotide variants and copy number variants define H1/H2 haplotypes and sub-haplotypes, but identifying the causal variant(s) for specific diseases and phenotypes is complex due to the extended linkage equilibrium. In this review, we assess the current knowledge of this inversion region regarding genomic structure, gene expression, cellular phenotypes, and disease association. We discuss recent discoveries and challenges, evaluate gaps in knowledge, and highlight the importance of understanding the effect of the 17q21.31 haplotypes to promote advances in precision medicine and drug discovery for several diseases.


Haplotypes , Neurodegenerative Diseases , tau Proteins , Humans , Haplotypes/genetics , Neurodegenerative Diseases/genetics , tau Proteins/genetics , Genetic Predisposition to Disease/genetics , Linkage Disequilibrium/genetics , Polymorphism, Single Nucleotide/genetics
6.
Bull Exp Biol Med ; 176(5): 599-602, 2024 Mar.
Article En | MEDLINE | ID: mdl-38724812

We studied the relationship between the HSPA5 gene polymorphisms and the risk of type 2 diabetes mellitus. Genotyping of three SNPs of the HSPA5 gene was performed in 1579 patients with type 2 diabetes mellitus and 1650 healthy individuals. It was found that the genotypes rs55736103-T/T, rs12009-G/G, and rs391957-T/C-T/T are associated with increased risk of type 2 diabetes in females. A rare haplotype, rs55736103C-rs12009A-rs391957T HSPA5, associated with a reduced risk of type 2 diabetes in females was found. Associations between polymorphisms of the HSPA5 gene encoding heat shock protein and the risk of type 2 diabetes mellitus were established for the first time.


Diabetes Mellitus, Type 2 , Endoplasmic Reticulum Chaperone BiP , Genetic Predisposition to Disease , Heat-Shock Proteins , Polymorphism, Single Nucleotide , Humans , Diabetes Mellitus, Type 2/genetics , Female , Polymorphism, Single Nucleotide/genetics , Male , Middle Aged , Genetic Predisposition to Disease/genetics , Heat-Shock Proteins/genetics , Case-Control Studies , Haplotypes/genetics , Gene Frequency/genetics , Aged , Genotype , Risk Factors , Adult
7.
Plant Cell Rep ; 43(6): 156, 2024 May 31.
Article En | MEDLINE | ID: mdl-38819495

KEY MESSAGE: In current study candidate gene (261 genes) based association mapping on 144 pigeonpea accessions for flowering time and related traits and 29 MTAs producing eight superior haplotypes were identified. In the current study, we have conducted an association analysis for flowering-associated traits in a diverse pigeonpea mini-core collection comprising 144 accessions using the SNP data of 261 flowering-related genes. In total, 13,449 SNPs were detected in the current study, which ranged from 743 (ICP10228) to 1469 (ICP6668) among the individuals. The nucleotide diversity (0.28) and Watterson estimates (0.34) reflected substantial diversity, while Tajima's D (-0.70) indicated the abundance of rare alleles in the collection. A total of 29 marker trait associations (MTAs) were identified, among which 19 were unique to days to first flowering (DOF) and/or days to fifty percent flowering (DFF), 9 to plant height (PH), and 1 to determinate (Det) growth habit using 3 years of phenotypic data. Among these MTAs, six were common to DOF and/or DFF, and four were common to DOF/DFF along with the PH, reflecting their pleiotropic action. These 29 MTAs spanned 25 genes, among which 10 genes clustered in the protein-protein network analysis, indicating their concerted involvement in floral induction. Furthermore, we identified eight haplotypes, four of which regulate late flowering, while the remaining four regulate early flowering using the MTAs. Interestingly, haplotypes conferring late flowering (H001, H002, and H008) were found to be taller, while those involved in early flowering (H003) were shorter in height. The expression pattern of these genes, as inferred from the transcriptome data, also underpinned their involvement in floral induction. The haplotypes identified will be highly useful to the pigeonpea breeding community for haplotype-based breeding.


Cajanus , Flowers , Haplotypes , Polymorphism, Single Nucleotide , Flowers/genetics , Flowers/physiology , Flowers/growth & development , Haplotypes/genetics , Cajanus/genetics , Cajanus/growth & development , Polymorphism, Single Nucleotide/genetics , Genes, Plant/genetics , Phenotype , Gene Expression Regulation, Plant , Genetic Association Studies , Quantitative Trait Loci/genetics
8.
Nat Comput Sci ; 4(5): 360-366, 2024 May.
Article En | MEDLINE | ID: mdl-38745108

For many genome-wide association studies, imputing genotypes from a haplotype reference panel is a necessary step. Over the past 15 years, reference panels have become larger and more diverse, leading to improvements in imputation accuracy. However, the latest generation of reference panels is subject to restrictions on data sharing due to concerns about privacy, limiting their usefulness for genotype imputation. In this context, here we propose RESHAPE, a method that employs a recombination Poisson process on a reference panel to simulate the genomes of hypothetical descendants after multiple generations. This data transformation helps to protect against re-identification threats and preserves data attributes, such as linkage disequilibrium patterns and, to some degree, identity-by-descent sharing, allowing for genotype imputation. Our experiments on gold-standard datasets show that simulated descendants up to eight generations can serve as reference panels without substantially reducing genotype imputation accuracy.


Genome-Wide Association Study , Genotype , Humans , Genome-Wide Association Study/methods , Linkage Disequilibrium , Haplotypes/genetics , Polymorphism, Single Nucleotide/genetics , Information Dissemination/methods , Computer Simulation , Models, Genetic , Algorithms , Genome, Human/genetics , Poisson Distribution
9.
J Transl Med ; 22(1): 451, 2024 May 13.
Article En | MEDLINE | ID: mdl-38741136

BACKGROUND: Facioscapulohumeral muscular dystrophy (FSHD) is a high-prevalence autosomal dominant neuromuscular disease characterized by significant clinical and genetic heterogeneity. Genetic diagnosis of FSHD remains a challenge because it cannot be detected by standard sequencing methods and requires a complex diagnosis workflow. METHODS: We developed a comprehensive genetic FSHD detection method based on Oxford Nanopore Technologies (ONT) whole-genome sequencing. Using a case-control design, we applied this procedure to 29 samples and compared the results with those from optical genome mapping (OGM), bisulfite sequencing (BSS), and whole-exome sequencing (WES). RESULTS: Using our ONT-based method, we identified 59 haplotypes (35 4qA and 24 4qB) among the 29 samples (including a mosaic sample), as well as the number of D4Z4 repeat units (RUs). The pathogenetic D4Z4 RU contraction identified by our ONT-based method showed 100% concordance with OGM results. The methylation levels of the most distal D4Z4 RU and the double homeobox 4 gene (DUX4) detected by ONT sequencing are highly consistent with the BSS results and showed excellent diagnostic efficiency. Additionally, our ONT-based method provided an independent methylation profile analysis of two permissive 4qA alleles, reflecting a more accurate scenario than traditional BSS. The ONT-based method detected 17 variations in three FSHD2-related genes from nine samples, showing 100% concordance with WES. CONCLUSIONS: Our ONT-based FSHD detection method is a comprehensive method for identifying pathogenetic D4Z4 RU contractions, methylation level alterations, allele-specific methylation of two 4qA haplotypes, and variations in FSHD2-related genes, which will all greatly improve genetic testing for FSHD.


DNA Methylation , Muscular Dystrophy, Facioscapulohumeral , Whole Genome Sequencing , Muscular Dystrophy, Facioscapulohumeral/genetics , Muscular Dystrophy, Facioscapulohumeral/diagnosis , Humans , DNA Methylation/genetics , Haplotypes/genetics , Male , Case-Control Studies , Homeodomain Proteins/genetics , Female , Nanopore Sequencing/methods , Adult
10.
Int J Mol Sci ; 25(9)2024 May 03.
Article En | MEDLINE | ID: mdl-38732219

Epstein-Barr virus (EBV) is a ubiquitous gammaherpesvirus etiologically associated with benign and malignant diseases. Since the pathogenic mechanisms of EBV are not fully understood, understanding EBV genetic diversity is an ongoing goal. Therefore, the present work describes the genetic diversity of the lytic gene BZLF1 in a sampling of 70 EBV-positive cases from southeastern Brazil. Additionally, together with the genetic regions previously characterized, the aim of the present study was to determine the impact of viral genetic factors that may influence EBV genetic diversity. Accordingly, the phylogenetic analysis of the BZLF1 indicated two main clades with high support, BZ-A and BZ-B (PP > 0.85). Thus, the BZ-A clade was the most diverse clade associated with the main polymorphisms investigated, including the haplotype Type 1 + V3 (p < 0.001). Furthermore, the multigene phylogenetic analysis (MLA) between BZLF1 and the oncogene LMP1 showed specific clusters, revealing haplotypic segregation that previous single-gene phylogenies from both genes failed to demonstrate. Surprisingly, the LMP1 Raji-related variant clusters were shown to be more diverse, associated with BZ-A/B and the Type 2/1 + V3 haplotypes. Finally, due to the high haplotypic diversity of the Raji-related variants, the number of DNA recombination-inducing motifs (DRIMs) was evaluated within the different clusters defined by the MLA. Similarly, the haplotype BZ-A + Raji was shown to harbor a greater number of DRIMs (p < 0.001). These results call attention to the high haplotype diversity of EBV in southeast Brazil and strengthen the hypothesis of the recombinant potential of South American Raji-related variants via the LMP1 oncogene.


Epstein-Barr Virus Infections , Genetic Variation , Herpesvirus 4, Human , Phylogeny , Recombination, Genetic , Herpesvirus 4, Human/genetics , Humans , Brazil , Epstein-Barr Virus Infections/virology , Epstein-Barr Virus Infections/genetics , Trans-Activators/genetics , Male , Female , Haplotypes/genetics , Adult , Viral Matrix Proteins/genetics , Child , Middle Aged , Adolescent , Virus Latency/genetics , Child, Preschool , Young Adult
11.
Int J Med Sci ; 21(6): 1064-1071, 2024.
Article En | MEDLINE | ID: mdl-38774744

Hyperlipidemia is notorious for causing coronary artery disease (CAD). IL-18 is a proinflammtory cytokine that contributes to the pathogenesis of CAD. Previous reports have revealed that genetic polymorphism of IL-18 is associated with its expression level as well as the susceptibility to CAD. In the present study, we aim to investigate the relationship between IL-18 single nucleotide polymorphisms (SNPs) and hyperlipidemia in the Han Chinese population in Taiwan. A total of 580 participants older than 30 were recruited from the community. We collected the demographics, self-reported disease histories, and lifestyles. We also assessed the levels of lipid profiles including total cholesterol (CHOL), triglyceride, low-density lipoprotein cholesterol (LDL-C) and high-density lipoprotein cholesterol. Two SNPs, rs3882891C/A (intron 5) and rs1946518A/C (promoter -607) of IL-18 were elucidated by polymerase chain reaction-restriction fragment length polymorphism (PCR-RFLP) methods. Our results revealed that rs3882891 AA was associated with lower risk of hypercholesterolemia, higher CHOL and LDL-C in subjects (p=0.003, p=0.000 and p=0.005 separately), and rs1946518 CC was associated with hypercholesterolemia, higher CHOL and LDL-C as well (p=0.021, p=0.003 and p=0.001 separately) Furthermore, both SNPs were associated with IL-18 expression level, which was examined by Genotype-Tissue Expression (GTEx) Portal (p=0.042 and 0.016 separately). Finally, the haplotype of IL-18 was subsequently arranged in the order of rs3882891 and rs1946518. The result revealed that the AC haplotype of 2 IL-18 SNPs was also associated with lower risk of hypercholesterolemia, lower levels of CHOL and LDL-C (p=0.01, p=0.001 and 0.003). The current study is the first to report the association between IL-18 SNPs and hyperlipidemia in the Chinese Han population.


Genetic Predisposition to Disease , Hyperlipidemias , Interleukin-18 , Polymorphism, Single Nucleotide , Humans , Interleukin-18/genetics , Male , Middle Aged , Female , Hyperlipidemias/genetics , Adult , Taiwan/epidemiology , Asian People/genetics , Aged , Haplotypes/genetics , Coronary Artery Disease/genetics , Coronary Artery Disease/blood , Coronary Artery Disease/epidemiology , Cholesterol, LDL/blood , Genetic Association Studies
12.
Mol Ecol Resour ; 24(5): e13969, 2024 Jul.
Article En | MEDLINE | ID: mdl-38747336

A major aim of evolutionary biology is to understand why patterns of genomic diversity vary within taxa and space. Large-scale genomic studies of widespread species are useful for studying how environment and demography shape patterns of genomic divergence. Here, we describe one of the most geographically comprehensive surveys of genomic variation in a wild vertebrate to date; the great tit (Parus major) HapMap project. We screened ca 500,000 SNP markers across 647 individuals from 29 populations, spanning ~30 degrees of latitude and 40 degrees of longitude - almost the entire geographical range of the European subspecies. Genome-wide variation was consistent with a recent colonisation across Europe from a South-East European refugium, with bottlenecks and reduced genetic diversity in island populations. Differentiation across the genome was highly heterogeneous, with clear 'islands of differentiation', even among populations with very low levels of genome-wide differentiation. Low local recombination rates were a strong predictor of high local genomic differentiation (FST), especially in island and peripheral mainland populations, suggesting that the interplay between genetic drift and recombination causes highly heterogeneous differentiation landscapes. We also detected genomic outlier regions that were confined to one or more peripheral great tit populations, probably as a result of recent directional selection at the species' range edges. Haplotype-based measures of selection were related to recombination rate, albeit less strongly, and highlighted population-specific sweeps that likely resulted from positive selection. Our study highlights how comprehensive screens of genomic variation in wild organisms can provide unique insights into spatio-temporal evolutionary dynamics.


Genetic Variation , Polymorphism, Single Nucleotide , Songbirds , Animals , Songbirds/genetics , Songbirds/classification , Genetics, Population/methods , Europe , Passeriformes/genetics , Passeriformes/classification , Haplotypes/genetics , Recombination, Genetic , Selection, Genetic
13.
Hum Genomics ; 18(1): 53, 2024 May 27.
Article En | MEDLINE | ID: mdl-38802968

BACKGROUND: The human lineage has undergone a postcranial skeleton gracilization (i.e. lower bone mass and strength relative to body size) compared to other primates and archaic populations such as the Neanderthals. This gracilization has been traditionally explained by differences in the mechanical load that our ancestors exercised. However, there is growing evidence that gracilization could also be genetically influenced. RESULTS: We have analyzed the LRP5 gene, which is known to be associated with high bone mineral density conditions, from an evolutionary and functional point of view. Taking advantage of the published genomes of archaic Homo populations, our results suggest that this gene has a complex evolutionary history both between archaic and living humans and within living human populations. In particular, we identified the presence of different selective pressures in archaics and extant modern humans, as well as evidence of positive selection in the African and South East Asian populations from the 1000 Genomes Project. Furthermore, we observed a very limited evidence of archaic introgression in this gene (only at three haplotypes of East Asian ancestry out of the 1000 Genomes), compatible with a general erasing of the fingerprint of archaic introgression due to functional differences in archaics compared to extant modern humans. In agreement with this hypothesis, we observed private mutations in the archaic genomes that we experimentally validated as putatively increasing bone mineral density. In particular, four of five archaic missense mutations affecting the first ß-propeller of LRP5 displayed enhanced Wnt pathway activation, of which two also displayed reduced negative regulation. CONCLUSIONS: In summary, these data suggest a genetic component contributing to the understanding of skeletal differences between extant modern humans and archaic Homo populations.


Evolution, Molecular , Low Density Lipoprotein Receptor-Related Protein-5 , Neanderthals , Humans , Low Density Lipoprotein Receptor-Related Protein-5/genetics , Animals , Neanderthals/genetics , Selection, Genetic/genetics , Hominidae/genetics , Haplotypes/genetics , Bone Density/genetics , Genome, Human/genetics
15.
Mol Biol Rep ; 51(1): 486, 2024 Apr 05.
Article En | MEDLINE | ID: mdl-38578390

BACKGROUND: Colorectal cancer (CRC) is a type of neoplasm, developing in the colon or rectum. The exact etiology of CRC is not well known, but the role of genetic, epigenetic, and environmental factors are established in its pathogenesis. Therefore, the aim of this research was to explore the effects of ANRIL polymorphisms on the CRC and its clinical findings. METHODS AND RESULTS: The peripheral blood specimens were collected from 142 CRC patients and 225 controls referred to Milad Hospital, Tehran, Iran. PCR- RFLP method was used to analyze ANRIL rs1333040, rs10757274 rs4977574, and rs1333048 polymorphisms. The ANRIL rs1333040 polymorphism was related to a higher risk of CRC in the co-dominant, dominant, and log-additive models. ANRIL rs10757274, rs4977574, and rs1333048 polymorphisms showed no effect on CRC susceptibility. The CGAA and TGGA haplotypes of ANRIL rs1333040/ rs10757274/ rs4977574/rs1333048 polymorphisms were associated with the higher and the lower risk of CRC respectively. The rs1333040 polymorphism was associated with higher TNM stages (III and IV). The frequency of ANRIL rs10757274 polymorphism was lower in CRC patients over 50 years of age only in the dominant model. In addition, the rs10757274 was associated with well differentiation in CRC patients. CONCLUSION: The ANRIL rs1333040 polymorphism was associated with a higher risk of CRC and higher TNM stages. ANRIL rs10757274 polymorphism was associated with the well-differentiated tumor in CRC.


Colorectal Neoplasms , RNA, Long Noncoding , Humans , Middle Aged , Case-Control Studies , Colorectal Neoplasms/genetics , Colorectal Neoplasms/pathology , Genetic Predisposition to Disease , Haplotypes/genetics , Iran , Polymorphism, Single Nucleotide/genetics , RNA, Long Noncoding/genetics
16.
Nat Commun ; 15(1): 3041, 2024 Apr 08.
Article En | MEDLINE | ID: mdl-38589412

Sugarcane is a vital crop with significant economic and industrial value. However, the cultivated sugarcane's ultra-complex genome still needs to be resolved due to its high ploidy and extensive recombination between the two subgenomes. Here, we generate a chromosomal-scale, haplotype-resolved genome assembly for a hybrid sugarcane cultivar ZZ1. This assembly contains 10.4 Gb genomic sequences and 68,509 annotated genes with defined alleles in two sub-genomes distributed in 99 original and 15 recombined chromosomes. RNA-seq data analysis shows that sugar accumulation-associated gene families have been primarily expanded from the ZZSO subgenome. However, genes responding to pokkah boeng disease susceptibility have been derived dominantly from the ZZSS subgenome. The region harboring the possible smut resistance genes has expanded significantly. Among them, the expansion of WAK and FLS2 families is proposed to have occurred during the breeding of ZZ1. Our findings provide insights into the complex genome of hybrid sugarcane cultivars and pave the way for future genomics and molecular breeding studies in sugarcane.


Saccharum , Saccharum/genetics , Plant Breeding , Genomics , Haplotypes/genetics , Chromosomes
17.
Nat Commun ; 15(1): 3126, 2024 Apr 11.
Article En | MEDLINE | ID: mdl-38605047

Long reads that cover more variants per read raise opportunities for accurate haplotype construction, whereas the genotype errors of single nucleotide polymorphisms pose great computational challenges for haplotyping tools. Here we introduce KSNP, an efficient haplotype construction tool based on the de Bruijn graph (DBG). KSNP leverages the ability of DBG in handling high-throughput erroneous reads to tackle the challenges. Compared to other notable tools in this field, KSNP achieves at least 5-fold speedup while producing comparable haplotype results. The time required for assembling human haplotypes is reduced to nearly the data-in time.


Algorithms , Polymorphism, Single Nucleotide , Humans , Haplotypes/genetics , Sequence Analysis, DNA/methods , High-Throughput Nucleotide Sequencing/methods , Software
18.
Mol Ecol ; 33(9): e17337, 2024 May.
Article En | MEDLINE | ID: mdl-38558465

Phylogeography bears an important part in ecology and evolution. However, current phylogeographic studies are largely constrained by limited numbers of individual samples. Using an environmental DNA (eDNA) assay for phylogeographic analyses, this study provides detailed information regarding the history of Siberian stone loach Barbatula toni, a primary freshwater fish across the whole range of Hokkaido, Japan. Based on an eDNA metabarcoding on 293 river water samples, we detected eDNA from B. toni in 189 rivers. A total of 51 samples, representing the entire island, were then selected from the B. toni eDNA-positive sample set for the subsequent analyses. To elucidate the phylogeographic structure of B. toni, newly developed eDNA metabarcoding primers (Barba-cytb-F/R) were applied to these samples, specifically targeting their haplotypic variation in cytochrome b. After a bioinformatic processing to mitigate haplotypic false positives, a total of 50 eDNA haplotypes were identified. Two regionally restricted, genetically distinct lineages of the species were revealed as a result of phylogeographic analyses on the haplotypes and tissue-derived DNA from B. toni. According to a molecular clock analysis, they have been genetically isolated for at least 1.5 million years, suggesting their ancient origin and colonisation of Hokkaido, presumably in the glacial periods. These results demonstrate how freshwater fishes can alter their distributions over evolutionary timescales and how eDNA assay can deepen our understanding of phylogeography.


DNA Barcoding, Taxonomic , DNA, Environmental , Haplotypes , Phylogeography , Rivers , Animals , Haplotypes/genetics , Japan , DNA, Environmental/genetics , Cytochromes b/genetics , Fresh Water , Phylogeny , Cypriniformes/genetics , Cypriniformes/classification
19.
Sci Rep ; 14(1): 7892, 2024 04 03.
Article En | MEDLINE | ID: mdl-38570611

Haplotype-resolved genome assembly plays a crucial role in understanding allele-specific functions. However, obtaining haplotype-resolved assembly for auto-polyploid genomes remains challenging. Existing methods can be classified into reference-based phasing, assembly-based phasing, and gamete binning. Nevertheless, there is a lack of cost-effective and efficient methods for haplotyping auto-polyploid genomes. In this study, we propose a novel phasing algorithm called PolyGH, which combines Hi-C and gametic data. We conducted experiments on tetraploid potato cultivars and divided the method into three steps. Firstly, gametic data was utilized to bin non-collapsed contigs, followed by merging adjacent fragments of the same type within the same contig. Secondly, accurate Hi-C signals related to differential genomic regions were acquired using unique k-mers. Finally, collapsed fragments were assigned to haplotigs based on combined Hi-C and gametic signals. Comparing PolyGH with Hi-C-based and gametic data-based methods, we found that PolyGH exhibited superior performance in haplotyping auto-polyploid genomes when integrating both data types. This approach has the potential to enhance haplotype-resolved assembly for auto-polyploid genomes.


Germ Cells , Polyploidy , Humans , Sequence Analysis, DNA/methods , Haplotypes/genetics , Alleles
20.
Methods Mol Biol ; 2744: 53-76, 2024.
Article En | MEDLINE | ID: mdl-38683311

DNA sequences are increasingly used for large-scale biodiversity inventories. Because these genetic data avoid the time-consuming initial sorting of specimens based on their phenotypic attributes, they have been recently incorporated into taxonomic workflows for overlooked and diverse taxa. Major statistical developments have accompanied this new practice, and several models have been proposed to delimit species with single-locus DNA sequences. However, proposed approaches to date make different assumptions regarding taxon lineage history, leading to strong discordance whenever comparisons are made among methods. Distance-based methods, such as Automatic Barcode Gap Discovery (ABGD) and Assemble Species by Automatic Partitioning (ASAP), rely on the detection of a barcode gap (i.e., the lack of overlap in the distributions of intraspecific and interspecific genetic distances) and the associated threshold in genetic distances. Network-based methods, as exemplified by the REfined Single Linkage (RESL) algorithm for the generation of Barcode Index Numbers (BINs), use connectivity statistics to hierarchically cluster-related haplotypes into molecular operational taxonomic units (MOTUs) which serve as species proxies. Tree-based methods, including Poisson Tree Processes (PTP) and the General Mixed Yule Coalescent (GMYC), fit statistical models to phylogenetic trees by maximum likelihood or Bayesian frameworks.Multiple webservers and stand-alone versions of these methods are now available, complicating decision-making regarding the most appropriate approach to use for a given taxon of interest. For instance, tree-based methods require an initial phylogenetic reconstruction, and multiple options are now available for this purpose such as RAxML and BEAST. Across all examined species delimitation methods, judicious parameter setting is paramount, as different model parameterizations can lead to differing conclusions. The objective of this chapter is to guide users step-by-step through all the procedures involved for each of these methods, while aggregating all necessary information required to conduct these analyses. The "Materials" section details how to prepare and format input files, including options to align sequences and conduct tree reconstruction with Maximum Likelihood and Bayesian inference. The Methods section presents the procedure and options available to conduct species delimitation analyses, including distance-, network-, and tree-based models. Finally, limits and future developments are discussed in the Notes section. Most importantly, species delimitation methods discussed herein are categorized based on five indicators: reliability, availability, scalability, understandability, and usability, all of which are fundamental properties needed for any approach to gain unanimous adoption within the DNA barcoding community moving forward.


Algorithms , DNA Barcoding, Taxonomic , Phylogeny , DNA Barcoding, Taxonomic/methods , Software , Biodiversity , Sequence Analysis, DNA/methods , Haplotypes/genetics
...