Your browser doesn't support javascript.
loading
: 20 | 50 | 100
1 - 20 de 40
1.
Imeta ; 3(1): e154, 2024 Feb.
Article En | MEDLINE | ID: mdl-38868520

Structural variations (SVs) are a major source of domestication and improvement traits. We present the first duck pan-genome constructed using five genome assemblies capturing ∼40.98 Mb new sequences. This pan-genome together with high-depth sequencing data (∼46.5×) identified 101,041 SVs, of which substantial proportions were derived from transposable element (TE) activity. Many TE-derived SVs anchoring in a gene body or regulatory region are linked to duck's domestication and improvement. By combining quantitative genetics with molecular experiments, we, for the first time, unraveled a 6945 bp Gypsy insertion as a functional mutation of the major gene IGF2BP1 associated with duck bodyweight. This Gypsy insertion, to our knowledge, explains the largest effect on bodyweight among avian species (27.61% of phenotypic variation). In addition, we also examined another 6634 bp Gypsy insertion in MITF intron, which triggers a novel transcript of MITF, thereby contributing to the development of white plumage. Our findings highlight the importance of using a pan-genome as a reference in genomics studies and illuminate the impact of transposons in trait formation and livestock breeding.

2.
bioRxiv ; 2024 Jun 10.
Article En | MEDLINE | ID: mdl-38895432

Understanding the function and fitness effects of diverse plant genomes requires transferable models. Language models (LMs) pre-trained on large-scale biological sequences can learn evolutionary conservation, thus expected to offer better cross-species prediction through fine-tuning on limited labeled data compared to supervised deep learning models. We introduce PlantCaduceus, a plant DNA LM based on the Caduceus and Mamba architectures, pre-trained on a carefully curated dataset consisting of 16 diverse Angiosperm genomes. Fine-tuning PlantCaduceus on limited labeled Arabidopsis data for four tasks involving transcription and translation modeling demonstrated high transferability to maize that diverged 160 million years ago, outperforming the best baseline model by 1.45-fold to 7.23-fold. PlantCaduceus also enables genome-wide deleterious mutation identification without multiple sequence alignment (MSA). PlantCaduceus demonstrated a threefold enrichment of rare alleles in prioritized deleterious mutations compared to MSA-based methods and matched state-of-the-art protein LMs. PlantCaduceus is a versatile pre-trained DNA LM expected to accelerate plant genomics and crop breeding applications.

3.
Plant Biotechnol J ; 22(3): 544-554, 2024 Mar.
Article En | MEDLINE | ID: mdl-37961986

Inversions, a type of chromosomal structural variation, significantly influence plant adaptation and gene functions by impacting gene expression and recombination rates. However, compared with other structural variations, their roles in functional biology and crop improvement remain largely unexplored. In this review, we highlight technological and methodological advancements that have allowed a comprehensive understanding of inversion variants through the pangenome framework and machine learning algorithms. Genome editing is an efficient method for inducing or reversing inversion mutations in plants, providing an effective mechanism to modify local recombination rates. Given the potential of inversions in crop breeding, we anticipate increasing attention on inversions from the scientific community in future research and breeding applications.


Gene Editing , Plant Breeding , Plant Breeding/methods , Gene Editing/methods , Plants/genetics , Chromosome Inversion/genetics , Genome, Plant/genetics
4.
bioRxiv ; 2023 Nov 02.
Article En | MEDLINE | ID: mdl-37961642

AlphaMissense is a recently developed method that is designed to classify missense variants into pathogenic, benign, or ambiguous categories across the entire human proteome. Asparagine Synthetase Deficiency (ASNSD) is a developmental disorder associated with severe symptoms, including congenital microcephaly, seizures, and premature death. Diagnosing ASNSD relies on identifying mutations in the asparagine synthetase (ASNS) gene through DNA sequencing and determining whether these variants are pathogenic or benign. Pathogenic ASNS variants are predicted to disrupt the protein's structure and/or function, leading to asparagine depletion within cells and inhibition of cell growth. AlphaMissense offers a promising solution for the rapid classification of ASNS variants established by DNA sequencing and provides a community resource of pathogenicity scores and classifications for newly diagnosed ASNSD patients. Here, we assessed AlphaMissense's utility in ASNSD by benchmarking it against known critical residues in ASNS and evaluating its performance against a list of previously reported ASNSD-associated variants. We also present a pipeline to calculate AlphaMissense scores for any protein in the UniProt database. AlphaMissense accurately attributed a high average pathogenicity score to known critical residues within the two ASNS active sites and the connecting intramolecular tunnel. The program successfully categorized 78.9% of known ASNSD-associated missense variants as pathogenic. The remaining variants were primarily labeled as ambiguous, with a smaller proportion classified as benign. This study underscores the potential role of AlphaMissense in classifying ASNS variants in suspected cases of ASNSD, potentially providing clarity to patients and their families grappling with ongoing diagnostic uncertainty.

5.
Genome Biol Evol ; 15(9)2023 09 04.
Article En | MEDLINE | ID: mdl-37728212

Bats are exceptional among mammals for their powered flight, extended lifespans, and robust immune systems and therefore have been of particular interest in comparative genomics. Using the Oxford Nanopore Technologies long-read platform, we sequenced the genomes of two bat species with key phylogenetic positions, the Jamaican fruit bat (Artibeus jamaicensis) and the Mesoamerican mustached bat (Pteronotus mesoamericanus), and carried out a comprehensive comparative genomic analysis with a diverse collection of bats and other mammals. The high-quality, long-read genome assemblies revealed a contraction of interferon (IFN)-α at the immunity-related type I IFN locus in bats, resulting in a shift in relative IFN-ω and IFN-α copy numbers. Contradicting previous hypotheses of constitutive expression of IFN-α being a feature of the bat immune system, three bat species lost all IFN-α genes. This shift to IFN-ω could contribute to the increased viral tolerance that has made bats a common reservoir for viruses that can be transmitted to humans. Antiviral genes stimulated by type I IFNs also showed evidence of rapid evolution, including a lineage-specific duplication of IFN-induced transmembrane genes and positive selection in IFIT2. In addition, 33 tumor suppressors and 6 DNA-repair genes showed signs of positive selection, perhaps contributing to increased longevity and reduced cancer rates in bats. The robust immune systems of bats rely on both bat-wide and lineage-specific evolution in the immune gene repertoire, suggesting diverse immune strategies. Our study provides new genomic resources for bats and sheds new light on the extraordinary molecular evolution in this critically important group of mammals.


Chiroptera , Neoplasms , Humans , Animals , Chiroptera/genetics , Phylogeny , Evolution, Molecular , Genomics , Longevity , Neoplasms/genetics , Neoplasms/veterinary
6.
bioRxiv ; 2023 Nov 10.
Article En | MEDLINE | ID: mdl-37503269

Meiotic drivers subvert Mendelian expectations by manipulating reproductive development to bias their own transmission. Chromosomal drive typically functions in asymmetric female meiosis, while gene drive is normally postmeiotic and typically found in males. Using single molecule and single-pollen genome sequencing, we describe Teosinte Pollen Drive, an instance of gene drive in hybrids between maize (Zea mays ssp. mays) and teosinte mexicana (Zea mays ssp. mexicana), that depends on RNA interference (RNAi). 22nt small RNAs from a non-coding RNA hairpin in mexicana depend on Dicer-Like 2 (Dcl2) and target Teosinte Drive Responder 1 (Tdr1), which encodes a lipase required for pollen viability. Dcl2, Tdr1, and the hairpin are in tight pseudolinkage on chromosome 5, but only when transmitted through the male. Introgression of mexicana into early cultivated maize is thought to have been critical to its geographical dispersal throughout the Americas, and a tightly linked inversion in mexicana spans a major domestication sweep in modern maize. A survey of maize landraces and sympatric populations of teosinte mexicana reveals correlated patterns of admixture among unlinked genes required for RNAi on at least 4 chromosomes that are also subject to gene drive in pollen from synthetic hybrids. Teosinte Pollen Drive likely played a major role in maize domestication and diversification, and offers an explanation for the widespread abundance of "self" small RNAs in the germlines of plants and animals.

7.
Ecol Evol ; 12(8): e9179, 2022 Aug.
Article En | MEDLINE | ID: mdl-36016815

Many plants exchanged in the global redistribution of species in the last 200 years, particularly between South Africa and Australia, have become threatening invasive species in their introduced range. Refining our understanding of the genetic diversity and population structure of native and alien populations, introduction pathways, propagule pressure, naturalization, and initial spread, can transform the effectiveness of management and prevention of further introductions. We used 20,221 single nucleotide polymorphisms to reconstruct the invasion of a coastal shrub, Chrysanthemoides monilifera ssp. rotundata (bitou bush) from South Africa, into eastern Australia (EAU), and Western Australia (WAU). We determined genetic diversity and population structure across the native and introduced ranges and compared hypothesized invasion scenarios using Bayesian modeling. We detected considerable genetic structure in the native range, as well as differentiation between populations in the native and introduced range. Phylogenetic analysis showed the introduced samples to be most closely related to the southern-most native populations, although Bayesian analysis inferred introduction from a ghost population. We detected strong genetic bottlenecks during the founding of both the EAU and WAU populations. It is likely that the WAU population was introduced from EAU, possibly involving an unsampled ghost population. The number of private alleles and polymorphic SNPs successively decreased from South Africa to EAU to WAU, although heterozygosity remained high. That bitou bush remains an invasion threat in EAU, despite reduced genetic diversity, provides a cautionary biosecurity message regarding the risk of introduction of potentially invasive species via shipping routes.

8.
Plant Genome ; 15(2): e20204, 2022 06.
Article En | MEDLINE | ID: mdl-35416423

Alignments of multiple genomes are a cornerstone of comparative genomics, but generating these alignments remains technically challenging and often impractical. We developed the msa_pipeline workflow (https://bitbucket.org/bucklerlab/msa_pipeline) to allow practical and sensitive multiple alignment of diverged plant genomes and calculation of conservation scores with minimal user inputs. As high repeat content and genomic divergence are substantial challenges in plant genome alignment, we also explored the effect of different masking approaches and parameters of the LAST aligner using genome assemblies of 33 grass species. Compared with conventional masking with RepeatMasker, a masking approach based on k-mers (nucleotide sequences of k length) increased the alignment rate of coding sequence and noncoding functional regions by 25 and 14%, respectively. We further found that default alignment parameters generally perform well, but parameter tuning can increase the alignment rate for noncoding functional regions by over 52% compared with default LAST settings. Finally, by increasing alignment sensitivity from the default baseline, parameter tuning can increase the number of noncoding sites that can be scored for conservation by over 76%. Overall, tuning of masking and alignment parameters can generate optimized multiple alignments to drive biological discovery in plants.


Genome, Plant , Genomics , Base Sequence , Workflow
9.
Mol Plant Pathol ; 23(5): 733-748, 2022 05.
Article En | MEDLINE | ID: mdl-35239989

Brassica napus (oilseed rape, canola) seedling resistance to Leptosphaeria maculans, the causal agent of blackleg (stem canker) disease, follows a gene-for-gene relationship. The avirulence genes AvrLmS and AvrLep2 were described to be perceived by the resistance genes RlmS and LepR2, respectively, present in B. napus 'Surpass 400'. Here we report cloning of AvrLmS and AvrLep2 using two independent methods. AvrLmS was cloned using combined in vitro crossing between avirulent and virulent isolates with sequencing of DNA bulks from avirulent or virulent progeny (bulked segregant sequencing). AvrLep2 was cloned using a biparental cross of avirulent and virulent L. maculans isolates and a classical map-based cloning approach. Taking these two approaches independently, we found that AvrLmS and AvrLep2 are the same gene. Complementation of virulent isolates with this gene confirmed its role in inducing resistance on Surpass 400, Topas-LepR2, and an RlmS-line. The gene, renamed AvrLmS-Lep2, encodes a small cysteine-rich protein of unknown function with an N-terminal secretory signal peptide, which is a common feature of the majority of effectors from extracellular fungal plant pathogens. The AvrLmS-Lep2/LepR2 interaction phenotype was found to vary from a typical hypersensitive response through intermediate resistance sometimes towards susceptibility, depending on the inoculation conditions. AvrLmS-Lep2 was nevertheless sufficient to significantly slow the systemic growth of the pathogen and reduce the stem lesion size on plant genotypes with LepR2, indicating the potential efficiency of this resistance to control the disease in the field.


Ascomycota , Brassica napus , Ascomycota/genetics , Brassica napus/genetics , Brassica napus/microbiology , Cloning, Molecular , Leptosphaeria , Plant Diseases/microbiology
12.
Plant Cell ; 33(11): 3454-3469, 2021 11 04.
Article En | MEDLINE | ID: mdl-34375428

In nature, single-strand breaks (SSBs) in DNA occur more frequently (by orders of magnitude) than double-strand breaks (DSBs). SSBs induced by the CRISPR/Cas9 nickase at a distance of 50-100 bp on opposite strands are highly mutagenic, leading to insertions/deletions (InDels), with insertions mainly occurring as direct tandem duplications. As short tandem repeats are overrepresented in plant genomes, this mechanism seems to be important for genome evolution. We investigated the distance at which paired 5'-overhanging SSBs are mutagenic and which DNA repair pathways are essential for insertion formation in Arabidopsis thaliana. We were able to detect InDel formation up to a distance of 250 bp, although with much reduced efficiency. Surprisingly, the loss of the classical nonhomologous end joining (NHEJ) pathway factors KU70 or DNA ligase 4 completely abolished tandem repeat formation. The microhomology-mediated NHEJ factor POLQ was required only for patch-like insertions, which are well-known from DSB repair as templated insertions from ectopic sites. As SSBs can also be repaired using homology, we furthermore asked whether the classical homologous recombination (HR) pathway is involved in this process in plants. The fact that RAD54 is not required for homology-mediated SSB repair demonstrates that the mechanisms for DSB- and SSB-induced HR differ in plants.


Arabidopsis/genetics , DNA Breaks, Single-Stranded , DNA Repair , DNA, Plant/genetics , Genome, Plant , DNA, Plant/chemistry
13.
Plant Biotechnol J ; 19(12): 2488-2500, 2021 12.
Article En | MEDLINE | ID: mdl-34310022

Plant genomes demonstrate significant presence/absence variation (PAV) within a species; however, the factors that lead to this variation have not been studied systematically in Brassica across diploids and polyploids. Here, we developed pangenomes of polyploid Brassica napus and its two diploid progenitor genomes B. rapa and B. oleracea to infer how PAV may differ between diploids and polyploids. Modelling of gene loss suggests that loss propensity is primarily associated with transposable elements in the diploids while in B. napus, gene loss propensity is associated with homoeologous recombination. We use these results to gain insights into the different causes of gene loss, both in diploids and following polyploidization, and pave the way for the application of machine learning methods to understanding the underlying biological and physical causes of gene presence/absence.


Brassica napus , Brassica , Brassica/genetics , Brassica napus/genetics , Diploidy , Genome, Plant/genetics , Polyploidy
14.
Mol Biol Evol ; 38(11): 5066-5081, 2021 10 27.
Article En | MEDLINE | ID: mdl-34329477

Domestication and breeding have reshaped the genomic architecture of chicken, but the retention and loss of genomic elements during these evolutionary processes remain unclear. We present the first chicken pan-genome constructed using 664 individuals, which identified an additional approximately 66.5-Mb sequences that are absent from the reference genome (GRCg6a). The constructed pan-genome encoded 20,491 predicated protein-coding genes, of which higher expression levels are observed in conserved genes relative to dispensable genes. Presence/absence variation (PAV) analyses demonstrated that gene PAV in chicken was shaped by selection, genetic drift, and hybridization. PAV-based genome-wide association studies identified numerous candidate mutations related to growth, carcass composition, meat quality, or physiological traits. Among them, a deletion in the promoter region of IGF2BP1 affecting chicken body size is reported, which is supported by functional studies and extra samples. This is the first time to report the causal variant of chicken body size quantitative trait locus located at chromosome 27 which was repeatedly reported. Therefore, the chicken pan-genome is a useful resource for biological discovery and breeding. It improves our understanding of chicken genome diversity and provides materials to unveil the evolution history of chicken domestication.


Chickens , Genome-Wide Association Study , Animals , Body Size/genetics , Chickens/genetics , Polymorphism, Single Nucleotide , Promoter Regions, Genetic , Quantitative Trait Loci
15.
Mol Ecol ; 30(15): 3730-3746, 2021 08.
Article En | MEDLINE | ID: mdl-34018645

Climate change is increasingly impacting ecosystems globally. Understanding adaptive genetic diversity and whether it will keep pace with projected climatic change is necessary to assess species' vulnerability and design efficient mitigation strategies such as assisted adaptation. Kelp forests are the foundations of temperate reefs globally but are declining in many regions due to climate stress. A lack of knowledge of kelp's adaptive genetic diversity hinders assessment of vulnerability under extant and future climates. Using 4245 single nucleotide polymorphisms (SNPs), we characterized patterns of neutral and putative adaptive genetic diversity for the dominant kelp in the southern hemisphere (Ecklonia radiata) from ~1000 km of coastline off Western Australia. Strong population structure and isolation-by-distance was underpinned by significant signatures of selection related to temperature and light. Gradient forest analysis of temperature-linked SNPs under selection revealed a strong association with mean annual temperature range, suggesting adaptation to local thermal environments. Critically, modelling revealed that predicted climate-mediated temperature changes will probably result in high genomic vulnerability via a mismatch between current and future predicted genotype-environment relationships such that kelp forests off Western Australia will need to significantly adapt to keep pace with projected climate change. Proactive management techniques such as assisted adaptation to boost resilience may be required to secure the future of these kelp forests and the immense ecological and economic values they support.


Kelp , Climate Change , Ecosystem , Forests , Genotype , Kelp/genetics
16.
medRxiv ; 2021 Mar 31.
Article En | MEDLINE | ID: mdl-33821285

The innate and adaptive immune response are regulated by biological clocks, and circulating lymphocytes are lowest at sunrise. Accordingly, severity of disease in mouse models is highly dependent on the time of day of viral infection. Here, we explore whether circadian immunity contributes significantly to seasonality of respiratory viruses, including influenza and SARS-CoV-2. Susceptibility-Infection-Recovery-Susceptibility (SIRS) models of influenza and SIRS-derived models of COVID-19 suggest that local sunrise time is a better predictor of the basic reproductive number (R0) than climate, even when day length is taken into account. Moreover, these models predict a window of susceptibility when local sunrise time corresponds to the morning commute and contact rate is expected to be high. Counterfactual modeling suggests that retaining daylight savings time in the fall would reduce the length of this window, and substantially reduce seasonal waves of respiratory infections.

17.
BMC Plant Biol ; 20(1): 546, 2020 Dec 07.
Article En | MEDLINE | ID: mdl-33287721

BACKGROUND: Brassica napus is an important oilseed crop cultivated worldwide. During domestication and breeding of B. napus, flowering time has been a target of selection because of its substantial impact on yield. Here we use double digest restriction-site associated DNA sequencing (ddRAD) to investigate the genetic basis of flowering in B. napus. An F2 mapping population was derived from a cross between an early-flowering spring type and a late-flowering winter type. RESULTS: Flowering time in the mapping population differed by up to 25 days between individuals. High genotype error rates persisted after initial quality controls, as suggested by a genotype discordance of ~ 12% between biological sequencing replicates. After genotype error correction, a linkage map spanning 3981.31 cM and compromising 14,630 single nucleotide polymorphisms (SNPs) was constructed. A quantitative trait locus (QTL) on chromosome C2 was detected, covering eight flowering time genes including FLC. CONCLUSIONS: These findings demonstrate the effectiveness of the ddRAD approach to sample the B. napus genome. Our results also suggest that ddRAD genotype error rates can be higher than expected in F2 populations. Quality filtering and genotype correction and imputation can substantially reduce these error rates and allow effective linkage mapping and QTL analysis.


Brassica napus/genetics , Chromosome Mapping/methods , Flowers/genetics , Quantitative Trait Loci/genetics , Sequence Analysis, DNA/methods , Alleles , Binding Sites/genetics , Brassica napus/growth & development , Chromosomes, Plant/genetics , DNA Restriction Enzymes/metabolism , Flowers/growth & development , Genes, Plant/genetics , Genome, Plant/genetics , Genotype , Phenotype , Polymorphism, Single Nucleotide , Time Factors
18.
Nat Plants ; 6(11): 1389, 2020 Nov.
Article En | MEDLINE | ID: mdl-33139862

An amendment to this paper has been published and can be accessed via a link at the top of the paper.

19.
Sci Rep ; 10(1): 14123, 2020 08 24.
Article En | MEDLINE | ID: mdl-32839508

Understanding the biogeographical and diversification processes explaining current diversity patterns of subcosmopolitan-distributed groups is challenging. We aimed at disentangling the historical biogeography of the subcosmopolitan liverwort genus Lejeunea with estimation of ancestral areas of origin and testing if sexual system and palaeotemperature variations can be factors of diversification. We assembled a dense taxon sampling for 120 species sampled throughout the geographical distribution of the genus. Lejeunea diverged from its sister group after the Paleocene-Eocene boundary (52.2 Ma, 95% credibility intervals 50.1-54.2 Ma), and the initial diversification of the crown group occurred in the early to middle Eocene (44.5 Ma, 95% credibility intervals 38.5-50.8 Ma). The DEC model indicated that (1) Lejeunea likely originated in an area composed of the Neotropics and the Nearctic, (2) dispersals through terrestrial land bridges in the late Oligocene and Miocene allowed Lejeunea to colonize the Old World, (3) the Boreotropical forest covering the northern regions until the late Eocene did not facilitate Lejeunea dispersals, and (4) a single long-distance dispersal event was inferred between the Neotropics and Africa. Biogeographical and diversification analyses show the Miocene was an important period when Lejeunea diversified globally. We found slight support for higher diversification rates of species with both male and female reproductive organs on the same individual (monoicy), and a moderate positive influence of palaeotemperatures on diversification. Our study shows that an ancient origin associated with a dispersal history facilitated by terrestrial land bridges and not long-distance dispersals are likely to explain the subcosmopolitan distribution of Lejeunea. By enhancing the diversification rates, monoicy likely favoured the colonisations of new areas, especially in the Miocene that was a key epoch shaping the worldwide distribution.


Hepatophyta/classification , Hepatophyta/growth & development , Phylogeny , Phylogeography , Biodiversity , Forests , Genetic Speciation , Hepatophyta/genetics , Tropical Climate
20.
Nat Plants ; 6(8): 914-920, 2020 08.
Article En | MEDLINE | ID: mdl-32690893

Recent years have seen a surge in plant genome sequencing projects and the comparison of multiple related individuals. The high degree of genomic variation observed led to the realization that single reference genomes do not represent the diversity within a species, and led to the expansion of the pan-genome concept. Pan-genomes represent the genomic diversity of a species and includes core genes, found in all individuals, as well as variable genes, which are absent in some individuals. Variable gene annotations often show similarities across plant species, with genes for biotic and abiotic stress commonly enriched within variable gene groups. Here we review the growth of pan-genomics in plants, explore the origins of gene presence and absence variation, and show how pan-genomes can support plant breeding and evolution studies.


Genome, Plant , Plants/genetics , Genes, Plant/genetics , Genetic Variation/genetics , Genome, Plant/genetics , Reference Values
...