RESUMO
Structural variations (SVs) and gene copy number variations (gCNVs) have contributed to crop evolution, domestication, and improvement. Here, we assembled 31 high-quality genomes of genetically diverse rice accessions. Coupling with two existing assemblies, we developed pan-genome-scale genomic resources including a graph-based genome, providing access to rice genomic variations. Specifically, we discovered 171,072 SVs and 25,549 gCNVs and used an Oryza glaberrima assembly to infer the derived states of SVs in the Oryza sativa population. Our analyses of SV formation mechanisms, impacts on gene expression, and distributions among subpopulations illustrate the utility of these resources for understanding how SVs and gCNVs shaped rice environmental adaptation and domestication. Our graph-based genome enabled genome-wide association study (GWAS)-based identification of phenotype-associated genetic variations undetectable when using only SNPs and a single reference assembly. Our work provides rich population-scale resources paired with easy-to-access tools to facilitate rice breeding as well as plant functional genomics and evolutionary biology research.
Assuntos
Ecótipo , Variação Genética , Genoma de Planta , Oryza/genética , Adaptação Fisiológica/genética , Agricultura , Domesticação , Perfilação da Expressão Gênica , Regulação da Expressão Gênica de Plantas , Genes de Plantas , Variação Estrutural do Genoma , Anotação de Sequência Molecular , FenótipoRESUMO
Drug resistance to chemotherapeutic agents remains a formidable challenge in cancer treatment, significantly impacting treatment efficacy. Extensive research has exposed the intimate involvement of noncoding RNAs (ncRNAs) in conferring resistance to cancer drugs. Understanding the intricate associations between ncRNAs and drug resistance is of pivotal importance in advancing clinical interventions and expediting drug development. However, traditional biological experimental methods are hampered by limitations, such as labor intensiveness, time consumption, and constraints in scalability. Addressing these challenges necessitates the development of efficient computational methods for the accurate prediction of potential ncRNA-drug resistance associations (NDRA). However, most existing predictive models primarily focus on known ncRNA-drug resistance associations, often neglecting the critical aspect of similarity information between ncRNAs and drug resistance. This oversight may hinder the accuracy of characterizing these associations. To overcome the limitations of existing computational models, we proposed B-NDRA, a computational framework designed for the discovery of drug resistance-related ncRNA. Initially, we constructed a heterogeneous graph that integrates ncRNA-drug resistance pairs, leveraging both known associations and similarity fusion information between ncRNAs and drug resistance. Subsequently, we employed an attention mechanism to aggregate local features of graph nodes following a dimensionality reduction of node features. Further, a graph neural network (GNN) facilitated the learning of global node embeddings. Notably, the integration of dual adaptive deep adjustment architectures, encompassing intrablock and interblock methodologies, enabled efficient extraction of global features while balancing local and global features. Finally, B-NDRA employed a multilayer perceptron to predict associations between ncRNAs and drug resistance. Through rigorous 5-fold cross-validation, B-NDRA achieved average AUC, AUPR, Accuracy, Precision, Recall, and F1-score values of 92.2%, 91.9%, 84.88%, 86.9%, 82.37%, and 84.44%, respectively. Furthermore, comparative evaluations were conducted on established models, namely, GAEMDA, GRPAMDA, and LRGCPND. The results, obtained through three distinct 5-fold cross-validation strategies, demonstrated a notable performance improvement across almost all metrics for our B-NDRA. Specific case studies targeting Doxorubicin and Imatinib further validated the practicality of our B-NDRA in discovering potential NDRA. These results confirm the potential of our B-NDRA as a valuable tool in advancing cancer research and therapeutic development. The source code and data set of B-NDRA can be found at https://github.com/XuanLi1145/B-NDRA.
Assuntos
Redes Neurais de Computação , RNA não Traduzido , RNA não Traduzido/genética , Humanos , Resistencia a Medicamentos Antineoplásicos , Biologia Computacional/métodos , Aprendizado ProfundoRESUMO
Biological introductions provide a natural ecological experiment unfolding in a recent historical timeframe to elucidate how evolutionary processes (such as founder effects, genetic diversity and adaptation) shape the genomic landscape of populations postintroduction. The Asian icefish, Protosalanx chinensis, is an economically important fishery resource, deliberately introduced into dozens of provinces across China for decades. However, while invading and disturbing the local ecosystem, many introduced populations declined, disappearing mysteriously in a very short time. The way in which various evolutionary forces integrate to result in invasion failure of an introduced population remains unknown. Here, we performed whole-genome sequencing of 10 species from the Salangidae family and 70 Asian icefish (Protosalanx chinensis) individuals from 7 geographic populations in China, aiming to characterize the evolutionary fate of introduced populations. Our results show that compared to other Salangidae species, P. chinensis has low genetic diversity, potentially due to the long-lasting decline in population size. In a recently introducted population, Lugu lake, severe sampling effects and a strong bottleneck further deteriorated the genomic landscape. Although the introduced population showed signs of reduced genetic load, the purging selection efficiency was low. Our selective sweep analysis revealed site frequency changes in candidate genes, including gata1a and hoxd4b, which could be associated with a decrease in dissolved oxygen in the deep-water plateau lake. These findings caution against the widespread introduction of P. chinensis in China and lay the groundwork for future use of this economically species.
Assuntos
Ecossistema , Genômica , Humanos , Genoma , Evolução Biológica , Adaptação FisiológicaRESUMO
Common buckwheat (Fagopyrum esculentum) and Tartary buckwheat (Fagopyrum tataricum), the two most widely cultivated buckwheat species, differ greatly in flavonoid content and reproductive mode. Here, we report the first high-quality and chromosome-level genome assembly of common buckwheat with 1.2 Gb. Comparative genomic analysis revealed that common buckwheat underwent a burst of long terminal repeat retrotransposons insertion accompanied by numerous large chromosome rearrangements after divergence from Tartary buckwheat. Moreover, multiple gene families involved in stress tolerance and flavonoid biosynthesis such as multidrug and toxic compound extrusion (MATE) and chalcone synthase (CHS) underwent significant expansion in buckwheat, especially in common buckwheat. Integrated multi-omics analysis identified high expression of catechin biosynthesis-related genes in flower and seed in common buckwheat and high expression of rutin biosynthesis-related genes in seed in Tartary buckwheat as being important for the differences in flavonoid type and content between these buckwheat species. We also identified a candidate key rutin-degrading enzyme gene (Ft8.2377) that was highly expressed in Tartary buckwheat seed. In addition, we identified a haplotype-resolved candidate locus containing many genes reportedly associated with the development of flower and pollen, which was potentially related to self-incompatibility in common buckwheat. Our study provides important resources facilitating future functional genomics-related research of flavonoid biosynthesis and self-incompatibility in buckwheat.
Assuntos
Fagopyrum , Flavonoides , Flavonoides/metabolismo , Fagopyrum/genética , Fagopyrum/metabolismo , Rutina/análise , Rutina/metabolismo , Genes de Plantas , Sementes/genéticaRESUMO
Rheum officinale, a member of the Polygonaceae family, is an important medicinal plant that is widely used in traditional Chinese medicine. Here, we report a 7.68-Gb chromosome-scale assembly of R. officinale with a contig N50 of 3.47 Mb, which was clustered into 44 chromosomes across four homologous groups. Comparative genomics analysis revealed that transposable elements have made a significant contribution to its genome evolution, gene copy number variation, and gene regulation and expression, particularly of genes involved in metabolite biosynthesis, stress resistance, and root development. We placed the recent autotetraploidization of R. officinale at â¼0.58 mya and analyzed the genomic features of its homologous chromosomes. Although no dominant monoploid genomes were observed at the overall expression level, numerous allele-differentially-expressed genes were identified, mainly with different transposable element insertions in their regulatory regions, suggesting that they functionally diverged after polyploidization. Combining genomics, transcriptomics, and metabolomics, we explored the contributions of gene family amplification and tetraploidization to the abundant anthraquinone production of R. officinale, as well as gene expression patterns and differences in anthraquinone content among tissues. Our report offers unprecedented genomic resources for fundamental research on the autopolyploid herb R. officinale and guidance for polyploid breeding of herbs.