Your browser doesn't support javascript.
loading
: 20 | 50 | 100
1 - 20 de 23.274
1.
BMC Med Res Methodol ; 24(1): 105, 2024 May 03.
Article En | MEDLINE | ID: mdl-38702624

BACKGROUND: Survival prediction using high-dimensional molecular data is a hot topic in the field of genomics and precision medicine, especially for cancer studies. Considering that carcinogenesis has a pathway-based pathogenesis, developing models using such group structures is a closer mimic of disease progression and prognosis. Many approaches can be used to integrate group information; however, most of them are single-model methods, which may account for unstable prediction. METHODS: We introduced a novel survival stacking method that modeled using group structure information to improve the robustness of cancer survival prediction in the context of high-dimensional omics data. With a super learner, survival stacking combines the prediction from multiple sub-models that are independently trained using the features in pre-grouped biological pathways. In addition to a non-negative linear combination of sub-models, we extended the super learner to non-negative Bayesian hierarchical generalized linear model and artificial neural network. We compared the proposed modeling strategy with the widely used survival penalized method Lasso Cox and several group penalized methods, e.g., group Lasso Cox, via simulation study and real-world data application. RESULTS: The proposed survival stacking method showed superior and robust performance in terms of discrimination compared with single-model methods in case of high-noise simulated data and real-world data. The non-negative Bayesian stacking method can identify important biological signal pathways and genes that are associated with the prognosis of cancer. CONCLUSIONS: This study proposed a novel survival stacking strategy incorporating biological group information into the cancer prognosis models. Additionally, this study extended the super learner to non-negative Bayesian model and ANN, enriching the combination of sub-models. The proposed Bayesian stacking strategy exhibited favorable properties in the prediction and interpretation of complex survival data, which may aid in discovering cancer targets.


Bayes Theorem , Genomics , Neoplasms , Humans , Neoplasms/genetics , Neoplasms/mortality , Genomics/methods , Prognosis , Algorithms , Proportional Hazards Models , Neural Networks, Computer , Survival Analysis , Computational Biology/methods
2.
Genes Chromosomes Cancer ; 63(5): e23238, 2024 May.
Article En | MEDLINE | ID: mdl-38722224

Pleomorphic rhabdomyosarcoma (PRMS) is a rare and highly aggressive sarcoma, occurring mostly in the deep soft tissues of middle-aged adults and showing a variable degree of skeletal muscle differentiation. The diagnosis is challenging as pathologic features overlap with embryonal rhabdomyosarcoma (ERMS), malignant Triton tumor, and other pleomorphic sarcomas. As recurrent genetic alterations underlying PRMS have not been described to date, ancillary molecular diagnostic testing is not useful in subclassification. Herein, we perform genomic profiling of a well-characterized cohort of 14 PRMS, compared to a control group of 23 ERMS and other pleomorphic sarcomas (undifferentiated pleomorphic sarcoma and pleomorphic liposarcoma) using clinically validated DNA-targeted Next generation sequencing (NGS) panels (MSK-IMPACT). The PRMS cohort included eight males and six females, with a median age of 53 years (range 31-76 years). Despite similar tumor mutation burdens, the genomic landscape of PRMS, with a high frequency of TP53 (79%) and RB1 (43%) alterations, stood in stark contrast to ERMS, with 4% and 0%, respectively. CDKN2A deletions were more common in PRMS (43%), compared to ERMS (13%). In contrast, ERMS harbored somatic driver mutations in the RAS pathway and loss of function mutations in BCOR, which were absent in PRMS. Copy number variations in PRMS showed multiple chromosomal arm-level changes, most commonly gains of chr17p and chr22q and loss of chr6q. Notably, gain of chr8, commonly seen in ERMS (61%) was conspicuously absent in PRMS. The genomic profiles of other pleomorphic sarcomas were overall analogous to PRMS, showing shared alterations in TP53, RB1, and CDKN2A. Overall survival and progression-free survival of PRMS were significantly worse (p < 0.0005) than that of ERMS. Our findings revealed that the molecular landscape of PRMS aligns with other adult pleomorphic sarcomas and is distinct from that of ERMS. Thus, NGS assays may be applied in select challenging cases toward a refined classification. Finally, our data corroborate the inclusion of PRMS in the therapeutic bracket of pleomorphic sarcomas, given that their clinical outcomes are comparable.


Rhabdomyosarcoma, Embryonal , Humans , Male , Female , Adult , Middle Aged , Aged , Rhabdomyosarcoma, Embryonal/genetics , Rhabdomyosarcoma, Embryonal/pathology , Rhabdomyosarcoma/genetics , Rhabdomyosarcoma/pathology , Rhabdomyosarcoma/classification , Mutation , High-Throughput Nucleotide Sequencing/methods , Genomics/methods , Biomarkers, Tumor/genetics , Retinoblastoma Binding Proteins/genetics , Ubiquitin-Protein Ligases
3.
PLoS One ; 19(5): e0302646, 2024.
Article En | MEDLINE | ID: mdl-38709766

The analysis of the DNA entrapped in ancient shells of molluscs has the potential to shed light on the evolution and ecology of this very diverse phylum. Ancient genomics could help reconstruct the responses of molluscs to past climate change, pollution, and human subsistence practices at unprecedented temporal resolutions. Applications are however still in their infancy, partly due to our limited knowledge of DNA preservation in calcium carbonate shells and the need for optimized methods for responsible genomic data generation. To improve ancient shell genomic analyses, we applied high-throughput DNA sequencing to 27 Mytilus mussel shells dated to ~111-6500 years Before Present, and investigated the impact, on DNA recovery, of shell imaging, DNA extraction protocols and shell sub-sampling strategies. First, we detected no quantitative or qualitative deleterious effect of micro-computed tomography for recording shell 3D morphological information prior to sub-sampling. Then, we showed that double-digestion and bleach treatment of shell powder prior to silica-based DNA extraction improves shell DNA recovery, also suggesting that DNA is protected in preservation niches within ancient shells. Finally, all layers that compose Mytilus shells, i.e., the nacreous (aragonite) and prismatic (calcite) carbonate layers, with or without the outer organic layer (periostracum) proved to be valuable DNA reservoirs, with aragonite appearing as the best substrate for genomic analyses. Our work contributes to the understanding of long-term molecular preservation in biominerals and we anticipate that resulting recommendations will be helpful for future efficient and responsible genomic analyses of ancient mollusc shells.


Animal Shells , Genomics , Mollusca , Animals , Genomics/methods , Mollusca/genetics , X-Ray Microtomography , Calcium Carbonate , High-Throughput Nucleotide Sequencing , Fossils
4.
Curr Protoc ; 4(5): e1046, 2024 May.
Article En | MEDLINE | ID: mdl-38717471

Whole-genome sequencing is widely used to investigate population genomic variation in organisms of interest. Assorted tools have been independently developed to call variants from short-read sequencing data aligned to a reference genome, including single nucleotide polymorphisms (SNPs) and structural variations (SVs). We developed SNP-SVant, an integrated, flexible, and computationally efficient bioinformatic workflow that predicts high-confidence SNPs and SVs in organisms without benchmarked variants, which are traditionally used for distinguishing sequencing errors from real variants. In the absence of these benchmarked datasets, we leverage multiple rounds of statistical recalibration to increase the precision of variant prediction. The SNP-SVant workflow is flexible, with user options to tradeoff accuracy for sensitivity. The workflow predicts SNPs and small insertions and deletions using the Genome Analysis ToolKit (GATK) and predicts SVs using the Genome Rearrangement IDentification Software Suite (GRIDSS), and it culminates in variant annotation using custom scripts. A key utility of SNP-SVant is its scalability. Variant calling is a computationally expensive procedure, and thus, SNP-SVant uses a workflow management system with intermediary checkpoint steps to ensure efficient use of resources by minimizing redundant computations and omitting steps where dependent files are available. SNP-SVant also provides metrics to assess the quality of called variants and converts between VCF and aligned FASTA format outputs to ensure compatibility with downstream tools to calculate selection statistics, which are commonplace in population genomics studies. By accounting for both small and large structural variants, users of this workflow can obtain a wide-ranging view of genomic alterations in an organism of interest. Overall, this workflow advances our capabilities in assessing the functional consequences of different types of genomic alterations, ultimately improving our ability to associate genotypes with phenotypes. © 2024 The Authors. Current Protocols published by Wiley Periodicals LLC. Basic Protocol: Predicting single nucleotide polymorphisms and structural variations Support Protocol 1: Downloading publicly available sequencing data Support Protocol 2: Visualizing variant loci using Integrated Genome Viewer Support Protocol 3: Converting between VCF and aligned FASTA formats.


Polymorphism, Single Nucleotide , Software , Workflow , Polymorphism, Single Nucleotide/genetics , Computational Biology/methods , Genomics/methods , Molecular Sequence Annotation/methods , Whole Genome Sequencing/methods
5.
BMC Genomics ; 25(1): 455, 2024 May 08.
Article En | MEDLINE | ID: mdl-38720252

BACKGROUND: Standard ChIP-seq and RNA-seq processing pipelines typically disregard sequencing reads whose origin is ambiguous ("multimappers"). This usual practice has potentially important consequences for the functional interpretation of the data: genomic elements belonging to clusters composed of highly similar members are left unexplored. RESULTS: In particular, disregarding multimappers leads to the underrepresentation in epigenetic studies of recently active transposable elements, such as AluYa5, L1HS and SVAs. Furthermore, this common strategy also has implications for transcriptomic analysis: members of repetitive gene families, such the ones including major histocompatibility complex (MHC) class I and II genes, are under-quantified. CONCLUSION: Revealing inherent biases that permeate routine tasks such as functional enrichment analysis, our results underscore the urgency of broadly adopting multimapper-aware bioinformatic pipelines -currently restricted to specific contexts or communities- to ensure the reliability of genomic and transcriptomic studies.


High-Throughput Nucleotide Sequencing , Humans , DNA Transposable Elements/genetics , Computational Biology/methods , Gene Expression Profiling/methods , Genomics/methods , Sequence Analysis, RNA/methods
6.
Sci Rep ; 14(1): 10109, 2024 05 02.
Article En | MEDLINE | ID: mdl-38698002

Phocaeicola dorei and Phocaeicola vulgatus are very common and abundant members of the human gut microbiome and play an important role in the infant gut microbiome. These species are closely related and often confused for one another; yet, their genome comparison, interspecific diversity, and evolutionary relationships have not been studied in detail so far. Here, we perform phylogenetic analysis and comparative genomic analyses of these two Phocaeicola species. We report that P. dorei has a larger genome yet a smaller pan-genome than P. vulgatus. We found that this is likely because P. vulgatus is more plastic than P. dorei, with a larger repertoire of genetic mobile elements and fewer anti-phage defense systems. We also found that P. dorei directly descends from a clade of P. vulgatus¸ and experienced genome expansion through genetic drift and horizontal gene transfer. Overall, P. dorei and P. vulgatus have very different functional and carbohydrate utilisation profiles, hinting at different ecological strategies, yet they present similar antimicrobial resistance profiles.


Genome, Bacterial , Phylogeny , Humans , Gastrointestinal Microbiome/genetics , Gene Transfer, Horizontal , Evolution, Molecular , Genomics/methods , Bacteroidetes/genetics
7.
PLoS One ; 19(5): e0299588, 2024.
Article En | MEDLINE | ID: mdl-38718091

Corynebacterium glutamicum is a non-pathogenic species of the Corynebacteriaceae family. It has been broadly used in industrial biotechnology for the production of valuable products. Though it is widely accepted at the industrial level, knowledge about the genomic diversity of the strains is limited. Here, we investigated the comparative genomic features of the strains and pan-genomic characteristics. We also observed phylogenetic relationships among the strains based on average nucleotide identity (ANI). We found diversity between strains at the genomic and pan-genomic levels. Less than one-third of the C. glutamicum pan-genome consists of core genes and soft-core genes. Whereas, a large number of strain-specific genes covered about half of the total pan-genome. Besides, C. glutamicum pan-genome is open and expanding, which indicates the possible addition of new gene families to the pan-genome. We also investigated the distribution of biosynthetic gene clusters (BGCs) among the strains. We discovered slight variations of BGCs at the strain level. Several BGCs with the potential to express novel bioactive secondary metabolites have been identified. Therefore, by utilizing the characteristic advantages of C. glutamicum, different strains can be potential applicants for natural drug discovery.


Corynebacterium glutamicum , Genetic Variation , Genome, Bacterial , Phylogeny , Corynebacterium glutamicum/genetics , Corynebacterium glutamicum/metabolism , Multigene Family , Genomics/methods
8.
Sci Rep ; 14(1): 12249, 2024 May 28.
Article En | MEDLINE | ID: mdl-38806503

Members of the family Trichomeriaceae, belonging to the Chaetothyriales order and the Ascomycota phylum, are known for their capability to inhabit hostile environments characterized by extreme temperatures, oligotrophic conditions, drought, or presence of toxic compounds. The genus Knufia encompasses many polyextremophilic species. In this report, the genomic and morphological features of the strain FJI-L2-BK-P2 presented, which was isolated from the Mars 2020 mission spacecraft assembly facility located at the Jet Propulsion Laboratory in Pasadena, California. The identification is based on sequence alignment for marker genes, multi-locus sequence analysis, and whole genome sequence phylogeny. The morphological features were studied using a diverse range of microscopic techniques (bright field, phase contrast, differential interference contrast and scanning electron microscopy). The phylogenetic marker genes of the strain FJI-L2-BK-P2 exhibited highest similarities with type strain of Knufia obscura (CBS 148926T) that was isolated from the gas tank of a car in Italy. To validate the species identity, whole genomes of both strains (FJI-L2-BK-P2 and CBS 148926T) were sequenced, annotated, and strain FJI-L2-BK-P2 was confirmed as K. obscura. The morphological analysis and description of the genomic characteristics of K. obscura FJI-L2-BK-P2 may contribute to refining the taxonomy of Knufia species. Key morphological features are reported in this K. obscura strain, resembling microsclerotia and chlamydospore-like propagules. These features known to be characteristic features in black fungi which could potentially facilitate their adaptation to harsh environments.


Ascomycota , Mars , Phylogeny , Spacecraft , Ascomycota/genetics , Ascomycota/classification , Ascomycota/isolation & purification , Genome, Fungal/genetics , Genomics/methods
9.
Sci Adv ; 10(21): eadj6823, 2024 May 24.
Article En | MEDLINE | ID: mdl-38781323

We present a draft genome of the little bush moa (Anomalopteryx didiformis)-one of approximately nine species of extinct flightless birds from Aotearoa, New Zealand-using ancient DNA recovered from a fossil bone from the South Island. We recover a complete mitochondrial genome at 249.9× depth of coverage and almost 900 megabases of a male moa nuclear genome at ~4 to 5× coverage, with sequence contiguity sufficient to identify more than 85% of avian universal single-copy orthologs. We describe a diverse landscape of transposable elements and satellite repeats, estimate a long-term effective population size of ~240,000, identify a diverse suite of olfactory receptor genes and an opsin repertoire with sensitivity in the ultraviolet range, show that the wingless moa phenotype is likely not attributable to gene loss or pseudogenization, and identify potential function-altering coding sequence variants in moa that could be synthesized for future functional assays. This genomic resource should support further studies of avian evolution and morphological divergence.


Birds , Extinction, Biological , Genome , Animals , Birds/genetics , Cell Nucleus/genetics , Phylogeny , Fossils , Genome, Mitochondrial , Flight, Animal , New Zealand , Male , DNA Transposable Elements/genetics , Genomics/methods
10.
Biomolecules ; 14(5)2024 May 10.
Article En | MEDLINE | ID: mdl-38785975

The understanding of the human genome has been greatly improved by the advent of next-generation sequencing technologies (NGS). Despite the undeniable advantages responsible for their widespread diffusion, these methods have some constraints, mainly related to short read length and the need for PCR amplification. As a consequence, long-read sequencers, called third-generation sequencing (TGS), have been developed, promising to overcome NGS. Starting from the first prototype, TGS has progressively ameliorated its chemistries by improving both read length and base-calling accuracy, as well as simultaneously reducing the costs/base. Based on these premises, TGS is showing its potential in many fields, including the analysis of difficult-to-sequence genomic regions, structural variations detection, RNA expression profiling, DNA methylation study, and metagenomic analyses. Protocol standardization and the development of easy-to-use pipelines for data analysis will enhance TGS use, also opening the way for their routine applications in diagnostic contexts.


High-Throughput Nucleotide Sequencing , Humans , High-Throughput Nucleotide Sequencing/methods , Sequence Analysis, DNA/methods , Genome, Human , Metagenomics/methods , DNA Methylation/genetics , Genomics/methods
11.
Yi Chuan ; 46(5): 421-430, 2024 May 20.
Article En | MEDLINE | ID: mdl-38763776

Inner Mongolia cashmere goat is an excellent livestock breed formed through long-term natural selection and artificial breeding, and is currently a world-class dual-purpose breed producing cashmere and meat. Multi trait animal model is considered to significantly improve the accuracy of genetic evaluation in livestock and poultry, enabling indirect selection between traits. In this study, the pedigree, genotype, environment, and phenotypic records of early growth traits of Inner Mongolia cashmere goats were used to build multi trait animal model., Then three methods including ABLUP, GBLUP, and ssGBLUP wereused to estimate the genetic parameters and genomic breeding values of early growth traits (birth weight, weaning weight, average daily weight gain before weaning, and yearling weight). The accuracy and reliability of genomic estimated breeding value are further evaluated using the five fold cross validation method. The results showed that the heritability of birth weight estimated by three methods was 0.13-0.15, the heritability of weaning weight was 0.13-0.20, heritability of daily weight gain before weaning was 0.11-0.14, and the heritability of yearling weight was 0.09-0.14, all of which belonged to moderate to low heritability. There is a strong positive genetic correlation between weaning weight and daily weight gain before weaning, daily weight gain before weaning and yearling weight, with correlation coefficients of 0.77-0.79 and 0.56-0.67, respectively. The same pattern was found in phenotype correlation among traits. The accuracy of the estimated breeding values by ABLUP, GBLUP, and ssGBLUP methods for birth weight is 0.5047, 0.6694, and 0.7156, respectively; the weaning weight is 0.6207, 0.6456, and 0.7254, respectively; the daily weight gain before weaning was 0.6110, 0.6855, and 0.7357 respectively; and the yearling weight was 0.6209, 0.7155, and 0.7756, respectively. In summary, the early growth traits of Inner Mongolia cashmere goats belong to moderate to low heritability, and the speed of genetic improvement is relatively slow. The genetic improvement of other growth traits can be achieved through the selection of weaning weight. The ssGBLUP method has the highest accuracy and reliability in estimating genomic breeding value of early growth traits in Inner Mongolia cashmere goats, and is significantly higher than that from ABLUP method, indicating that it is the best method for genomic breeding of early growth weight in Inner Mongolia cashmere goats.


Breeding , Goats , Animals , Goats/genetics , Goats/growth & development , Phenotype , Genomics/methods , Female , Male , Birth Weight/genetics , Models, Genetic
12.
Cell Rep Med ; 5(5): 101565, 2024 May 21.
Article En | MEDLINE | ID: mdl-38776875

CML is readily treatable with tyrosine kinase inhibitors (TKIs); however, resistance occurs, with the disease curable in only ∼15%-20% of patients. Using integrated functional genomics, Adnan Awad et al.1 identify agents effective against CML stem cells and describe mechanisms underlying TKI resistance.


Drug Resistance, Neoplasm , Genomics , Leukemia, Myelogenous, Chronic, BCR-ABL Positive , Protein Kinase Inhibitors , Humans , Leukemia, Myelogenous, Chronic, BCR-ABL Positive/genetics , Leukemia, Myelogenous, Chronic, BCR-ABL Positive/drug therapy , Leukemia, Myelogenous, Chronic, BCR-ABL Positive/pathology , Protein Kinase Inhibitors/therapeutic use , Protein Kinase Inhibitors/pharmacology , Drug Resistance, Neoplasm/genetics , Drug Resistance, Neoplasm/drug effects , Genomics/methods
13.
Physiol Plant ; 176(3): e14349, 2024.
Article En | MEDLINE | ID: mdl-38783512

Millets, comprising a diverse group of small-seeded grains, have emerged as vital crops with immense nutritional, environmental, and economic significance. The comprehension of complex traits in millets, influenced by multifaceted genetic determinants, presents a compelling challenge and opportunity in agricultural research. This review delves into the transformative roles of phenomics and genomics in deciphering these intricate genetic architectures. On the phenomics front, high-throughput platforms generate rich datasets on plant morphology, physiology, and performance in diverse environments. This data, coupled with field trials and controlled conditions, helps to interpret how the environment interacts with genetics. Genomics provides the underlying blueprint for these complex traits. Genome sequencing and genotyping technologies have illuminated the millet genome landscape, revealing diverse gene pools and evolutionary relationships. Additionally, different omics approaches unveil the intricate information of gene expression, protein function, and metabolite accumulation driving phenotypic expression. This multi-omics approach is crucial for identifying candidate genes and unfolding the intricate pathways governing complex traits. The review highlights the synergy between phenomics and genomics. Genomically informed phenotyping targets specific traits, reducing the breeding size and cost. Conversely, phenomics identifies promising germplasm for genomic analysis, prioritizing variants with superior performance. This dynamic interplay accelerates breeding programs and facilitates the development of climate-smart, nutrient-rich millet varieties and hybrids. In conclusion, this review emphasizes the crucial roles of phenomics and genomics in unlocking the genetic enigma of millets.


Genomics , Millets , Phenomics , Genomics/methods , Millets/genetics , Phenotype , Genome, Plant/genetics , Plant Breeding/methods , Crops, Agricultural/genetics
14.
Microb Genom ; 10(5)2024 May.
Article En | MEDLINE | ID: mdl-38785221

Wastewater-based surveillance (WBS) is an important epidemiological and public health tool for tracking pathogens across the scale of a building, neighbourhood, city, or region. WBS gained widespread adoption globally during the SARS-CoV-2 pandemic for estimating community infection levels by qPCR. Sequencing pathogen genes or genomes from wastewater adds information about pathogen genetic diversity, which can be used to identify viral lineages (including variants of concern) that are circulating in a local population. Capturing the genetic diversity by WBS sequencing is not trivial, as wastewater samples often contain a diverse mixture of viral lineages with real mutations and sequencing errors, which must be deconvoluted computationally from short sequencing reads. In this study we assess nine different computational tools that have recently been developed to address this challenge. We simulated 100 wastewater sequence samples consisting of SARS-CoV-2 BA.1, BA.2, and Delta lineages, in various mixtures, as well as a Delta-Omicron recombinant and a synthetic 'novel' lineage. Most tools performed well in identifying the true lineages present and estimating their relative abundances and were generally robust to variation in sequencing depth and read length. While many tools identified lineages present down to 1 % frequency, results were more reliable above a 5 % threshold. The presence of an unknown synthetic lineage, which represents an unclassified SARS-CoV-2 lineage, increases the error in relative abundance estimates of other lineages, but the magnitude of this effect was small for most tools. The tools also varied in how they labelled novel synthetic lineages and recombinants. While our simulated dataset represents just one of many possible use cases for these methods, we hope it helps users understand potential sources of error or bias in wastewater sequencing analysis and to appreciate the commonalities and differences across methods.


COVID-19 , Genome, Viral , SARS-CoV-2 , Wastewater , Wastewater/virology , SARS-CoV-2/genetics , SARS-CoV-2/classification , COVID-19/virology , COVID-19/epidemiology , Humans , Computational Biology/methods , Genomics/methods , Wastewater-Based Epidemiological Monitoring , Phylogeny
15.
Arch Dermatol Res ; 316(6): 217, 2024 May 24.
Article En | MEDLINE | ID: mdl-38787526

We aim to systemically review the genomics, transcriptomics, epigenetics, proteomics, metabonomics and microbiota of psoriatic arthritis and psoriasis, illustrating the differences of these two diseases, broadening our understanding of the pathogenesis of them and providing important clues for valuable biomarkers of earlier diagnosis and treatments. To our knowledge, this is the first study that combine all omics studies from genomics to microbiota and may serve as a reference for future studies to identify the key underlying pathways in psoriatic arthritis.


Arthritis, Psoriatic , Genomics , Metabolomics , Proteomics , Psoriasis , Humans , Arthritis, Psoriatic/diagnosis , Arthritis, Psoriatic/immunology , Psoriasis/diagnosis , Psoriasis/immunology , Genomics/methods , Microbiota/immunology , Biomarkers/metabolism , Epigenesis, Genetic , Transcriptome , Multiomics
16.
Nat Commun ; 15(1): 4312, 2024 May 21.
Article En | MEDLINE | ID: mdl-38773118

Genomics-guided methodologies have revolutionized the discovery of natural products. However, a major challenge in the field of genome mining is determining how to selectively extract biosynthetic gene clusters (BGCs) for untapped natural products from numerous available genome sequences. In this study, we developed a fungal genome mining tool that extracts BGCs encoding enzymes that lack a detectable protein domain (i.e., domainless enzymes) and are not recognized as biosynthetic proteins by existing bioinformatic tools. We searched for BGCs encoding a homologue of Pyr4-family terpene cyclases, which are representative examples of apparently domainless enzymes, in approximately 2000 fungal genomes and discovered several BGCs with unique features. The subsequent characterization of selected BGCs led to the discovery of fungal onoceroid triterpenoids and unprecedented onoceroid synthases. Furthermore, in addition to the onoceroids, a previously unreported sesquiterpene hydroquinone, of which the biosynthesis involves a Pyr4-family terpene cyclase, was obtained. Our genome mining tool has broad applicability in fungal genome mining and can serve as a beneficial platform for accessing diverse, unexploited natural products.


Genome, Fungal , Multigene Family , Triterpenes , Triterpenes/metabolism , Triterpenes/chemistry , Fungal Proteins/genetics , Fungal Proteins/metabolism , Genomics/methods , Computational Biology/methods , Phylogeny , Biological Products/metabolism , Biological Products/chemistry , Biosynthetic Pathways/genetics , Data Mining
17.
BMC Genomics ; 25(1): 502, 2024 May 21.
Article En | MEDLINE | ID: mdl-38773367

BACKGROUND: Fusarium zanthoxyli is a destructive pathogen causing stem canker in prickly ash, an ecologically and economically important forest tree. However, the genome lack of F. zanthoxyli has hindered research on its interaction with prickly ash and the development of precise control strategies for stem canker. RESULTS: In this study, we sequenced and annotated a relatively high-quality genome of F. zanthoxyli with a size of 43.39 Mb, encoding 11,316 putative genes. Pathogenicity-related factors are predicted, comprising 495 CAZymes, 217 effectors, 156 CYP450s, and 202 enzymes associated with secondary metabolism. Besides, a comparative genomics analysis revealed Fusarium and Colletotrichum diverged from a shared ancestor approximately 141.1 ~ 88.4 million years ago (MYA). Additionally, a phylogenomic investigation of 12 different phytopathogens within Fusarium indicated that F. zanthoxyli originated approximately 34.6 ~ 26.9 MYA, and events of gene expansion and contraction within them were also unveiled. Finally, utilizing conserved domain prediction, the results revealed that among the 59 unique genes, the most enriched domains were PnbA and ULP1. Among the 783 expanded genes, the most enriched domains were PKc_like kinases and those belonging to the APH_ChoK_Like family. CONCLUSION: This study sheds light on the genetic basis of F. zanthoxyli's pathogenicity and evolution which provides valuable information for future research on its molecular interactions with prickly ash and the development of effective strategies to combat stem canker.


Evolution, Molecular , Fusarium , Genome, Fungal , Genomics , Phylogeny , Plant Diseases , Fusarium/genetics , Fusarium/pathogenicity , Genomics/methods , Plant Diseases/microbiology , Virulence/genetics
18.
Life Sci Alliance ; 7(8)2024 Aug.
Article En | MEDLINE | ID: mdl-38777370

The B-cell acute lymphoblastic leukemia (ALL) cell line REH, with the t(12;21) ETV6::RUNX1 translocation, is known to have a complex karyotype defined by a series of large-scale chromosomal rearrangements. Taken from a 15-yr-old at relapse, the cell line offers a practical model for the study of pediatric B-ALL. In recent years, short- and long-read DNA and RNA sequencing have emerged as a complement to karyotyping techniques in the resolution of structural variants in an oncological context. Here, we explore the integration of long-read PacBio and Oxford Nanopore whole-genome sequencing, IsoSeq RNA sequencing, and short-read Illumina sequencing to create a detailed genomic and transcriptomic characterization of the REH cell line. Whole-genome sequencing clarified the molecular traits of disrupted ALL-associated genes including CDKN2A, PAX5, BTG1, VPREB1, and TBL1XR1, as well as the glucocorticoid receptor NR3C1 Meanwhile, transcriptome sequencing identified seven fusion genes within the genomic breakpoints. Together, our extensive whole-genome investigation makes high-quality open-source data available to the leukemia genomics community.


Whole Genome Sequencing , Humans , Cell Line, Tumor , Whole Genome Sequencing/methods , High-Throughput Nucleotide Sequencing/methods , Translocation, Genetic/genetics , Oncogene Proteins, Fusion/genetics , Genomics/methods , Precursor Cell Lymphoblastic Leukemia-Lymphoma/genetics , Transcriptome/genetics , Gene Expression Profiling/methods , Core Binding Factor Alpha 2 Subunit/genetics , Karyotyping/methods , Sequence Analysis, RNA/methods
19.
Sci Rep ; 14(1): 11660, 2024 05 22.
Article En | MEDLINE | ID: mdl-38777847

The presence of Salmonella in dry fermented sausages is source of recalls and outbreaks. The genomic diversity of 173 Salmonella isolates from the dry fermented sausage production chains (pig carcasses, pork, and sausages) from France and Spain were investigated through their core phylogenomic relationships and accessory genome profiles. Ten different serovars and thirteen sequence type profiles were identified. The most frequent serovar from sausages was the monophasic variant of S. Typhimurium (1,4,[5],12:i:-, 72%) while S. Derby was in pig carcasses (51%). Phylogenomic clusters found in S. 1,4,[5],12:i:-, S. Derby, S. Rissen and S. Typhimurium serovars identified closely related isolates, with less than 10 alleles and 20 SNPs of difference, displaying Salmonella persistence along the pork production chain. Most of the S. 1,4,[5],12:i:- contained the Salmonella genomic island-4 (SGI-4), Tn21 and IncFIB plasmid. More than half of S. Derby strains contained the SGI-1 and Tn7. S. 1,4,[5],12:i:- genomes carried the most multidrug resistance genes (91% of the strains), whereas extended-spectrum ß-lactamase genes were found in Typhimurium and Derby serovars. Salmonella monitoring and characterization in the pork production chains, specially S. 1,4,[5],12:i:- serovar, is of special importance due to its multidrug resistance capacity and persistence in dry fermented sausages.


Food Microbiology , Meat Products , Phylogeny , Salmonella , Meat Products/microbiology , Spain , France , Animals , Salmonella/genetics , Salmonella/isolation & purification , Salmonella/classification , Swine , Fermentation , Genome, Bacterial , Serogroup , Genomics/methods , Genomic Islands/genetics
20.
BMC Genomics ; 25(1): 496, 2024 May 23.
Article En | MEDLINE | ID: mdl-38778305

BACKGROUND: Conducting genome-wide association studies (GWAS) for reproductive traits in Hanwoo cattle, including age at first calving (AFC), calving interval (CI), gestation length (GL), and number of artificial inseminations per conception (NAIPC), is of paramount significance. These analyses provided a thorough exploration of the genetic basis of these traits, facilitating the identification of key markers for targeted trait improvement. Breeders can optimize their selection strategies, leading to more efficient and sustainable breeding programs, by incorporating genetic insights. This impact extends beyond individual traits and contributes to the overall productivity and profitability of the Hanwoo beef cattle industry. Ultimately, GWAS is essential in ensuring the long-term genetic resilience and adaptability of Hanwoo cattle populations. The primary goal of this study was to identify significant single nucleotide polymorphisms (SNPs) or quantitative trait loci (QTLs) associated with the studied reproductive traits and subsequently map the underlying genes that hold promise for trait improvement. RESULTS: A genome-wide association study of reproductive traits identified 68 significant single nucleotide polymorphisms (SNPs) distributed across 29 Bos taurus autosomes (BTA). Among them, BTA14 exhibited the highest number of identified SNPs (25), whereas BTA6, BTA7, BTA8, BTA10, BTA13, BTA17, and BTA20 exhibited 8, 5, 5, 3, 8, 2, and 12 significant SNPs, respectively. Annotation of candidate genes within a 500 kb region surrounding the significant SNPs led to the identification of ten candidate genes relevant to age at first calving. These genes were: FANCG, UNC13B, TESK1, TLN1, and CREB3 on BTA8; FAM110B, UBXN2B, SDCBP, and TOX on BTA14; and MAP3K1 on BTA20. Additionally, APBA3, TCF12, and ZFR2, located on BTA7 and BTA10, were associated with the calving interval; PAX1, SGCD, and HAND1, located on BTA7 and BTA13, were linked to gestation length; and RBM47, UBE2K, and GPX8, located on BTA6 and BTA20, were linked to the number of artificial inseminations per conception in Hanwoo cows. CONCLUSIONS: The findings of this study enhance our knowledge of the genetic factors that influence reproductive traits in Hanwoo cattle populations and provide a foundation for future breeding strategies focused on improving desirable traits in beef cattle. This research offers new evidence and insights into the genetic variants and genome regions associated with reproductive traits and contributes valuable information to guide future efforts in cattle breeding.


Genome-Wide Association Study , Polymorphism, Single Nucleotide , Quantitative Trait Loci , Reproduction , Animals , Cattle/genetics , Cattle/physiology , Reproduction/genetics , Female , Phenotype , Genomics/methods
...