RESUMO
Deoxyribonucleic acid (DNA) methylation plays a key role in gene regulation and is critical for development and human disease. Techniques such as whole-genome bisulfite sequencing (WGBS) and reduced representation bisulfite sequencing (RRBS) allow DNA methylation analysis at the genome scale, with Illumina NovaSeq 6000 and MGI Tech DNBSEQ-T7 being popular due to their efficiency and affordability. However, detailed comparative studies of their performance are not available. In this study, we constructed 60 WGBS and RRBS libraries for two platforms using different types of clinical samples and generated approximately 2.8 terabases of sequencing data. We systematically compared quality control metrics, genomic coverage, CpG methylation levels, intra- and interplatform correlations, and performance in detecting differentially methylated positions. Our results revealed that the DNBSEQ platform exhibited better raw read quality, although base quality recalibration indicated potential overestimation of base quality. The DNBSEQ platform also showed lower sequencing depth and less coverage uniformity in GC-rich regions than did the NovaSeq platform and tended to enrich methylated regions. Overall, both platforms demonstrated robust intra- and interplatform reproducibility for RRBS and WGBS, with NovaSeq performing better for WGBS, highlighting the importance of considering these factors when selecting a platform for bisulfite sequencing.
Assuntos
Ilhas de CpG , Metilação de DNA , Análise de Sequência de DNA , Humanos , Análise de Sequência de DNA/métodos , Genoma Humano , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Sulfitos/química , Pareamento de Bases , Sequenciamento Completo do Genoma/métodos , Reprodutibilidade dos TestesRESUMO
Implementing a specific cloud resource to analyze extensive genomic data on severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) poses a challenge when resources are limited. To overcome this, we repurposed a cloud platform initially designed for use in research on cancer genomics (https://cgc.sbgenomics.com) to enable its use in research on SARS-CoV-2 to build Cloud Workflow for Viral and Variant Identification (COWID). COWID is a workflow based on the Common Workflow Language that realizes the full potential of sequencing technology for use in reliable SARS-CoV-2 identification and leverages cloud computing to achieve efficient parallelization. COWID outperformed other contemporary methods for identification by offering scalable identification and reliable variant findings with no false-positive results. COWID typically processed each sample of raw sequencing data within 5 min at a cost of only US$0.01. The COWID source code is publicly available (https://github.com/hendrick0403/COWID) and can be accessed on any computer with Internet access. COWID is designed to be user-friendly; it can be implemented without prior programming knowledge. Therefore, COWID is a time-efficient tool that can be used during a pandemic.
Assuntos
COVID-19 , Humanos , COVID-19/diagnóstico , Computação em Nuvem , SARS-CoV-2/genética , Fluxo de Trabalho , GenômicaRESUMO
BACKGROUND: The development of sequence-specific precision treatments like CRISPR gene editing therapies for Duchenne muscular dystrophy (DMD) requires sequence humanized animal models to enable the direct clinical translation of tested strategies. The current available integrated transgenic mouse model containing the full-length human DMD gene, Tg(DMD)72Thoen/J (hDMDTg), has been found to have two copies of the transgene per locus in a tail-to-tail orientation, which does not accurately simulate the true (single) copy number of the DMD gene. This duplication also complicates analysis when testing CRISPR therapy editing outcomes, as large genetic alterations and rearrangements can occur between the cut sites on the two transgenes. RESULTS: To address this, we performed long read nanopore sequencing on hDMDTg mice to better understand the structure of the duplicated transgenes. Following that, we performed a megabase-scale deletion of one of the transgenes by CRISPR zygotic microinjection to generate a single-copy, full-length, humanized DMD transgenic mouse model (hDMDTgSc). Functional, molecular, and histological characterisation shows that the single remaining human transgene retains its function and rescues the dystrophic phenotype caused by endogenous murine Dmd knockout. CONCLUSIONS: Our unique hDMDTgSc mouse model simulates the true copy number of the DMD gene, and can potentially be used for the further generation of DMD disease models that would be better suited for the pre-clinical assessment and development of sequence specific CRISPR therapies.
Assuntos
Sistemas CRISPR-Cas , Modelos Animais de Doenças , Camundongos Transgênicos , Distrofia Muscular de Duchenne , Transgenes , Animais , Distrofia Muscular de Duchenne/genética , Distrofia Muscular de Duchenne/terapia , Camundongos , Humanos , Edição de Genes/métodos , Distrofina/genética , Duplicação Gênica , Repetições Palindrômicas Curtas Agrupadas e Regularmente Espaçadas/genéticaRESUMO
Lyme neuroborreliosis (LNB) is a complex neuroinflammatory disorder caused by Borrelia burgdorferi, which is transmitted through tick bites. Epigenetic alterations, specifically DNA methylation (DNAm), could play a role in the host immune response during infection. In this study, we present the first genome-wide analysis of DNAm in peripheral blood mononuclear cells from patients with LNB and those without LNB. Using a network-based approach, we highlighted HLA genes at the core of these DNAm changes, which were found to be enriched in immune-related pathways. These findings shed light on the role of epigenetic modifications in the LNB pathogenesis that should be confirmed and further expanded upon in future studies.
Assuntos
Borrelia burgdorferi , Neuroborreliose de Lyme , Humanos , Neuroborreliose de Lyme/genética , Metilação de DNA , Leucócitos Mononucleares , Borrelia burgdorferi/genéticaRESUMO
BACKGROUND: DNA metabarcoding applies high-throughput sequencing approaches to generate numerous DNA barcodes from mixed sample pools for mass species identification and community characterisation. To date, however, most metabarcoding studies employ second-generation sequencing platforms like Illumina, which are limited by short read lengths and longer turnaround times. While third-generation platforms such as the MinION (Oxford Nanopore Technologies) can sequence longer reads and even in real-time, application of these platforms for metabarcoding has remained limited possibly due to the relatively high read error rates as well as the paucity of specialised software for processing such reads. RESULTS: We show that this is no longer the case by performing nanopore-based, cytochrome c oxidase subunit I (COI) metabarcoding on 34 zooplankton bulk samples, and benchmarking the results against conventional Illumina MiSeq sequencing. Nanopore R10.3 sequencing chemistry and super accurate (SUP) basecalling model reduced raw read error rates to ~ 4%, and consensus calling with amplicon_sorter (without further error correction) generated metabarcodes that were ≤ 1% erroneous. Although Illumina recovered a higher number of molecular operational taxonomic units (MOTUs) than nanopore sequencing (589 vs. 471), we found no significant differences in the zooplankton communities inferred between the sequencing platforms. Importantly, 406 of 444 (91.4%) shared MOTUs between Illumina and nanopore were also found to be free of indel errors, and 85% of the zooplankton richness could be recovered after just 12-15 h of sequencing. CONCLUSION: Our results demonstrate that nanopore sequencing can generate metabarcodes with Illumina-like accuracy, and we are the first study to show that nanopore metabarcodes are almost always indel-free. We also show that nanopore metabarcoding is viable for characterising species-rich communities rapidly, and that the same ecological conclusions can be obtained regardless of the sequencing platform used. Collectively, our study inspires confidence in nanopore sequencing and paves the way for greater utilisation of nanopore technology in various metabarcoding applications.
Assuntos
Código de Barras de DNA Taxonômico , Sequenciamento de Nucleotídeos em Larga Escala , Nanoporos , Código de Barras de DNA Taxonômico/métodos , Animais , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Mutação INDEL , Sequenciamento por Nanoporos/métodos , Complexo IV da Cadeia de Transporte de Elétrons/genética , Zooplâncton/genética , Zooplâncton/classificação , Análise de Sequência de DNA/métodosRESUMO
BACKGROUND: The order Lepidoptera has an abundance of species, including both agriculturally beneficial and detrimental insects. Molecular data has been used to investigate the phylogenetic relationships of major subdivisions in Lepidoptera, which has enhanced our understanding of the evolutionary relationships at the family and superfamily levels. However, the phylogenetic placement of many superfamilies and/or families in this order is still unknown. In this study, we determine the systematic status of the family Argyresthiidae within Lepidoptera and explore its phylogenetic affinities and implications for the evolution of the order. We describe the first mitochondrial (mt) genome from a member of Argyresthiidae, the apple fruit moth Argyresthia conjugella. The insect is an important pest on apples in Fennoscandia, as it switches hosts when the main host fails to produce crops. RESULTS: The mt genome of A. conjugella contains 16,044 bp and encodes all 37 genes commonly found in insect mt genomes, including 13 protein-coding genes (PCGs), two ribosomal RNAs, 22 transfer RNAs, and a large control region (1101 bp). The nucleotide composition was extremely AT-rich (82%). All detected PCGs (13) began with an ATN codon and terminated with a TAA stop codon, except the start codon in cox1 is ATT. All 22 tRNAs had cloverleaf secondary structures, except trnS1, where one of the dihydrouridine (DHU) arms is missing, reflecting potential differences in gene expression. When compared to the mt genomes of 507 other Lepidoptera representing 18 superfamilies and 42 families, phylogenomic analyses found that A. conjugella had the closest relationship with the Plutellidae family (Yponomeutoidea-super family). We also detected a sister relationship between Yponomeutoidea and the superfamily Tineidae. CONCLUSIONS: Our results underline the potential importance of mt genomes in comparative genomic analyses of Lepidoptera species and provide valuable evolutionary insight across the tree of Lepidoptera species.
Assuntos
Genoma Mitocondrial , Lepidópteros , Malus , Mariposas , Humanos , Animais , Mariposas/genética , Malus/genética , Filogenia , Frutas , Lepidópteros/genética , RNA de Transferência/genética , Códon de TerminaçãoRESUMO
BACKGROUND: Sequencing variable regions of the 16S rRNA gene (≃300 bp) with Illumina technology is commonly used to study the composition of human microbiota. Unfortunately, short reads are unable to differentiate between highly similar species. Considering that species from the same genus can be associated with health or disease it is important to identify them at the lowest possible taxonomic rank. Third-generation sequencing platforms such as PacBio SMRT, increase read lengths allowing to sequence the whole gene with the maximum taxonomic resolution. Despite its potential, full length 16S rRNA gene sequencing is not widely used yet. The aim of the current study was to compare the sequencing output and taxonomic annotation performance of the two approaches (Illumina short read sequencing and PacBio long read sequencing of 16S rRNA gene) in different human microbiome samples. DNA from saliva, oral biofilms (subgingival plaque) and faeces of 9 volunteers was isolated. Regions V3-V4 and V1-V9 were amplified and sequenced by Illumina Miseq and by PacBio Sequel II sequencers, respectively. RESULTS: With both platforms, a similar percentage of reads was assigned to the genus level (94.79% and 95.06% respectively) but with PacBio a higher proportion of reads were further assigned to the species level (55.23% vs 74.14%). Regarding overall bacterial composition, samples clustered by niche and not by sequencing platform. In addition, all genera with > 0.1% abundance were detected in both platforms for all types of samples. Although some genera such as Streptococcus tended to be observed at higher frequency in PacBio than in Illumina (20.14% vs 14.12% in saliva, 10.63% vs 6.59% in subgingival plaque biofilm samples) none of the differences were statistically significant when correcting for multiple testing. CONCLUSIONS: The results presented in the current manuscript suggest that samples sequenced using Illumina and PacBio are mostly comparable. Considering that PacBio reads were assigned at the species level with higher accuracy than Illumina, our data support the use of PacBio technology for future microbiome studies, although a higher cost is currently required to obtain an equivalent number of reads per sample.
Assuntos
Microbiota , Humanos , RNA Ribossômico 16S/genética , Genes de RNAr , Filogenia , Análise de Sequência de DNA/métodos , Microbiota/genética , Sequenciamento de Nucleotídeos em Larga Escala/métodosRESUMO
BACKGROUND: Patagonian toothfish (Dissostichus eleginoides) is an economically and ecologically important fish species in the family Nototheniidae. Juveniles occupy progressively deeper waters as they mature and grow, and adults have been caught as deep as 2500 m, living on or in just above the southern shelves and slopes around the sub-Antarctic islands of the Southern Ocean. As apex predators, they are a key part of the food web, feeding on a variety of prey, including krill, squid, and other fish. Despite its importance, genomic sequence data, which could be used for more accurate dating of the divergence between Patagonian and Antarctic toothfish, or establish whether it shares adaptations to temperature with fish living in more polar or equatorial climes, has so far been limited. RESULTS: A high-quality D. eleginoides genome was generated using a combination of Illumina, PacBio and Omni-C sequencing technologies. To aid the genome annotation, the transcriptome derived from a variety of toothfish tissues was also generated using both short and long read sequencing methods. The final genome assembly was 797.8 Mb with a N50 scaffold length of 3.5 Mb. Approximately 31.7% of the genome consisted of repetitive elements. A total of 35,543 putative protein-coding regions were identified, of which 50% have been functionally annotated. Transcriptomics analysis showed that approximately 64% of the predicted genes (22,617 genes) were found to be expressed in the tissues sampled. Comparative genomics analysis revealed that the anti-freeze glycoprotein (AFGP) locus of D. eleginoides does not contain any AFGP proteins compared to the same locus in the Antarctic toothfish (Dissostichus mawsoni). This is in agreement with previously published results looking at hybridization signals and confirms that Patagonian toothfish do not possess AFGP coding sequences in their genome. CONCLUSIONS: We have assembled and annotated the Patagonian toothfish genome, which will provide a valuable genetic resource for ecological and evolutionary studies on this and other closely related species.
Assuntos
Perciformes , Animais , Perciformes/genética , Genômica , Regiões Antárticas , Evolução Biológica , Proteínas AnticongelantesRESUMO
BACKGROUND: The Polygonaceae is a family well-known for its weeds, and edible plants, Fagopyrum (buckwheat) and Rheum (rhubarb), which are primarily herbaceous and temperate in distribution. Yet, the family also contains a number of lineages that are principally distributed in the tropics and subtropics. Notably, these lineages are woody, unlike their temperate relatives. To date, full-genome sequencing has focused on the temperate and herbaceous taxa. In an effort to increase breadth of genetic knowledge of the Polygonaceae, we here present six fully assembled and annotated chloroplast genomes from six of the tropical, woody genera: Coccoloba rugosa (a narrow and endangered Puerto Rican endemic), Gymnopodium floribundum, Neomillspaughia emarginata, Podopterus mexicanus, Ruprechtia coriacea, and Triplaris cumingiana. RESULTS: These assemblies represent the first publicly-available assembled and annotated plastomes for the genera Podopterus, Gymnopodium, and Neomillspaughia, and the first assembled and annotated plastomes for the species Coccoloba rugosa, Ruprechtia coriacea, and Triplaris cumingiana. We found the assembled chloroplast genomes to be above the median size of Polygonaceae plastomes, but otherwise exhibit features typical of the family. The features of greatest sequence variation are found among the ndh genes and in the small single copy (SSC) region of the plastome. The inverted repeats show high GC content and little sequence variation across genera. When placed in a phylogenetic context, our sequences were resolved within the Eriogonoideae. CONCLUSIONS: These six plastomes from among the tropical woody Polygonaceae appear typical within the family. The plastome assembly of Ruprechtia coriacea presented here calls into question the sequence identity of a previously published plastome assembly of R. albida.
Assuntos
Genoma de Cloroplastos , Polygonaceae , Polygonaceae/genética , Polygonaceae/classificação , Filogenia , Anotação de Sequência MolecularRESUMO
Whole-genome sequencing has become the method of choice for bacterial outbreak investigation, with most clinical and public health laboratories currently routinely using short-read Illumina sequencing. Recently, long-read Oxford Nanopore Technologies (ONT) sequencing has gained prominence and may offer advantages over short-read sequencing, particularly with the recent introduction of the R10 chemistry, which promises much lower error rates than the R9 chemistry. However, limited information is available on its performance for bacterial single-nucleotide polymorphism (SNP)-based outbreak investigation. We present an open-source workflow, Prokaryotic Awesome variant Calling Utility (PACU) (https://github.com/BioinformaticsPlatformWIV-ISP/PACU), for constructing SNP phylogenies using Illumina and/or ONT R9/R10 sequencing data. The workflow was evaluated using outbreak data sets of Shiga toxin-producing Escherichia coli and Listeria monocytogenes by comparing ONT R9 and R10 with Illumina data. The performance of each sequencing technology was evaluated not only separately but also by integrating samples sequenced by different technologies/chemistries into the same phylogenomic analysis. Additionally, the minimum sequencing time required to obtain accurate phylogenetic results using nanopore sequencing was evaluated. PACU allowed accurate identification of outbreak clusters for both species using all technologies/chemistries, but ONT R9 results deviated slightly more from the Illumina results. ONT R10 results showed trends very similar to Illumina, and we found that integrating data sets sequenced by either Illumina or ONT R10 for different isolates into the same analysis produced stable and highly accurate phylogenomic results. The resulting phylogenies for these two outbreaks stabilized after ~20 hours of sequencing for ONT R9 and ~8 hours for ONT R10. This study provides a proof of concept for using ONT R10, either in isolation or in combination with Illumina, for rapid and accurate bacterial SNP-based outbreak investigation.
Assuntos
Surtos de Doenças , Polimorfismo de Nucleotídeo Único , Humanos , Sequenciamento por Nanoporos/métodos , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Filogenia , Listeria monocytogenes/genética , Listeria monocytogenes/classificação , Listeria monocytogenes/isolamento & purificação , Sequenciamento Completo do Genoma/métodos , Genoma Bacteriano/genética , Listeriose/epidemiologia , Listeriose/microbiologia , Análise de Sequência de DNA/métodos , Nanoporos , Bactérias/genética , Bactérias/classificação , Bactérias/isolamento & purificaçãoRESUMO
BACKGROUND: HPV status in a subset of HNSCC is linked with distinct treatment outcomes. Present investigation aims to elucidate the distinct clinicopathological features of HPV-positive and HPV-negative HNSCC and investigate their association with the HNSCC patient survival. MATERIALS AND METHODS: The total RNA of exosomes from HPV-positive (93VU147T) and HPV-negative (OCT-1) HNSCC cells was isolated, and the transcripts were estimated using Illumina HiSeq X. The expression of altered transcripts and their clinical relevance were further analyzed using publicly available cancer transcriptome data from The Cancer Genome Atlas (TCGA). RESULTS: Transcriptomic analyses identified 3785 differentially exported transcripts (DETs) in HPV-positive exosomes compared to HPV-negative exosomes. DETs that regulate the protein machinery, cellular redox potential, and various neurological disorder-related pathways were over-represented in HPV-positive exosomes. TCGA database revealed the clinical relevance of altered transcripts. Among commonly exported abundant transcripts, SGK1 and MAD1L1 showed high expression, which has been correlated with poor survival in HNSCC patients. In the top 20 DETs of HPV-negative exosomes, high expression of FADS3, SGK3, and TESK2 correlated with poor survival of the HNSCC patients in the TCGA database. CONCLUSION: Overall, our study demonstrates that HPV-positive and HPV-negative cells' exosomes carried differential transcripts cargo that may be related to pathways associated with neurological disorders. Additionally, the altered transcripts identified have clinical relevance, correlating with patient survival in HNSCC, thereby highlighting their potential as biomarkers and as therapeutic targets.
Assuntos
Exossomos , Neoplasias de Cabeça e Pescoço , Carcinoma de Células Escamosas de Cabeça e Pescoço , Humanos , Exossomos/metabolismo , Exossomos/genética , Carcinoma de Células Escamosas de Cabeça e Pescoço/genética , Carcinoma de Células Escamosas de Cabeça e Pescoço/virologia , Carcinoma de Células Escamosas de Cabeça e Pescoço/mortalidade , Carcinoma de Células Escamosas de Cabeça e Pescoço/patologia , Carcinoma de Células Escamosas de Cabeça e Pescoço/metabolismo , Neoplasias de Cabeça e Pescoço/genética , Neoplasias de Cabeça e Pescoço/mortalidade , Neoplasias de Cabeça e Pescoço/patologia , Neoplasias de Cabeça e Pescoço/virologia , Neoplasias de Cabeça e Pescoço/metabolismo , Masculino , Feminino , Regulação Neoplásica da Expressão Gênica , Perfilação da Expressão Gênica , Infecções por Papillomavirus/virologia , Infecções por Papillomavirus/complicações , Infecções por Papillomavirus/genética , Biomarcadores Tumorais/genética , Biomarcadores Tumorais/metabolismo , Pessoa de Meia-Idade , Linhagem Celular Tumoral , Transcriptoma , Prognóstico , IdosoRESUMO
BACKGROUND: Assisted reproductive technologies (ART) may perturb DNA methylation (DNAm) in early embryonic development. Although a handful of epigenome-wide association studies of ART have been published, none have investigated CpGs on the X chromosome. To bridge this knowledge gap, we leveraged one of the largest collections of mother-father-newborn trios of ART and non-ART (natural) conceptions to date to investigate sex-specific DNAm differences on the X chromosome. The discovery cohort consisted of 982 ART and 963 non-ART trios from the Norwegian Mother, Father, and Child Cohort Study (MoBa). To verify our results from the MoBa cohort, we used an external cohort of 149 ART and 58 non-ART neonates from the Australian 'Clinical review of the Health of adults conceived following Assisted Reproductive Technologies' (CHART) study. The Illumina EPIC array was used to measure DNAm in both datasets. In the MoBa cohort, we performed a set of X-chromosome-wide association studies ('XWASs' hereafter) to search for sex-specific DNAm differences between ART and non-ART newborns. We tested several models to investigate the influence of various confounders, including parental DNAm. We also searched for differentially methylated regions (DMRs) and regions of co-methylation flanking the most significant CpGs. Additionally, we ran an analogous model to our main model on the external CHART dataset. RESULTS: In the MoBa cohort, we found more differentially methylated CpGs and DMRs in girls than boys. Most of the associations persisted after controlling for parental DNAm and other confounders. Many of the significant CpGs and DMRs were in gene-promoter regions, and several of the genes linked to these CpGs are expressed in tissues relevant for both ART and sex (testis, placenta, and fallopian tube). We found no support for parental DNAm-dependent features as an explanation for the observed associations in the newborns. The most significant CpG in the boys-only analysis was in UBE2DNL, which is expressed in testes but with unknown function. The most significant CpGs in the girls-only analysis were in EIF2S3 and AMOT. These three loci also displayed differential DNAm in the CHART cohort. CONCLUSIONS: Genes that co-localized with the significant CpGs and DMRs associated with ART are implicated in several key biological processes (e.g., neurodevelopment) and disorders (e.g., intellectual disability and autism). These connections are particularly compelling in light of previous findings indicating that neurodevelopmental outcomes differ in ART-conceived children compared to those naturally conceived.
Assuntos
Metilação de DNA , Epigênese Genética , Masculino , Gravidez , Adulto , Criança , Feminino , Humanos , Recém-Nascido , Metilação de DNA/genética , Estudos de Coortes , Estudo de Associação Genômica Ampla , AustráliaRESUMO
BACKGROUND: Deep learning methods are revolutionizing natural science. In this study, we aim to apply such techniques to develop blood type prediction models based on cheap to analyze and easily scalable screening array genotyping platforms. METHODS: Combining existing blood types from blood banks and imputed screening array genotypes for ~111,000 Danish and 1168 Finnish blood donors, we used deep learning techniques to train and validate blood type prediction models for 36 antigens in 15 blood group systems. To account for missing genotypes a denoising autoencoder initial step was utilized, followed by a convolutional neural network blood type classifier. RESULTS: Two thirds of the trained blood type prediction models demonstrated an F1-accuracy above 99%. Models for antigens with low or high frequencies like, for example, Cw, low training cohorts like, for example, Cob, or very complicated genetic underpinning like, for example, RhD, proved to be more challenging for high accuracy (>99%) DL modeling. However, in the Danish cohort only 4 out of 36 models (Cob, Cw, D-weak, Kpa) failed to achieve a prediction F1-accuracy above 97%. This high predictive performance was replicated in the Finnish cohort. DISCUSSION: High accuracy in a variety of blood groups proves viability of deep learning-based blood type prediction using array chip genotypes, even in blood groups with nontrivial genetic underpinnings. These techniques are suitable for aiding in identifying blood donors with rare blood types by greatly narrowing down the potential pool of candidate donors before clinical grade confirmation.
RESUMO
BACKGROUND: A filamentous fungus Penicillium rubens is widely recognized for producing industrially important antibiotic, penicillin at industrial scale. OBJECTIVE: To better comprehend, the genetic blueprint of the wild-type P. rubens was isolated from India to identify the genetic/biosynthetic pathways for phenoxymethylpenicillin (penicillin V, PenV) and other secondary metabolites. METHOD: Genomic DNA (gDNA) was isolated, and library was prepared as per Illumina platform. Whole genome sequencing (WGS) was performed according to Illumina NovoSeq platform. Further, SOAPdenovo was used to assemble the short reads validated by Bowtie-2 and SAMtools packages. Glimmer and GeneMark were used to dig out total genes in genome. Functional annotation of predicted proteins was performed by NCBI non-redundant (NR), UniProt, Kyoto Encyclopedia of Genes and Genomes (KEGG), and Gene Ontology (GO) databases. Moreover, secretome analysis was performed by SignalP 4.1 and TargetP v1.1 and carbohydrate-active enzymes (CAZymes) and protease families by CAZy database. Comparative genome analysis was performed by Mauve 2.4.0. software to find genomic correlation between P. rubens BIONCL P45 and Penicillium chrysogenum Wisconsin 54-1255; also phylogeny was prepared with known penicillin producing strains by ParSNP tool. RESULTS: Penicillium rubens BIONCL P45 strain was isolated from India and is producing excess PenV. The 31.09 Mb genome was assembled with 95.6% coverage of the reference genome P. chrysogenum Wis 54-1255 with 10687 protein coding genes, 3502 genes had homologs in NR, UniProt, KEGG, and GO databases. Additionally, 358 CAZymes and 911 transporter coding genes were found in genome. Genome contains complete pathways for penicillin, homogentisate pathway of phenyl acetic acid (PAA) catabolism, Andrastin A, Sorbicillin, Roquefortine C, and Meleagrin. Comparative genome analysis of BIONCL P45 and Wis 54-1255 revealed 99.89% coverage with 2952 common KEGG orthologous protein-coding genes. Phylogenetic analysis revealed that BIONCL P45 was clustered with Fleming's original isolate P. rubens IMI 15378. CONCLUSION: This genome can be a helpful resource for further research in developing fermentation processes and strain engineering approaches for high titer penicillin production.
Assuntos
Genoma Fúngico , Penicillium , Vias Biossintéticas , Índia , Penicillium/classificação , Penicillium/genética , Penicillium/isolamento & purificação , Penicillium/metabolismo , Filogenia , Sequenciamento Completo do GenomaRESUMO
BACKGROUND: Ascomycetous budding yeasts are ubiquitous environmental microorganisms important in food production and medicine. Due to recent intensive genomic research, the taxonomy of yeast is becoming more organized based on the identification of monophyletic taxa. This includes genera important to humans, such as Kazachstania. Until now, Kazachstania humilis (previously Candida humilis) was regarded as a sourdough-specific yeast. In addition, any antibacterial activity has not been associated with this species. RESULTS: Previously, we isolated a yeast strain that impaired bio-hydrogen production in a dark fermentation bioreactor and inhibited the growth of Gram-positive and Gram-negative bacteria. Here, using next generation sequencing technologies, we sequenced the genome of this strain named K. humilis MAW1. This is the first genome of a K. humilis isolate not originating from a fermented food. We used novel phylogenetic approach employing the 18 S-ITS-D1-D2 region to show the placement of the K. humilis MAW1 among other members of the Kazachstania genus. This strain was examined by global phenotypic profiling, including carbon sources utilized and the influence of stress conditions on growth. Using the well-recognized bacterial model Escherichia coli AB1157, we show that K. humilis MAW1 cultivated in an acidic medium inhibits bacterial growth by the disturbance of cell division, manifested by filament formation. To gain a greater understanding of the inhibitory effect of K. humilis MAW1, we selected 23 yeast proteins with recognized toxic activity against bacteria and used them for Blast searches of the K. humilis MAW1 genome assembly. The resulting panel of genes present in the K. humilis MAW1 genome included those encoding the 1,3-ß-glucan glycosidase and the 1,3-ß-glucan synthesis inhibitor that might disturb the bacterial cell envelope structures. CONCLUSIONS: We characterized a non-sourdough-derived strain of K. humilis, including its genome sequence and physiological aspects. The MAW1, together with other K. humilis strains, shows the new organization of the mating-type locus. The revealed here pH-dependent ability to inhibit bacterial growth has not been previously recognized in this species. Our study contributes to the building of genome sequence-based classification systems; better understanding of K.humilis as a cell factory in fermentation processes and exploring bacteria-yeast interactions in microbial communities.
Assuntos
Antibacterianos , Saccharomycetales , Humanos , Filogenia , Antibacterianos/metabolismo , Bactérias Gram-Negativas , Bactérias Gram-Positivas , Saccharomycetales/genética , Leveduras/metabolismo , FermentaçãoRESUMO
Commercial short tandem repeat (STR) kits exclusively contain human-specific primers; however, various non-human organisms with high homology to the STR kit's primer sequences can cause cross-reactivity. Owing to the proprietary nature of the primers in STR kits, the origins and sequences of most non-specific peaks (NSPs) remain unclear. Such NSPs can complicate data interpretation between the casework and reference samples; thus, we developed "NSPlex", an efficient method to discover the biological origins of NSPs. We used leftover STR kit amplicons after capillary electrophoresis and performed advanced bioinformatics analyses using next-generation sequencing followed by BLAST nucleotide searches. Using our method, we could successfully identify NSP generated from PCR amplicons of a sample mixture of human DNA and DNA extracted from matcha powder (finely ground powder of green tea leaves and previously known as a potential source of NSP). Our results showed our method is efficient for NSP analysis without the need for the primer information as in commercial STR kits.
Assuntos
Impressões Digitais de DNA , Primers do DNA , Eletroforese Capilar , Sequenciamento de Nucleotídeos em Larga Escala , Repetições de Microssatélites , Reação em Cadeia da Polimerase , Humanos , Impressões Digitais de DNA/métodosRESUMO
The study of microalgal communities is critical for understanding aquatic ecosystems. These communities primarily comprise diatoms (Heterokontophyta), with two methods commonly used to study them: Microscopy and metabarcoding. However, these two methods often deliver different results; thus, their suitability for analyzing diatom communities is frequently debated and evaluated. This study used these two methods to analyze the diatom communities in identical water samples and compare the results. The taxonomy of the species constituting the diatom communities was confirmed, and both methods showed that species belonging to the orders Bacillariales and Naviculales (class Bacillariophyceae) are the most diverse. In the lower taxonomic levels (family, genus, and species), microscopy tended to show a bias toward detecting diatom species (Nitzschia frustulum, Nitzschia inconspicua, Nitzschia intermedia, Navicula gregaria, Navicula perminuta, Navicula recens, Navicula sp.) belonging to the Bacillariaceae and Naviculaceae families. The results of the two methods differed in identifying diatom species in the communities and analyzing their structural characteristics. These results are consistent with the fact that diatoms belonging to the genera Nitzschia and Navicula are abundant in the communities; furthermore, only the Illumina MiSeq data showed the abundance of the Melosira and Entomoneis genera. The results obtained from microscopy were superior to those of Illumina MiSeq regarding species-level identification. Based on the results obtained via microscopy and Illumina MiSeq, it was revealed that neither method is perfect and that each has clear strengths and weaknesses. Therefore, to analyze diatom communities effectively and accurately, these two methods should be combined.
Assuntos
Código de Barras de DNA Taxonômico , Diatomáceas , Estuários , Microscopia , Diatomáceas/classificação , Diatomáceas/crescimento & desenvolvimento , Microscopia/métodos , República da Coreia , Biodiversidade , Filogenia , EcossistemaRESUMO
PURPOSE: Recently, cases of serious illness in newborns infected with Echovirus 11 have been reported in Europe, including Italy. Here, we report the case of a newborn diagnosed with disseminated Echovirus 11 infection, which occurred in October 2023 in the Province of Bolzano, Italy. METHODS: A molecular screening, by Real-Time RT-PCR, was employed to analyse the cerebrospinal fluid, blood and stool samples, and nasal swabs. The entire viral genome was sequenced using both Illumina and Nanopore technologies. RESULTS: The patient was admitted to hospital due to fever. Molecular testing revealed the presence of enterovirus RNA. Typing confirmed the presence of Echovirus 11. The patient was initially treated with antibiotic therapy and, following the diagnosis of enterovirus infection, also with human immunoglobulins. Over the following days, the patient remained afebrile, with decreasing inflammation indices and in excellent general condition. Genomic and phylogenetic characterization suggested that the strain was similar to strains from severe cases reported in Europe. CONCLUSIONS: Despite the low overall risk for the neonatal population in Europe, recent cases of Echovirus 11 have highlighted the importance of surveillance and complete genome sequencing is fundamental to understanding the phylogenetic relationships of Echovirus 11 variants.
RESUMO
BACKGROUND: Eleusine coracana (L.) Gaertn is a crucial C4 species renowned for its stress robustness and nutritional significance. Because of its adaptability traits, finger millet (ragi) is a storehouse of critical genomic resources for crop improvement. However, more knowledge about this crop's molecular responses to heat stress needs to be gained. METHODS AND RESULTS: In the present study, a comparative RNA sequencing analysis was done in the leaf tissue of the finger millet, between the heat-sensitive (KJNS-46) and heat-tolerant (PES-110) cultivars of Ragi, in response to high temperatures. On average, each sample generated about 24 million reads. Interestingly, a comparison of transcriptomic profiling identified 684 transcripts which were significantly differentially expressed genes (DEGs) examined between the heat-stressed samples of both genotypes. The heat-induced change in the transcriptome was confirmed by qRT-PCR using a set of randomly selected genes. Pathway analysis and functional annotation analysis revealed the activation of various genes involved in response to stress specifically heat, oxidation-reduction process, water deprivation, and changes in heat shock protein (HSP) and transcription factors, calcium signaling, and kinase signaling. The basal regulatory genes, such as bZIP, were involved in response to heat stress, indicating that heat stress activates genes involved in housekeeping or related to basal regulatory processes. A substantial percentage of the DEGs belonged to proteins of unknown functions (PUFs), i.e., not yet characterized. CONCLUSION: These findings highlight the importance of candidate genes, such as HSPs and pathways that can confer tolerance towards heat stress in ragi. These results will provide valuable information to improve the heat tolerance in heat-susceptible agronomically important varieties of ragi and other crops.
Assuntos
Eleusine , Termotolerância , Genótipo , Perfilação da Expressão Gênica , Proteínas de Choque TérmicoRESUMO
Wet meadows, a type of wetland, are vulnerable to climate change and human activity, impacting soil properties and microorganisms that are crucial to the ecosystem processes of wet meadows. To decipher the ecological mechanisms and processes involved in wet meadows, it is necessary to examine the bacterial communities associated with plant roots. To gain valuable insight into the microbial dynamics of alpine wet meadows, we used Illumina MiSeq sequencing to investigate how environmental factors shape the bacterial communities thriving in the rhizosphere and rhizoplane of three plant species: Cremanthodium ellisii, Caltha scaposa, and Cremanthodium lineare. The most abundant bacterial phyla in rhizosphere and rhizoplane were Proteobacteria > Firmicutes > Actinobacteria, while Macrococcus, Lactococcus, and Exiguobacterium were the most abundant bacterial genera between rhizosphere and rhizoplane. The mantel test, network, and structure equation models revealed that bacterial communities of rhizosphere were shaped by total nitrogen (TN), soil water content (SWC), soil organic carbon (SOC), microbial biomass carbon (MBC), microbial biomass nitrogen (MBN), pH, however, rhizoplane bacterial communities exhibited varying results. The bacterial communities exhibited significant heterogeneity, with stochastic process predominating in both the rhizosphere and rhizoplane. PICRUSt2 and FAPROTAX analysis revealed substantial differences in key biogeochemical cycles and metabolic functional predictions. It was concluded that root compartments significantly influenced the bacterial communities, although plant species and elevation asserted varying effects. This study portrays how physicochemical properties, plant species, and elevations can shift the overall structure and functional repertoire of bacterial communities in alpine wet meadows.