Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 447
Filtrar
1.
Am J Hum Genet ; 110(5): 863-879, 2023 05 04.
Artigo em Inglês | MEDLINE | ID: mdl-37146589

RESUMO

Deleterious mutations in the X-linked gene encoding ornithine transcarbamylase (OTC) cause the most common urea cycle disorder, OTC deficiency. This rare but highly actionable disease can present with severe neonatal onset in males or with later onset in either sex. Individuals with neonatal onset appear normal at birth but rapidly develop hyperammonemia, which can progress to cerebral edema, coma, and death, outcomes ameliorated by rapid diagnosis and treatment. Here, we develop a high-throughput functional assay for human OTC and individually measure the impact of 1,570 variants, 84% of all SNV-accessible missense mutations. Comparison to existing clinical significance calls, demonstrated that our assay distinguishes known benign from pathogenic variants and variants with neonatal onset from late-onset disease presentation. This functional stratification allowed us to identify score ranges corresponding to clinically relevant levels of impairment of OTC activity. Examining the results of our assay in the context of protein structure further allowed us to identify a 13 amino acid domain, the SMG loop, whose function appears to be required in human cells but not in yeast. Finally, inclusion of our data as PS3 evidence under the current ACMG guidelines, in a pilot reclassification of 34 variants with complete loss of activity, would change the classification of 22 from variants of unknown significance to clinically actionable likely pathogenic variants. These results illustrate how large-scale functional assays are especially powerful when applied to rare genetic diseases.


Assuntos
Hiperamonemia , Doença da Deficiência de Ornitina Carbomoiltransferase , Ornitina Carbamoiltransferase , Humanos , Substituição de Aminoácidos , Hiperamonemia/etiologia , Hiperamonemia/genética , Mutação de Sentido Incorreto/genética , Ornitina Carbamoiltransferase/genética , Doença da Deficiência de Ornitina Carbomoiltransferase/genética , Doença da Deficiência de Ornitina Carbomoiltransferase/diagnóstico , Doença da Deficiência de Ornitina Carbomoiltransferase/terapia
2.
Brief Bioinform ; 25(6)2024 Sep 23.
Artigo em Inglês | MEDLINE | ID: mdl-39331016

RESUMO

Nanopore sequence technology has demonstrated a longer read length and enabled to potentially address the limitations of short-read sequencing including long-range haplotype phasing and accurate variant calling. However, there is still room for improvement in terms of the performance of single nucleotide variant (SNV) identification and computing resource usage for the state-of-the-art approaches. In this work, we introduce miniSNV, a lightweight SNV calling algorithm that simultaneously achieves high performance and yield. miniSNV utilizes known common variants in populations as variation backgrounds and leverages read pileup, read-based phasing, and consensus generation to identify and genotype SNVs for Oxford Nanopore Technologies (ONT) long reads. Benchmarks on real and simulated ONT data under various error profiles demonstrate that miniSNV has superior sensitivity and comparable accuracy on SNV detection and runs faster with outstanding scalability and lower memory than most state-of-the-art variant callers. miniSNV is available from https://github.com/CuiMiao-HIT/miniSNV.


Assuntos
Algoritmos , Sequenciamento por Nanoporos , Polimorfismo de Nucleotídeo Único , Sequenciamento por Nanoporos/métodos , Software , Humanos , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Análise de Sequência de DNA/métodos
3.
Brief Bioinform ; 25(5)2024 Jul 25.
Artigo em Inglês | MEDLINE | ID: mdl-39177264

RESUMO

Recent nanopore sequencing system (R10.4) has enhanced base calling accuracy and is being increasingly utilized for detecting CpG methylation state. However, the robustness and universality of the methylation calling model in officially supplied Dorado remains poorly tested. In this study, we obtained heterogeneous datasets from human and plant sources to carry out comprehensive evaluations, which showed that Dorado performed significantly different across datasets. We therefore developed deep neural networks and implemented several optimizations in training a new model called DeepBAM. DeepBAM achieved superior and more stable performances compared with Dorado, including higher area under the ROC curves (98.47% on average and up to 7.36% improvement) and F1 scores (94.97% on average and up to 16.24% improvement) across the datasets. DeepBAM-based whole genome methylation frequencies have achieved >0.95 correlations with BS-seq on four of five datasets, outperforming Dorado in all instances. It enables unraveling allele-specific methylation patterns, including regions of transposable elements. The enhanced performance of DeepBAM paves the way for broader applications of nanopore sequencing in CpG methylation studies.


Assuntos
Ilhas de CpG , Metilação de DNA , Sequenciamento por Nanoporos , Sequenciamento por Nanoporos/métodos , Humanos , Software , Análise de Sequência de DNA/métodos , Redes Neurais de Computação
4.
J Biol Chem ; 300(6): 107317, 2024 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-38677514

RESUMO

It has become increasingly evident that the structures RNAs adopt are conformationally dynamic; the various structured states that RNAs sample govern their interactions with other nucleic acids, proteins, and ligands to regulate a myriad of biological processes. Although several biophysical approaches have been developed and used to study the dynamic landscape of structured RNAs, technical limitations have limited their application to all classes of RNA due to variable size and flexibility. Recent advances combining chemical probing experiments with next-generation- and direct sequencing have emerged as an alternative approach to exploring the conformational dynamics of RNA. In this review, we provide a methodological overview of the sequencing-based techniques used to study RNA conformational dynamics. We discuss how different techniques have enabled us to better understand the propensity of RNAs from a variety of different classes to sample multiple conformational states. Finally, we present examples of the ways these techniques have reshaped how we think about RNA structure.


Assuntos
Sequenciamento de Nucleotídeos em Larga Escala , Conformação de Ácido Nucleico , RNA , RNA/química , RNA/metabolismo , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Nanoporos , Humanos , Análise de Sequência de RNA/métodos
5.
BMC Genomics ; 25(1): 785, 2024 Aug 13.
Artigo em Inglês | MEDLINE | ID: mdl-39138417

RESUMO

To reduce the use of antibiotics and chemicals in aquaculture, an edible herb, Bidens pilosa, has been selected as a multifunctional feed additive. Although there has been considerable research into the effects of B. pilosa on poultry, the wider effects of B. pilosa, particularly on the growth and gut microbiota of fish, remain largely unexplored. We aimed to investigate the interactive effects between the host on growth and the gut microbiota using transcriptomics and the gut microbiota in B. pilosa-fed tilapia. In this study, we added 0.5% and 1% B. pilosa to the diet and observed that the growth performance of tilapia significantly increased over 8 weeks of feeding. Comparative transcriptome analysis was performed on RNA sequence profiles obtained from liver and muscle tissues. Functional enrichment analysis revealed that B. pilosa regulates several pathways and genes involved in amino acid metabolism, lipid metabolism, carbohydrate metabolism, endocrine system, signal transduction, and metabolism of other amino acids. The expression of the selected growth-associated genes was validated by qRT-PCR. The qRT-PCR results indicated that B. pilosa may enhance growth performance by activating the expression of the liver igf1 and muscle igf1rb genes and inhibiting the expression of the muscle negative regulator mstnb. Both the enhancement of liver endocrine IGF1/IGF1Rb signaling and the suppression of muscle autocrine/paracrine MSTN signaling induced the expression of myogenic regulatory factors (MRFs), myod1, myog and mrf4 in muscle to promote muscle growth in tilapia. The predicted function of the gut microbiota showed several significantly different pathways that overlapped with the KEGG enrichment results of differentially expressed genes in the liver transcriptomes. This finding suggested that the gut microbiota may influence liver metabolism through the gut-liver axis in B. pilosa-fed tilapia. In conclusion, dietary B. pilosa can regulate endocrine IGF1 signaling and autocrine/paracrine MSTN signaling to activate the expression of MRFs to promote muscle growth and alter the composition of gut bacteria, which can then affect liver amino acid metabolism, carbohydrate metabolism, endocrine system, lipid metabolism, metabolism of other amino acids, and signal transduction in the host, ultimately enhancing growth performance. Our results suggest that B. pilosa has the potential to be a functional additive that can be used as an alternative to reduce antibiotic use as a growth promoter in aquaculture.


Assuntos
Ração Animal , Bidens , Microbioma Gastrointestinal , Tilápia , Animais , Microbioma Gastrointestinal/efeitos dos fármacos , Tilápia/crescimento & desenvolvimento , Tilápia/microbiologia , Tilápia/genética , Tilápia/metabolismo , Bidens/metabolismo , Bidens/crescimento & desenvolvimento , Perfilação da Expressão Gênica , Transcriptoma , Fígado/metabolismo
6.
Electrophoresis ; 2024 May 25.
Artigo em Inglês | MEDLINE | ID: mdl-38794987

RESUMO

In forensic science, the demand for precision, consistency, and cost-effectiveness has driven the exploration of next-generation sequencing technologies. This study investigates the potential of Oxford Nanopore Sequencing (ONT) Technology for analyzing the HIrisPlex-S panel, a set of 41 single nucleotide polymorphism (SNP) markers used to predict eye, hair, and skin color. Using ONT sequencing, we assessed the accuracy and reliability of ONT-generated data by comparing it with conventional capillary electrophoresis (CE) in 18 samples. The Guppy v6.1 was used as a basecaller, and sample profiles were obtained using Burrows-Wheeler Aligner, Samtools, BCFtools, and Python. Comparing accuracy with CE, we found that 62% of SNPs in ONT-unligated samples were correctly genotyped, with 36% showing allele dropout, and 2% being incorrectly genotyped. In the ONT-ligated samples, 85% of SNPs were correctly genotyped, with 10% showing allele dropout, and 5% being incorrectly genotyped. Our findings indicate that ONT, particularly when combined with ligation, enhances genotyping accuracy and coverage, thereby reducing allele dropouts. However, challenges associated with the technology's error rates and the impact on genotyping accuracy are recognized. Phenotype predictions based on ONT data demonstrate varying degrees of success, with the technology showing high accuracy in several cases. Although ONT technology holds promise in forensic genetics, further optimization and quality control measures are essential to harness its full potential. This study contributes to the ongoing efforts to refine sequence read tuning and improve correction tools in the context of ONT technology's application in forensic genetics.

7.
Vox Sang ; 119(4): 377-382, 2024 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-38226545

RESUMO

BACKGROUND AND OBJECTIVES: Mixed-field agglutination in ABO phenotyping (A3, B3) has been linked to genetically different blood cell populations such as in chimerism, or to rare variants in either ABO exon 7 or regulatory regions. Clarification of such cases is challenging and would greatly benefit from sequencing technologies that allow resolving full-gene haplotypes at high resolution. MATERIALS AND METHODS: We used long-read sequencing by Oxford Nanopore Technologies to sequence the entire ABO gene, amplified in two overlapping long-range PCR fragments, in a blood donor presented with A3B phenotype. Confirmation analyses were carried out by Sanger sequencing and included samples from other family members. RESULTS: Our data revealed a novel heterozygous g.10924C>A variant on the ABO*A allele located in the transcription factor binding site for RUNX1 in intron 1 (+5.8 kb site). Inheritance was shown by the results of the donor's mother, who shared the novel variant and the anti-A specific mixed-field agglutination. CONCLUSION: We discovered a regulatory variant in the 8-bp RUNX1 motif of ABO, which extends current knowledge of three other variants affecting the same motif and also leading to A3 or B3 phenotypes. Overall, long-range PCR combined with nanopore sequencing proved powerful and showed great potential as an emerging strategy for resolving cases with cryptic ABO phenotypes.


Assuntos
Sistema ABO de Grupos Sanguíneos , Subunidade alfa 2 de Fator de Ligação ao Core , Humanos , Íntrons/genética , Subunidade alfa 2 de Fator de Ligação ao Core/genética , Fenótipo , Alelos , Sítios de Ligação , Sistema ABO de Grupos Sanguíneos/genética , Genótipo
8.
Microb Cell Fact ; 23(1): 111, 2024 Apr 15.
Artigo em Inglês | MEDLINE | ID: mdl-38622625

RESUMO

BACKGROUND: Ascomycetous budding yeasts are ubiquitous environmental microorganisms important in food production and medicine. Due to recent intensive genomic research, the taxonomy of yeast is becoming more organized based on the identification of monophyletic taxa. This includes genera important to humans, such as Kazachstania. Until now, Kazachstania humilis (previously Candida humilis) was regarded as a sourdough-specific yeast. In addition, any antibacterial activity has not been associated with this species. RESULTS: Previously, we isolated a yeast strain that impaired bio-hydrogen production in a dark fermentation bioreactor and inhibited the growth of Gram-positive and Gram-negative bacteria. Here, using next generation sequencing technologies, we sequenced the genome of this strain named K. humilis MAW1. This is the first genome of a K. humilis isolate not originating from a fermented food. We used novel phylogenetic approach employing the 18 S-ITS-D1-D2 region to show the placement of the K. humilis MAW1 among other members of the Kazachstania genus. This strain was examined by global phenotypic profiling, including carbon sources utilized and the influence of stress conditions on growth. Using the well-recognized bacterial model Escherichia coli AB1157, we show that K. humilis MAW1 cultivated in an acidic medium inhibits bacterial growth by the disturbance of cell division, manifested by filament formation. To gain a greater understanding of the inhibitory effect of K. humilis MAW1, we selected 23 yeast proteins with recognized toxic activity against bacteria and used them for Blast searches of the K. humilis MAW1 genome assembly. The resulting panel of genes present in the K. humilis MAW1 genome included those encoding the 1,3-ß-glucan glycosidase and the 1,3-ß-glucan synthesis inhibitor that might disturb the bacterial cell envelope structures. CONCLUSIONS: We characterized a non-sourdough-derived strain of K. humilis, including its genome sequence and physiological aspects. The MAW1, together with other K. humilis strains, shows the new organization of the mating-type locus. The revealed here pH-dependent ability to inhibit bacterial growth has not been previously recognized in this species. Our study contributes to the building of genome sequence-based classification systems; better understanding of K.humilis as a cell factory in fermentation processes and exploring bacteria-yeast interactions in microbial communities.


Assuntos
Antibacterianos , Saccharomycetales , Humanos , Filogenia , Antibacterianos/metabolismo , Bactérias Gram-Negativas , Bactérias Gram-Positivas , Saccharomycetales/genética , Leveduras/metabolismo , Fermentação
9.
RNA Biol ; 21(1): 1-15, 2024 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-38758523

RESUMO

2´-O-methylation (Nm) is one of the most abundant modifications found in both mRNAs and noncoding RNAs. It contributes to many biological processes, such as the normal functioning of tRNA, the protection of mRNA against degradation by the decapping and exoribonuclease (DXO) protein, and the biogenesis and specificity of rRNA. Recent advancements in single-molecule sequencing techniques for long read RNA sequencing data offered by Oxford Nanopore technologies have enabled the direct detection of RNA modifications from sequencing data. In this study, we propose a bio-computational framework, Nm-Nano, for predicting the presence of Nm sites in direct RNA sequencing data generated from two human cell lines. The Nm-Nano framework integrates two supervised machine learning (ML) models for predicting Nm sites: Extreme Gradient Boosting (XGBoost) and Random Forest (RF) with K-mer embedding. Evaluation on benchmark datasets from direct RNA sequecing of HeLa and HEK293 cell lines, demonstrates high accuracy (99% with XGBoost and 92% with RF) in identifying Nm sites. Deploying Nm-Nano on HeLa and HEK293 cell lines reveals genes that are frequently modified with Nm. In HeLa cell lines, 125 genes are identified as frequently Nm-modified, showing enrichment in 30 ontologies related to immune response and cellular processes. In HEK293 cell lines, 61 genes are identified as frequently Nm-modified, with enrichment in processes like glycolysis and protein localization. These findings underscore the diverse regulatory roles of Nm modifications in metabolic pathways, protein degradation, and cellular processes. The source code of Nm-Nano can be freely accessed at https://github.com/Janga-Lab/Nm-Nano.


Assuntos
Aprendizado de Máquina , Análise de Sequência de RNA , Transcriptoma , Humanos , Metilação , Análise de Sequência de RNA/métodos , Células HeLa , Sequenciamento por Nanoporos/métodos , Células HEK293 , Biologia Computacional/métodos , Processamento Pós-Transcricional do RNA , Nanoporos , Software , RNA Mensageiro/genética , RNA Mensageiro/metabolismo
10.
Brain ; 146(5): 1831-1843, 2023 05 02.
Artigo em Inglês | MEDLINE | ID: mdl-36227727

RESUMO

Instability of simple DNA repeats has been known as a common cause of hereditary ataxias for over 20 years. Routine genetic diagnostics of these phenotypically similar diseases still rely on an iterative workflow for quantification of repeat units by PCR-based methods of limited precision. We established and validated clinical nanopore Cas9-targeted sequencing, an amplification-free method for simultaneous analysis of 10 repeat loci associated with clinically overlapping hereditary ataxias. The method combines target enrichment by CRISPR-Cas9, Oxford Nanopore long-read sequencing and a bioinformatics pipeline using the tools STRique and Megalodon for parallel detection of length, sequence, methylation and composition of the repeat loci. Clinical nanopore Cas9-targeted sequencing allowed for the precise and parallel analysis of 10 repeat loci associated with adult-onset ataxia and revealed additional parameter such as FMR1 promotor methylation and repeat sequence required for diagnosis at the same time. Using clinical nanopore Cas9-targeted sequencing we analysed 100 clinical samples of undiagnosed ataxia patients and identified causative repeat expansions in 28 patients. Parallel repeat analysis enabled a molecular diagnosis of ataxias independent of preconceptions on the basis of clinical presentation. Biallelic expansions within RFC1 were identified as the most frequent cause of ataxia. We characterized the RFC1 repeat composition of all patients and identified a novel repeat motif, AGGGG. Our results highlight the power of clinical nanopore Cas9-targeted sequencing as a readily expandable workflow for the in-depth analysis and diagnosis of phenotypically overlapping repeat expansion disorders.


Assuntos
Ataxia Cerebelar , Degenerações Espinocerebelares , Adulto , Humanos , Ataxia/genética , Ataxia Cerebelar/genética , Biologia Computacional , Sequenciamento de Nucleotídeos em Larga Escala , Proteína do X Frágil da Deficiência Intelectual
11.
Curr Genomics ; 25(3): 212-225, 2024 May 31.
Artigo em Inglês | MEDLINE | ID: mdl-39086998

RESUMO

Background: Chemically modified therapeutic mRNAs have gained momentum recently. In addition to commonly used modifications (e.g., pseudouridine), 5moU is considered a promising substitution for uridine in therapeutic mRNAs. Accurate identification of 5-methoxyuridine (5moU) would be crucial for the study and quality control of relevant in vitro-transcribed (IVT) mRNAs. However, current methods exhibit deficiencies in providing quantitative methodologies for detecting such modification. Utilizing the capabilities of Oxford nanopore direct RNA sequencing, in this study, we present NanoML-5moU, a machine-learning framework designed specifically for the read-level detection and quantification of 5moU modification for IVT data. Materials and Methods: Nanopore direct RNA sequencing data from both 5moU-modified and unmodified control samples were collected. Subsequently, a comprehensive analysis and modeling of signal event characteristics (mean, median current intensities, standard deviations, and dwell times) were performed. Furthermore, classical machine learning algorithms, notably the Support Vector Machine (SVM), Random Forest (RF), and XGBoost were employed to discern 5moU modifications within NNUNN (where N represents A, C, U, or G) 5-mers. Results: Notably, the signal event attributes pertaining to each constituent base of the NNUNN 5-mers, in conjunction with the utilization of the XGBoost algorithm, exhibited remarkable performance levels (with a maximum AUROC of 0.9567 in the "AGTTC" reference 5-mer dataset and a minimum AUROC of 0.8113 in the "TGTGC" reference 5-mer dataset). This accomplishment markedly exceeded the efficacy of the prevailing background error comparison model (ELIGOs AUC 0.751 for site-level prediction). The model's performance was further validated through a series of curated datasets, which featured customized modification ratios designed to emulate broader data patterns, demonstrating its general applicability in quality control of IVT mRNA vaccines. The NanoML-5moU framework is publicly available on GitHub (https://github.com/JiayiLi21/NanoML-5moU). Conclusion: NanoML-5moU enables accurate read-level profiling of 5moU modification with nanopore direct RNA-sequencing, which is a powerful tool specialized in unveiling signal patterns in in vitro-transcribed (IVT) mRNAs.

12.
BMC Biol ; 21(1): 286, 2023 12 08.
Artigo em Inglês | MEDLINE | ID: mdl-38066581

RESUMO

BACKGROUND: Genomic prediction describes the use of SNP genotypes to predict complex traits and has been widely applied in humans and agricultural species. Genotyping-by-sequencing, a method which uses low-coverage sequence data paired with genotype imputation, is becoming an increasingly popular SNP genotyping method for genomic prediction. The development of Oxford Nanopore Technologies' (ONT) MinION sequencer has now made genotyping-by-sequencing portable and rapid. Here we evaluate the speed and accuracy of genomic predictions using low-coverage ONT sequence data in a population of cattle using four imputation approaches. We also investigate the effect of SNP reference panel size on imputation performance. RESULTS: SNP array genotypes and ONT sequence data for 62 beef heifers were used to calculate genomic estimated breeding values (GEBVs) from 641 k SNP for four traits. GEBV accuracy was much higher when genome-wide flanking SNP from sequence data were used to help impute the 641 k panel used for genomic predictions. Using the imputation package QUILT, correlations between ONT and low-density SNP array genomic breeding values were greater than 0.91 and up to 0.97 for sequencing coverages as low as 0.1 × using a reference panel of 48 million SNP. Imputation time was significantly reduced by decreasing the number of flanking sequence SNP used in imputation for all methods. When compared to high-density SNP arrays, genotyping accuracy and genomic breeding value correlations at 0.5 × coverage were also found to be higher than those imputed from low-density arrays. CONCLUSIONS: Here we demonstrated accurate genomic prediction is possible with ONT sequence data from sequencing coverages as low as 0.1 × , and imputation time can be as short as 10 min per sample. We also demonstrate that in this population, genotyping-by-sequencing at 0.1 × coverage can be more accurate than imputation from low-density SNP arrays.


Assuntos
Sequenciamento por Nanoporos , Humanos , Animais , Bovinos/genética , Feminino , Polimorfismo de Nucleotídeo Único , Genoma , Genômica/métodos , Genótipo
13.
Genomics ; 115(5): 110697, 2023 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-37567397

RESUMO

The Pacific oyster (Crassostrea gigas) is a widely cultivated shellfish in the world, while its transcriptome diversity remains less unexplored due to the limitation of short reads. In this study, we used Oxford Nanopore sequencing to develop the full-length transcriptome database of C. gigas. We identified 77,920 full-length transcripts from 21,523 genes, and uncovered 9668 alternative splicing events and 87,468 alternative polyadenylation sites. Notably, a total of 16,721 novel transcripts were annotated in this work. Furthermore, integrative analysis of 25 publicly available RNA-seq datasets revealed the transcriptome diversity involved in post-transcriptional regulation in C. gigas. We further developed a Drupal based webserver, Cgtdb, which can be used for transcriptome visualization, sequence alignment, and functional genome annotation analyses. This work provides valuable resources and a useful tool for integrative analysis of various transcriptome datasets in C. gigas, which will serve as an essential reference for functional annotation of the oyster genome.

14.
Genomics ; 115(6): 110709, 2023 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-37739021

RESUMO

Recent studies on marine organisms have made use of third-generation sequencing technologies such as Pacific Biosciences (PacBio) and Oxford Nanopore Technologies (ONT). While these specialized bioinformatics tools have different algorithmic designs and performance capabilities, they offer scalability and can be applied to various datasets. We investigated the effectiveness of PacBio and ONT RNA sequencing methods in identifying the venom of the jellyfish species Nemopilema nomurai. We conducted a detailed analysis of the sequencing data from both methods, focusing on key characteristics such as CD, alternative splicing, long-chain noncoding RNA, simple sequence repeat, transcription factor, and functional transcript annotation. Our findings indicate that ONT generally produced higher raw data quality in the transcriptome analysis, while PacBio generated longer read lengths. PacBio was found to be superior in identifying CDs and long-chain noncoding RNA, whereas ONT was more cost-effective for predicting alternative splicing events, simple sequence repeats, and transcription factors. Based on these results, we conclude that PacBio is the most specific and sensitive method for identifying venom components, while ONT is the most cost-effective method for studying venogenesis, cnidocyst (venom gland) development, and transcription of virulence genes in jellyfish. Our study has implications for future sequencing technologies in marine jellyfish, and highlights the power of full-length transcriptome analysis in discovering potential therapeutic targets for jellyfish dermatitis.


Assuntos
Venenos de Cnidários , Cifozoários , Animais , RNA , Análise de Sequência de RNA , RNA não Traduzido , Sequenciamento de Nucleotídeos em Larga Escala/métodos
15.
Int J Mol Sci ; 25(14)2024 Jul 11.
Artigo em Inglês | MEDLINE | ID: mdl-39062841

RESUMO

Pre-treatment genotyping of four well-characterized toxicity risk-variants in the dihydropyrimidine dehydrogenase gene (DPYD) has been widely implemented in Europe to prevent serious adverse effects in cancer patients treated with fluoropyrimidines. Current genotyping practices are largely limited to selected commonly studied variants and are unable to determine phasing when more than one variant allele is detected. Recent evidence indicates that common DPYD variants modulate the functional impact of deleterious variants in a phase-dependent manner, where a cis- or a trans-configuration translates into different toxicity risks and dosing recommendations. DPYD is a large gene with 23 exons spanning nearly a mega-base of DNA, making it a challenging candidate for full-gene sequencing in the diagnostic setting. Herein, we present a time- and cost-efficient long-read sequencing approach for capturing the complete coding region of DPYD. We demonstrate that this method can reliably produce phased genotypes, overcoming a major limitation with current methods. This method was validated using 21 subjects, including two cancer patients, each of whom carried multiple DPYD variants. Genotype assignments showed complete concordance with conventional approaches. Furthermore, we demonstrate that the method is robust to technical challenges inherent in long-range sequencing of PCR products, including reference alignment bias and PCR chimerism.


Assuntos
Di-Hidrouracila Desidrogenase (NADP) , Genótipo , Técnicas de Genotipagem , Di-Hidrouracila Desidrogenase (NADP)/genética , Humanos , Técnicas de Genotipagem/métodos , Análise de Sequência de DNA/métodos , Neoplasias/genética , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Polimorfismo de Nucleotídeo Único , Alelos
16.
Plant J ; 110(2): 572-588, 2022 04.
Artigo em Inglês | MEDLINE | ID: mdl-35106855

RESUMO

The assembly and scaffolding of plant crop genomes facilitate the characterization of genetically diverse cultivated and wild germplasm. The cultivated tomato (Solanum lycopersicum) has been improved through the introgression of genetic material from related wild species, including resistance to pandemic strains of tobacco mosaic virus (TMV) from Solanum peruvianum. Here we applied PacBio HiFi and ONT Nanopore sequencing to develop independent, highly contiguous and complementary assemblies of an inbred TMV-resistant tomato variety. We show specific examples of how HiFi and ONT datasets can complement one another to improve assembly contiguity. We merged the HiFi and ONT assemblies to generate a long-read-only assembly where all 12 chromosomes were represented as 12 contiguous sequences (N50 = 68.5 Mbp). This chromosome scale assembly did not require scaffolding using an orthogonal data type. The merged assembly was validated by chromosome conformation capture data and is highly consistent with previous tomato genome assemblies that made use of genetic maps and Hi-C for scaffolding. Our long-read-only assembly reveals that a complex series of structural variants linked to the TMV resistance gene likely contributed to linkage drag of a 64.1-Mbp region of the S. peruvianum genome during tomato breeding. Through marker studies and ONT-based comprehensive haplotyping we show that this minimal introgression region is present in six cultivated tomato hybrid varieties developed in three commercial breeding programs. Our results suggest that complementary long read technologies can facilitate the rapid generation of near-complete genome sequences.


Assuntos
Nanoporos , Solanum lycopersicum , Cromossomos , Genoma de Planta/genética , Solanum lycopersicum/genética , Melhoramento Vegetal , Análise de Sequência de DNA
17.
Mol Plant Microbe Interact ; 36(1): 73-77, 2023 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-36537805

RESUMO

The bacterial plant pathogen Xanthomonas oryzae pv. oryzae is responsible for the foliar rice bacterial blight disease. Genetically contrasted, continent-specific, sublineages of this species can cause important damages to rice production both in Asia and Africa. We report on the genome of the CIX2779 strain of this pathogen, previously named NAI1 and originating from Niger. Oxford Nanopore long reads assembly and Illumina short reads polishing produced a genome sequence composed of a 4,725,792-bp circular chromosome and a 39,798-bp-long circular plasmid designated pCIX2779_1. The chromosome structure and base-level sequence are highly related to reference strains of African X. oryzae pv. oryzae and encode identical transcription activator-like effectors for virulence. Importantly, our in silico analysis strongly indicates that pCIX2779_1 is a genuine conjugative plasmid, the first indigenous one sequenced from an African strain of the X. oryzae species. [Formula: see text] Copyright © 2022 The Author(s). This is an open access article distributed under the CC BY 4.0 International license.


Assuntos
Oryza , Xanthomonas , Oryza/microbiologia , Plasmídeos , Efetores Semelhantes a Ativadores de Transcrição/genética , Xanthomonas/genética , Doenças das Plantas/microbiologia , Proteínas de Bactérias/genética
18.
BMC Genomics ; 24(1): 572, 2023 Sep 26.
Artigo em Inglês | MEDLINE | ID: mdl-37752451

RESUMO

BACKGROUND: Telomeres are the nucleoprotein complexes that physically cap the ends of eukaryotic chromosomes. Most plants possess Arabidopsis-type telomere sequences (TSs). In addition to terminal TSs, more diverse interstitial TSs exists in plants. Although telomeres have been sufficiently studied, the actual diversity of TSs in land plants is underestimated. RESULTS: We investigate genotypes from seven natural populations with contrasting environments of four Chenopodium species to reveal the variability in TSs by analyzing Oxford Nanopore reads. Fluorescent in situ hybridization was used to localize telomeric repeats on chromosomes. We identified a number of derivative monomers that arise in part of both terminal and interstitial telomeric arrays of a single genotype. The former presents a case of block-organized double-monomer telomers, where blocks of Arabidopsis-type TTTAGGG motifs were interspersed with blocks of derivative TTTAAAA motifs. The latter is an integral part of the satellitome with transformations specific to the inactive genome fraction. CONCLUSIONS: We suggested two alternative models for the possible formation of derivative monomers from telomeric heptamer motifs of Arabidopsis-type. It was assumed that derivatization of TSs is a ubiquitous process in the plant genome but occurrence and frequencies of derivatives may be genotype-specific. We also propose that the formation of non-canonical arrays of TSs, especially at chromosomal termini, may be a source for genomic variability in nature.


Assuntos
Arabidopsis , Humanos , Arabidopsis/genética , Hibridização in Situ Fluorescente , Telômero/genética , Genótipo , Eucariotos
19.
BMC Genomics ; 24(1): 117, 2023 Mar 16.
Artigo em Inglês | MEDLINE | ID: mdl-36927511

RESUMO

BACKGROUND: Generating the most contiguous, accurate genome assemblies given available sequencing technologies is a long-standing challenge in genome science. With the rise of long-read sequencing, assembly challenges have shifted from merely increasing contiguity to correctly assembling complex, repetitive regions of interest, ideally in a phased manner. At present, researchers largely choose between two types of long read data: longer, but less accurate sequences, or highly accurate, but shorter reads (i.e., >Q20 or 99% accurate). To better understand how these types of long-read data as well as scale of data (i.e., mean length and sequencing depth) influence genome assembly outcomes, we compared genome assemblies for a caddisfly, Hesperophylax magnus, generated with longer, but less accurate, Oxford Nanopore (ONT) R9.4.1 and highly accurate PacBio HiFi (HiFi) data. Next, we expanded this comparison to consider the influence of highly accurate long-read sequence data on genome assemblies across 6750 plant and animal genomes. For this broader comparison, we used HiFi data as a surrogate for highly accurate long-reads broadly as we could identify when they were used from GenBank metadata. RESULTS: HiFi reads outperformed ONT reads in all assembly metrics tested for the caddisfly data set and allowed for accurate assembly of the repetitive ~ 20 Kb H-fibroin gene. Across plants and animals, genome assemblies that incorporated HiFi reads were also more contiguous. For plants, the average HiFi assembly was 501% more contiguous (mean contig N50 = 20.5 Mb) than those generated with any other long-read data (mean contig N50 = 4.1 Mb). For animals, HiFi assemblies were 226% more contiguous (mean contig N50 = 20.9 Mb) versus other long-read assemblies (mean contig N50 = 9.3 Mb). In plants, we also found limited evidence that HiFi may offer a unique solution for overcoming genomic complexity that scales with assembly size. CONCLUSIONS: Highly accurate long-reads generated with HiFi or analogous technologies represent a key tool for maximizing genome assembly quality for a wide swath of plants and animals. This finding is particularly important when resources only allow for one type of sequencing data to be generated. Ultimately, to realize the promise of biodiversity genomics, we call for greater uptake of highly accurate long-reads in future studies.


Assuntos
Biodiversidade , Genômica , Sequenciamento de Nucleotídeos em Larga Escala , Análise de Sequência de DNA , Genômica/métodos , Genômica/normas , Genômica/tendências , Insetos/classificação , Insetos/genética , Fibroínas/genética , Mapeamento de Sequências Contíguas , Genoma de Inseto/genética , Animais , Bases de Dados de Ácidos Nucleicos , Reprodutibilidade dos Testes , Metanálise como Assunto , Conjuntos de Dados como Assunto , Análise de Sequência de DNA/métodos , Análise de Sequência de DNA/normas , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Sequenciamento de Nucleotídeos em Larga Escala/normas , Sequenciamento de Nucleotídeos em Larga Escala/tendências , Plantas/genética , Genoma de Planta/genética
20.
BMC Genomics ; 24(1): 581, 2023 Oct 02.
Artigo em Inglês | MEDLINE | ID: mdl-37784013

RESUMO

BACKGROUND: Rapid and accurate pathogen identification is required for disease management. Compared to sequencing entire genomes, targeted sequencing may be used to direct sequencing resources to genes of interest for microbe identification and mitigate the low resolution that single-locus molecular identification provides. This work describes a broad-spectrum fungal identification tool developed to focus high-throughput Nanopore sequencing on genes commonly employed for disease diagnostics and phylogenetic inference. RESULTS: Orthologs of targeted genes were extracted from 386 reference genomes of fungal species spanning six phyla to identify homologous regions that were used to design the baits used for enrichment. To reduce the cost of producing probes without diminishing the phylogenetic power, DNA sequences were first clustered, and then consensus sequences within each cluster were identified to produce 26,000 probes that targeted 114 genes. To test the efficacy of our probes, we applied the technique to three species representing Ascomycota and Basidiomycota fungi. The efficiency of enrichment, quantified as mean target coverage over the mean genome-wide coverage, ranged from 200 to 300. Furthermore, enrichment of long reads increased the depth of coverage across the targeted genes and into non-coding flanking sequence. The assemblies generated from enriched samples provided well-resolved phylogenetic trees for taxonomic assignment and molecular identification. CONCLUSIONS: Our work provides data to support the utility of targeted Nanopore sequencing for fungal identification and provides a platform that may be extended for use with other phytopathogens.


Assuntos
Ascomicetos , Sequenciamento por Nanoporos , Nanoporos , Filogenia , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Análise de Sequência de DNA/métodos
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA