RESUMO
BACKGROUND: Third-generation sequencing (TGS) based on long-read technology has been gradually used in identifying thalassemia and hemoglobin (Hb) variants. The aim of the present study was to explore genotype varieties of thalassemia and Hb variants in Quanzhou region of Southeast China by TGS. METHODS: Included in this study were 6,174 subjects with thalassemia traits from Quanzhou region of Southeast China. All of them underwent common thalassemia gene testing using the DNA reverse dot-blot hybridization technology. Subjects who were suspected as rare thalassemia carriers were further subjected to TGS to identify rare or novel α- and ß-globin gene variants, and the results were verified by Sanger sequencing and/or gap PCR. RESULTS: Of the 6,174 included subjects, 2,390 (38.71%) were identified as α- and ß-globin gene mutation carriers, including 40 carrying rare or novel α- and ß-thalassemia mutations. The αCD30(-GAG)α and Hb Lepore-Boston-Washington were first reported in Fujian province Southeast China. Moreover, the ßCD15(TGG> TAG), ßIVS-II-761, ß0-Filipino(~ 45 kb deletion), and Hb Lepore-Quanzhou were first identified in the Chinese population. In addition, 35 cases of Hb variants were detected, the rare Hb variants of Hb Jilin and Hb Beijing were first reported in Fujian province of China. Among them, one case with compound αααanti3.7 and Hb G-Honolulu variants was identified in this study. CONCLUSION: Our findings may provide valuable data for enriching the spectrum of thalassemia and highlight the clinical application value of TGS-based α- and ß-globin genetic testing.
Assuntos
alfa-Globinas , Globinas beta , Humanos , Globinas beta/genética , alfa-Globinas/genética , China , Feminino , Masculino , Adulto , Sequenciamento de Nucleotídeos em Larga Escala , Mutação , Adolescente , Criança , Talassemia/genética , Adulto Jovem , Talassemia beta/genética , Genótipo , Pessoa de Meia-Idade , Talassemia alfa/genética , População do Leste AsiáticoRESUMO
Outbreaks of furunculosis cause significant losses in salmonid aquaculture worldwide. With a recent rise in antimicrobial resistance, regulatory measures to minimize the use of antibiotics in animal husbandry, including aquaculture, have increased scrutiny and availability of veterinary medical products to control this disease in production facilities. In such a regulatory environment, the utility of autogenous vaccines to assist with disease prevention and control as a veterinary-guided prophylactic measure is of high interest to the producers and veterinary services alike. However, evolving concepts of epidemiological units and epidemiological links need to be considered during approval and acceptance procedures for the application of autogenous vaccines in multiple aquaculture facilities. Here, we present the results of solid-state nanopore sequencing (Oxford Nanopore Technologies, ONT) performed on 54 isolates of Aeromonas salmonicida ssp. salmonicida sampled during clinical outbreaks of furunculosis in different aquaculture facilities from Bavaria, Germany, from 2017 to 2020. All of the performed analyses (phylogeny, single nucleotide polymorphism and 3D protein modeling for major immunogenic proteins) support a high probability that all studied isolates belong to the same epidemiological unit. Simultaneously, we describe a cost/effective method of whole genome analysis with the usage of ONT as a viable strategy to study outbreaks of other pathogens in the field of aquatic veterinary medicine for the purpose of developing the best autogenous vaccine candidates applicable to multiple aquaculture establishments.
RESUMO
OBJECTIVE: To describe a novel α-thalassemiadeletion identified from a newborn by third-generation sequencing (TGS). CASE REPORT: The proband, a newborn subject to neonatal capillary electrophoresis (CE) screening, exhibited suspected α0-thalassemia carrier status (Hb Bart's 3.0%). Notably, both parents had negative results on thalassemia screening during pregnancy. Multiplex ligation-dependent probe amplification (MLPA) presented a deletion between probes 364nt and 472 nt that extended from the HBZ gene to the downstream region of the RGS11 gene. Subsequently, TGS determined the approximated break position of this deletion, indicating a length exceeding 145â kb (chr16:127,815-273,190 del 145376â bp). Sanger sequencing validated the upstream and downstream breakpoints of this deletion. Only maternal data were available for pedigree analysis, with the father's sample lacking. MLPA showed no deletion in the mother, suggesting possible paternal inheritance. The deletion was named Guigang deletion (--Guigang) after the proband's city of origin, Guigang. CONCLUSIONS: We reported a novel α-thalassemiadeletion and provided insights into the hematological phenotype and molecular analysis. These findings have implications for genetic counseling and prenatal diagnosis.
Assuntos
Deleção de Sequência , alfa-Globinas , Talassemia alfa , Humanos , Recém-Nascido , alfa-Globinas/genética , Feminino , Talassemia alfa/genética , Talassemia alfa/diagnóstico , Família Multigênica , Masculino , Sequenciamento de Nucleotídeos em Larga Escala , População do Leste AsiáticoRESUMO
DNA-based technologies have been used in forensic practice since the mid-1980s. While PCR-based STR genotyping using Capillary Electrophoresis remains the gold standard for generating DNA profiles in routine casework worldwide, the research community is continually seeking alternative methods capable of providing additional information to enhance discrimination power or contribute with new investigative leads. Oxford Nanopore Technologies (ONT) and PacBio third-generation sequencing have revolutionized the field, offering real-time capabilities, single-molecule resolution, and long-read sequencing (LRS). ONT, the pioneer of nanopore sequencing, uses biological nanopores to analyze nucleic acids in real-time. Its devices have revolutionized sequencing and may represent an interesting alternative for forensic research and routine casework, given that it offers unparalleled flexibility in a portable size: it enables sequencing approaches that range widely from PCR-amplified short target regions (e.g., CODIS STRs) to PCR-free whole transcriptome or even ultra-long whole genome sequencing. Despite its higher error rate compared to Illumina sequencing, it can significantly improve accuracy in read alignment against a reference genome or de novo genome assembly. This is achieved by generating long contiguous sequences that correctly assemble repetitive sections and regions with structural variation. Moreover, it allows real-time determination of DNA methylation status from native DNA without the need for bisulfite conversion. LRS enables the analysis of thousands of markers at once, providing phasing information and eliminating the need for multiple assays. This maximizes the information retrieved from a single invaluable sample. In this review, we explore the potential use of LRS in different forensic genetics approaches.
RESUMO
Spinal muscular atrophy (SMA) is the second most common fatal genetic disease in infancy. It is caused by deletion or intragenic pathogenic variants of the causative gene SMN1, which degenerates anterior horn motor neurons and leads to progressive myasthenia and muscle atrophy. Early treatment improves motor function and prognosis in patients with SMA, but drugs are expensive and do not cure the disease. Therefore, carrier screening seems to be the most effective way to prevent SMA birth defects. In this study, we genetically analyzed 1400 samples using multiplex ligation-dependent probe amplification (MLPA) and quantitative polymerase chain reaction (qPCR), and compared the consistency of the results. We randomly selected 44 samples with consistent MLPA and qPCR results for comprehensive SMA analysis (CASMA) using a long-read sequencing (LRS)-based approach. CASMA results showed 100% consistency, visually and intuitively explained the inconsistency between exons 7 and 8 copy numbers detected by MLPA in 13 samples. A total of 16 samples showed inconsistent MLPA and qPCR results for SMN1 exon 7. CASMA was performed on all samples and the results were consistent with those of resampling for MLPA and qPCR detection. CASMA also detected an additional intragenic variant c.-39A>G in a sample with two copies of SMN1 (RT02). Finally, we detected 23 SMA carriers, with an estimated carrier rate of 1/61 in this cohort. In addition, CASMA identified the "2 + 0" carrier status of SMN1 and SMN2 in a family by analyzing the genotypes of only three samples (parents and one sibling). CASMA has great advantages over MLPA and qPCR assays, and could become a powerful technical support for large-scale screening of SMA.
Assuntos
Éxons , Atrofia Muscular Espinal , Proteína 1 de Sobrevivência do Neurônio Motor , Humanos , Atrofia Muscular Espinal/genética , Atrofia Muscular Espinal/diagnóstico , Proteína 1 de Sobrevivência do Neurônio Motor/genética , Feminino , Masculino , Éxons/genética , Triagem de Portadores Genéticos/métodos , Reação em Cadeia da Polimerase Multiplex/métodos , Análise de Sequência de DNA/métodosRESUMO
Vairimorpha (Nosema) ceranae is a single-cellular fungus that obligately infects the midgut epithelial cells of adult honeybees, causing bee microsporidiosis and jeopardizing bee health and production. This work aims to construct the full-length transcriptome of V. ceranae and conduct a relevant investigation using PacBio single-molecule real-time (SMRT) sequencing technology. Following PacBio SMRT sequencing, 41,950 circular consensus (CCS) were generated, and 25,068 full-length non-chimeric (FLNC) reads were then detected. After polishing, 4387 high-quality, full-length transcripts were gained. There are 778, 2083, 1202, 1559, 1457, 1232, 1702, and 3896 full-length transcripts that could be annotated to COG, GO, KEGG, KOG, Pfam, Swiss-Prot, eggNOG, and Nr databases, respectively. Additionally, 11 alternative splicing (AS) events occurred in 6 genes were identified, including 1 alternative 5' splice-site and 10 intron retention. The structures of 225 annotated genes in the V. ceranae reference genome were optimized, of which 29 genes were extended at both 5' UTR and 3' UTR, while 90 and 106 genes were, respectively, extended at the 5' UTR as well as 3' UTR. Furthermore, a total of 29 high-confidence lncRNAs were obtained, including 12 sense-lncRNAs, 10 lincRNAs, and 7 antisense-lncRNAs. Taken together, the high-quality, full-length transcriptome of V. ceranae was constructed and annotated, the structures of annotated genes in the V. ceranae reference genome were improved, and abundant new genes, transcripts, and lncRNAs were discovered. Findings from this current work offer a valuable resource and a crucial foundation for molecular and omics research on V. ceranae.
Assuntos
Processamento Alternativo , Nosema , Transcriptoma , Fatores de Virulência , Transcriptoma/genética , Fatores de Virulência/genética , Nosema/genética , Nosema/patogenicidade , Animais , Abelhas/microbiologia , Abelhas/genética , Isoformas de Proteínas/genética , Proteínas Fúngicas/genética , Proteínas Fúngicas/metabolismo , Anotação de Sequência MolecularRESUMO
Background: The Neotropics harbors the largest species richness of the planet; however, even in well-studied groups, there are potentially hundreds of species that lack a formal description, and likewise, many already described taxa are difficult to identify using morphology. Specifically in small mammals, complex morphological diagnoses have been facilitated by the use of molecular data, particularly from mitochondrial sequences, to obtain accurate species identifications. Obtaining mitochondrial markers implies the use of PCR and specific primers, which are largely absent for non-model organisms. Oxford Nanopore Technologies (ONT) is a new alternative for sequencing the entire mitochondrial genome without the need for specific primers. Only a limited number of studies have employed exclusively ONT long-reads to assemble mitochondrial genomes, and few studies have yet evaluated the usefulness of such reads in multiple non-model organisms. Methods: We implemented fieldwork to collect small mammals, including rodents, bats, and marsupials, in five localities in the northern extreme of the Cordillera Central of Colombia. DNA samples were sequenced using the MinION device and Flongle flow cells. Shotgun-sequenced data was used to reconstruct the mitochondrial genome of all the samples. In parallel, using a customized computational pipeline, species-level identifications were obtained based on sequencing raw reads (Whole Genome Sequencing). ONT-based identifications were corroborated using traditional morphological characters and phylogenetic analyses. Results: A total of 24 individuals from 18 species were collected, morphologically identified, and deposited in the biological collection of Universidad EAFIT. Our different computational pipelines were able to reconstruct mitochondrial genomes from exclusively ONT reads. We obtained three new mitochondrial genomes and eight new molecular mitochondrial sequences for six species. Our species identification pipeline was able to obtain accurate species identifications for up to 75% of the individuals in as little as 5 s. Finally, our phylogenetic analyses corroborated the identifications from our automated species identification pipeline and revealed important contributions to the knowledge of the diversity of Neotropical small mammals. Discussion: This study was able to evaluate different pipelines to reconstruct mitochondrial genomes from non-model organisms, using exclusively ONT reads, benchmarking these protocols on a multi-species dataset. The proposed methodology can be applied by non-expert taxonomists and has the potential to be implemented in real-time, without the need to euthanize the organisms and under field conditions. Therefore, it stands as a relevant tool to help increase the available data for non-model organisms, and the rate at which researchers can characterize life specially in highly biodiverse places as the Neotropics.
Assuntos
Genoma Mitocondrial , Mamíferos , Análise de Sequência de DNA , Animais , Mamíferos/genética , Genoma Mitocondrial/genética , Análise de Sequência de DNA/métodos , Nanoporos , Colômbia , DNA Mitocondrial/genética , Filogenia , Quirópteros/genética , Sequenciamento por Nanoporos/métodosRESUMO
Hybrid Pennisetum, a top biomass energy source, faces usage limitations because of its scarce lactic acid bacteria and high fiber content. This study assessed the influence of rumen fluid pretreatment on hybrid Pennisetum's silage, with focus on silage duration and rumen fluid effects on quality and fiber decomposition. Advanced third-generation sequencing was used to track microbial diversity changes and revealed that rumen fluid considerably enhanced dry matter, crude protein, and water-soluble carbohydrates, thus improving fermentation quality to satisfactory pH levels (3.40-3.67). Ideal results, including the highest fiber breakdown and enzymatic efficiency (47.23 %), were obtained with 5 % rumen fluid in 60 days. The addition of rumen fluid changed the dominant species, including Paucilactobacillus vaccinostercus (0.00 % vs. 18.21 %) and Lactiplantibacillus plantarum (21.03 % vs. 47.02 %), and no Enterobacter was detected in the high-concentration treatments. Moreover, strong correlations were found between specific lactic acid bacteria and fermentation indicators, revealing the potential of achieving efficient and economically beneficial hybrid Pennisetum production.
Assuntos
Fermentação , Pennisetum , Rúmen , Silagem , Silagem/microbiologia , Rúmen/microbiologia , Animais , Fibras na Dieta/metabolismo , MicrobiotaRESUMO
BACKGROUND: Infectious diseases are still one of the greatest threats to human health, and the etiology of 20% of cases of clinical fever is unknown; therefore, rapid identification of pathogens is highly important. Traditional culture methods are only able to detect a limited number of pathogens and are time-consuming; serologic detection has window periods, false-positive and false-negative problems; and nucleic acid molecular detection methods can detect several known pathogens only once. Three-generation nanopore sequencing technology provides new options for identifying pathogens. CASE SUMMARY: Case 1: The patient was admitted to the hospital with abdominal pain for three days and cessation of defecation for five days, accompanied by cough and sputum. Nanopore sequencing of the drainage fluid revealed the presence of oral-like bacteria, leading to a clinical diagnosis of bronchopleural fistula. Cefoperazone sodium sulbactam treatment was effective. Case 2: The patient was admitted to the hospital with fever and headache, and CT revealed lung inflammation. Antibiotic treatment for Streptococcus pneumoniae, identified through nanopore sequencing of cerebrospinal fluid, was effective. Case 3: The patient was admitted to our hospital with intermittent fever and an enlarged neck mass that had persisted for more than six months. Despite antibacterial treatment, her symptoms worsened. The nanopore sequencing results indicate that voriconazole treatment is effective for Aspergillus brookii. The patient was diagnosed with mixed cell type classical Hodgkin's lymphoma with infection. CONCLUSION: Three-generation nanopore sequencing technology allows for rapid and accurate detection of pathogens in human infectious diseases.
RESUMO
The classification of the Uranoscopidae species is controversial and the Ichthyscopus pollicaris belonging to Uranoscopidae was first reported in 2019. In the present study, the whole genome sequence of I. pollicaris were generated by PacBio and Illumina platforms for the first time. After de novo assembly and correction of the high-quality PacBio data, a 527.25 Mb I. pollicaris genome with an N50 length of 11.25 Mb was finally generated. Meanwhile, 170.41 Mb repeating sequence, 21,263 genes, 784 miRNAs, 2,225 tRNAs, 3004 rRNAs, and 1422 snRNAs were annotated in I. pollicaris genome. Furthermore, 3,168 single-copy orthologous genes were applied to reconstructed the phylogenetic relationship between I. pollicaris and other 11 species. The draft genome sequences have been deposited in NCBI database with the accession number of PRJNA1071810.
RESUMO
Molecular HLA typing techniques are currently undergoing a rapid evolution. While real-time PCR is established as the standard method in tissue typing laboratories regarding allocation of solid organs, next generation sequencing (NGS) for high-resolution HLA typing is becoming indispensable but is not yet suitable for deceased donors. By contrast, high-resolution typing is essential for stem cell transplantation and is increasingly required for questions relating to various disease associations. In this multicentre clinical study, the TGS technique using nanopore sequencing is investigated applying NanoTYPE™ kit and NanoTYPER™ software (Omixon Biocomputing Ltd., Budapest, Hungary) regarding the concordance of the results with NGS and its practicability in diagnostic laboratories. The results of 381 samples show a concordance of 99.58% for 11 HLA loci, HLA-A, -B, -C, -DRB1, -DRB3, -DRB4, -DRB5, -DQA1, -DQB1, -DPA1 and -DPB1. The quality control (QC) data shows a very high quality of the sequencing performed in each laboratory, 34,926 (97.15%) QC values were returned as 'passed', 862 (2.4%) as 'inspect' and 162 (0.45%) as 'failed'. We show that an 'inspect' or 'failed' QC warning does not automatically lead to incorrect HLA typing. The advantages of nanopore sequencing are speed, flexibility, reusability of the flow cells and easy implementation in the laboratory. There are challenges, such as exon coverage and the handling of large amounts of data. Finally, nanopore sequencing presents potential for applications in basic research within the field of epigenetics and genomics and holds significance for clinical concerns.
Assuntos
Antígenos HLA , Sequenciamento de Nucleotídeos em Larga Escala , Teste de Histocompatibilidade , Humanos , Teste de Histocompatibilidade/métodos , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Antígenos HLA/genética , Software , Alelos , Genótipo , Controle de Qualidade , Sequenciamento por Nanoporos/métodos , Técnicas de Genotipagem/métodosRESUMO
While poly (3-hydroxybutyrate) (PHB) holds promise as a bioplastic, its commercial utilization has been hampered by the high cost of raw materials. However, glycerol emerges as a viable feedstock for PHB production, offering a sustainable production approach and substantial cost reduction potential. Glycerol stands out as a promising feedstock for PHB production, offering a pathway toward sustainable manufacturing and considerable cost savings. The identification and characterization of strains capable of converting glycerol into PHB represent a pivotal strategy in advancing PHB production research. In this study, we isolated a strain, Ralstonia sp. RRA (RRA). The strain exhibits remarkable proficiency in synthesizing PHB from glycerol. With glycerol as the carbon source, RRA achieved a specific growth rate of 0.19 h-1, attaining a PHB content of approximately 50% within 30 h. Through third-generation genome and transcriptome sequencing, we elucidated the genome composition and identified a total of eight genes (glpR, glpD, glpS, glpT, glpP, glpQ, glpV, and glpK) involved in the glycerol metabolism pathway. Leveraging these findings, the strain RRA demonstrates significant promise in producing PHB from low-cost renewable carbon sources.
RESUMO
BACKGROUND: Structural variation (SV) detection methods using third-generation sequencing data are widely employed, yet accurately detecting SVs remains challenging. Different methods often yield inconsistent results for certain SV types, complicating tool selection and revealing biases in detection. RESULTS: This study comprehensively evaluates 53 SV detection pipelines using simulated and real data from PacBio (CLR: Continuous Long Read, CCS: Circular Consensus Sequencing) and Nanopore (ONT) platforms. We assess their performance in detecting various sizes and types of SVs, breakpoint biases, and genotyping accuracy with various sequencing depths. Notably, pipelines such as Minimap2-cuteSV2, NGMLR-SVIM, PBMM2-pbsv, Winnowmap-Sniffles2, and Winnowmap-SVision exhibit comparatively higher recall and precision. Our findings also show that combining multiple pipelines with the same aligner, like pbmm2 or winnowmap, can significantly enhance performance. The individual pipelines' detailed ranking and performance metrics can be viewed in a dynamic table: http://pmglab.top/SVPipelinesRanking . CONCLUSIONS: This study comprehensively characterizes the strengths and weaknesses of numerous pipelines, providing valuable insights that can improve SV detection in third-generation sequencing data and inform SV annotation and function prediction.
Assuntos
Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Variação Estrutural do Genoma , Software , Análise de Sequência de DNA/métodosRESUMO
α-thalassemia major (α-TM) often causes Hb Bart's (c4) hydrops fetalis and severe obstetric complications in the mother. Step-wise screening for couples at risk of having offspring(s) affected by α-TM is the efficient prevention method but some rare genotypes of thalassemia cannot be detected. A 32-year-old male with low HbA2 (2.4%) and mild anemia was performed real-time PCR-based multicolor melting curve analysis (MMCA) because his wife was -SEA deletion carrier. The result of multiplex ligation-dependent probe amplification (MLPA) suggested the existence of -SEA deletion in the proband. A novel deletion of the α-globin gene cluster was found using self-designed MLPA probes combined with longer PCR, which was further accurately described to be 16.8Kb (hg38, Chr16:1,65,236-1,82,113) deletion by the third-generation sequencing. A fragment ranging from 1,53,226 to 1,54,538(GRch38/hg38) was identified which suggested the existence of the homologous recombination event. The third-generation sequencing is accurate and efficient in obtaining accurate information for complex structural variations.
Assuntos
Família Multigênica , Deleção de Sequência , alfa-Globinas , Talassemia alfa , Humanos , Masculino , Adulto , alfa-Globinas/genética , Talassemia alfa/genética , Talassemia alfa/diagnóstico , Sequenciamento de Nucleotídeos em Larga Escala , FemininoRESUMO
OBJECTIVE: To explore the diagnostic value of third-generation nanopore sequencing technology in patients with diabetes mellitus suspected of pulmonary tuberculosis. METHODS: Samples, including sputum and bronchoalveolar lavage fluid(BALF), were collected from patients with diabetes mellitus suspected of pulmonary tuberculosis who were admitted from October 2021 to August 2023. Nanopore sequencing, acid-fast bacilli (AFB) smear, mycobacterial solid culture, Xpert MTB/RIF, and DNA detection were performed, and their diagnostic efficacy was compared. RESULTS: Third-generation nanopore sequencing technology exhibited high accuracy in diagnosing pulmonary tuberculosis in patients with diabetes mellitus. Compared to traditional methods, nanopore sequencing showed significantly improved sensitivity (76.80 %), negative predictive value (30.40 %), coincidence (77.92 %), and diagnostic accuracy (AUC = 0.822). Combined detection with Xpert achieved the highest diagnostic performance, with increased sensitivity (81.20 %), positive predictive value (98.20 %), negative predictive value (35.00 %), coincidence (81.82 %), and AUC (0.843). Although acid-fast staining had limitations, its combination with nanopore sequencing improved screening effectiveness. CONCLUSION: Compared to established diagnostic modalities such as acid-fast staining, mycobacterial solid culture, Xpert MTB/RIF, and DNA detection, third-generation nanopore sequencing technology demonstrates a significant improvement in sensitivity for detecting suspected pulmonary tuberculosis in diabetic patients. Notably, the combined application of nanopore sequencing with Xpert testing offers a further enhancement in diagnostic accuracy.
Assuntos
Diabetes Mellitus , Mycobacterium tuberculosis , Sequenciamento por Nanoporos , Sensibilidade e Especificidade , Escarro , Tuberculose Pulmonar , Humanos , Mycobacterium tuberculosis/genética , Mycobacterium tuberculosis/isolamento & purificação , Tuberculose Pulmonar/diagnóstico , Tuberculose Pulmonar/microbiologia , Sequenciamento por Nanoporos/métodos , Masculino , Feminino , Pessoa de Meia-Idade , Escarro/microbiologia , Adulto , Líquido da Lavagem Broncoalveolar/microbiologia , Idoso , DNA Bacteriano/genética , Técnicas de Diagnóstico Molecular/métodosRESUMO
Deletion is a crucial type of genomic structural variation and is associated with numerous genetic diseases. The advent of third-generation sequencing technology has facilitated the analysis of complex genomic structures and the elucidation of the mechanisms underlying phenotypic changes and disease onset due to genomic variants. Importantly, it has introduced innovative perspectives for deletion variants calling. Here we propose a method named Dual Attention Structural Variation (DASV) to analyze deletion structural variations in sequencing data. DASV converts gene alignment information into images and integrates them with genomic sequencing data through a dual attention mechanism. Subsequently, it employs a multi-scale network to precisely identify deletion regions. Compared with four widely used genome structural variation calling tools: cuteSV, SVIM, Sniffles and PBSV, the results demonstrate that DASV consistently achieves a balance between precision and recall, enhancing the F1 score across various datasets. The source code is available at https://github.com/deconvolution-w/DASV.
Assuntos
Sequenciamento de Nucleotídeos em Larga Escala , Software , Humanos , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Deleção de Sequência , Análise de Sequência de DNA/métodos , Algoritmos , Genômica/métodos , Biologia Computacional/métodosRESUMO
Honeybees are an indispensable pollinator in nature with pivotal ecological, economic, and scientific value. However, a full-length transcriptome for Apis mellifera, assembled with the advanced third-generation nanopore sequencing technology, has yet to be reported. Here, nanopore sequencing of the midgut tissues of uninoculated and Nosema ceranae-inoculated A. mellifera workers was conducted, and the full-length transcriptome was then constructed and annotated based on high-quality long reads. Next followed improvement of sequences and annotations of the current reference genome of A. mellifera. A total of 5,942,745 and 6,664,923 raw reads were produced from midguts of workers at 7 days post-inoculation (dpi) with N. ceranae and 10 dpi, while 7,100,161 and 6,506,665 raw reads were generated from the midguts of corresponding uninoculated workers. After strict quality control, 6,928,170, 6,353,066, 5,745,048, and 6,416,987 clean reads were obtained, with a length distribution ranging from 1 kb to 10 kb. Additionally, 16,824, 17,708, 15,744, and 18,246 full-length transcripts were respectively detected, including 28,019 nonredundant ones. Among these, 43,666, 30,945, 41,771, 26,442, and 24,532 full-length transcripts could be annotated to the Nr, KOG, eggNOG, GO, and KEGG databases, respectively. Additionally, 501 novel genes (20,326 novel transcripts) were identified for the first time, among which 401 (20,255), 193 (13,365), 414 (19,186), 228 (12,093), and 202 (11,703) were respectively annotated to each of the aforementioned five databases. The expression and sequences of three randomly selected novel transcripts were confirmed by RT-PCR and Sanger sequencing. The 5' UTR of 2082 genes, the 3' UTR of 2029 genes, and both the 5' and 3' UTRs of 730 genes were extended. Moreover, 17,345 SSRs, 14,789 complete ORFs, 1224 long non-coding RNAs (lncRNAs), and 650 transcription factors (TFs) from 37 families were detected. Findings from this work not only refine the annotation of the A. mellifera reference genome, but also provide a valuable resource and basis for relevant molecular and -omics studies.
Assuntos
Anotação de Sequência Molecular , Transcriptoma , Abelhas/genética , Animais , Transcriptoma/genética , Genoma de Inseto , Nosema/genética , Sequenciamento por Nanoporos/métodos , Perfilação da Expressão Gênica/métodosRESUMO
Brain tumors and genomics have a long-standing history given that glioblastoma was the first cancer studied by the cancer genome atlas. The numerous and continuous advances through the decades in sequencing technologies have aided in the advanced molecular characterization of brain tumors for diagnosis, prognosis, and treatment. Since the implementation of molecular biomarkers by the WHO CNS in 2016, the genomics of brain tumors has been integrated into diagnostic criteria. Long-read sequencing, also known as third generation sequencing, is an emerging technique that allows for the sequencing of longer DNA segments leading to improved detection of structural variants and epigenetics. These capabilities are opening a way for better characterization of brain tumors. Here, we present a comprehensive summary of the state of the art of third-generation sequencing in the application for brain tumor diagnosis, prognosis, and treatment. We discuss the advantages and potential new implementations of long-read sequencing into clinical paradigms for neuro-oncology patients.
RESUMO
HLA-DRB5*01:01:01:07 differs from HLA-DRB5*01:01:01:01 by two nucleotide changes in intron 1 and intron 2.
Assuntos
Cadeias HLA-DRB5 , Teste de Histocompatibilidade , Humanos , Alelos , Sequência de Bases , China , População do Leste Asiático , Éxons , Teste de Histocompatibilidade/métodos , Cadeias HLA-DRB5/genética , Íntrons , Alinhamento de Sequência , Análise de Sequência de DNA/métodosRESUMO
SVhawkeye is a novel visualization software created to rapidly extract essential structural information from third-generation sequencing data, such as data generated by PacBio or Oxford Nanopore Technologies. Its primary focus is on visualizing various structural variations commonly encountered in whole-genome sequencing (WGS) experiments, including deletions, insertions, duplications, inversions, and translocations. Additionally, SVhawkeye has the capability to display isoform structures obtained from iso-seq data and provides interval depth visualization for deducing local copy number variation (CNV). One noteworthy feature of SVhawkeye is its capacity to genotype structural variations, a critical function that enhances the accuracy of structural variant genotyping. SVhawkeye is an open-source software developed using Python and R languages, and it is freely accessible on GitHub (https://github.com/yywan0913/SVhawkeye).