Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 14.257
Filtrar
1.
Brief Bioinform ; 25(3)2024 Mar 27.
Artículo en Inglés | MEDLINE | ID: mdl-38752856

RESUMEN

Enhancing the reproducibility and comprehension of adaptive immune receptor repertoire sequencing (AIRR-seq) data analysis is critical for scientific progress. This study presents guidelines for reproducible AIRR-seq data analysis, and a collection of ready-to-use pipelines with comprehensive documentation. To this end, ten common pipelines were implemented using ViaFoundry, a user-friendly interface for pipeline management and automation. This is accompanied by versioned containers, documentation and archiving capabilities. The automation of pre-processing analysis steps and the ability to modify pipeline parameters according to specific research needs are emphasized. AIRR-seq data analysis is highly sensitive to varying parameters and setups; using the guidelines presented here, the ability to reproduce previously published results is demonstrated. This work promotes transparency, reproducibility, and collaboration in AIRR-seq data analysis, serving as a model for handling and documenting bioinformatics pipelines in other research domains.


Asunto(s)
Biología Computacional , Programas Informáticos , Humanos , Biología Computacional/métodos , Reproducibilidad de los Resultados , Receptores Inmunológicos/genética , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Inmunidad Adaptativa/genética , Guías como Asunto
3.
Front Cell Infect Microbiol ; 14: 1395239, 2024.
Artículo en Inglés | MEDLINE | ID: mdl-38774626

RESUMEN

Background: Traditional microbiological detection methods used to detect pulmonary infections in people living with HIV (PLHIV) are usually time-consuming and have low sensitivity, leading to delayed treatment. We aimed to evaluate the diagnostic value of metagenomics next-generation sequencing (mNGS) for microbial diagnosis of suspected pulmonary infections in PLHIV. Methods: We retrospectively analyzed PLHIV who were hospitalized due to suspected pulmonary infections at the sixth people hospital of Zhengzhou from November 1, 2021 to June 30, 2022. Bronchoalveolar lavage fluid (BALF) samples of PLHIV were collected and subjected to routine microbiological examination and mNGS detection. The diagnostic performance of the two methods was compared to evaluate the diagnostic value of mNGS for unknown pathogens. Results: This study included a total of 36 PLHIV with suspected pulmonary infections, of which 31 were male. The reporting period of mNGS is significantly shorter than that of CMTs. The mNGS positive rate of BALF samples in PLHIV was 83.33%, which was significantly higher than that of smear and culture (44.4%, P<0.001). In addition, 11 patients showed consistent results between the two methods. Futhermore, mNGS showed excellent performance in identifying multi-infections in PLHIV, and 27 pathogens were detected in the BALF of 30 PLHIV by mNGS, among which 15 PLHIV were found to have multiple microbial infections (at least 3 pathogens). Pneumocystis jirovecii, human herpesvirus type 5, and human herpesvirus type 4 were the most common pathogen types. Conclusions: For PLHIV with suspected pulmonary infections, mNGS is capable of rapidly and accurately identifying the pathogen causing the pulmonary infection, which contributes to implement timely and accurate anti-infective treatment.


Asunto(s)
Líquido del Lavado Bronquioalveolar , Infecciones por VIH , Secuenciación de Nucleótidos de Alto Rendimiento , Metagenómica , Humanos , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Metagenómica/métodos , Masculino , Femenino , Infecciones por VIH/complicaciones , Infecciones por VIH/virología , Estudios Retrospectivos , Líquido del Lavado Bronquioalveolar/microbiología , Líquido del Lavado Bronquioalveolar/virología , Adulto , Persona de Mediana Edad , China , Coinfección/diagnóstico , Coinfección/microbiología , Coinfección/virología , Infecciones del Sistema Respiratorio/diagnóstico , Infecciones del Sistema Respiratorio/virología , Infecciones del Sistema Respiratorio/microbiología
4.
Curr Protoc ; 4(5): e1041, 2024 May.
Artículo en Inglés | MEDLINE | ID: mdl-38774978

RESUMEN

The detection, validation, and subsequent interpretation of potentially mosaic single-nucleotide variants (SNV) within next-generation sequencing data remains a challenge in both research and clinical laboratory settings. The ability to identify mosaic variants in high genome coverage sequencing data at levels of ≤1% underscores the necessity for developing guidelines and best practices to verify these variants orthogonally. Droplet digital PCR (ddPCR) has proven to be a powerful and precise method that allows for the determination of low-level variant fractions within a given sample. Herein we describe two precise ddPCR methods using either a fluorescent TaqMan hydrolysis probe approach or an EvaGreen fluorescent dye protocol. The TaqMan approach relies on two different fluorescent probes (FAM and HEX/VIC), each designed to amplify selectively only in the presence of a single nucleotide change denoting the variant or reference position. The fractional abundance is then calculated to determine the relative quantities of both alleles in the final sample. The EvaGreen protocol relies on two independent reactions with oligonucleotide primers designed with the single nucleotide change denoting the variant at the penultimate position of the primer. The relative amplification efficiency of both primer sets (reference and variant) can be compared to determine the mosaic level of a given variant. As the cost of high-coverage sequencing continues to decrease, the identification of potentially mosaic variants will also increase. The approaches outlined will allow clinicians and researchers a more precise determination of the true mosaic level of a given variant allowing them to better assess not only its potential pathogenicity but also its possible recurrence risk when offering genetic counseling to families. © 2024 Wiley Periodicals LLC. Basic Protocol: Droplet digital PCR (ddPCR) with TaqMan hydrolysis probes Alternate Protocol: EvaGreen oligonucleotide-specific ddPCR.


Asunto(s)
Reacción en Cadena de la Polimerasa , Polimorfismo de Nucleótido Simple , Polimorfismo de Nucleótido Simple/genética , Humanos , Reacción en Cadena de la Polimerasa/métodos , Mosaicismo , Colorantes Fluorescentes/química , Secuenciación de Nucleótidos de Alto Rendimiento/métodos
5.
PLoS One ; 19(5): e0303171, 2024.
Artículo en Inglés | MEDLINE | ID: mdl-38768113

RESUMEN

Tumor microenvironment (TME) is a complex dynamic system with many tumor-interacting components including tumor-infiltrating leukocytes (TILs), cancer associated fibroblasts, blood vessels, and other stromal constituents. It intrinsically affects tumor development and pharmacology of oncology therapeutics, particularly immune-oncology (IO) treatments. Accurate measurement of TME is therefore of great importance for understanding the tumor immunity, identifying IO treatment mechanisms, developing predictive biomarkers, and ultimately, improving the treatment of cancer. Here, we introduce a mouse-IO NGS-based (NGSmIO) assay for accurately detecting and quantifying the mRNA expression of 1080 TME related genes in mouse tumor models. The NGSmIO panel was shown to be superior to the commonly used microarray approach by hosting 300 more relevant genes to better characterize various lineage of immune cells, exhibits improved mRNA and protein expression correlation to flow cytometry, shows stronger correlation with mRNA expression than RNAseq with 10x higher sequencing depth, and demonstrates higher sensitivity in measuring low-expressed genes. We describe two studies; firstly, detecting the pharmacodynamic change of interferon-γ expression levels upon anti-PD-1: anti-CD4 combination treatment in MC38 and Hepa 1-6 tumors; and secondly, benchmarking baseline TILs in 14 syngeneic tumors using transcript level expression of lineage specific genes, which demonstrate effective and robust applications of the NGSmIO panel.


Asunto(s)
Secuenciación de Nucleótidos de Alto Rendimiento , Microambiente Tumoral , Animales , Ratones , Microambiente Tumoral/inmunología , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Interferón gamma/genética , Interferón gamma/metabolismo , Línea Celular Tumoral , Regulación Neoplásica de la Expresión Génica , Modelos Animales de Enfermedad , Ratones Endogámicos C57BL , ARN Mensajero/genética , Receptor de Muerte Celular Programada 1/genética , Receptor de Muerte Celular Programada 1/metabolismo , Neoplasias/genética , Neoplasias/inmunología , Femenino , Linfocitos Infiltrantes de Tumor/inmunología , Linfocitos Infiltrantes de Tumor/metabolismo , Perfilación de la Expresión Génica/métodos
6.
Genes Chromosomes Cancer ; 63(5): e23238, 2024 May.
Artículo en Inglés | MEDLINE | ID: mdl-38722224

RESUMEN

Pleomorphic rhabdomyosarcoma (PRMS) is a rare and highly aggressive sarcoma, occurring mostly in the deep soft tissues of middle-aged adults and showing a variable degree of skeletal muscle differentiation. The diagnosis is challenging as pathologic features overlap with embryonal rhabdomyosarcoma (ERMS), malignant Triton tumor, and other pleomorphic sarcomas. As recurrent genetic alterations underlying PRMS have not been described to date, ancillary molecular diagnostic testing is not useful in subclassification. Herein, we perform genomic profiling of a well-characterized cohort of 14 PRMS, compared to a control group of 23 ERMS and other pleomorphic sarcomas (undifferentiated pleomorphic sarcoma and pleomorphic liposarcoma) using clinically validated DNA-targeted Next generation sequencing (NGS) panels (MSK-IMPACT). The PRMS cohort included eight males and six females, with a median age of 53 years (range 31-76 years). Despite similar tumor mutation burdens, the genomic landscape of PRMS, with a high frequency of TP53 (79%) and RB1 (43%) alterations, stood in stark contrast to ERMS, with 4% and 0%, respectively. CDKN2A deletions were more common in PRMS (43%), compared to ERMS (13%). In contrast, ERMS harbored somatic driver mutations in the RAS pathway and loss of function mutations in BCOR, which were absent in PRMS. Copy number variations in PRMS showed multiple chromosomal arm-level changes, most commonly gains of chr17p and chr22q and loss of chr6q. Notably, gain of chr8, commonly seen in ERMS (61%) was conspicuously absent in PRMS. The genomic profiles of other pleomorphic sarcomas were overall analogous to PRMS, showing shared alterations in TP53, RB1, and CDKN2A. Overall survival and progression-free survival of PRMS were significantly worse (p < 0.0005) than that of ERMS. Our findings revealed that the molecular landscape of PRMS aligns with other adult pleomorphic sarcomas and is distinct from that of ERMS. Thus, NGS assays may be applied in select challenging cases toward a refined classification. Finally, our data corroborate the inclusion of PRMS in the therapeutic bracket of pleomorphic sarcomas, given that their clinical outcomes are comparable.


Asunto(s)
Rabdomiosarcoma Embrionario , Humanos , Masculino , Femenino , Adulto , Persona de Mediana Edad , Anciano , Rabdomiosarcoma Embrionario/genética , Rabdomiosarcoma Embrionario/patología , Rabdomiosarcoma/genética , Rabdomiosarcoma/patología , Rabdomiosarcoma/clasificación , Mutación , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Genómica/métodos , Biomarcadores de Tumor/genética , Proteínas de Unión a Retinoblastoma/genética , Ubiquitina-Proteína Ligasas
7.
BMC Bioinformatics ; 25(1): 180, 2024 May 08.
Artículo en Inglés | MEDLINE | ID: mdl-38720249

RESUMEN

BACKGROUND: High-throughput sequencing (HTS) has become the gold standard approach for variant analysis in cancer research. However, somatic variants may occur at low fractions due to contamination from normal cells or tumor heterogeneity; this poses a significant challenge for standard HTS analysis pipelines. The problem is exacerbated in scenarios with minimal tumor DNA, such as circulating tumor DNA in plasma. Assessing sensitivity and detection of HTS approaches in such cases is paramount, but time-consuming and expensive: specialized experimental protocols and a sufficient quantity of samples are required for processing and analysis. To overcome these limitations, we propose a new computational approach specifically designed for the generation of artificial datasets suitable for this task, simulating ultra-deep targeted sequencing data with low-fraction variants and demonstrating their effectiveness in benchmarking low-fraction variant calling. RESULTS: Our approach enables the generation of artificial raw reads that mimic real data without relying on pre-existing data by using NEAT, a fine-grained read simulator that generates artificial datasets using models learned from multiple different datasets. Then, it incorporates low-fraction variants to simulate somatic mutations in samples with minimal tumor DNA content. To prove the suitability of the created artificial datasets for low-fraction variant calling benchmarking, we used them as ground truth to evaluate the performance of widely-used variant calling algorithms: they allowed us to define tuned parameter values of major variant callers, considerably improving their detection of very low-fraction variants. CONCLUSIONS: Our findings highlight both the pivotal role of our approach in creating adequate artificial datasets with low tumor fraction, facilitating rapid prototyping and benchmarking of algorithms for such dataset type, as well as the important need of advancing low-fraction variant calling techniques.


Asunto(s)
Benchmarking , Secuenciación de Nucleótidos de Alto Rendimiento , Neoplasias , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Humanos , Neoplasias/genética , Mutación , Algoritmos , ADN de Neoplasias/genética , Análisis de Secuencia de ADN/métodos , Biología Computacional/métodos
8.
Microbiome ; 12(1): 84, 2024 May 09.
Artículo en Inglés | MEDLINE | ID: mdl-38725076

RESUMEN

BACKGROUND: Emergence of antibiotic resistance in bacteria is an important threat to global health. Antibiotic resistance genes (ARGs) are some of the key components to define bacterial resistance and their spread in different environments. Identification of ARGs, particularly from high-throughput sequencing data of the specimens, is the state-of-the-art method for comprehensively monitoring their spread and evolution. Current computational methods to identify ARGs mainly rely on alignment-based sequence similarities with known ARGs. Such approaches are limited by choice of reference databases and may potentially miss novel ARGs. The similarity thresholds are usually simple and could not accommodate variations across different gene families and regions. It is also difficult to scale up when sequence data are increasing. RESULTS: In this study, we developed ARGNet, a deep neural network that incorporates an unsupervised learning autoencoder model to identify ARGs and a multiclass classification convolutional neural network to classify ARGs that do not depend on sequence alignment. This approach enables a more efficient discovery of both known and novel ARGs. ARGNet accepts both amino acid and nucleotide sequences of variable lengths, from partial (30-50 aa; 100-150 nt) sequences to full-length protein or genes, allowing its application in both target sequencing and metagenomic sequencing. Our performance evaluation showed that ARGNet outperformed other deep learning models including DeepARG and HMD-ARG in most of the application scenarios especially quasi-negative test and the analysis of prediction consistency with phylogenetic tree. ARGNet has a reduced inference runtime by up to 57% relative to DeepARG. CONCLUSIONS: ARGNet is flexible, efficient, and accurate at predicting a broad range of ARGs from the sequencing data. ARGNet is freely available at https://github.com/id-bioinfo/ARGNet , with an online service provided at https://ARGNet.hku.hk . Video Abstract.


Asunto(s)
Bacterias , Redes Neurales de la Computación , Bacterias/genética , Bacterias/efectos de los fármacos , Bacterias/clasificación , Farmacorresistencia Bacteriana/genética , Antibacterianos/farmacología , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Biología Computacional/métodos , Genes Bacterianos/genética , Farmacorresistencia Microbiana/genética , Humanos , Aprendizaje Profundo
9.
Front Cell Infect Microbiol ; 14: 1366908, 2024.
Artículo en Inglés | MEDLINE | ID: mdl-38725449

RESUMEN

Background: Metagenomic next-generation sequencing (mNGS) is a novel non-invasive and comprehensive technique for etiological diagnosis of infectious diseases. However, its practical significance has been seldom reported in the context of hematological patients with high-risk febrile neutropenia, a unique patient group characterized by neutropenia and compromised immune responses. Methods: This retrospective study evaluated the results of plasma cfDNA sequencing in 164 hematological patients with high-risk febrile neutropenia. We assessed the diagnostic efficacy and clinical impact of mNGS, comparing it with conventional microbiological tests. Results: mNGS identified 68 different pathogens in 111 patients, whereas conventional methods detected only 17 pathogen types in 36 patients. mNGS exhibited a significantly higher positive detection rate than conventional methods (67.7% vs. 22.0%, P < 0.001). This improvement was consistent across bacterial (30.5% vs. 9.1%), fungal (19.5% vs. 4.3%), and viral (37.2% vs. 9.1%) infections (P < 0.001 for all comparisons). The anti-infective treatment strategies were adjusted for 51.2% (84/164) of the patients based on the mNGS results. Conclusions: mNGS of plasma cfDNA offers substantial promise for the early detection of pathogens and the timely optimization of anti-infective therapies in hematological patients with high-risk febrile neutropenia.


Asunto(s)
Neutropenia Febril , Secuenciación de Nucleótidos de Alto Rendimiento , Metagenómica , Humanos , Metagenómica/métodos , Masculino , Estudios Retrospectivos , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Femenino , Persona de Mediana Edad , Neutropenia Febril/microbiología , Neutropenia Febril/sangre , Neutropenia Febril/diagnóstico , Adulto , Anciano , Adulto Joven , Adolescente , Anciano de 80 o más Años , Infecciones Bacterianas/diagnóstico , Infecciones Bacterianas/microbiología , Bacterias/genética , Bacterias/aislamiento & purificación , Bacterias/clasificación , Micosis/diagnóstico , Micosis/microbiología , Virosis/diagnóstico , Virosis/virología
10.
14.
HLA ; 103(5): e15518, 2024 May.
Artículo en Inglés | MEDLINE | ID: mdl-38733247

RESUMEN

Donor-derived cell-free DNA (dd-cfDNA) has been widely studied as biomarker for non-invasive allograft rejection monitoring. Earlier rejection detection enables more prompt diagnosis and intervention, ultimately improving patient treatment and outcomes. This multi-centre study aims to verify analytical performance of a next-generation sequencing-based dd-cfDNA assay at end-user environments. Three independent laboratories received the same experimental design and 16 blinded samples to perform cfDNA extraction and the dd-cfDNA assay workflow. dd-cfDNA results were compared between sites and against manufacturer validation to evaluate concordance, reproducibility, repeatability and verify analytical performance. A total of 247 sample libraries were generated across 18 runs, with completion time of <24 h. A 96.0% first pass rate highlighted minimal failures. Overall observed versus expected dd-cfDNA results demonstrated good concordance and a strong positive correlation with linear least squares regression r2 = 0.9989, and high repeatability and reproducibility within and between sites, respectively (p > 0.05). Manufacturer validation established limit of blank 0.18%, limit of detection 0.23% and limit of quantification 0.23%, and results from independent sites verified those limits. Parallel analyses illustrated no significant difference (p = 0.951) between dd-cfDNA results with or without recipient genotype. The dd-cfDNA assay evaluated here has been verified as a reliable method for efficient, reproducible dd-cfDNA quantification in plasma from solid organ transplant recipients without requiring genotyping. Implementation of onsite dd-cfDNA testing at clinical laboratories could facilitate earlier detection of allograft injury, bearing great potential for patient care.


Asunto(s)
Ácidos Nucleicos Libres de Células , Rechazo de Injerto , Secuenciación de Nucleótidos de Alto Rendimiento , Trasplante de Órganos , Donantes de Tejidos , Receptores de Trasplantes , Humanos , Ácidos Nucleicos Libres de Células/sangre , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Reproducibilidad de los Resultados , Rechazo de Injerto/diagnóstico , Rechazo de Injerto/sangre , Rechazo de Injerto/genética , Biomarcadores/sangre
15.
Int J Mol Sci ; 25(9)2024 May 02.
Artículo en Inglés | MEDLINE | ID: mdl-38732187

RESUMEN

Dynamic changes in genomic DNA methylation patterns govern the epigenetic developmental programs and accompany the organism's aging. Epigenetic clock (eAge) algorithms utilize DNA methylation to estimate the age and risk factors for diseases as well as analyze the impact of various interventions. High-throughput bisulfite sequencing methods, such as reduced-representation bisulfite sequencing (RRBS) or whole genome bisulfite sequencing (WGBS), provide an opportunity to identify the genomic regions of disordered or heterogeneous DNA methylation, which might be associated with cell-type heterogeneity, DNA methylation erosion, and allele-specific methylation. We systematically evaluated the applicability of five scores assessing the variability of methylation patterns by evaluating within-sample heterogeneity (WSH) to construct human blood epigenetic clock models using RRBS data. The best performance was demonstrated by the model based on a metric designed to assess DNA methylation erosion with an MAE of 3.686 years. We also trained a prediction model that uses the average methylation level over genomic regions. Although this region-based model was relatively more efficient than the WSH-based model, the latter required the analysis of just a few short genomic regions and, therefore, could be a useful tool to design a reduced epigenetic clock that is analyzed by targeted next-generation sequencing.


Asunto(s)
Envejecimiento , Metilación de ADN , Epigénesis Genética , Secuenciación de Nucleótidos de Alto Rendimiento , Humanos , Envejecimiento/genética , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Algoritmos , Islas de CpG , Femenino , Masculino , Epigenómica/métodos , Anciano , Adulto , Persona de Mediana Edad , Análisis de Secuencia de ADN/métodos
17.
Nat Commun ; 15(1): 3972, 2024 May 10.
Artículo en Inglés | MEDLINE | ID: mdl-38730241

RESUMEN

The advancement of Long-Read Sequencing (LRS) techniques has significantly increased the length of sequencing to several kilobases, thereby facilitating the identification of alternative splicing events and isoform expressions. Recently, numerous computational tools for isoform detection using long-read sequencing data have been developed. Nevertheless, there remains a deficiency in comparative studies that systemically evaluate the performance of these tools, which are implemented with different algorithms, under various simulations that encompass potential influencing factors. In this study, we conducted a benchmark analysis of thirteen methods implemented in nine tools capable of identifying isoform structures from long-read RNA-seq data. We evaluated their performances using simulated data, which represented diverse sequencing platforms generated by an in-house simulator, RNA sequins (sequencing spike-ins) data, as well as experimental data. Our findings demonstrate IsoQuant as a highly effective tool for isoform detection with LRS, with Bambu and StringTie2 also exhibiting strong performance. These results offer valuable guidance for future research on alternative splicing analysis and the ongoing improvement of tools for isoform detection using LRS data.


Asunto(s)
Algoritmos , Empalme Alternativo , ARN Mensajero , Análisis de Secuencia de ARN , Humanos , ARN Mensajero/genética , ARN Mensajero/análisis , Análisis de Secuencia de ARN/métodos , Isoformas de ARN/genética , Programas Informáticos , Biología Computacional/métodos , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Isoformas de Proteínas/genética
18.
BMC Bioinformatics ; 25(1): 186, 2024 May 10.
Artículo en Inglés | MEDLINE | ID: mdl-38730374

RESUMEN

BACKGROUND: Commonly used next generation sequencing machines typically produce large amounts of short reads of a few hundred base-pairs in length. However, many downstream applications would generally benefit from longer reads. RESULTS: We present CAREx-an algorithm for the generation of pseudo-long reads from paired-end short-read Illumina data based on the concept of repeatedly computing multiple-sequence-alignments to extend a read until its partner is found. Our performance evaluation on both simulated data and real data shows that CAREx is able to connect significantly more read pairs (up to 99 % for simulated data) and to produce more error-free pseudo-long reads than previous approaches. When used prior to assembly it can achieve superior de novo assembly results. Furthermore, the GPU-accelerated version of CAREx exhibits the fastest execution times among all tested tools. CONCLUSION: CAREx is a new MSA-based algorithm and software for producing pseudo-long reads from paired-end short read data. It outperforms other state-of-the-art programs in terms of (i) percentage of connected read pairs, (ii) reduction of error rates of filled gaps, (iii) runtime, and (iv) downstream analysis using de novo assembly. CAREx is open-source software written in C++ (CPU version) and in CUDA/C++ (GPU version). It is licensed under GPLv3 and can be downloaded at ( https://github.com/fkallen/CAREx ).


Asunto(s)
Algoritmos , Secuenciación de Nucleótidos de Alto Rendimiento , Programas Informáticos , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Análisis de Secuencia de ADN/métodos , Humanos , Alineación de Secuencia/métodos
19.
Hum Genomics ; 18(1): 46, 2024 May 10.
Artículo en Inglés | MEDLINE | ID: mdl-38730490

RESUMEN

BACKGROUND: Current clinical diagnosis pathway for lysosomal storage disorders (LSDs) involves sequential biochemical enzymatic tests followed by DNA sequencing, which is iterative, has low diagnostic yield and is costly due to overlapping clinical presentations. Here, we describe a novel low-cost and high-throughput sequencing assay using single-molecule molecular inversion probes (smMIPs) to screen for causative single nucleotide variants (SNVs) and copy number variants (CNVs) in genes associated with 29 common LSDs in India. RESULTS: 903 smMIPs were designed to target exon and exon-intron boundaries of targeted genes (n = 23; 53.7 kb of the human genome) and were equimolarly pooled to create a sequencing library. After extensive validation in a cohort of 50 patients, we screened 300 patients with either biochemical diagnosis (n = 187) or clinical suspicion (n = 113) of LSDs. A diagnostic yield of 83.4% was observed in patients with prior biochemical diagnosis of LSD. Furthermore, diagnostic yield of 73.9% (n = 54/73) was observed in patients with high clinical suspicion of LSD in contrast with 2.4% (n = 1/40) in patients with low clinical suspicion of LSD. In addition to detecting SNVs, the assay could detect single and multi-exon copy number variants with high confidence. Critically, Niemann-Pick disease type C and neuronal ceroid lipofuscinosis-6 diseases for which biochemical testing is unavailable, could be diagnosed using our assay. Lastly, we observed a non-inferior performance of the assay in DNA extracted from dried blood spots in comparison with whole blood. CONCLUSION: We developed a flexible and scalable assay to reliably detect genetic causes of 29 common LSDs in India. The assay consolidates the detection of multiple variant types in multiple sample types while having improved diagnostic yield at same or lower cost compared to current clinical paradigm.


Asunto(s)
Variaciones en el Número de Copia de ADN , Pruebas Genéticas , Secuenciación de Nucleótidos de Alto Rendimiento , Enfermedades por Almacenamiento Lisosomal , Humanos , Enfermedades por Almacenamiento Lisosomal/genética , Enfermedades por Almacenamiento Lisosomal/diagnóstico , India , Variaciones en el Número de Copia de ADN/genética , Pruebas Genéticas/métodos , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Polimorfismo de Nucleótido Simple/genética , Femenino , Masculino , Sondas Moleculares/genética
20.
Comput Biol Med ; 175: 108542, 2024 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-38714048

RESUMEN

The genomics landscape has undergone a revolutionary transformation with the emergence of third-generation sequencing technologies. Fueled by the exponential surge in sequencing data, there is an urgent demand for accurate and rapid algorithms to effectively handle this burgeoning influx. Under such circumstances, we developed a parallelized, yet accuracy-lossless algorithm for maximal exact match (MEM) retrieval to strategically address the computational bottleneck of uLTRA, a leading spliced alignment algorithm known for its precision in handling long RNA sequencing (RNA-seq) reads. The design of the algorithm incorporates a multi-threaded strategy, enabling the concurrent processing of multiple reads simultaneously. Additionally, we implemented the serialization of index required for MEM retrieval to facilitate its reuse, resulting in accelerated startup for practical tasks. Extensive experiments demonstrate that our parallel algorithm achieves significant improvements in runtime, speedup, throughput, and memory usage. When applied to the largest human dataset, the algorithm achieves an impressive speedup of 10.78 × , significantly improving throughput on a large scale. Moreover, the integration of the parallel MEM retrieval algorithm into the uLTRA pipeline introduces a dual-layered parallel capability, consistently yielding a speedup of 4.99 × compared to the multi-process and single-threaded execution of uLTRA. The thorough analysis of experimental results underscores the adept utilization of parallel processing capabilities and its advantageous performance in handling large datasets. This study provides a showcase of parallelized strategies for MEM retrieval within the context of spliced alignment algorithm, effectively facilitating the process of RNA-seq data analysis. The code is available at https://github.com/RongxingWong/AcceleratingSplicedAlignment.


Asunto(s)
Algoritmos , Análisis de Secuencia de ARN , Humanos , Análisis de Secuencia de ARN/métodos , Empalme del ARN , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Alineación de Secuencia/métodos , Programas Informáticos
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA