Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 98
Filtrar
Más filtros

Bases de datos
Tipo del documento
Intervalo de año de publicación
1.
Am J Hum Genet ; 109(9): 1680-1691, 2022 09 01.
Artículo en Inglés | MEDLINE | ID: mdl-36007525

RESUMEN

Neisseria meningitidis protects itself from complement-mediated killing by binding complement factor H (FH). Previous studies associated susceptibility to meningococcal disease (MD) with variation in CFH, but the causal variants and underlying mechanism remained unknown. Here we attempted to define the association more accurately by sequencing the CFH-CFHR locus and imputing missing genotypes in previously obtained GWAS datasets of MD-affected individuals of European ancestry and matched controls. We identified a CFHR3 SNP that provides protection from MD (rs75703017, p value = 1.1 × 10-16) by decreasing the concentration of FH in the blood (p value = 1.4 × 10-11). We subsequently used dual-luciferase studies and CRISPR gene editing to establish that deletion of rs75703017 increased FH expression in hepatocyte by preventing promotor inhibition. Our data suggest that reduced concentrations of FH in the blood confer protection from MD; with reduced access to FH, N. meningitidis is less able to shield itself from complement-mediated killing.


Asunto(s)
Factor H de Complemento , Infecciones Meningocócicas , Proteínas Sanguíneas/genética , Factor H de Complemento/genética , Proteínas del Sistema Complemento/genética , Predisposición Genética a la Enfermedad , Genotipo , Humanos , Infecciones Meningocócicas/genética
2.
Brief Bioinform ; 24(6)2023 09 22.
Artículo en Inglés | MEDLINE | ID: mdl-37874948

RESUMEN

Proteases contribute to a broad spectrum of cellular functions. Given a relatively limited amount of experimental data, developing accurate sequence-based predictors of substrate cleavage sites facilitates a better understanding of protease functions and substrate specificity. While many protease-specific predictors of substrate cleavage sites were developed, these efforts are outpaced by the growth of the protease substrate cleavage data. In particular, since data for 100+ protease types are available and this number continues to grow, it becomes impractical to publish predictors for new protease types, and instead it might be better to provide a computational platform that helps users to quickly and efficiently build predictors that address their specific needs. To this end, we conceptualized, developed, tested and released a versatile bioinformatics platform, ProsperousPlus, that empowers users, even those with no programming or little bioinformatics background, to build fast and accurate predictors of substrate cleavage sites. ProsperousPlus facilitates the use of the rapidly accumulating substrate cleavage data to train, empirically assess and deploy predictive models for user-selected substrate types. Benchmarking tests on test datasets show that our platform produces predictors that on average exceed the predictive performance of current state-of-the-art approaches. ProsperousPlus is available as a webserver and a stand-alone software package at http://prosperousplus.unimelb-biotools.cloud.edu.au/.


Asunto(s)
Aprendizaje Automático , Péptido Hidrolasas , Péptido Hidrolasas/metabolismo , Especificidad por Sustrato , Algoritmos
3.
Brief Bioinform ; 24(3)2023 05 19.
Artículo en Inglés | MEDLINE | ID: mdl-37150785

RESUMEN

A-to-I editing is the most prevalent RNA editing event, which refers to the change of adenosine (A) bases to inosine (I) bases in double-stranded RNAs. Several studies have revealed that A-to-I editing can regulate cellular processes and is associated with various human diseases. Therefore, accurate identification of A-to-I editing sites is crucial for understanding RNA-level (i.e. transcriptional) modifications and their potential roles in molecular functions. To date, various computational approaches for A-to-I editing site identification have been developed; however, their performance is still unsatisfactory and needs further improvement. In this study, we developed a novel stacked-ensemble learning model, ATTIC (A-To-I ediTing predICtor), to accurately identify A-to-I editing sites across three species, including Homo sapiens, Mus musculus and Drosophila melanogaster. We first comprehensively evaluated 37 RNA sequence-derived features combined with 14 popular machine learning algorithms. Then, we selected the optimal base models to build a series of stacked ensemble models. The final ATTIC framework was developed based on the optimal models improved by the feature selection strategy for specific species. Extensive cross-validation and independent tests illustrate that ATTIC outperforms state-of-the-art tools for predicting A-to-I editing sites. We also developed a web server for ATTIC, which is publicly available at http://web.unimelb-bioinfortools.cloud.edu.au/ATTIC/. We anticipate that ATTIC can be utilized as a useful tool to accelerate the identification of A-to-I RNA editing events and help characterize their roles in post-transcriptional regulation.


Asunto(s)
Drosophila melanogaster , Edición de ARN , Animales , Ratones , Humanos , Drosophila melanogaster/genética , Drosophila melanogaster/metabolismo , ARN/genética , Adenosina/genética , Adenosina/metabolismo , Inosina/genética , Inosina/metabolismo
4.
Brief Bioinform ; 24(4)2023 07 20.
Artículo en Inglés | MEDLINE | ID: mdl-37291763

RESUMEN

BACKGROUND: Promoters are DNA regions that initiate the transcription of specific genes near the transcription start sites. In bacteria, promoters are recognized by RNA polymerases and associated sigma factors. Effective promoter recognition is essential for synthesizing the gene-encoded products by bacteria to grow and adapt to different environmental conditions. A variety of machine learning-based predictors for bacterial promoters have been developed; however, most of them were designed specifically for a particular species. To date, only a few predictors are available for identifying general bacterial promoters with limited predictive performance. RESULTS: In this study, we developed TIMER, a Siamese neural network-based approach for identifying both general and species-specific bacterial promoters. Specifically, TIMER uses DNA sequences as the input and employs three Siamese neural networks with the attention layers to train and optimize the models for a total of 13 species-specific and general bacterial promoters. Extensive 10-fold cross-validation and independent tests demonstrated that TIMER achieves a competitive performance and outperforms several existing methods on both general and species-specific promoter prediction. As an implementation of the proposed method, the web server of TIMER is publicly accessible at http://web.unimelb-bioinfortools.cloud.edu.au/TIMER/.


Asunto(s)
Bacterias , Redes Neurales de la Computación , Bacterias/genética , Bacterias/metabolismo , ARN Polimerasas Dirigidas por ADN/genética , ARN Polimerasas Dirigidas por ADN/metabolismo , Secuencia de Bases , Regiones Promotoras Genéticas
5.
Brief Bioinform ; 23(2)2022 03 10.
Artículo en Inglés | MEDLINE | ID: mdl-35021193

RESUMEN

Promoters are crucial regulatory DNA regions for gene transcriptional activation. Rapid advances in next-generation sequencing technologies have accelerated the accumulation of genome sequences, providing increased training data to inform computational approaches for both prokaryotic and eukaryotic promoter prediction. However, it remains a significant challenge to accurately identify species-specific promoter sequences using computational approaches. To advance computational support for promoter prediction, in this study, we curated 58 comprehensive, up-to-date, benchmark datasets for 7 different species (i.e. Escherichia coli, Bacillus subtilis, Homo sapiens, Mus musculus, Arabidopsis thaliana, Zea mays and Drosophila melanogaster) to assist the research community to assess the relative functionality of alternative approaches and support future research on both prokaryotic and eukaryotic promoters. We revisited 106 predictors published since 2000 for promoter identification (40 for prokaryotic promoter, 61 for eukaryotic promoter, and 5 for both). We systematically evaluated their training datasets, computational methodologies, calculated features, performance and software usability. On the basis of these benchmark datasets, we benchmarked 19 predictors with functioning webservers/local tools and assessed their prediction performance. We found that deep learning and traditional machine learning-based approaches generally outperformed scoring function-based approaches. Taken together, the curated benchmark dataset repository and the benchmarking analysis in this study serve to inform the design and implementation of computational approaches for promoter prediction and facilitate more rigorous comparison of new techniques in the future.


Asunto(s)
Drosophila melanogaster , Eucariontes , Animales , Biología Computacional/métodos , Drosophila melanogaster/genética , Células Eucariotas , Ratones , Células Procariotas , Regiones Promotoras Genéticas
6.
Brief Bioinform ; 23(1)2022 01 17.
Artículo en Inglés | MEDLINE | ID: mdl-34729589

RESUMEN

Conventional supervised binary classification algorithms have been widely applied to address significant research questions using biological and biomedical data. This classification scheme requires two fully labeled classes of data (e.g. positive and negative samples) to train a classification model. However, in many bioinformatics applications, labeling data is laborious, and the negative samples might be potentially mislabeled due to the limited sensitivity of the experimental equipment. The positive unlabeled (PU) learning scheme was therefore proposed to enable the classifier to learn directly from limited positive samples and a large number of unlabeled samples (i.e. a mixture of positive or negative samples). To date, several PU learning algorithms have been developed to address various biological questions, such as sequence identification, functional site characterization and interaction prediction. In this paper, we revisit a collection of 29 state-of-the-art PU learning bioinformatic applications to address various biological questions. Various important aspects are extensively discussed, including PU learning methodology, biological application, classifier design and evaluation strategy. We also comment on the existing issues of PU learning and offer our perspectives for the future development of PU learning applications. We anticipate that our work serves as an instrumental guideline for a better understanding of the PU learning framework in bioinformatics and further developing next-generation PU learning frameworks for critical biological applications.


Asunto(s)
Algoritmos , Biología Computacional , Biología Computacional/métodos , Aprendizaje Automático Supervisado
7.
Brief Bioinform ; 22(6)2021 11 05.
Artículo en Inglés | MEDLINE | ID: mdl-34226915

RESUMEN

Pseudouridine is a ubiquitous RNA modification type present in eukaryotes and prokaryotes, which plays a vital role in various biological processes. Almost all kinds of RNAs are subject to this modification. However, it remains a great challenge to identify pseudouridine sites via experimental approaches, requiring expensive and time-consuming experimental research. Therefore, computational approaches that can be used to perform accurate in silico identification of pseudouridine sites from the large amount of RNA sequence data are highly desirable and can aid in the functional elucidation of this critical modification. Here, we propose a new computational approach, termed Porpoise, to accurately identify pseudouridine sites from RNA sequence data. Porpoise builds upon a comprehensive evaluation of 18 frequently used feature encoding schemes based on the selection of four types of features, including binary features, pseudo k-tuple composition, nucleotide chemical property and position-specific trinucleotide propensity based on single-strand (PSTNPss). The selected features are fed into the stacked ensemble learning framework to enable the construction of an effective stacked model. Both cross-validation tests on the benchmark dataset and independent tests show that Porpoise achieves superior predictive performance than several state-of-the-art approaches. The application of model interpretation tools demonstrates the importance of PSTNPs for the performance of the trained models. This new method is anticipated to facilitate community-wide efforts to identify putative pseudouridine sites and formulate novel testable biological hypothesis.


Asunto(s)
Biología Computacional/métodos , Seudouridina/química , ARN/química , ARN/genética , Algoritmos , Aprendizaje Automático , Seudouridina/genética , Reproducibilidad de los Resultados , Análisis de Secuencia de ARN/métodos
8.
Bioinformatics ; 38(17): 4053-4061, 2022 09 02.
Artículo en Inglés | MEDLINE | ID: mdl-35799358

RESUMEN

MOTIVATION: Accurate annotation of different genomic signals and regions (GSRs) from DNA sequences is fundamentally important for understanding gene structure, regulation and function. Numerous efforts have been made to develop machine learning-based predictors for in silico identification of GSRs. However, it remains a great challenge to identify GSRs as the performance of most existing approaches is unsatisfactory. As such, it is highly desirable to develop more accurate computational methods for GSRs prediction. RESULTS: In this study, we propose a general deep learning framework termed DeepGenGrep, a general predictor for the systematic identification of multiple different GSRs from genomic DNA sequences. DeepGenGrep leverages the power of hybrid neural networks comprising a three-layer convolutional neural network and a two-layer long short-term memory to effectively learn useful feature representations from sequences. Benchmarking experiments demonstrate that DeepGenGrep outperforms several state-of-the-art approaches on identifying polyadenylation signals, translation initiation sites and splice sites across four eukaryotic species including Homo sapiens, Mus musculus, Bos taurus and Drosophila melanogaster. Overall, DeepGenGrep represents a useful tool for the high-throughput and cost-effective identification of potential GSRs in eukaryotic genomes. AVAILABILITY AND IMPLEMENTATION: The webserver and source code are freely available at http://bigdata.biocie.cn/deepgengrep/home and Github (https://github.com/wx-cie/DeepGenGrep/). SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Asunto(s)
Aprendizaje Profundo , Ratones , Bovinos , Animales , Drosophila melanogaster/genética , Genómica/métodos , Genoma , Programas Informáticos
9.
BMC Cancer ; 22(1): 85, 2022 Jan 20.
Artículo en Inglés | MEDLINE | ID: mdl-35057759

RESUMEN

BACKGROUND: Circulating cell-free DNA (cfDNA) in the plasma of cancer patients contains cell-free tumour DNA (ctDNA) derived from tumour cells and it has been widely recognized as a non-invasive source of tumour DNA for diagnosis and prognosis of cancer. Molecular profiling of ctDNA is often performed using targeted sequencing or low-coverage whole genome sequencing (WGS) to identify tumour specific somatic mutations or somatic copy number aberrations (sCNAs). However, these approaches cannot efficiently detect all tumour-derived genomic changes in ctDNA. METHODS: We performed WGS analysis of cfDNA from 4 breast cancer patients and 2 patients with benign tumours. We sequenced matched germline DNA for all 6 patients and tumour samples from the breast cancer patients. All samples were sequenced on Illumina HiSeqXTen sequencing platform and achieved approximately 30x, 60x and 100x coverage on germline, tumour and plasma DNA samples, respectively. RESULTS: The mutational burden of the plasma samples (1.44 somatic mutations/Mb of genome) was higher than the matched tumour samples. However, 90% of high confidence somatic cfDNA variants were not detected in matched tumour samples and were found to comprise two background plasma mutational signatures. In contrast, cfDNA from the di-nucleosome fraction (300 bp-350 bp) had much higher proportion (30%) of variants shared with tumour. Despite high coverage sequencing we were unable to detect sCNAs in plasma samples. CONCLUSIONS: Deep sequencing analysis of plasma samples revealed higher fraction of unique somatic mutations in plasma samples, which were not detected in matched tumour samples. Sequencing of di-nucleosome bound cfDNA fragments may increase recovery of tumour mutations from plasma.


Asunto(s)
Neoplasias de la Mama/genética , ADN Tumoral Circulante/sangre , Análisis Mutacional de ADN/métodos , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Secuenciación Completa del Genoma/métodos , Adulto , Biomarcadores de Tumor/genética , Neoplasias de la Mama/sangre , Femenino , Humanos , Mutación , Pronóstico
10.
PLoS Comput Biol ; 17(1): e1008586, 2021 01.
Artículo en Inglés | MEDLINE | ID: mdl-33471816

RESUMEN

A streaming assembly pipeline utilising real-time Oxford Nanopore Technology (ONT) sequencing data is important for saving sequencing resources and reducing time-to-result. A previous approach implemented in npScarf provided an efficient streaming algorithm for hybrid assembly but was relatively prone to mis-assemblies compared to other graph-based methods. Here we present npGraph, a streaming hybrid assembly tool using the assembly graph instead of the separated pre-assembly contigs. It is able to produce more complete genome assembly by resolving the path finding problem on the assembly graph using long reads as the traversing guide. Application to synthetic and real data from bacterial isolate genomes show improved accuracy while still maintaining a low computational cost. npGraph also provides a graphical user interface (GUI) which provides a real-time visualisation of the progress of assembly. The tool and source code is available at https://github.com/hsnguyen/assembly.


Asunto(s)
Nanoporos , Alineación de Secuencia/métodos , Análisis de Secuencia de ADN/métodos , Algoritmos , Biología Computacional , ADN Bacteriano/análisis , ADN Bacteriano/genética , Genoma Bacteriano/genética , Nanotecnología , Programas Informáticos , Interfaz Usuario-Computador
11.
Theor Appl Genet ; 133(1): 23-36, 2020 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-31595335

RESUMEN

KEY MESSAGE: ß-Carotene content in sweetpotato is associated with the Orange and phytoene synthase genes; due to physical linkage of phytoene synthase with sucrose synthase, ß-carotene and starch content are negatively correlated. In populations depending on sweetpotato for food security, starch is an important source of calories, while ß-carotene is an important source of provitamin A. The negative association between the two traits contributes to the low nutritional quality of sweetpotato consumed, especially in sub-Saharan Africa. Using a biparental mapping population of 315 F1 progeny generated from a cross between an orange-fleshed and a non-orange-fleshed sweetpotato variety, we identified two major quantitative trait loci (QTL) on linkage group (LG) three (LG3) and twelve (LG12) affecting starch, ß-carotene, and their correlated traits, dry matter and flesh color. Analysis of parental haplotypes indicated that these two regions acted pleiotropically to reduce starch content and increase ß-carotene in genotypes carrying the orange-fleshed parental haplotype at the LG3 locus. Phytoene synthase and sucrose synthase, the rate-limiting and linked genes located within the QTL on LG3 involved in the carotenoid and starch biosynthesis, respectively, were differentially expressed in Beauregard versus Tanzania storage roots. The Orange gene, the molecular switch for chromoplast biogenesis, located within the QTL on LG12 while not differentially expressed was expressed in developing roots of the parental genotypes. We conclude that these two QTL regions act together in a cis and trans manner to inhibit starch biosynthesis in amyloplasts and enhance chromoplast biogenesis, carotenoid biosynthesis, and accumulation in orange-fleshed sweetpotato. Understanding the genetic basis of this negative association between starch and ß-carotene will inform future sweetpotato breeding strategies targeting sweetpotato for food and nutritional security.


Asunto(s)
Regulación de la Expresión Génica de las Plantas , Ipomoea batatas/genética , Poliploidía , Sitios de Carácter Cuantitativo/genética , Almidón/metabolismo , beta Caroteno/metabolismo , Alelos , Ambiente , Estudios de Asociación Genética , Fenotipo , Raíces de Plantas/genética , Raíces de Plantas/crecimiento & desarrollo , Carácter Cuantitativo Heredable
12.
Emerg Infect Dis ; 25(3): 406-415, 2019 03.
Artículo en Inglés | MEDLINE | ID: mdl-30789135

RESUMEN

In this retrospective study, we used whole-genome sequencing (WGS) to delineate transmission dynamics, characterize drug-resistance markers, and identify risk factors of transmission among Papua New Guinea residents of the Torres Strait Protected Zone (TSPZ) who had tuberculosis diagnoses during 2010-2015. Of 117 isolates collected, we could acquire WGS data for 100; 79 were Beijing sublineage 2.2.1.1, which was associated with active transmission (odds ratio 6.190, 95% CI 2.221-18.077). Strains were distributed widely throughout the TSPZ. Clustering occurred more often within than between villages (p = 0.0013). Including 4 multidrug-resistant tuberculosis isolates from Australia citizens epidemiologically linked to the TSPZ into the transmission network analysis revealed 2 probable cross-border transmission events. All multidrug-resistant isolates (33/104) belonged to Beijing sublineage 2.2.1.1 and had high-level isoniazid and ethionamide co-resistance; 2 isolates were extensively drug resistant. Including WGS in regional surveillance could improve tuberculosis transmission tracking and control strategies within the TSPZ.


Asunto(s)
Emigración e Inmigración , Mycobacterium tuberculosis/efectos de los fármacos , Tuberculosis Resistente a Múltiples Medicamentos/epidemiología , Tuberculosis Resistente a Múltiples Medicamentos/microbiología , Antituberculosos/farmacología , Australia/epidemiología , Técnicas de Tipificación Bacteriana , Evolución Molecular , Genotipo , Geografía , Historia del Siglo XXI , Humanos , Pruebas de Sensibilidad Microbiana , Mycobacterium tuberculosis/genética , Mycobacterium tuberculosis/aislamiento & purificación , Papúa Nueva Guinea/epidemiología , Tuberculosis Resistente a Múltiples Medicamentos/diagnóstico , Tuberculosis Resistente a Múltiples Medicamentos/historia , Secuenciación Completa del Genoma
13.
J Antimicrob Chemother ; 74(3): 582-593, 2019 03 01.
Artículo en Inglés | MEDLINE | ID: mdl-30445429

RESUMEN

BACKGROUND: Polymyxin B and E (colistin) have been pivotal in the treatment of XDR Gram-negative bacterial infections; however, resistance has emerged. A structurally related lipopeptide, octapeptin C4, has shown significant potency against XDR bacteria, including polymyxin-resistant strains, but its mode of action remains undefined. OBJECTIVES: We sought to compare and contrast the acquisition of resistance in an XDR Klebsiella pneumoniae (ST258) clinical isolate in vitro with all three lipopeptides to potentially unveil variations in their mode of action. METHODS: The isolate was exposed to increasing concentrations of polymyxins and octapeptin C4 over 20 days. Day 20 strains underwent WGS, complementation assays, antimicrobial susceptibility testing and lipid A analysis. RESULTS: Twenty days of exposure to the polymyxins resulted in a 1000-fold increase in the MIC, whereas for octapeptin C4 a 4-fold increase was observed. There was no cross-resistance observed between the polymyxin- and octapeptin-resistant strains. Sequencing of polymyxin-resistant isolates revealed mutations in previously known resistance-associated genes, including crrB, mgrB, pmrB, phoPQ and yciM, along with novel mutations in qseC. Octapeptin C4-resistant isolates had mutations in mlaDF and pqiB, genes related to phospholipid transport. These genetic variations were reflected in distinct phenotypic changes to lipid A. Polymyxin-resistant isolates increased 4-amino-4-deoxyarabinose fortification of lipid A phosphate groups, whereas the lipid A of octapeptin C4-resistant strains harboured a higher abundance of hydroxymyristate and palmitoylate. CONCLUSIONS: Octapeptin C4 has a distinct mode of action compared with the polymyxins, highlighting its potential as a future therapeutic agent to combat the increasing threat of XDR bacteria.


Asunto(s)
Antibacterianos/farmacología , Colistina/farmacología , Farmacorresistencia Bacteriana Múltiple , Klebsiella pneumoniae/efectos de los fármacos , Lipopéptidos/farmacología , Péptidos Cíclicos/farmacología , Polimixina B/farmacología , Humanos , Infecciones por Klebsiella/microbiología , Klebsiella pneumoniae/aislamiento & purificación , Pruebas de Sensibilidad Microbiana , Mutación , Secuenciación Completa del Genoma
14.
Bioinformatics ; 34(5): 873-874, 2018 03 01.
Artículo en Inglés | MEDLINE | ID: mdl-29092025

RESUMEN

Motivation: Targeted sequencing using capture probes has become increasingly popular in clinical applications due to its scalability and cost-effectiveness. The approach also allows for higher sequencing coverage of the targeted regions resulting in better analysis statistical power. However, because of the dynamics of the hybridization process, it is difficult to evaluate the efficiency of the probe design prior to the experiments which are time consuming and costly. Results: We developed CapSim, a software package for simulation of targeted sequencing. Given a genome sequence and a set of probes, CapSim simulates the fragmentation, the dynamics of probe hybridization and the sequencing of the captured fragments on Illumina and PacBio sequencing platforms. The simulated data can be used for evaluating the performance of the analysis pipeline, as well as the efficiency of the probe design. Parameters of the various stages in the sequencing process can also be evaluated in order to optimize the experiments. Availability and implementation: CapSim is publicly available under BSD license at https://github.com/Devika1/capsim. Contact: l.coin@imb.uq.edu.au. Supplementary information: Supplementary data are available at Bioinformatics online.


Asunto(s)
Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Análisis de Secuencia de ADN/métodos , Genómica/métodos , Programas Informáticos
15.
BMC Infect Dis ; 19(1): 660, 2019 Jul 24.
Artículo en Inglés | MEDLINE | ID: mdl-31340776

RESUMEN

BACKGROUND: Rapid diagnosis and appropriate treatment is imperative in bacterial sepsis due increasing risk of mortality with every hour without appropriate antibiotic therapy. Atypical infections with fastidious organisms may take more than 4 days to diagnose leading to calls for improved methods for rapidly diagnosing sepsis. Capnocytophaga canimorsus is a slow-growing, fastidious gram-negative bacillus which is a common commensal within the mouths of dogs, but rarely cause infections in humans. C. canimorsus sepsis risk factors include immunosuppression, alcoholism and elderly age. Here we report on the application of emerging nanopore sequencing methods to rapidly diagnose an atypical case of C. canimorsus septic shock. CASE PRESENTATION: A 62 year-old female patient was admitted to an intensive care unit with septic shock and multi-organ failure six days after a reported dog bite. Blood cultures were unable to detect a pathogen after 3 days despite observed intracellular bacilli on blood smears. Real-time nanopore sequencing was subsequently employed on whole blood to detect Capnocytophaga canimorsus in 19 h. The patient was not immunocompromised and did not have any other known risk factors. Whole-genome sequencing of clinical sample and of the offending dog's oral swabs showed near-identical C. canimorsus genomes. The patient responded to antibiotic treatment and was discharged from hospital 31 days after admission. CONCLUSIONS: Use of real-time nanopore sequencing reduced the time-to-diagnosis of Capnocytophaga canimorsus in this case from 6.25 days to 19 h. Capnocytophaga canimorsus should be considered in cases of suspected sepsis involving cat or dog contact, irrespective of the patient's known risk factors.


Asunto(s)
Mordeduras y Picaduras/complicaciones , Capnocytophaga/aislamiento & purificación , Choque Séptico/diagnóstico , Animales , Antibacterianos/uso terapéutico , Capnocytophaga/efectos de los fármacos , Capnocytophaga/genética , Gatos , Perros , Femenino , Infecciones por Bacterias Gramnegativas/diagnóstico , Infecciones por Bacterias Gramnegativas/inmunología , Infecciones por Bacterias Gramnegativas/microbiología , Humanos , Huésped Inmunocomprometido , Persona de Mediana Edad , Nanoporos , Análisis de Secuencia de ADN , Choque Séptico/inmunología , Choque Séptico/microbiología
16.
Nucleic Acids Res ; 45(5): e34, 2017 03 17.
Artículo en Inglés | MEDLINE | ID: mdl-27903916

RESUMEN

Accurate identification of copy number alterations is an essential step in understanding the events driving tumor progression. While a variety of algorithms have been developed to use high-throughput sequencing data to profile copy number changes, no tool is able to reliably characterize ploidy and genotype absolute copy number from tumor samples that contain less than 40% tumor cells. To increase our power to resolve the copy number profile from low-cellularity tumor samples, we developed a novel approach that pre-phases heterozygote germline single nucleotide polymorphisms (SNPs) in order to replace the commonly used 'B-allele frequency' with a more powerful 'parental-haplotype frequency'. We apply our tool-sCNAphase-to characterize the copy number and loss-of-heterozygosity profiles of four publicly available breast cancer cell-lines. Comparisons to previous spectral karyotyping and microarray studies revealed that sCNAphase reliably identified overall ploidy as well as the individual copy number mutations from each cell-line. Analysis of artificial cell-line mixtures demonstrated the capacity of this method to determine the level of tumor cellularity, consistently identify sCNAs and characterize ploidy in samples with as little as 10% tumor cells. This novel methodology has the potential to bring sCNA profiling to low-cellularity tumors, a form of cancer unable to be accurately studied by current methods.


Asunto(s)
Aneuploidia , Variaciones en el Número de Copia de ADN , Haplotipos , Programas Informáticos , Algoritmos , Recuento de Células , Línea Celular Tumoral , Dosificación de Gen , Heterocigoto , Secuenciación de Nucleótidos de Alto Rendimiento , Humanos , Análisis de Secuencia de ADN
18.
BMC Bioinformatics ; 19(1): 261, 2018 07 13.
Artículo en Inglés | MEDLINE | ID: mdl-30001702

RESUMEN

BACKGROUND: Detection of genomic inversions remains challenging. Many existing methods primarily target inzversions with a non repetitive breakpoint, leaving inverted repeat (IR) mediated non-allelic homologous recombination (NAHR) inversions largely unexplored. RESULT: We present npInv, a novel tool specifically for detecting and genotyping NAHR inversion using long read sub-alignment of long read sequencing data. We benchmark npInv with other tools in both simulation and real data. We use npInv to generate a whole-genome inversion map for NA12878 consisting of 30 NAHR inversions (of which 15 are novel), including all previously known NAHR mediated inversions in NA12878 with flanking IR less than 7kb. Our genotyping accuracy on this dataset was 94%. We used PCR to confirm the presence of two of these novel inversions. We show that there is a near linear relationship between the length of flanking IR and the minimum inversion size, without inverted repeats. CONCLUSION: The application of npInv shows high accuracy in both simulation and real data. The results give deeper insight into understanding inversion.


Asunto(s)
Inversión Cromosómica/genética , Genotipo , Humanos
19.
Bioinformatics ; 33(24): 3988-3990, 2017 Dec 15.
Artículo en Inglés | MEDLINE | ID: mdl-28961965

RESUMEN

MOTIVATION: The recent introduction of a barcoding protocol for Oxford Nanopore sequencing has increased the versatility of the technology. Several bioinformatics tools have been developed to demultiplex barcoded reads, but none of them supports streaming analysis. This limits the use of multiplexed sequencing in real-time applications, which is one of the main advantages of the technology. RESULTS: We introduced npBarcode, an open source and cross-platform tool for barcode demultiplexing in streaming fashion that can be used to pipe data to further real-time analyses. The tool also provides a friendly graphical user interface by integrating the module into npReader, making possible to monitor the progress concurrently when the sequencing is still in progress. We show that our algorithm achieves accuracies at least as good as competing tools. AVAILABILITY AND IMPLEMENTATION: npBarcode is bundled in Japsa-a Java tools kit for genome analysis, and is freely available at https://github.com/mdcao/japsa. CONTACT: s.nguyen@uq.edu.au or l.coin@imb.uq.edu.au. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Asunto(s)
Procesamiento Automatizado de Datos , Nanoporos , Análisis de Secuencia de ADN/métodos , Programas Informáticos , Algoritmos , Reproducibilidad de los Resultados
20.
N Engl J Med ; 370(18): 1712-1723, 2014 May 01.
Artículo en Inglés | MEDLINE | ID: mdl-24785206

RESUMEN

BACKGROUND: Improved diagnostic tests for tuberculosis in children are needed. We hypothesized that transcriptional signatures of host blood could be used to distinguish tuberculosis from other diseases in African children who either were or were not infected with the human immunodeficiency virus (HIV). METHODS: The study population comprised prospective cohorts of children who were undergoing evaluation for suspected tuberculosis in South Africa (655 children), Malawi (701 children), and Kenya (1599 children). Patients were assigned to groups according to whether the diagnosis was culture-confirmed tuberculosis, culture-negative tuberculosis, diseases other than tuberculosis, or latent tuberculosis infection. Diagnostic signatures distinguishing tuberculosis from other diseases and from latent tuberculosis infection were identified from genomewide analysis of RNA expression in host blood. RESULTS: We identified a 51-transcript signature distinguishing tuberculosis from other diseases in the South African and Malawian children (the discovery cohort). In the Kenyan children (the validation cohort), a risk score based on the signature for tuberculosis and for diseases other than tuberculosis showed a sensitivity of 82.9% (95% confidence interval [CI], 68.6 to 94.3) and a specificity of 83.6% (95% CI, 74.6 to 92.7) for the diagnosis of culture-confirmed tuberculosis. Among patients with cultures negative for Mycobacterium tuberculosis who were treated for tuberculosis (those with highly probable, probable, or possible cases of tuberculosis), the estimated sensitivity was 62.5 to 82.3%, 42.1 to 80.8%, and 35.3 to 79.6%, respectively, for different estimates of actual tuberculosis in the groups. In comparison, the sensitivity of the Xpert MTB/RIF assay for molecular detection of M. tuberculosis DNA in cases of culture-confirmed tuberculosis was 54.3% (95% CI, 37.1 to 68.6), and the sensitivity in highly probable, probable, or possible cases was an estimated 25.0 to 35.7%, 5.3 to 13.3%, and 0%, respectively; the specificity of the assay was 100%. CONCLUSIONS: RNA expression signatures provided data that helped distinguish tuberculosis from other diseases in African children with and those without HIV infection. (Funded by the European Union Action for Diseases of Poverty Program and others).


Asunto(s)
Mycobacterium tuberculosis/genética , ARN Bacteriano/sangre , Transcriptoma , Tuberculosis/diagnóstico , África , Algoritmos , Técnicas Bacteriológicas , Niño , Preescolar , Diagnóstico Diferencial , Infecciones por VIH/complicaciones , Humanos , Lactante , Tuberculosis Latente/diagnóstico , Masculino , Mycobacterium tuberculosis/aislamiento & purificación , Análisis de Secuencia por Matrices de Oligonucleótidos , Riesgo , Sensibilidad y Especificidad , Tuberculosis/complicaciones , Tuberculosis/genética
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA