Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 98
Filtrar
Mais filtros

Base de dados
Tipo de documento
Intervalo de ano de publicação
1.
Am J Hum Genet ; 109(9): 1680-1691, 2022 09 01.
Artigo em Inglês | MEDLINE | ID: mdl-36007525

RESUMO

Neisseria meningitidis protects itself from complement-mediated killing by binding complement factor H (FH). Previous studies associated susceptibility to meningococcal disease (MD) with variation in CFH, but the causal variants and underlying mechanism remained unknown. Here we attempted to define the association more accurately by sequencing the CFH-CFHR locus and imputing missing genotypes in previously obtained GWAS datasets of MD-affected individuals of European ancestry and matched controls. We identified a CFHR3 SNP that provides protection from MD (rs75703017, p value = 1.1 × 10-16) by decreasing the concentration of FH in the blood (p value = 1.4 × 10-11). We subsequently used dual-luciferase studies and CRISPR gene editing to establish that deletion of rs75703017 increased FH expression in hepatocyte by preventing promotor inhibition. Our data suggest that reduced concentrations of FH in the blood confer protection from MD; with reduced access to FH, N. meningitidis is less able to shield itself from complement-mediated killing.


Assuntos
Fator H do Complemento , Infecções Meningocócicas , Proteínas Sanguíneas/genética , Fator H do Complemento/genética , Proteínas do Sistema Complemento/genética , Predisposição Genética para Doença , Genótipo , Humanos , Infecções Meningocócicas/genética
2.
Brief Bioinform ; 24(6)2023 09 22.
Artigo em Inglês | MEDLINE | ID: mdl-37874948

RESUMO

Proteases contribute to a broad spectrum of cellular functions. Given a relatively limited amount of experimental data, developing accurate sequence-based predictors of substrate cleavage sites facilitates a better understanding of protease functions and substrate specificity. While many protease-specific predictors of substrate cleavage sites were developed, these efforts are outpaced by the growth of the protease substrate cleavage data. In particular, since data for 100+ protease types are available and this number continues to grow, it becomes impractical to publish predictors for new protease types, and instead it might be better to provide a computational platform that helps users to quickly and efficiently build predictors that address their specific needs. To this end, we conceptualized, developed, tested and released a versatile bioinformatics platform, ProsperousPlus, that empowers users, even those with no programming or little bioinformatics background, to build fast and accurate predictors of substrate cleavage sites. ProsperousPlus facilitates the use of the rapidly accumulating substrate cleavage data to train, empirically assess and deploy predictive models for user-selected substrate types. Benchmarking tests on test datasets show that our platform produces predictors that on average exceed the predictive performance of current state-of-the-art approaches. ProsperousPlus is available as a webserver and a stand-alone software package at http://prosperousplus.unimelb-biotools.cloud.edu.au/.


Assuntos
Aprendizado de Máquina , Peptídeo Hidrolases , Peptídeo Hidrolases/metabolismo , Especificidade por Substrato , Algoritmos
3.
Brief Bioinform ; 24(3)2023 05 19.
Artigo em Inglês | MEDLINE | ID: mdl-37150785

RESUMO

A-to-I editing is the most prevalent RNA editing event, which refers to the change of adenosine (A) bases to inosine (I) bases in double-stranded RNAs. Several studies have revealed that A-to-I editing can regulate cellular processes and is associated with various human diseases. Therefore, accurate identification of A-to-I editing sites is crucial for understanding RNA-level (i.e. transcriptional) modifications and their potential roles in molecular functions. To date, various computational approaches for A-to-I editing site identification have been developed; however, their performance is still unsatisfactory and needs further improvement. In this study, we developed a novel stacked-ensemble learning model, ATTIC (A-To-I ediTing predICtor), to accurately identify A-to-I editing sites across three species, including Homo sapiens, Mus musculus and Drosophila melanogaster. We first comprehensively evaluated 37 RNA sequence-derived features combined with 14 popular machine learning algorithms. Then, we selected the optimal base models to build a series of stacked ensemble models. The final ATTIC framework was developed based on the optimal models improved by the feature selection strategy for specific species. Extensive cross-validation and independent tests illustrate that ATTIC outperforms state-of-the-art tools for predicting A-to-I editing sites. We also developed a web server for ATTIC, which is publicly available at http://web.unimelb-bioinfortools.cloud.edu.au/ATTIC/. We anticipate that ATTIC can be utilized as a useful tool to accelerate the identification of A-to-I RNA editing events and help characterize their roles in post-transcriptional regulation.


Assuntos
Drosophila melanogaster , Edição de RNA , Animais , Camundongos , Humanos , Drosophila melanogaster/genética , Drosophila melanogaster/metabolismo , RNA/genética , Adenosina/genética , Adenosina/metabolismo , Inosina/genética , Inosina/metabolismo
4.
Brief Bioinform ; 24(4)2023 07 20.
Artigo em Inglês | MEDLINE | ID: mdl-37291763

RESUMO

BACKGROUND: Promoters are DNA regions that initiate the transcription of specific genes near the transcription start sites. In bacteria, promoters are recognized by RNA polymerases and associated sigma factors. Effective promoter recognition is essential for synthesizing the gene-encoded products by bacteria to grow and adapt to different environmental conditions. A variety of machine learning-based predictors for bacterial promoters have been developed; however, most of them were designed specifically for a particular species. To date, only a few predictors are available for identifying general bacterial promoters with limited predictive performance. RESULTS: In this study, we developed TIMER, a Siamese neural network-based approach for identifying both general and species-specific bacterial promoters. Specifically, TIMER uses DNA sequences as the input and employs three Siamese neural networks with the attention layers to train and optimize the models for a total of 13 species-specific and general bacterial promoters. Extensive 10-fold cross-validation and independent tests demonstrated that TIMER achieves a competitive performance and outperforms several existing methods on both general and species-specific promoter prediction. As an implementation of the proposed method, the web server of TIMER is publicly accessible at http://web.unimelb-bioinfortools.cloud.edu.au/TIMER/.


Assuntos
Bactérias , Redes Neurais de Computação , Bactérias/genética , Bactérias/metabolismo , RNA Polimerases Dirigidas por DNA/genética , RNA Polimerases Dirigidas por DNA/metabolismo , Sequência de Bases , Regiões Promotoras Genéticas
5.
Brief Bioinform ; 23(2)2022 03 10.
Artigo em Inglês | MEDLINE | ID: mdl-35021193

RESUMO

Promoters are crucial regulatory DNA regions for gene transcriptional activation. Rapid advances in next-generation sequencing technologies have accelerated the accumulation of genome sequences, providing increased training data to inform computational approaches for both prokaryotic and eukaryotic promoter prediction. However, it remains a significant challenge to accurately identify species-specific promoter sequences using computational approaches. To advance computational support for promoter prediction, in this study, we curated 58 comprehensive, up-to-date, benchmark datasets for 7 different species (i.e. Escherichia coli, Bacillus subtilis, Homo sapiens, Mus musculus, Arabidopsis thaliana, Zea mays and Drosophila melanogaster) to assist the research community to assess the relative functionality of alternative approaches and support future research on both prokaryotic and eukaryotic promoters. We revisited 106 predictors published since 2000 for promoter identification (40 for prokaryotic promoter, 61 for eukaryotic promoter, and 5 for both). We systematically evaluated their training datasets, computational methodologies, calculated features, performance and software usability. On the basis of these benchmark datasets, we benchmarked 19 predictors with functioning webservers/local tools and assessed their prediction performance. We found that deep learning and traditional machine learning-based approaches generally outperformed scoring function-based approaches. Taken together, the curated benchmark dataset repository and the benchmarking analysis in this study serve to inform the design and implementation of computational approaches for promoter prediction and facilitate more rigorous comparison of new techniques in the future.


Assuntos
Drosophila melanogaster , Eucariotos , Animais , Biologia Computacional/métodos , Drosophila melanogaster/genética , Células Eucarióticas , Camundongos , Células Procarióticas , Regiões Promotoras Genéticas
6.
Brief Bioinform ; 23(1)2022 01 17.
Artigo em Inglês | MEDLINE | ID: mdl-34729589

RESUMO

Conventional supervised binary classification algorithms have been widely applied to address significant research questions using biological and biomedical data. This classification scheme requires two fully labeled classes of data (e.g. positive and negative samples) to train a classification model. However, in many bioinformatics applications, labeling data is laborious, and the negative samples might be potentially mislabeled due to the limited sensitivity of the experimental equipment. The positive unlabeled (PU) learning scheme was therefore proposed to enable the classifier to learn directly from limited positive samples and a large number of unlabeled samples (i.e. a mixture of positive or negative samples). To date, several PU learning algorithms have been developed to address various biological questions, such as sequence identification, functional site characterization and interaction prediction. In this paper, we revisit a collection of 29 state-of-the-art PU learning bioinformatic applications to address various biological questions. Various important aspects are extensively discussed, including PU learning methodology, biological application, classifier design and evaluation strategy. We also comment on the existing issues of PU learning and offer our perspectives for the future development of PU learning applications. We anticipate that our work serves as an instrumental guideline for a better understanding of the PU learning framework in bioinformatics and further developing next-generation PU learning frameworks for critical biological applications.


Assuntos
Algoritmos , Biologia Computacional , Biologia Computacional/métodos , Aprendizado de Máquina Supervisionado
7.
Brief Bioinform ; 22(6)2021 11 05.
Artigo em Inglês | MEDLINE | ID: mdl-34226915

RESUMO

Pseudouridine is a ubiquitous RNA modification type present in eukaryotes and prokaryotes, which plays a vital role in various biological processes. Almost all kinds of RNAs are subject to this modification. However, it remains a great challenge to identify pseudouridine sites via experimental approaches, requiring expensive and time-consuming experimental research. Therefore, computational approaches that can be used to perform accurate in silico identification of pseudouridine sites from the large amount of RNA sequence data are highly desirable and can aid in the functional elucidation of this critical modification. Here, we propose a new computational approach, termed Porpoise, to accurately identify pseudouridine sites from RNA sequence data. Porpoise builds upon a comprehensive evaluation of 18 frequently used feature encoding schemes based on the selection of four types of features, including binary features, pseudo k-tuple composition, nucleotide chemical property and position-specific trinucleotide propensity based on single-strand (PSTNPss). The selected features are fed into the stacked ensemble learning framework to enable the construction of an effective stacked model. Both cross-validation tests on the benchmark dataset and independent tests show that Porpoise achieves superior predictive performance than several state-of-the-art approaches. The application of model interpretation tools demonstrates the importance of PSTNPs for the performance of the trained models. This new method is anticipated to facilitate community-wide efforts to identify putative pseudouridine sites and formulate novel testable biological hypothesis.


Assuntos
Biologia Computacional/métodos , Pseudouridina/química , RNA/química , RNA/genética , Algoritmos , Aprendizado de Máquina , Pseudouridina/genética , Reprodutibilidade dos Testes , Análise de Sequência de RNA/métodos
8.
Bioinformatics ; 38(17): 4053-4061, 2022 09 02.
Artigo em Inglês | MEDLINE | ID: mdl-35799358

RESUMO

MOTIVATION: Accurate annotation of different genomic signals and regions (GSRs) from DNA sequences is fundamentally important for understanding gene structure, regulation and function. Numerous efforts have been made to develop machine learning-based predictors for in silico identification of GSRs. However, it remains a great challenge to identify GSRs as the performance of most existing approaches is unsatisfactory. As such, it is highly desirable to develop more accurate computational methods for GSRs prediction. RESULTS: In this study, we propose a general deep learning framework termed DeepGenGrep, a general predictor for the systematic identification of multiple different GSRs from genomic DNA sequences. DeepGenGrep leverages the power of hybrid neural networks comprising a three-layer convolutional neural network and a two-layer long short-term memory to effectively learn useful feature representations from sequences. Benchmarking experiments demonstrate that DeepGenGrep outperforms several state-of-the-art approaches on identifying polyadenylation signals, translation initiation sites and splice sites across four eukaryotic species including Homo sapiens, Mus musculus, Bos taurus and Drosophila melanogaster. Overall, DeepGenGrep represents a useful tool for the high-throughput and cost-effective identification of potential GSRs in eukaryotic genomes. AVAILABILITY AND IMPLEMENTATION: The webserver and source code are freely available at http://bigdata.biocie.cn/deepgengrep/home and Github (https://github.com/wx-cie/DeepGenGrep/). SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Aprendizado Profundo , Camundongos , Bovinos , Animais , Drosophila melanogaster/genética , Genômica/métodos , Genoma , Software
9.
BMC Cancer ; 22(1): 85, 2022 Jan 20.
Artigo em Inglês | MEDLINE | ID: mdl-35057759

RESUMO

BACKGROUND: Circulating cell-free DNA (cfDNA) in the plasma of cancer patients contains cell-free tumour DNA (ctDNA) derived from tumour cells and it has been widely recognized as a non-invasive source of tumour DNA for diagnosis and prognosis of cancer. Molecular profiling of ctDNA is often performed using targeted sequencing or low-coverage whole genome sequencing (WGS) to identify tumour specific somatic mutations or somatic copy number aberrations (sCNAs). However, these approaches cannot efficiently detect all tumour-derived genomic changes in ctDNA. METHODS: We performed WGS analysis of cfDNA from 4 breast cancer patients and 2 patients with benign tumours. We sequenced matched germline DNA for all 6 patients and tumour samples from the breast cancer patients. All samples were sequenced on Illumina HiSeqXTen sequencing platform and achieved approximately 30x, 60x and 100x coverage on germline, tumour and plasma DNA samples, respectively. RESULTS: The mutational burden of the plasma samples (1.44 somatic mutations/Mb of genome) was higher than the matched tumour samples. However, 90% of high confidence somatic cfDNA variants were not detected in matched tumour samples and were found to comprise two background plasma mutational signatures. In contrast, cfDNA from the di-nucleosome fraction (300 bp-350 bp) had much higher proportion (30%) of variants shared with tumour. Despite high coverage sequencing we were unable to detect sCNAs in plasma samples. CONCLUSIONS: Deep sequencing analysis of plasma samples revealed higher fraction of unique somatic mutations in plasma samples, which were not detected in matched tumour samples. Sequencing of di-nucleosome bound cfDNA fragments may increase recovery of tumour mutations from plasma.


Assuntos
Neoplasias da Mama/genética , DNA Tumoral Circulante/sangue , Análise Mutacional de DNA/métodos , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Sequenciamento Completo do Genoma/métodos , Adulto , Biomarcadores Tumorais/genética , Neoplasias da Mama/sangue , Feminino , Humanos , Mutação , Prognóstico
10.
PLoS Comput Biol ; 17(1): e1008586, 2021 01.
Artigo em Inglês | MEDLINE | ID: mdl-33471816

RESUMO

A streaming assembly pipeline utilising real-time Oxford Nanopore Technology (ONT) sequencing data is important for saving sequencing resources and reducing time-to-result. A previous approach implemented in npScarf provided an efficient streaming algorithm for hybrid assembly but was relatively prone to mis-assemblies compared to other graph-based methods. Here we present npGraph, a streaming hybrid assembly tool using the assembly graph instead of the separated pre-assembly contigs. It is able to produce more complete genome assembly by resolving the path finding problem on the assembly graph using long reads as the traversing guide. Application to synthetic and real data from bacterial isolate genomes show improved accuracy while still maintaining a low computational cost. npGraph also provides a graphical user interface (GUI) which provides a real-time visualisation of the progress of assembly. The tool and source code is available at https://github.com/hsnguyen/assembly.


Assuntos
Nanoporos , Alinhamento de Sequência/métodos , Análise de Sequência de DNA/métodos , Algoritmos , Biologia Computacional , DNA Bacteriano/análise , DNA Bacteriano/genética , Genoma Bacteriano/genética , Nanotecnologia , Software , Interface Usuário-Computador
11.
Theor Appl Genet ; 133(1): 23-36, 2020 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-31595335

RESUMO

KEY MESSAGE: ß-Carotene content in sweetpotato is associated with the Orange and phytoene synthase genes; due to physical linkage of phytoene synthase with sucrose synthase, ß-carotene and starch content are negatively correlated. In populations depending on sweetpotato for food security, starch is an important source of calories, while ß-carotene is an important source of provitamin A. The negative association between the two traits contributes to the low nutritional quality of sweetpotato consumed, especially in sub-Saharan Africa. Using a biparental mapping population of 315 F1 progeny generated from a cross between an orange-fleshed and a non-orange-fleshed sweetpotato variety, we identified two major quantitative trait loci (QTL) on linkage group (LG) three (LG3) and twelve (LG12) affecting starch, ß-carotene, and their correlated traits, dry matter and flesh color. Analysis of parental haplotypes indicated that these two regions acted pleiotropically to reduce starch content and increase ß-carotene in genotypes carrying the orange-fleshed parental haplotype at the LG3 locus. Phytoene synthase and sucrose synthase, the rate-limiting and linked genes located within the QTL on LG3 involved in the carotenoid and starch biosynthesis, respectively, were differentially expressed in Beauregard versus Tanzania storage roots. The Orange gene, the molecular switch for chromoplast biogenesis, located within the QTL on LG12 while not differentially expressed was expressed in developing roots of the parental genotypes. We conclude that these two QTL regions act together in a cis and trans manner to inhibit starch biosynthesis in amyloplasts and enhance chromoplast biogenesis, carotenoid biosynthesis, and accumulation in orange-fleshed sweetpotato. Understanding the genetic basis of this negative association between starch and ß-carotene will inform future sweetpotato breeding strategies targeting sweetpotato for food and nutritional security.


Assuntos
Regulação da Expressão Gênica de Plantas , Ipomoea batatas/genética , Poliploidia , Locos de Características Quantitativas/genética , Amido/metabolismo , beta Caroteno/metabolismo , Alelos , Meio Ambiente , Estudos de Associação Genética , Fenótipo , Raízes de Plantas/genética , Raízes de Plantas/crescimento & desenvolvimento , Característica Quantitativa Herdável
12.
Emerg Infect Dis ; 25(3): 406-415, 2019 03.
Artigo em Inglês | MEDLINE | ID: mdl-30789135

RESUMO

In this retrospective study, we used whole-genome sequencing (WGS) to delineate transmission dynamics, characterize drug-resistance markers, and identify risk factors of transmission among Papua New Guinea residents of the Torres Strait Protected Zone (TSPZ) who had tuberculosis diagnoses during 2010-2015. Of 117 isolates collected, we could acquire WGS data for 100; 79 were Beijing sublineage 2.2.1.1, which was associated with active transmission (odds ratio 6.190, 95% CI 2.221-18.077). Strains were distributed widely throughout the TSPZ. Clustering occurred more often within than between villages (p = 0.0013). Including 4 multidrug-resistant tuberculosis isolates from Australia citizens epidemiologically linked to the TSPZ into the transmission network analysis revealed 2 probable cross-border transmission events. All multidrug-resistant isolates (33/104) belonged to Beijing sublineage 2.2.1.1 and had high-level isoniazid and ethionamide co-resistance; 2 isolates were extensively drug resistant. Including WGS in regional surveillance could improve tuberculosis transmission tracking and control strategies within the TSPZ.


Assuntos
Emigração e Imigração , Mycobacterium tuberculosis/efeitos dos fármacos , Tuberculose Resistente a Múltiplos Medicamentos/epidemiologia , Tuberculose Resistente a Múltiplos Medicamentos/microbiologia , Antituberculosos/farmacologia , Austrália/epidemiologia , Técnicas de Tipagem Bacteriana , Evolução Molecular , Genótipo , Geografia , História do Século XXI , Humanos , Testes de Sensibilidade Microbiana , Mycobacterium tuberculosis/genética , Mycobacterium tuberculosis/isolamento & purificação , Papua Nova Guiné/epidemiologia , Tuberculose Resistente a Múltiplos Medicamentos/diagnóstico , Tuberculose Resistente a Múltiplos Medicamentos/história , Sequenciamento Completo do Genoma
13.
J Antimicrob Chemother ; 74(3): 582-593, 2019 03 01.
Artigo em Inglês | MEDLINE | ID: mdl-30445429

RESUMO

BACKGROUND: Polymyxin B and E (colistin) have been pivotal in the treatment of XDR Gram-negative bacterial infections; however, resistance has emerged. A structurally related lipopeptide, octapeptin C4, has shown significant potency against XDR bacteria, including polymyxin-resistant strains, but its mode of action remains undefined. OBJECTIVES: We sought to compare and contrast the acquisition of resistance in an XDR Klebsiella pneumoniae (ST258) clinical isolate in vitro with all three lipopeptides to potentially unveil variations in their mode of action. METHODS: The isolate was exposed to increasing concentrations of polymyxins and octapeptin C4 over 20 days. Day 20 strains underwent WGS, complementation assays, antimicrobial susceptibility testing and lipid A analysis. RESULTS: Twenty days of exposure to the polymyxins resulted in a 1000-fold increase in the MIC, whereas for octapeptin C4 a 4-fold increase was observed. There was no cross-resistance observed between the polymyxin- and octapeptin-resistant strains. Sequencing of polymyxin-resistant isolates revealed mutations in previously known resistance-associated genes, including crrB, mgrB, pmrB, phoPQ and yciM, along with novel mutations in qseC. Octapeptin C4-resistant isolates had mutations in mlaDF and pqiB, genes related to phospholipid transport. These genetic variations were reflected in distinct phenotypic changes to lipid A. Polymyxin-resistant isolates increased 4-amino-4-deoxyarabinose fortification of lipid A phosphate groups, whereas the lipid A of octapeptin C4-resistant strains harboured a higher abundance of hydroxymyristate and palmitoylate. CONCLUSIONS: Octapeptin C4 has a distinct mode of action compared with the polymyxins, highlighting its potential as a future therapeutic agent to combat the increasing threat of XDR bacteria.


Assuntos
Antibacterianos/farmacologia , Colistina/farmacologia , Farmacorresistência Bacteriana Múltipla , Klebsiella pneumoniae/efeitos dos fármacos , Lipopeptídeos/farmacologia , Peptídeos Cíclicos/farmacologia , Polimixina B/farmacologia , Humanos , Infecções por Klebsiella/microbiologia , Klebsiella pneumoniae/isolamento & purificação , Testes de Sensibilidade Microbiana , Mutação , Sequenciamento Completo do Genoma
14.
Bioinformatics ; 34(5): 873-874, 2018 03 01.
Artigo em Inglês | MEDLINE | ID: mdl-29092025

RESUMO

Motivation: Targeted sequencing using capture probes has become increasingly popular in clinical applications due to its scalability and cost-effectiveness. The approach also allows for higher sequencing coverage of the targeted regions resulting in better analysis statistical power. However, because of the dynamics of the hybridization process, it is difficult to evaluate the efficiency of the probe design prior to the experiments which are time consuming and costly. Results: We developed CapSim, a software package for simulation of targeted sequencing. Given a genome sequence and a set of probes, CapSim simulates the fragmentation, the dynamics of probe hybridization and the sequencing of the captured fragments on Illumina and PacBio sequencing platforms. The simulated data can be used for evaluating the performance of the analysis pipeline, as well as the efficiency of the probe design. Parameters of the various stages in the sequencing process can also be evaluated in order to optimize the experiments. Availability and implementation: CapSim is publicly available under BSD license at https://github.com/Devika1/capsim. Contact: l.coin@imb.uq.edu.au. Supplementary information: Supplementary data are available at Bioinformatics online.


Assuntos
Sequenciamento de Nucleotídeos em Larga Escala/métodos , Análise de Sequência de DNA/métodos , Genômica/métodos , Software
15.
BMC Infect Dis ; 19(1): 660, 2019 Jul 24.
Artigo em Inglês | MEDLINE | ID: mdl-31340776

RESUMO

BACKGROUND: Rapid diagnosis and appropriate treatment is imperative in bacterial sepsis due increasing risk of mortality with every hour without appropriate antibiotic therapy. Atypical infections with fastidious organisms may take more than 4 days to diagnose leading to calls for improved methods for rapidly diagnosing sepsis. Capnocytophaga canimorsus is a slow-growing, fastidious gram-negative bacillus which is a common commensal within the mouths of dogs, but rarely cause infections in humans. C. canimorsus sepsis risk factors include immunosuppression, alcoholism and elderly age. Here we report on the application of emerging nanopore sequencing methods to rapidly diagnose an atypical case of C. canimorsus septic shock. CASE PRESENTATION: A 62 year-old female patient was admitted to an intensive care unit with septic shock and multi-organ failure six days after a reported dog bite. Blood cultures were unable to detect a pathogen after 3 days despite observed intracellular bacilli on blood smears. Real-time nanopore sequencing was subsequently employed on whole blood to detect Capnocytophaga canimorsus in 19 h. The patient was not immunocompromised and did not have any other known risk factors. Whole-genome sequencing of clinical sample and of the offending dog's oral swabs showed near-identical C. canimorsus genomes. The patient responded to antibiotic treatment and was discharged from hospital 31 days after admission. CONCLUSIONS: Use of real-time nanopore sequencing reduced the time-to-diagnosis of Capnocytophaga canimorsus in this case from 6.25 days to 19 h. Capnocytophaga canimorsus should be considered in cases of suspected sepsis involving cat or dog contact, irrespective of the patient's known risk factors.


Assuntos
Mordeduras e Picadas/complicações , Capnocytophaga/isolamento & purificação , Choque Séptico/diagnóstico , Animais , Antibacterianos/uso terapêutico , Capnocytophaga/efeitos dos fármacos , Capnocytophaga/genética , Gatos , Cães , Feminino , Infecções por Bactérias Gram-Negativas/diagnóstico , Infecções por Bactérias Gram-Negativas/imunologia , Infecções por Bactérias Gram-Negativas/microbiologia , Humanos , Hospedeiro Imunocomprometido , Pessoa de Meia-Idade , Nanoporos , Análise de Sequência de DNA , Choque Séptico/imunologia , Choque Séptico/microbiologia
16.
Nucleic Acids Res ; 45(5): e34, 2017 03 17.
Artigo em Inglês | MEDLINE | ID: mdl-27903916

RESUMO

Accurate identification of copy number alterations is an essential step in understanding the events driving tumor progression. While a variety of algorithms have been developed to use high-throughput sequencing data to profile copy number changes, no tool is able to reliably characterize ploidy and genotype absolute copy number from tumor samples that contain less than 40% tumor cells. To increase our power to resolve the copy number profile from low-cellularity tumor samples, we developed a novel approach that pre-phases heterozygote germline single nucleotide polymorphisms (SNPs) in order to replace the commonly used 'B-allele frequency' with a more powerful 'parental-haplotype frequency'. We apply our tool-sCNAphase-to characterize the copy number and loss-of-heterozygosity profiles of four publicly available breast cancer cell-lines. Comparisons to previous spectral karyotyping and microarray studies revealed that sCNAphase reliably identified overall ploidy as well as the individual copy number mutations from each cell-line. Analysis of artificial cell-line mixtures demonstrated the capacity of this method to determine the level of tumor cellularity, consistently identify sCNAs and characterize ploidy in samples with as little as 10% tumor cells. This novel methodology has the potential to bring sCNA profiling to low-cellularity tumors, a form of cancer unable to be accurately studied by current methods.


Assuntos
Aneuploidia , Variações do Número de Cópias de DNA , Haplótipos , Software , Algoritmos , Contagem de Células , Linhagem Celular Tumoral , Dosagem de Genes , Heterozigoto , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Análise de Sequência de DNA
18.
BMC Bioinformatics ; 19(1): 261, 2018 07 13.
Artigo em Inglês | MEDLINE | ID: mdl-30001702

RESUMO

BACKGROUND: Detection of genomic inversions remains challenging. Many existing methods primarily target inzversions with a non repetitive breakpoint, leaving inverted repeat (IR) mediated non-allelic homologous recombination (NAHR) inversions largely unexplored. RESULT: We present npInv, a novel tool specifically for detecting and genotyping NAHR inversion using long read sub-alignment of long read sequencing data. We benchmark npInv with other tools in both simulation and real data. We use npInv to generate a whole-genome inversion map for NA12878 consisting of 30 NAHR inversions (of which 15 are novel), including all previously known NAHR mediated inversions in NA12878 with flanking IR less than 7kb. Our genotyping accuracy on this dataset was 94%. We used PCR to confirm the presence of two of these novel inversions. We show that there is a near linear relationship between the length of flanking IR and the minimum inversion size, without inverted repeats. CONCLUSION: The application of npInv shows high accuracy in both simulation and real data. The results give deeper insight into understanding inversion.


Assuntos
Inversão Cromossômica/genética , Genótipo , Humanos
19.
Bioinformatics ; 33(24): 3988-3990, 2017 Dec 15.
Artigo em Inglês | MEDLINE | ID: mdl-28961965

RESUMO

MOTIVATION: The recent introduction of a barcoding protocol for Oxford Nanopore sequencing has increased the versatility of the technology. Several bioinformatics tools have been developed to demultiplex barcoded reads, but none of them supports streaming analysis. This limits the use of multiplexed sequencing in real-time applications, which is one of the main advantages of the technology. RESULTS: We introduced npBarcode, an open source and cross-platform tool for barcode demultiplexing in streaming fashion that can be used to pipe data to further real-time analyses. The tool also provides a friendly graphical user interface by integrating the module into npReader, making possible to monitor the progress concurrently when the sequencing is still in progress. We show that our algorithm achieves accuracies at least as good as competing tools. AVAILABILITY AND IMPLEMENTATION: npBarcode is bundled in Japsa-a Java tools kit for genome analysis, and is freely available at https://github.com/mdcao/japsa. CONTACT: s.nguyen@uq.edu.au or l.coin@imb.uq.edu.au. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Processamento Eletrônico de Dados , Nanoporos , Análise de Sequência de DNA/métodos , Software , Algoritmos , Reprodutibilidade dos Testes
20.
N Engl J Med ; 370(18): 1712-1723, 2014 May 01.
Artigo em Inglês | MEDLINE | ID: mdl-24785206

RESUMO

BACKGROUND: Improved diagnostic tests for tuberculosis in children are needed. We hypothesized that transcriptional signatures of host blood could be used to distinguish tuberculosis from other diseases in African children who either were or were not infected with the human immunodeficiency virus (HIV). METHODS: The study population comprised prospective cohorts of children who were undergoing evaluation for suspected tuberculosis in South Africa (655 children), Malawi (701 children), and Kenya (1599 children). Patients were assigned to groups according to whether the diagnosis was culture-confirmed tuberculosis, culture-negative tuberculosis, diseases other than tuberculosis, or latent tuberculosis infection. Diagnostic signatures distinguishing tuberculosis from other diseases and from latent tuberculosis infection were identified from genomewide analysis of RNA expression in host blood. RESULTS: We identified a 51-transcript signature distinguishing tuberculosis from other diseases in the South African and Malawian children (the discovery cohort). In the Kenyan children (the validation cohort), a risk score based on the signature for tuberculosis and for diseases other than tuberculosis showed a sensitivity of 82.9% (95% confidence interval [CI], 68.6 to 94.3) and a specificity of 83.6% (95% CI, 74.6 to 92.7) for the diagnosis of culture-confirmed tuberculosis. Among patients with cultures negative for Mycobacterium tuberculosis who were treated for tuberculosis (those with highly probable, probable, or possible cases of tuberculosis), the estimated sensitivity was 62.5 to 82.3%, 42.1 to 80.8%, and 35.3 to 79.6%, respectively, for different estimates of actual tuberculosis in the groups. In comparison, the sensitivity of the Xpert MTB/RIF assay for molecular detection of M. tuberculosis DNA in cases of culture-confirmed tuberculosis was 54.3% (95% CI, 37.1 to 68.6), and the sensitivity in highly probable, probable, or possible cases was an estimated 25.0 to 35.7%, 5.3 to 13.3%, and 0%, respectively; the specificity of the assay was 100%. CONCLUSIONS: RNA expression signatures provided data that helped distinguish tuberculosis from other diseases in African children with and those without HIV infection. (Funded by the European Union Action for Diseases of Poverty Program and others).


Assuntos
Mycobacterium tuberculosis/genética , RNA Bacteriano/sangue , Transcriptoma , Tuberculose/diagnóstico , África , Algoritmos , Técnicas Bacteriológicas , Criança , Pré-Escolar , Diagnóstico Diferencial , Infecções por HIV/complicações , Humanos , Lactente , Tuberculose Latente/diagnóstico , Masculino , Mycobacterium tuberculosis/isolamento & purificação , Análise de Sequência com Séries de Oligonucleotídeos , Risco , Sensibilidade e Especificidade , Tuberculose/complicações , Tuberculose/genética
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA