Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 22
Filtrar
1.
BMC Bioinformatics ; 17: 387, 2016 Sep 20.
Artigo em Inglês | MEDLINE | ID: mdl-27650316

RESUMO

BACKGROUND: Burkholderia mallei and B. pseudomallei are the causative agents of glanders and melioidosis, respectively, diseases with high morbidity and mortality rates. B. mallei and B. pseudomallei are closely related genetically; B. mallei evolved from an ancestral strain of B. pseudomallei by genome reduction and adaptation to an obligate intracellular lifestyle. Although these two bacteria cause different diseases, they share multiple virulence factors, including bacterial secretion systems, which represent key components of bacterial pathogenicity. Despite recent progress, the secretion system proteins for B. mallei and B. pseudomallei, their pathogenic mechanisms of action, and host factors are not well characterized. RESULTS: We previously developed a manually curated database, DBSecSys, of bacterial secretion system proteins for B. mallei. Here, we report an expansion of the database with corresponding information about B. pseudomallei. DBSecSys 2.0 contains comprehensive literature-based and computationally derived information about B. mallei ATCC 23344 and literature-based and computationally derived information about B. pseudomallei K96243. The database contains updated information for 163 B. mallei proteins from the previous database and 61 additional B. mallei proteins, and new information for 281 B. pseudomallei proteins associated with 5 secretion systems, their 1,633 human- and murine-interacting targets, and 2,400 host-B. mallei interactions and 2,286 host-B. pseudomallei interactions. The database also includes information about 13 pathogenic mechanisms of action for B. mallei and B. pseudomallei secretion system proteins inferred from the available literature or computationally. Additionally, DBSecSys 2.0 provides details about 82 virulence attenuation experiments for 52 B. mallei secretion system proteins and 98 virulence attenuation experiments for 61 B. pseudomallei secretion system proteins. We updated the Web interface and data access layer to speed-up users' search of detailed information for orthologous proteins related to secretion systems of the two pathogens. CONCLUSIONS: The updates of DBSecSys 2.0 provide unique capabilities to access comprehensive information about secretion systems of B. mallei and B. pseudomallei. They enable studies and comparisons of corresponding proteins of these two closely related pathogens and their host-interacting partners. The database is available at http://dbsecsys.bhsai.org .


Assuntos
Proteínas de Bactérias/metabolismo , Sistemas de Secreção Bacterianos/metabolismo , Burkholderia mallei/patogenicidade , Burkholderia pseudomallei/patogenicidade , Bases de Dados de Proteínas , Animais , Proteínas de Bactérias/genética , Sistemas de Secreção Bacterianos/genética , Burkholderia mallei/genética , Burkholderia mallei/metabolismo , Burkholderia pseudomallei/genética , Burkholderia pseudomallei/metabolismo , Humanos , Camundongos , Fatores de Virulência/genética , Fatores de Virulência/metabolismo
2.
PLoS Comput Biol ; 11(3): e1004088, 2015 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-25738731

RESUMO

Burkholderia pathogenicity relies on protein virulence factors to control and promote bacterial internalization, survival, and replication within eukaryotic host cells. We recently used yeast two-hybrid (Y2H) screening to identify a small set of novel Burkholderia proteins that were shown to attenuate disease progression in an aerosol infection animal model using the virulent Burkholderia mallei ATCC 23344 strain. Here, we performed an extended analysis of primarily nine B. mallei virulence factors and their interactions with human proteins to map out how the bacteria can influence and alter host processes and pathways. Specifically, we employed topological analyses to assess the connectivity patterns of targeted host proteins, identify modules of pathogen-interacting host proteins linked to processes promoting infectivity, and evaluate the effect of crosstalk among the identified host protein modules. Overall, our analysis showed that the targeted host proteins generally had a large number of interacting partners and interacted with other host proteins that were also targeted by B. mallei proteins. We also introduced a novel Host-Pathogen Interaction Alignment (HPIA) algorithm and used it to explore similarities between host-pathogen interactions of B. mallei, Yersinia pestis, and Salmonella enterica. We inferred putative roles of B. mallei proteins based on the roles of their aligned Y. pestis and S. enterica partners and showed that up to 73% of the predicted roles matched existing annotations. A key insight into Burkholderia pathogenicity derived from these analyses of Y2H host-pathogen interactions is the identification of eukaryotic-specific targeted cellular mechanisms, including the ubiquitination degradation system and the use of the focal adhesion pathway as a fulcrum for transmitting mechanical forces and regulatory signals. This provides the mechanisms to modulate and adapt the host-cell environment for the successful establishment of host infections and intracellular spread.


Assuntos
Burkholderia mallei/fisiologia , Burkholderia mallei/patogenicidade , Interações Hospedeiro-Patógeno/fisiologia , Algoritmos , Animais , Proteínas de Bactérias/fisiologia , Análise por Conglomerados , Biologia Computacional , Adesões Focais , Mormo/microbiologia , Mormo/fisiopatologia , Humanos , Camundongos , Mapas de Interação de Proteínas/fisiologia , Transdução de Sinais/fisiologia , Fatores de Virulência/metabolismo
3.
BMC Genomics ; 16: 1106, 2015 Dec 29.
Artigo em Inglês | MEDLINE | ID: mdl-26714771

RESUMO

BACKGROUND: Francisella tularensis is a select bio-threat agent and one of the most virulent intracellular pathogens known, requiring just a few organisms to establish an infection. Although several virulence factors are known, we lack an understanding of virulence factors that act through host-pathogen protein interactions to promote infection. To address these issues in the highly infectious F. tularensis subsp. tularensis Schu S4 strain, we deployed a combined in silico, in vitro, and in vivo analysis to identify virulence factors and their interactions with host proteins to characterize bacterial infection mechanisms. RESULTS: We initially used comparative genomics and literature to identify and select a set of 49 putative and known virulence factors for analysis. Each protein was then subjected to proteome-scale yeast two-hybrid (Y2H) screens with human and murine cDNA libraries to identify potential host-pathogen protein-protein interactions. Based on the bacterial protein interaction profile with both hosts, we selected seven novel putative virulence factors for mutant construction and animal validation experiments. We were able to create five transposon insertion mutants and used them in an intranasal BALB/c mouse challenge model to establish 50 % lethal dose estimates. Three of these, ΔFTT0482c, ΔFTT1538c, and ΔFTT1597, showed attenuation in lethality and can thus be considered novel F. tularensis virulence factors. The analysis of the accompanying Y2H data identified intracellular protein trafficking between the early endosome to the late endosome as an important component in virulence attenuation for these virulence factors. Furthermore, we also used the Y2H data to investigate host protein binding of two known virulence factors, showing that direct protein binding was a component in the modulation of the inflammatory response via activation of mitogen-activated protein kinases and in the oxidative stress response. CONCLUSIONS: Direct interactions with specific host proteins and the ability to influence interactions among host proteins are important components for F. tularensis to avoid host-cell defense mechanisms and successfully establish an infection. Although direct host-pathogen protein-protein binding is only one aspect of Francisella virulence, it is a critical component in directly manipulating and interfering with cellular processes in the host cell.


Assuntos
Francisella tularensis/patogenicidade , Interações Hospedeiro-Patógeno/genética , Fatores de Virulência/metabolismo , Animais , Feminino , Francisella tularensis/genética , Camundongos , Camundongos Endogâmicos BALB C , Ligação Proteica/genética , Ligação Proteica/fisiologia , Virulência/genética , Fatores de Virulência/genética
4.
Mol Cell Proteomics ; 12(11): 3036-51, 2013 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-23800426

RESUMO

Burkholderia mallei is an infectious intracellular pathogen whose virulence and resistance to antibiotics makes it a potential bioterrorism agent. Given its genetic origin as a commensal soil organism, it is equipped with an extensive and varied set of adapted mechanisms to cope with and modulate host-cell environments. One essential virulence mechanism constitutes the specialized secretion systems that are designed to penetrate host-cell membranes and insert pathogen proteins directly into the host cell's cytosol. However, the secretion systems' proteins and, in particular, their host targets are largely uncharacterized. Here, we used a combined in silico, in vitro, and in vivo approach to identify B. mallei proteins required for pathogenicity. We used bioinformatics tools, including orthology detection and ab initio predictions of secretion system proteins, as well as published experimental Burkholderia data to initially select a small number of proteins as putative virulence factors. We then used yeast two-hybrid assays against normalized whole human and whole murine proteome libraries to detect and identify interactions among each of these bacterial proteins and host proteins. Analysis of such interactions provided both verification of known virulence factors and identification of three new putative virulence proteins. We successfully created insertion mutants for each of these three proteins using the virulent B. mallei ATCC 23344 strain. We exposed BALB/c mice to mutant strains and the wild-type strain in an aerosol challenge model using lethal B. mallei doses. In each set of experiments, mice exposed to mutant strains survived for the 21-day duration of the experiment, whereas mice exposed to the wild-type strain rapidly died. Given their in vivo role in pathogenicity, and based on the yeast two-hybrid interaction data, these results point to the importance of these pathogen proteins in modulating host ubiquitination pathways, phagosomal escape, and actin-cytoskeleton rearrangement processes.


Assuntos
Burkholderia mallei/metabolismo , Burkholderia mallei/patogenicidade , Interações Hospedeiro-Patógeno/fisiologia , Fatores de Virulência/metabolismo , Animais , Proteínas de Bactérias/genética , Proteínas de Bactérias/metabolismo , Burkholderia mallei/genética , Feminino , Interações Hospedeiro-Patógeno/genética , Humanos , Camundongos , Camundongos Endogâmicos BALB C , Mutagênese Insercional , Mapeamento de Interação de Proteínas , Proteômica , Técnicas do Sistema de Duplo-Híbrido , Virulência/genética , Virulência/fisiologia , Fatores de Virulência/genética
5.
BMC Bioinformatics ; 15: 244, 2014 Jul 16.
Artigo em Inglês | MEDLINE | ID: mdl-25030112

RESUMO

BACKGROUND: Bacterial pathogenicity represents a major public health concern worldwide. Secretion systems are a key component of bacterial pathogenicity, as they provide the means for bacterial proteins to penetrate host-cell membranes and insert themselves directly into the host cells' cytosol. Burkholderia mallei is a Gram-negative bacterium that uses multiple secretion systems during its host infection life cycle. To date, the identities of secretion system proteins for B. mallei are not well known, and their pathogenic mechanisms of action and host factors are largely uncharacterized. DESCRIPTION: We present the Database of Burkholderia malleiSecretion Systems (DBSecSys), a compilation of manually curated and computationally predicted bacterial secretion system proteins and their host factors. Currently, DBSecSys contains comprehensive experimentally and computationally derived information about B. mallei strain ATCC 23344. The database includes 143 B. mallei proteins associated with five secretion systems, their 1,635 human and murine interacting targets, and the corresponding 2,400 host-B. mallei interactions. The database also includes information about 10 pathogenic mechanisms of action for B. mallei secretion system proteins inferred from the available literature. Additionally, DBSecSys provides details about 42 virulence attenuation experiments for 27 B. mallei secretion system proteins. Users interact with DBSecSys through a Web interface that allows for data browsing, querying, visualizing, and downloading. CONCLUSIONS: DBSecSys provides a comprehensive, systematically organized resource of experimental and computational data associated with B. mallei secretion systems. It provides the unique ability to study secretion systems not only through characterization of their corresponding pathogen proteins, but also through characterization of their host-interacting partners.The database is available at https://applications.bhsai.org/dbsecsys.


Assuntos
Proteínas de Bactérias/metabolismo , Sistemas de Secreção Bacterianos , Burkholderia mallei/fisiologia , Bases de Dados de Proteínas , Animais , Burkholderia mallei/metabolismo , Burkholderia mallei/patogenicidade , Interações Hospedeiro-Patógeno , Humanos , Camundongos , Fatores de Virulência/metabolismo
6.
Nucleic Acids Res ; 40(16): e127, 2012 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-22584625

RESUMO

Accurate estimation of expression levels from RNA-Seq data entails precise mapping of the sequence reads to a reference genome. Because the standard reference genome contains only one allele at any given locus, reads overlapping polymorphic loci that carry a non-reference allele are at least one mismatch away from the reference and, hence, are less likely to be mapped. This bias in read mapping leads to inaccurate estimates of allele-specific expression (ASE). To address this read-mapping bias, we propose the construction of an enhanced reference genome that includes the alternative alleles at known polymorphic loci. We show that mapping to this enhanced reference reduced the read-mapping biases, leading to more reliable estimates of ASE. Experiments on simulated data show that the proposed strategy reduced the number of loci with mapping bias by ≥ 63% when compared with a previous approach that relies on masking the polymorphic loci and by ≥ 18% when compared with the standard approach that uses an unaltered reference. When we applied our strategy to actual RNA-Seq data, we found that it mapped up to 15% more reads than the previous approaches and identified many seemingly incorrect inferences made by them.


Assuntos
Alelos , Mapeamento Cromossômico/métodos , Perfilação da Expressão Gênica , Análise de Sequência de RNA/métodos , Mapeamento Cromossômico/normas , Loci Gênicos , Genoma Humano , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Polimorfismo de Nucleotídeo Único , Padrões de Referência
7.
Nucleic Acids Res ; 39(13): e88, 2011 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-21572104

RESUMO

The unparalleled growth in the availability of genomic data offers both a challenge to develop orthology detection methods that are simultaneously accurate and high throughput and an opportunity to improve orthology detection by leveraging evolutionary evidence in the accumulated sequenced genomes. Here, we report a novel orthology detection method, termed QuartetS, that exploits evolutionary evidence in a computationally efficient manner. Based on the well-established evolutionary concept that gene duplication events can be used to discriminate homologous genes, QuartetS uses an approximate phylogenetic analysis of quartet gene trees to infer the occurrence of duplication events and discriminate paralogous from orthologous genes. We used function- and phylogeny-based metrics to perform a large-scale, systematic comparison of the orthology predictions of QuartetS with those of four other methods [bi-directional best hit (BBH), outgroup, OMA and QuartetS-C (QuartetS followed by clustering)], involving 624 bacterial genomes and >2 million genes. We found that QuartetS slightly, but consistently, outperformed the highly specific OMA method and that, while consuming only 0.5% additional computational time, QuartetS predicted 50% more orthologs with a 50% lower false positive rate than the widely used BBH method. We conclude that, for large-scale phylogenetic and functional analysis, QuartetS and QuartetS-C should be preferred, respectively, in applications where high accuracy and high throughput are required.


Assuntos
Algoritmos , Genes , Filogenia , Duplicação Gênica , Genoma Bacteriano , Genômica/métodos , Alinhamento de Sequência
8.
BMC Bioinformatics ; 11: 340, 2010 Jun 23.
Artigo em Inglês | MEDLINE | ID: mdl-20573238

RESUMO

BACKGROUND: Pathogen diagnostic assays based on polymerase chain reaction (PCR) technology provide high sensitivity and specificity. However, the design of these diagnostic assays is computationally intensive, requiring high-throughput methods to identify unique PCR signatures in the presence of an ever increasing availability of sequenced genomes. RESULTS: We present the Tool for PCR Signature Identification (TOPSI), a high-performance computing pipeline for the design of PCR-based pathogen diagnostic assays. The TOPSI pipeline efficiently designs PCR signatures common to multiple bacterial genomes by obtaining the shared regions through pairwise alignments between the input genomes. TOPSI successfully designed PCR signatures common to 18 Staphylococcus aureus genomes in less than 14 hours using 98 cores on a high-performance computing system. CONCLUSIONS: TOPSI is a computationally efficient, fully integrated tool for high-throughput design of PCR signatures common to multiple bacterial genomes. TOPSI is freely available for download at http://www.bhsai.org/downloads/topsi.tar.gz.


Assuntos
Biologia Computacional/métodos , Genoma Bacteriano , Reação em Cadeia da Polimerase/métodos , Staphylococcus aureus/genética , Sequência de Bases , Burkholderia mallei/genética , Burkholderia pseudomallei/genética , Mapeamento Cromossômico , Sensibilidade e Especificidade , Staphylococcus aureus/classificação
9.
Proteins ; 74(2): 449-60, 2009 Feb 01.
Artigo em Inglês | MEDLINE | ID: mdl-18636476

RESUMO

In this article, we present a new method termed CatFam (Catalytic Families) to automatically infer the functions of catalytic proteins, which account for 20-40% of all proteins in living organisms and play a critical role in a variety of biological processes. CatFam is a sequence-based method that generates sequence profiles to represent and infer protein catalytic functions. CatFam generates profiles through a stepwise procedure that carefully controls profile quality and employs nonenzymes as negative samples to establish profile-specific thresholds associated with a predefined nominal false-positive rate (FPR) of predictions. The adjustable FPR allows for fine precision control of each profile and enables the generation of profile databases that meet different needs: function annotation with high precision and hypothesis generation with moderate precision but better recall. Multiple tests of CatFam databases (generated with distinct nominal FPRs) against enzyme and nonenzyme datasets show that the method's predictions have consistently high precision and recall. For example, a 1% FPR database predicts protein catalytic functions for a dataset of enzymes and nonenzymes with 98.6% precision and 95.0% recall. Comparisons of CatFam databases against other established profile-based methods for the functional annotation of 13 bacterial genomes indicate that CatFam consistently achieves higher precision and (in most cases) higher recall, and that (on average) CatFam provides 21.9% additional catalytic functions not inferred by the other similarly reliable methods. These results strongly suggest that the proposed method provides a valuable contribution to the automated prediction of protein catalytic functions. The CatFam databases and the database search program are freely available at http://www.bhsai.org/downloads/catfam.tar.gz.


Assuntos
Algoritmos , Bases de Dados de Proteínas , Análise de Sequência de Proteína/métodos , Animais , Catálise , Análise por Conglomerados , Enzimas/genética , Enzimas/metabolismo , Genoma , Humanos , Redes e Vias Metabólicas , Estrutura Terciária de Proteína , Proteínas/genética , Proteínas/metabolismo , Reprodutibilidade dos Testes , Relação Estrutura-Atividade
10.
BMC Bioinformatics ; 9: 185, 2008 Apr 10.
Artigo em Inglês | MEDLINE | ID: mdl-18402679

RESUMO

BACKGROUND: We present a methodology for high-throughput design of oligonucleotide fingerprints for microarray-based pathogen diagnostic assays. The oligonucleotide fingerprints, or DNA microarray probes, are designed for identifying target organisms in environmental or clinical samples. The design process is implemented in a high-performance computing software pipeline that incorporates major algorithmic improvements over a previous version to both reduce computation time and improve specificity assessment. RESULTS: The algorithmic improvements result in significant reduction in runtimes, with the updated pipeline being nearly up to five-times faster than the previous version. The improvements in specificity assessment, based on multiple specificity criteria, result in robust and consistent evaluation of cross-hybridization with nontarget sequences. In addition, the multiple criteria provide finer control on the number of resulting fingerprints, which helps in obtaining a larger number of fingerprints with high specificity. Simulation tests for Francisella tularensis and Yersinia pestis, using a well-established hybridization model to estimate cross-hybridization with nontarget sequences, show that the improved specificity criteria yield a larger number of fingerprints as compared to using a single specificity criterion. CONCLUSION: The faster runtimes, achieved as the result of algorithmic improvements, are critical for extending the pipeline to process multiple target genomes. The larger numbers of identified fingerprints, obtained by considering broader specificity criteria, are essential for designing probes for hard-to-distinguish target sequences.


Assuntos
Bactérias/genética , Bactérias/isolamento & purificação , Impressões Digitais de DNA/métodos , Sondas de DNA/genética , DNA Bacteriano/genética , Ilhas Genômicas/genética , Análise de Sequência com Séries de Oligonucleotídeos/métodos , Algoritmos , Mapeamento Cromossômico/métodos
11.
BMC Bioinformatics ; 9: 52, 2008 Jan 25.
Artigo em Inglês | MEDLINE | ID: mdl-18221520

RESUMO

BACKGROUND: Automated protein function prediction methods are needed to keep pace with high-throughput sequencing. With the existence of many programs and databases for inferring different protein functions, a pipeline that properly integrates these resources will benefit from the advantages of each method. However, integrated systems usually do not provide mechanisms to generate customized databases to predict particular protein functions. Here, we describe a tool termed PIPA (Pipeline for Protein Annotation) that has these capabilities. RESULTS: PIPA annotates protein functions by combining the results of multiple programs and databases, such as InterPro and the Conserved Domains Database, into common Gene Ontology (GO) terms. The major algorithms implemented in PIPA are: (1) a profile database generation algorithm, which generates customized profile databases to predict particular protein functions, (2) an automated ontology mapping generation algorithm, which maps various classification schemes into GO, and (3) a consensus algorithm to reconcile annotations from the integrated programs and databases.PIPA's profile generation algorithm is employed to construct the enzyme profile database CatFam, which predicts catalytic functions described by Enzyme Commission (EC) numbers. Validation tests show that CatFam yields average recall and precision larger than 95.0%. CatFam is integrated with PIPA. We use an association rule mining algorithm to automatically generate mappings between terms of two ontologies from annotated sample proteins. Incorporating the ontologies' hierarchical topology into the algorithm increases the number of generated mappings. In particular, it generates 40.0% additional mappings from the Clusters of Orthologous Groups (COG) to EC numbers and a six-fold increase in mappings from COG to GO terms. The mappings to EC numbers show a very high precision (99.8%) and recall (96.6%), while the mappings to GO terms show moderate precision (80.0%) and low recall (33.0%). Our consensus algorithm for GO annotation is based on the computation and propagation of likelihood scores associated with GO terms. The test results suggest that, for a given recall, the application of the consensus algorithm yields higher precision than when consensus is not used. CONCLUSION: The algorithms implemented in PIPA provide automated genome-wide protein function annotation based on reconciled predictions from multiple resources.


Assuntos
Algoritmos , Biologia Computacional/métodos , Bases de Dados de Proteínas , Reconhecimento Automatizado de Padrão/métodos , Proteínas/genética , Proteínas/fisiologia , Proteômica/métodos , Sequência de Aminoácidos , Relação Estrutura-Atividade
12.
BMC Genomics ; 9: 496, 2008 Oct 21.
Artigo em Inglês | MEDLINE | ID: mdl-18940003

RESUMO

BACKGROUND: With multiple strains of various pathogens being sequenced, it is necessary to develop high-throughput methods that can simultaneously process multiple bacterial or viral genomes to find common fingerprints as well as fingerprints that are unique to each individual genome. We present algorithmic enhancements to an existing single-genome pipeline that allows for efficient design of microarray probes common to groups of target genomes. The enhanced pipeline takes advantage of the similarities in the input genomes to narrow the search to short, nonredundant regions of the target genomes and, thereby, significantly reduces the computation time. The pipeline also computes a three-state hybridization matrix, which gives the expected hybridization of each probe with each target. RESULTS: Design of microarray probes for eight pathogenic Burkholderia genomes shows that the multiple-genome pipeline is nearly four-times faster than the single-genome pipeline for this application. The probes designed for these eight genomes were experimentally tested with one non-target and three target genomes. Hybridization experiments show that less than 10% of the designed probes cross hybridize with non-targets. Also, more than 65% of the probes designed to identify all Burkholderia mallei and B. pseudomallei strains successfully hybridize with a B. pseudomallei strain not used for probe design. CONCLUSION: The savings in runtime suggest that the enhanced pipeline can be used to design fingerprints for tens or even hundreds of related genomes in a single run. Hybridization results with an unsequenced B. pseudomallei strain indicate that the designed probes might be useful in identifying unsequenced strains of B. mallei and B. pseudomallei.


Assuntos
Burkholderia/genética , Impressões Digitais de DNA/métodos , Sondas de DNA , Genoma Bacteriano , Análise de Sequência com Séries de Oligonucleotídeos/métodos , Algoritmos , Técnicas de Tipagem Bacteriana , Burkholderia/classificação , Biologia Computacional , DNA Bacteriano/genética , Sensibilidade e Especificidade , Análise de Sequência de DNA
13.
Bioinformatics ; 23(1): 5-13, 2007 Jan 01.
Artigo em Inglês | MEDLINE | ID: mdl-17068088

RESUMO

MOTIVATION: Advances in DNA microarray technology and computational methods have unlocked new opportunities to identify 'DNA fingerprints', i.e. oligonucleotide sequences that uniquely identify a specific genome. We present an integrated approach for the computational identification of DNA fingerprints for design of microarray-based pathogen diagnostic assays. We provide a quantifiable definition of a DNA fingerprint stated both from a computational as well as an experimental point of view, and the analytical proof that all in silico fingerprints satisfying the stated definition are found using our approach. RESULTS: The presented computational approach is implemented in an integrated high-performance computing (HPC) software tool for oligonucleotide fingerprint identification termed TOFI. We employed TOFI to identify in silico DNA fingerprints for several bacteria and plasmid sequences, which were then experimentally evaluated as potential probes for microarray-based diagnostic assays. Results and analysis of approximately 150 in silico DNA fingerprints for Yersinia pestis and 250 fingerprints for Francisella tularensis are presented. AVAILABILITY: The implemented algorithm is available upon request.


Assuntos
Impressões Digitais de DNA/métodos , DNA Bacteriano/análise , DNA Bacteriano/classificação , Análise de Sequência com Séries de Oligonucleotídeos/métodos , Software , Algoritmos , Francisella tularensis/classificação , Francisella tularensis/genética , Plasmídeos/genética , Design de Software , Yersinia pestis/classificação , Yersinia pestis/genética
14.
PLoS One ; 12(11): e0188071, 2017.
Artigo em Inglês | MEDLINE | ID: mdl-29176882

RESUMO

Coxiella burnetii is an obligate Gram-negative intracellular pathogen and the etiological agent of Q fever. Successful infection requires a functional Type IV secretion system, which translocates more than 100 effector proteins into the host cytosol to establish the infection, restructure the intracellular host environment, and create a parasitophorous vacuole where the replicating bacteria reside. We used yeast two-hybrid (Y2H) screening of 33 selected C. burnetii effectors against whole genome human and murine proteome libraries to generate a map of potential host-pathogen protein-protein interactions (PPIs). We detected 273 unique interactions between 20 pathogen and 247 human proteins, and 157 between 17 pathogen and 137 murine proteins. We used orthology to combine the data and create a single host-pathogen interaction network containing 415 unique interactions between 25 C. burnetii and 363 human proteins. We further performed complementary pairwise Y2H testing of 43 out of 91 C. burnetii-human interactions involving five pathogen proteins. We used the combined data to 1) perform enrichment analyses of target host cellular processes and pathways, 2) examine effectors with known infection phenotypes, and 3) infer potential mechanisms of action for four effectors with uncharacterized functions. The host-pathogen interaction profiles supported known Coxiella phenotypes, such as adapting cell morphology through cytoskeletal re-arrangements, protein processing and trafficking, organelle generation, cholesterol processing, innate immune modulation, and interactions with the ubiquitin and proteasome pathways. The generated dataset of PPIs-the largest collection of unbiased Coxiella host-pathogen interactions to date-represents a rich source of information with respect to secreted pathogen effector proteins and their interactions with human host proteins.


Assuntos
Proteínas de Bactérias/metabolismo , Coxiella burnetii/metabolismo , Interações Hospedeiro-Patógeno , Animais , Sequência Conservada , Ontologia Genética , Humanos , Camundongos , Ligação Proteica , Domínios Proteicos , Mapeamento de Interação de Proteínas , Técnicas do Sistema de Duplo-Híbrido
15.
PLoS One ; 12(12): e0188461, 2017.
Artigo em Inglês | MEDLINE | ID: mdl-29216202

RESUMO

Certain occupational and geographical exposures have been associated with an increased risk of lung disease. As a baseline for future studies, we sought to characterize the upper respiratory microbiomes of healthy military personnel in a garrison environment. Nasal, oropharyngeal, and nasopharyngeal swabs were collected from 50 healthy active duty volunteers eight times over the course of one year (1107 swabs, completion rate = 92.25%) and subjected to pyrosequencing of the V1-V3 region of 16S rDNA. Respiratory bacterial taxa were characterized at the genus level, using QIIME 1.8 and the Ribosomal Database Project classifier. High levels of Staphylococcus, Corynebacterium, and Propionibacterium were observed among both nasal and nasopharyngeal microbiota, comprising more than 75% of all operational taxonomical units (OTUs). In contrast, Streptococcus was the sole dominant bacterial genus (approximately 50% of all OTUs) in the oropharynx. The average bacterial diversity was greater in the oropharynx than in the nasal or nasopharyngeal region at all time points. Diversity analysis indicated a significant overlap between nasal and nasopharyngeal samples, whereas oropharyngeal samples formed a cluster distinct from these two regions. The study produced a large set of pyrosequencing data on the V1-V3 region of bacterial 16S rDNA for the respiratory microbiomes of healthy active duty Service Members. Pre-processing of sequencing reads showed good data quality. The derived microbiome profiles were consistent both internally and with previous reports, suggesting their utility for further analyses and association studies based on sequence and demographic data.


Assuntos
Microbiota , Militares , Sistema Respiratório/microbiologia , Corynebacterium/genética , Corynebacterium/isolamento & purificação , DNA Ribossômico/genética , Feminino , Humanos , Masculino , Cavidade Nasal/microbiologia , Nasofaringe/microbiologia , Propionibacterium/genética , Propionibacterium/isolamento & purificação , RNA Ribossômico 16S/genética , Staphylococcus/genética , Staphylococcus/isolamento & purificação
16.
Artigo em Inglês | MEDLINE | ID: mdl-26955620

RESUMO

Burkholderia mallei (Bm) is a highly infectious intracellular pathogen classified as a category B biological agent by the Centers for Disease Control and Prevention. After respiratory exposure, Bm establishes itself within host macrophages before spreading into major organ systems, which can lead to chronic infection, sepsis, and death. Previously, we combined computational prediction of host-pathogen interactions with yeast two-hybrid experiments and identified novel virulence factor genes in Bm, including BMAA0553, BMAA0728 (tssN), and BMAA1865. In the present study, we used recombinant allelic exchange to construct deletion mutants of BMAA0553 and tssN (ΔBMAA0553 and ΔTssN, respectively) and showed that both deletions completely abrogated virulence at doses of >100 times the LD50 of the wild-type Bm strain. Analysis of ΔBMAA0553- and ΔTssN-infected mice showed starkly reduced bacterial dissemination relative to wild-type Bm, and subsequent in vitro experiments characterized pathogenic phenotypes with respect to intracellular growth, macrophage uptake and phagosomal escape, actin-based motility, and multinucleated giant cell formation. Based on observed in vitro and in vivo phenotypes, we explored the use of ΔTssN as a candidate live-attenuated vaccine. Mice immunized with aerosolized ΔTssN showed a 21-day survival rate of 67% after a high-dose aerosol challenge with the wild-type Bm ATCC 23344 strain, compared to a 0% survival rate for unvaccinated mice. However, analysis of histopathology and bacterial burden showed that while the surviving vaccinated mice were protected from acute infection, Bm was still able to establish a chronic infection. Vaccinated mice showed a modest IgG response, suggesting a limited potential of ΔTssN as a vaccine candidate, but also showed prolonged elevation of pro-inflammatory cytokines, underscoring the role of cellular and innate immunity in mitigating acute infection in inhalational glanders.


Assuntos
Anticorpos Antibacterianos/imunologia , Vacinas Bacterianas/imunologia , Burkholderia mallei/imunologia , Burkholderia mallei/patogenicidade , Mormo/imunologia , Imunoglobulina G/imunologia , Administração por Inalação , Aerossóis , Animais , Burkholderia mallei/genética , Citocinas/metabolismo , Feminino , Deleção de Genes , Mormo/microbiologia , Interações Hospedeiro-Patógeno , Macrófagos/imunologia , Camundongos , Camundongos Endogâmicos BALB C , Vacinação , Vacinas Atenuadas/imunologia , Virulência/genética
17.
Microbiome ; 2: 31, 2014.
Artigo em Inglês | MEDLINE | ID: mdl-25228989

RESUMO

BACKGROUND: Sample storage conditions, extraction methods, PCR primers, and parameters are major factors that affect metagenomics analysis based on microbial 16S rRNA gene sequencing. Most published studies were limited to the comparison of only one or two types of these factors. Systematic multi-factor explorations are needed to evaluate the conditions that may impact validity of a microbiome analysis. This study was aimed to improve methodological options to facilitate the best technical approaches in the design of a microbiome study. Three readily available mock bacterial community materials and two commercial extraction techniques, Qiagen DNeasy and MO BIO PowerSoil DNA purification methods, were used to assess procedures for 16S ribosomal DNA amplification and pyrosequencing-based analysis. Primers were chosen for 16S rDNA quantitative PCR and amplification of region V3 to V1. Swabs spiked with mock bacterial community cells and clinical oropharyngeal swabs were incubated at respective temperatures of -80°C, -20°C, 4°C, and 37°C for 4 weeks, then extracted with the two methods, and subjected to pyrosequencing and taxonomic and statistical analyses to investigate microbiome profile stability. RESULTS: The bacterial compositions for the mock community DNA samples determined in this study were consistent with the projected levels and agreed with the literature. The quantitation accuracy of abundances for several genera was improved with changes made to the standard Human Microbiome Project (HMP) procedure. The data for the samples purified with DNeasy and PowerSoil methods were statistically distinct; however, both results were reproducible and in good agreement with each other. The temperature effect on storage stability was investigated by using mock community cells and showed that the microbial community profiles were altered with the increase in incubation temperature. However, this phenomenon was not detected when clinical oropharyngeal swabs were used in the experiment. CONCLUSIONS: Mock community materials originated from the HMP study are valuable controls in developing 16S metagenomics analysis procedures. Long-term exposure to a high temperature may introduce variation into analysis for oropharyngeal swabs, suggestive of storage at 4°C or lower. The observed variations due to sample storage temperature are in a similar range as the intrapersonal variability among different clinical oropharyngeal swab samples.

18.
Source Code Biol Med ; 6: 14, 2011 Sep 08.
Artigo em Inglês | MEDLINE | ID: mdl-21902825

RESUMO

With ever-increasing numbers of microbial genomes being sequenced, efficient tools are needed to perform strain-level identification of any newly sequenced genome. Here, we present the SNP identification for strain typing (SNIT) pipeline, a fast and accurate software system that compares a newly sequenced bacterial genome with other genomes of the same species to identify single nucleotide polymorphisms (SNPs) and small insertions/deletions (indels). Based on this information, the pipeline analyzes the polymorphic loci present in all input genomes to identify the genome that has the fewest differences with the newly sequenced genome. Similarly, for each of the other genomes, SNIT identifies the input genome with the fewest differences. Results from five bacterial species show that the SNIT pipeline identifies the correct closest neighbor with 75% to 100% accuracy. The SNIT pipeline is available for download at http://www.bhsai.org/snit.html.

19.
PLoS One ; 6(3): e17469, 2011 Mar 07.
Artigo em Inglês | MEDLINE | ID: mdl-21408217

RESUMO

BACKGROUND: The annotation of genomes from next-generation sequencing platforms needs to be rapid, high-throughput, and fully integrated and automated. Although a few Web-based annotation services have recently become available, they may not be the best solution for researchers that need to annotate a large number of genomes, possibly including proprietary data, and store them locally for further analysis. To address this need, we developed a standalone software application, the Annotation of microbial Genome Sequences (AGeS) system, which incorporates publicly available and in-house-developed bioinformatics tools and databases, many of which are parallelized for high-throughput performance. METHODOLOGY: The AGeS system supports three main capabilities. The first is the storage of input contig sequences and the resulting annotation data in a central, customized database. The second is the annotation of microbial genomes using an integrated software pipeline, which first analyzes contigs from high-throughput sequencing by locating genomic regions that code for proteins, RNA, and other genomic elements through the Do-It-Yourself Annotation (DIYA) framework. The identified protein-coding regions are then functionally annotated using the in-house-developed Pipeline for Protein Annotation (PIPA). The third capability is the visualization of annotated sequences using GBrowse. To date, we have implemented these capabilities for bacterial genomes. AGeS was evaluated by comparing its genome annotations with those provided by three other methods. Our results indicate that the software tools integrated into AGeS provide annotations that are in general agreement with those provided by the compared methods. This is demonstrated by a >94% overlap in the number of identified genes, a significant number of identical annotated features, and a >90% agreement in enzyme function predictions.


Assuntos
Genoma Bacteriano/genética , Anotação de Sequência Molecular/métodos , Software , Sequência de Bases , Genes Bacterianos/genética , Reprodutibilidade dos Testes
20.
PLoS One ; 4(7): e6254, 2009 Jul 16.
Artigo em Inglês | MEDLINE | ID: mdl-19606223

RESUMO

BACKGROUND: Protein structures are critical for understanding the mechanisms of biological systems and, subsequently, for drug and vaccine design. Unfortunately, protein sequence data exceed structural data by a factor of more than 200 to 1. This gap can be partially filled by using computational protein structure prediction. While structure prediction Web servers are a notable option, they often restrict the number of sequence queries and/or provide a limited set of prediction methodologies. Therefore, we present a standalone protein structure prediction software package suitable for high-throughput structural genomic applications that performs all three classes of prediction methodologies: comparative modeling, fold recognition, and ab initio. This software can be deployed on a user's own high-performance computing cluster. METHODOLOGY/PRINCIPAL FINDINGS: The pipeline consists of a Perl core that integrates more than 20 individual software packages and databases, most of which are freely available from other research laboratories. The query protein sequences are first divided into domains either by domain boundary recognition or Bayesian statistics. The structures of the individual domains are then predicted using template-based modeling or ab initio modeling. The predicted models are scored with a statistical potential and an all-atom force field. The top-scoring ab initio models are annotated by structural comparison against the Structural Classification of Proteins (SCOP) fold database. Furthermore, secondary structure, solvent accessibility, transmembrane helices, and structural disorder are predicted. The results are generated in text, tab-delimited, and hypertext markup language (HTML) formats. So far, the pipeline has been used to study viral and bacterial proteomes. CONCLUSIONS: The standalone pipeline that we introduce here, unlike protein structure prediction Web servers, allows users to devote their own computing assets to process a potentially unlimited number of queries as well as perform resource-intensive ab initio structure prediction.


Assuntos
Proteínas/química , Sequência de Aminoácidos , Análise por Conglomerados , Bases de Dados de Proteínas , Dados de Sequência Molecular , Linguagens de Programação , Conformação Proteica , Homologia de Sequência de Aminoácidos , Interface Usuário-Computador
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA