Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 7 de 7
Filtrar
Mais filtros








Base de dados
Intervalo de ano de publicação
1.
Drug Discov Today ; 29(3): 103884, 2024 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-38219969

RESUMO

The volume of nucleic acid sequence data has exploded recently, amplifying the challenge of transforming data into meaningful information. Processing data can require an increasingly complex ecosystem of customized tools, which increases difficulty in communicating analyses in an understandable way yet is of sufficient detail to enable informed decisions or repeats. This can be of particular interest to institutions and companies communicating computations in a regulatory environment. BioCompute Objects (BCOs; an instance of pipeline documentation that conforms to the IEEE 2791-2020 standard) were developed as a standardized mechanism for analysis reporting. A suite of BCOs is presented, representing interconnected elements of a computation modeled after those that might be found in a regulatory submission but are shared publicly - in this case a pipeline designed to identify viral contaminants in biological manufacturing, such as for vaccines.


Assuntos
Biologia Computacional , Vacinas , Sequenciamento de Nucleotídeos em Larga Escala , Fluxo de Trabalho
2.
PLoS Pathog ; 19(7): e1011527, 2023 07.
Artigo em Inglês | MEDLINE | ID: mdl-37523399

RESUMO

Members of the spotted fever group rickettsia express four large, surface-exposed autotransporters, at least one of which is a known virulence determinant. Autotransporter translocation to the bacterial outer surface, also known as type V secretion, involves formation of a ß-barrel autotransporter domain in the periplasm that inserts into the outer membrane to form a pore through which the N-terminal passenger domain is passed and exposed on the outer surface. Two major surface antigens of Rickettsia rickettsii, are known to be surface exposed and the passenger domain cleaved from the autotransporter domain. A highly passaged strain of R. rickettsii, Iowa, fails to cleave these autotransporters and is avirulent. We have identified a putative peptidase, truncated in the Iowa strain, that when reconstituted into Iowa restores appropriate processing of the autotransporters as well as restoring a modest degree of virulence.


Assuntos
Rickettsia rickettsii , Sistemas de Secreção Tipo V , Rickettsia rickettsii/genética , Sistemas de Secreção Tipo V/genética , Peptídeo Hidrolases , Proteínas da Membrana Bacteriana Externa , Fatores de Virulência
3.
mSphere ; 5(5)2020 10 14.
Artigo em Inglês | MEDLINE | ID: mdl-33055255

RESUMO

High-throughput sequencing (HTS) has been widely used to characterize HIV-1 genome sequences. There are no algorithms currently that can directly determine genotype and quasispecies population using short HTS reads generated from long genome sequences without additional software. To establish a robust subpopulation, subtype, and recombination analysis workflow, we amplified the HIV-1 3'-half genome from plasma samples of 65 HIV-1-infected individuals and sequenced the entire amplicon (∼4,500 bp) by HTS. With direct analysis of raw reads using HIVE-hexahedron, we showed that 48% of samples harbored 2 to 13 subpopulations. We identified various subtypes (17 A1s, 4 Bs, 27 Cs, 6 CRF02_AGs, and 11 unique recombinant forms) and defined recombinant breakpoints of 10 recombinants. These results were validated with viral genome sequences generated by single genome sequencing (SGS) or the analysis of consensus sequence of the HTS reads. The HIVE-hexahedron workflow is more sensitive and accurate than just evaluating the consensus sequence and also more cost-effective than SGS.IMPORTANCE The highly recombinogenic nature of human immunodeficiency virus type 1 (HIV-1) leads to recombination and emergence of quasispecies. It is important to reliably identify subpopulations to understand the complexity of a viral population for drug resistance surveillance and vaccine development. High-throughput sequencing (HTS) provides improved resolution over Sanger sequencing for the analysis of heterogeneous viral subpopulations. However, current methods of analysis of HTS reads are unable to fully address accurate population reconstruction. Hence, there is a dire need for a more sensitive, accurate, user-friendly, and cost-effective method to analyze viral quasispecies. For this purpose, we have improved the HIVE-hexahedron algorithm that we previously developed with in silico short sequences to analyze raw HTS short reads. The significance of this study is that our standalone algorithm enables a streamlined analysis of quasispecies, subtype, and recombination patterns from long HIV-1 genome regions without the need of additional sequence analysis tools. Distinct viral populations and recombination patterns identified by HIVE-hexahedron are further validated by comparison with sequences obtained by single genome sequencing (SGS).


Assuntos
Algoritmos , Genoma Viral , HIV-1/classificação , HIV-1/genética , Recombinação Genética , Estudos de Coortes , Simulação por Computador , Variação Genética , Genótipo , Infecções por HIV/sangue , Infecções por HIV/virologia , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Filogenia , Quase-Espécies/genética
4.
PLoS One ; 14(9): e0206484, 2019.
Artigo em Inglês | MEDLINE | ID: mdl-31509535

RESUMO

A comprehensive knowledge of the types and ratios of microbes that inhabit the healthy human gut is necessary before any kind of pre-clinical or clinical study can be performed that attempts to alter the microbiome to treat a condition or improve therapy outcome. To address this need we present an innovative scalable comprehensive analysis workflow, a healthy human reference microbiome list and abundance profile (GutFeelingKB), and a novel Fecal Biome Population Report (FecalBiome) with clinical applicability. GutFeelingKB provides a list of 157 organisms (8 phyla, 18 classes, 23 orders, 38 families, 59 genera and 109 species) that forms the baseline biome and therefore can be used as healthy controls for studies related to dysbiosis. This list can be expanded to 863 organisms if closely related proteomes are considered. The incorporation of microbiome science into routine clinical practice necessitates a standard report for comparison of an individual's microbiome to the growing knowledgebase of "normal" microbiome data. The FecalBiome and the underlying technology of GutFeelingKB address this need. The knowledgebase can be useful to regulatory agencies for the assessment of fecal transplant and other microbiome products, as it contains a list of organisms from healthy individuals. In addition to the list of organisms and their abundances, this study also generated a collection of assembled contiguous sequences (contigs) of metagenomics dark matter. In this study, metagenomic dark matter represents sequences that cannot be mapped to any known sequence but can be assembled into contigs of 10,000 nucleotides or higher. These sequences can be used to create primers to study potential novel organisms. All data is freely available from https://hive.biochemistry.gwu.edu/gfkb and NCBI's Short Read Archive.


Assuntos
Microbioma Gastrointestinal , Metagenoma , Metagenômica , Fezes/microbiologia , Humanos , Metagenômica/métodos
5.
Methods Mol Biol ; 1558: 159-190, 2017.
Artigo em Inglês | MEDLINE | ID: mdl-28150238

RESUMO

Post-translational modifications (PTMs) are covalent modifications that proteins might undergo following or sometimes during the process of translation. Together with gene diversity, PTMs contribute to the overall variety of possible protein function for a given organism. Single-nucleotide polymorphisms (SNPs) are the most common form of variations found in the human genome, and have been found to be associated with diseases like Alzheimer's disease (AD) and Parkinson's disease (PD), among many others. Studies have also shown that non-synonymous single-nucleotide variation (nsSNV) at the PTM site, which alters the corresponding encoded amino acid in the translated protein sequence, can lead to abnormal activity of a protein and can contribute to a disease phenotype. Significant advances in next-generation sequencing (NGS) technologies and high-throughput proteomics have resulted in the generation of a huge amount of data for both SNPs and PTMs. However, these data are unsystematically distributed across a number of diverse databases. Thus, there is a need for efforts toward data standardization and validation of bioinformatics algorithms that can fully leverage SNP and PTM information for biomedical research. In this book chapter, we will present some of the commonly used databases for both SNVs and PTMs and describe a broad approach that can be applied to many scenarios for studying the impact of nsSNVs on PTM sites of human proteins.


Assuntos
Aminoácidos , Biologia Computacional/métodos , Bases de Dados Genéticas , Polimorfismo de Nucleotídeo Único , Processamento de Proteína Pós-Traducional , Proteínas , Proteômica/métodos , Aminoácidos/química , Aminoácidos/metabolismo , Variação Genética , Humanos , Anotação de Sequência Molecular , Mutação , Proteínas/química , Proteínas/genética , Proteínas/metabolismo , Controle de Qualidade , Software , Relação Estrutura-Atividade , Navegador
6.
Artigo em Inglês | MEDLINE | ID: mdl-26989153

RESUMO

The High-performance Integrated Virtual Environment (HIVE) is a distributed storage and compute environment designed primarily to handle next-generation sequencing (NGS) data. This multicomponent cloud infrastructure provides secure web access for authorized users to deposit, retrieve, annotate and compute on NGS data, and to analyse the outcomes using web interface visual environments appropriately built in collaboration with research and regulatory scientists and other end users. Unlike many massively parallel computing environments, HIVE uses a cloud control server which virtualizes services, not processes. It is both very robust and flexible due to the abstraction layer introduced between computational requests and operating system processes. The novel paradigm of moving computations to the data, instead of moving data to computational nodes, has proven to be significantly less taxing for both hardware and network infrastructure.The honeycomb data model developed for HIVE integrates metadata into an object-oriented model. Its distinction from other object-oriented databases is in the additional implementation of a unified application program interface to search, view and manipulate data of all types. This model simplifies the introduction of new data types, thereby minimizing the need for database restructuring and streamlining the development of new integrated information systems. The honeycomb model employs a highly secure hierarchical access control and permission system, allowing determination of data access privileges in a finely granular manner without flooding the security subsystem with a multiplicity of rules. HIVE infrastructure will allow engineers and scientists to perform NGS analysis in a manner that is both efficient and secure. HIVE is actively supported in public and private domains, and project collaborations are welcomed. Database URL: https://hive.biochemistry.gwu.edu.


Assuntos
Sequenciamento de Nucleotídeos em Larga Escala/métodos , Interface Usuário-Computador , Biologia Computacional , Mutação/genética , Poliovirus/genética , Vacinas contra Poliovirus/imunologia , Proteômica , Recombinação Genética , Alinhamento de Sequência , Estatística como Assunto
7.
Artigo em Inglês | MEDLINE | ID: mdl-25819073

RESUMO

BioXpress is a gene expression and cancer association database in which the expression levels are mapped to genes using RNA-seq data obtained from The Cancer Genome Atlas, International Cancer Genome Consortium, Expression Atlas and publications. The BioXpress database includes expression data from 64 cancer types, 6361 patients and 17 469 genes with 9513 of the genes displaying differential expression between tumor and normal samples. In addition to data directly retrieved from RNA-seq data repositories, manual biocuration of publications supplements the available cancer association annotations in the database. All cancer types are mapped to Disease Ontology terms to facilitate a uniform pan-cancer analysis. The BioXpress database is easily searched using HUGO Gene Nomenclature Committee gene symbol, UniProtKB/RefSeq accession or, alternatively, can be queried by cancer type with specified significance filters. This interface along with availability of pre-computed downloadable files containing differentially expressed genes in multiple cancers enables straightforward retrieval and display of a broad set of cancer-related genes.


Assuntos
Bases de Dados Genéticas , Regulação Neoplásica da Expressão Gênica , Neoplasias , RNA Neoplásico , Humanos , Neoplasias/genética , Neoplasias/metabolismo , RNA Neoplásico/biossíntese , RNA Neoplásico/genética
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA