Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 14 de 14
Filtrar
1.
BMC Bioinformatics ; 17: 43, 2016 Jan 20.
Artigo em Inglês | MEDLINE | ID: mdl-26792120

RESUMO

BACKGROUND: Here we introduce the Protein Sequence Annotation Tool (PSAT), a web-based, sequence annotation meta-server for performing integrated, high-throughput, genome-wide sequence analyses. Our goals in building PSAT were to (1) create an extensible platform for integration of multiple sequence-based bioinformatics tools, (2) enable functional annotations and enzyme predictions over large input protein fasta data sets, and (3) provide a web interface for convenient execution of the tools. RESULTS: In this paper, we demonstrate the utility of PSAT by annotating the predicted peptide gene products of Herbaspirillum sp. strain RV1423, importing the results of PSAT into EC2KEGG, and using the resulting functional comparisons to identify a putative catabolic pathway, thereby distinguishing RV1423 from a well annotated Herbaspirillum species. This analysis demonstrates that high-throughput enzyme predictions, provided by PSAT processing, can be used to identify metabolic potential in an otherwise poorly annotated genome. CONCLUSIONS: PSAT is a meta server that combines the results from several sequence-based annotation and function prediction codes, and is available at http://psat.llnl.gov/psat/. PSAT stands apart from other sequence-based genome annotation systems in providing a high-throughput platform for rapid de novo enzyme predictions and sequence annotations over large input protein sequence data sets in FASTA. PSAT is most appropriately applied in annotation of large protein FASTA sets that may or may not be associated with a single genome.


Assuntos
Genoma Bacteriano , Herbaspirillum/genética , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Internet , Anotação de Sequência Molecular/métodos , Software , Biologia Computacional/métodos , Computadores , Microbiologia da Água
2.
J Autoimmun ; 50: 77-82, 2014 May.
Artigo em Inglês | MEDLINE | ID: mdl-24387802

RESUMO

Previous cross-sectional analyses demonstrated that CD8(+) and CD4(+) T-cell reactivity to islet-specific antigens was more prevalent in T1D subjects than in healthy donors (HD). Here, we examined T1D-associated epitope-specific CD4(+) T-cell cytokine production and autoreactive CD8(+) T-cell frequency on a monthly basis for one year in 10 HD, 33 subjects with T1D, and 15 subjects with T2D. Autoreactive CD4(+) T-cells from both T1D and T2D subjects produced more IFN-γ when stimulated than cells from HD. In contrast, higher frequencies of islet antigen-specific CD8(+) T-cells were detected only in T1D. These observations support the hypothesis that general beta-cell stress drives autoreactive CD4(+) T-cell activity while islet over-expression of MHC class I commonly seen in T1D mediates amplification of CD8(+) T-cells and more rapid beta-cell loss. In conclusion, CD4(+) T-cell autoreactivity appears to be present in both T1D and T2D while autoreactive CD8(+) T-cells are unique to T1D. Thus, autoreactive CD8(+) cells may serve as a more T1D-specific biomarker.


Assuntos
Autoantígenos/imunologia , Linfócitos T CD4-Positivos/imunologia , Linfócitos T CD8-Positivos/imunologia , Diabetes Mellitus Tipo 1/imunologia , Diabetes Mellitus Tipo 2/imunologia , Ilhotas Pancreáticas/imunologia , Adulto , Idoso , Linfócitos T CD4-Positivos/patologia , Linfócitos T CD8-Positivos/patologia , Estudos de Casos e Controles , Citotoxicidade Imunológica , Diabetes Mellitus Tipo 1/patologia , Diabetes Mellitus Tipo 2/patologia , ELISPOT , Feminino , Humanos , Interferon gama/biossíntese , Ilhotas Pancreáticas/patologia , Estudos Longitudinais , Masculino , Pessoa de Meia-Idade
3.
BMC Bioinformatics ; 13: 321, 2012 Dec 02.
Artigo em Inglês | MEDLINE | ID: mdl-23198735

RESUMO

BACKGROUND: Methods of weakening and attenuating pathogens' abilities to infect and propagate in a host, thus allowing the natural immune system to more easily decimate invaders, have gained attention as alternatives to broad-spectrum targeting approaches. The following work describes a technique to identifying proteins involved in virulence by relying on latent information computationally gathered across biological repositories, applicable to both generic and specific virulence categories. RESULTS: A lightweight method for data integration is used, which links information regarding a protein via a path-based query graph. A method of weighting is then applied to query graphs that can serve as input to various statistical classification methods for discrimination, and the combined usage of both data integration and learning methods are tested against the problem of both generalized and specific virulence function prediction. CONCLUSIONS: This approach improves coverage of functional data over a protein. Moreover, while depending largely on noisy and potentially non-curated data from public sources, we find it outperforms other techniques to identification of general virulence factors and baseline remote homology detection methods for specific virulence categories.


Assuntos
Proteínas/classificação , Análise de Sequência de Proteína/métodos , Análise de Sequência de Proteína/estatística & dados numéricos , Fatores de Virulência/classificação , Interpretação Estatística de Dados , Bases de Dados de Proteínas , Proteínas/química , Virulência , Fatores de Virulência/química
4.
J Biomed Inform ; 43(6): 873-82, 2010 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-20643225

RESUMO

Though there have been many advances in providing access to linked and integrated biomedical data across repositories, developing methods which allow users to specify ambiguous and exploratory queries over disparate sources remains a challenge to extracting well-curated or diversely-supported biological information. In the following work, we discuss the concepts of data coverage and evidence in the context of integrated sources. We address diverse information retrieval via a simple framework for representing coverage and evidence that operates in parallel with an arbitrary schema, and a language upon which queries on the schema and framework may be executed. We show that this approach is capable of answering questions that require ranged levels of evidence or triangulation, and demonstrate that appropriately-formed queries can significantly improve the level of precision when retrieving well-supported biomedical data.


Assuntos
Bases de Dados Factuais , Armazenamento e Recuperação da Informação/métodos , Pesquisa Biomédica , Internet , Semântica
5.
J Biomed Inform ; 43(3): 407-18, 2010 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-20015478

RESUMO

Genome wide association studies (GWAS) are an important approach to understanding the genetic mechanisms behind human diseases. Single nucleotide polymorphisms (SNPs) are the predominant markers used in genome wide association studies, and the ability to predict which SNPs are likely to be functional is important for both a priori and a posteriori analyses of GWA studies. This article describes the design, implementation and evaluation of a family of systems for the purpose of identifying SNPs that may cause a change in phenotypic outcomes. The methods described in this article characterize the feasibility of combinations of logical and probabilistic inference with federated data integration for both point and regional SNP annotation and analysis. Evaluations of the methods demonstrate the overall strong predictive value of logical, and logical with probabilistic, inference applied to the domain of SNP annotation.


Assuntos
Modelos Estatísticos , Polimorfismo de Nucleotídeo Único , Bases de Dados Genéticas , Estudo de Associação Genômica Ampla/métodos , Lógica
6.
BMC Res Notes ; 5: 96, 2012 Feb 14.
Artigo em Inglês | MEDLINE | ID: mdl-22333139

RESUMO

BACKGROUND: Genes conferring antibiotic resistance to groups of bacterial pathogens are cause for considerable concern, as many once-reliable antibiotics continue to see a reduction in efficacy. The recent discovery of the metallo ß-lactamase blaNDM-1 gene, which appears to grant antibiotic resistance to a variety of Enterobacteriaceae via a mobile plasmid, is one example of this distressing trend. The following work describes a computational analysis of pathogen-borne MBLs that focuses on the structural aspects of characterized proteins. RESULTS: Using both sequence and structural analyses, we examine residues and structural features specific to various pathogen-borne MBL types. This analysis identifies a linker region within MBL-like folds that may act as a discriminating structural feature between these proteins, and specifically resistance-associated acquirable MBLs. Recently released crystal structures of the newly emerged NDM-1 protein were aligned against related MBL structures using a variety of global and local structural alignment methods, and the overall fold conformation is examined for structural conservation. Conservation appears to be present in most areas of the protein, yet is strikingly absent within a linker region, making NDM-1 unique with respect to a linker-based classification scheme. Variability analysis of the NDM-1 crystal structure highlights unique residues in key regions as well as identifying several characteristics shared with other transferable MBLs. CONCLUSIONS: A discriminating linker region identified in MBL proteins is highlighted and examined in the context of NDM-1 and primarily three other MBL types: IMP-1, VIM-2 and ccrA. The presence of an unusual linker region variant and uncommon amino acid composition at specific structurally important sites may help to explain the unusually broad kinetic profile of NDM-1 and may aid in directing research attention to areas of this protein, and possibly other MBLs, that may be targeted for inactivation or attenuation of enzymatic activity.

7.
J Biomed Semantics ; 2 Suppl 3: S2, 2011.
Artigo em Inglês | MEDLINE | ID: mdl-21992591

RESUMO

BACKGROUND: Extracting medication information from clinical records has many potential applications, and recently published research, systems, and competitions reflect an interest therein. Much of the early extraction work involved rules and lexicons, but more recently machine learning has been applied to the task. METHODS: We present a hybrid system consisting of two parts. The first part, field detection, uses a cascade of statistical classifiers to identify medication-related named entities. The second part uses simple heuristics to link those entities into medication events. RESULTS: The system achieved performance that is comparable to other approaches to the same task. This performance is further improved by adding features that reference external medication name lists. CONCLUSIONS: This study demonstrates that our hybrid approach outperforms purely statistical or rule-based systems. The study also shows that a cascade of classifiers works better than a single classifier in extracting medication information. The system is available as is upon request from the first author.

8.
J Am Med Inform Assoc ; 17(5): 514-8, 2010.
Artigo em Inglês | MEDLINE | ID: mdl-20819854

RESUMO

The Third i2b2 Workshop on Natural Language Processing Challenges for Clinical Records focused on the identification of medications, their dosages, modes (routes) of administration, frequencies, durations, and reasons for administration in discharge summaries. This challenge is referred to as the medication challenge. For the medication challenge, i2b2 released detailed annotation guidelines along with a set of annotated discharge summaries. Twenty teams representing 23 organizations and nine countries participated in the medication challenge. The teams produced rule-based, machine learning, and hybrid systems targeted to the task. Although rule-based systems dominated the top 10, the best performing system was a hybrid. Of all medication-related fields, durations and reasons were the most difficult for all systems to detect. While medications themselves were identified with better than 0.75 F-measure by all of the top 10 systems, the best F-measure for durations and reasons were 0.525 and 0.459, respectively. State-of-the-art natural language processing systems go a long way toward extracting medication names, dosages, modes, and frequencies. However, they are limited in recognizing duration and reason fields and would benefit from future research.


Assuntos
Registros Eletrônicos de Saúde , Armazenamento e Recuperação da Informação/métodos , Processamento de Linguagem Natural , Preparações Farmacêuticas , Computadores Híbridos , Humanos , Pacientes Desistentes do Tratamento
9.
J Am Med Inform Assoc ; 17(5): 519-23, 2010.
Artigo em Inglês | MEDLINE | ID: mdl-20819855

RESUMO

OBJECTIVE: Within the context of the Third i2b2 Workshop on Natural Language Processing Challenges for Clinical Records, the authors (also referred to as 'the i2b2 medication challenge team' or 'the i2b2 team' for short) organized a community annotation experiment. DESIGN: For this experiment, the authors released annotation guidelines and a small set of annotated discharge summaries. They asked the participants of the Third i2b2 Workshop to annotate 10 discharge summaries per person; each discharge summary was annotated by two annotators from two different teams, and a third annotator from a third team resolved disagreements. MEASUREMENTS: In order to evaluate the reliability of the annotations thus produced, the authors measured community inter-annotator agreement and compared it with the inter-annotator agreement of expert annotators when both the community and the expert annotators generated ground truth based on pooled system outputs. For this purpose, the pool consisted of the three most densely populated automatic annotations of each record. The authors also compared the community inter-annotator agreement with expert inter-annotator agreement when the experts annotated raw records without using the pool. Finally, they measured the quality of the community ground truth by comparing it with the expert ground truth. RESULTS AND CONCLUSIONS: The authors found that the community annotators achieved comparable inter-annotator agreement to expert annotators, regardless of whether the experts annotated from the pool. Furthermore, the ground truth generated by the community obtained F-measures above 0.90 against the ground truth of the experts, indicating the value of the community as a source of high-quality ground truth even on intricate and domain-specific annotation tasks.


Assuntos
Registros Eletrônicos de Saúde , Armazenamento e Recuperação da Informação/métodos , Processamento de Linguagem Natural , Preparações Farmacêuticas , Humanos , Alta do Paciente
10.
AMIA Annu Symp Proc ; : 889, 2008 Nov 06.
Artigo em Inglês | MEDLINE | ID: mdl-18999001

RESUMO

In the following work, we test a generalized approach to integrating, transforming and learning data from disparate data sources for the classification of bacterial proteins involved in pathogenesis. We rely on the implicit inter-linkages between biological databases to draw relevant records, and leverage statistical learning methods to infer classification based on abundant, albeit noisy, data. Results suggest that types of public biological information have varying degrees of effectiveness in predictive data mining.


Assuntos
Inteligência Artificial , Proteínas de Bactérias/classificação , Toxinas Bacterianas/classificação , Bases de Dados de Proteínas , Reconhecimento Automatizado de Padrão/métodos , Terminologia como Assunto , Fatores de Virulência/classificação , Algoritmos , Armazenamento e Recuperação da Informação/métodos , Processamento de Linguagem Natural
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA