Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 7 de 7
Filtrar
Mais filtros

Base de dados
Tipo de documento
Intervalo de ano de publicação
1.
Cell Rep Med ; 5(7): 101625, 2024 Jul 16.
Artigo em Inglês | MEDLINE | ID: mdl-38944038

RESUMO

Infrared spectroscopy is a powerful technique for probing the molecular profiles of complex biofluids, offering a promising avenue for high-throughput in vitro diagnostics. While several studies showcased its potential in detecting health conditions, a large-scale analysis of a naturally heterogeneous potential patient population has not been attempted. Using a population-based cohort, here we analyze 5,184 blood plasma samples from 3,169 individuals using Fourier transform infrared (FTIR) spectroscopy. Applying a multi-task classification to distinguish between dyslipidemia, hypertension, prediabetes, type 2 diabetes, and healthy states, we find that the approach can accurately single out healthy individuals and characterize chronic multimorbid states. We further identify the capacity to forecast the development of metabolic syndrome years in advance of onset. Dataset-independent testing confirms the robustness of infrared signatures against variations in sample handling, storage time, and measurement regimes. This study provides the framework that establishes infrared molecular fingerprinting as an efficient modality for populational health diagnostics.


Assuntos
Diabetes Mellitus Tipo 2 , Aprendizado de Máquina , Fenótipo , Humanos , Espectroscopia de Infravermelho com Transformada de Fourier/métodos , Feminino , Masculino , Diabetes Mellitus Tipo 2/diagnóstico , Diabetes Mellitus Tipo 2/sangue , Pessoa de Meia-Idade , Adulto , Idoso , Estado Pré-Diabético/diagnóstico , Estado Pré-Diabético/sangue , Síndrome Metabólica/diagnóstico , Síndrome Metabólica/sangue , Hipertensão/diagnóstico , Hipertensão/sangue , Dislipidemias/diagnóstico , Dislipidemias/sangue
2.
Nucleic Acids Res ; 42(Web Server issue): W337-43, 2014 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-24799431

RESUMO

PredictProtein is a meta-service for sequence analysis that has been predicting structural and functional features of proteins since 1992. Queried with a protein sequence it returns: multiple sequence alignments, predicted aspects of structure (secondary structure, solvent accessibility, transmembrane helices (TMSEG) and strands, coiled-coil regions, disulfide bonds and disordered regions) and function. The service incorporates analysis methods for the identification of functional regions (ConSurf), homology-based inference of Gene Ontology terms (metastudent), comprehensive subcellular localization prediction (LocTree3), protein-protein binding sites (ISIS2), protein-polynucleotide binding sites (SomeNA) and predictions of the effect of point mutations (non-synonymous SNPs) on protein function (SNAP2). Our goal has always been to develop a system optimized to meet the demands of experimentalists not highly experienced in bioinformatics. To this end, the PredictProtein results are presented as both text and a series of intuitive, interactive and visually appealing figures. The web server and sources are available at http://ppopen.rostlab.org.


Assuntos
Conformação Proteica , Software , Substituição de Aminoácidos , Sítios de Ligação , Ontologia Genética , Internet , Proteínas Intrinsicamente Desordenadas/química , Proteínas de Membrana/química , Mutação , Mapeamento de Interação de Proteínas , Proteínas/análise , Proteínas/genética , Proteínas/metabolismo , Alinhamento de Sequência , Análise de Sequência de Proteína , Homologia de Sequência de Aminoácidos
3.
J Cheminform ; 3: 22, 2011 Jun 27.
Artigo em Inglês | MEDLINE | ID: mdl-21708012

RESUMO

BACKGROUND: We present a machine learning approach to the problem of protein ligand interaction prediction. We focus on a set of binding data obtained from 113 different protein kinases and 20 inhibitors. It was attained through ATP site-dependent binding competition assays and constitutes the first available dataset of this kind. We extract information about the investigated molecules from various data sources to obtain an informative set of features. RESULTS: A Support Vector Machine (SVM) as well as a decision tree algorithm (C5/See5) is used to learn models based on the available features which in turn can be used for the classification of new kinase-inhibitor pair test instances. We evaluate our approach using different feature sets and parameter settings for the employed classifiers. Moreover, the paper introduces a new way of evaluating predictions in such a setting, where different amounts of information about the binding partners can be assumed to be available for training. Results on an external test set are also provided. CONCLUSIONS: In most of the cases, the presented approach clearly outperforms the baseline methods used for comparison. Experimental results indicate that the applied machine learning methods are able to detect a signal in the data and predict binding affinity to some extent. For SVMs, the binding prediction can be improved significantly by using features that describe the active site of a kinase. For C5, besides diversity in the feature set, alignment scores of conserved regions turned out to be very useful.

4.
Nucleic Acids Res ; 35(20): e135, 2007.
Artigo em Inglês | MEDLINE | ID: mdl-17942428

RESUMO

Tubulins are still considered as typical proteins of Eukaryotes. However, more recently they have been found in the unusual bacteria Prosthecobacter (btubAB). In this study, the genomic organization of the btub-genes and their genomic environment were characterized by using the newly developed Two-Step Gene Walking method. In all investigated Prosthecobacters, btubAB are organized in a typical bacterial operon. Strikingly, all btub-operons comprise a third gene with similarities to kinesin light chain sequences. The genomic environments of the characterized btub-operons are always different. This supports the hypothesis that this group of genes represents an independent functional unit, which was acquired by Prosthecobacter via horizontal gene transfer. The newly developed Two-Step Gene Walking method is based on randomly primed polymerase chain reaction (PCR). It presents a simple workflow, which comprises only two major steps--a Walking-PCR with a single specific outward pointing primer (step 1) and the direct sequencing of its product using a nested specific primer (step 2). Two-Step Gene Walking proved to be highly efficient and was successfully used to characterize over 20 kb of sequence not only in pure culture but even in complex non-pure culture samples.


Assuntos
Bactérias/genética , Mapeamento Cromossômico/métodos , Genes Bacterianos , Óperon , Tubulina (Proteína)/genética , Cinesinas , Reação em Cadeia da Polimerase
5.
Pac Symp Biocomput ; : 596-607, 2006.
Artigo em Inglês | MEDLINE | ID: mdl-17094272

RESUMO

We address the problem of learning a predictive model for growth inhibition from the NCI DTP human tumor cell line screening data. Extending the classical Quantitative Structure Activity Relationship paradigm, we investigate whether including gene expression data leads to a statistically significant improvement of prediction quality. Our analysis shows that the straightforward approach of including individual gene expression as features does not necessarily improve, but on the contrary, may degrade performance significantly. When gene expression information is aggregated, for instance by features representing the correlation with reference cell lines, performance can be improved significantly. Further improvements may be expected if the learning task is structured by grouping features and instances.


Assuntos
Ensaios de Seleção de Medicamentos Antitumorais/estatística & dados numéricos , Modelos Biológicos , Farmacogenética/estatística & dados numéricos , Linhagem Celular Tumoral , Biologia Computacional , Bases de Dados Genéticas , Expressão Gênica , Humanos
6.
Bioinformatics ; 21 Suppl 2: ii123-9, 2005 Sep 01.
Artigo em Inglês | MEDLINE | ID: mdl-16204090

RESUMO

MOTIVATION: We tackle the problem of finding regularities in microarray data. Various data mining tools, such as clustering, classification, Bayesian networks and association rules, have been applied so far to gain insight into gene-expression data. Association rule mining techniques used so far work on discretizations of the data and cannot account for cumulative effects. In this paper, we investigate the use of quantitative association rules that can operate directly on numeric data and represent cumulative effects of variables. Technically speaking, this type of quantitative association rules based on half-spaces can find non-axis-parallel regularities. RESULTS: We performed a variety of experiments testing the utility of quantitative association rules for microarray data. First of all, the results should be statistically significant and robust against fluctuations in the data. Next, the approach should be scalable in the number of variables, which is important for such high-dimensional data. Finally, the rules should make sense biologically and be sufficiently different from rules found in regular association rule mining working with discretizations. In all of these dimensions, the proposed approach performed satisfactorily. Therefore, quantitative association rules based on half-spaces should be considered as a tool for the analysis of microarray gene-expression data. AVAILABILITY: The code is available from the authors on request.


Assuntos
Algoritmos , Bases de Dados Genéticas , Perfilação da Expressão Gênica/métodos , Análise de Sequência com Séries de Oligonucleotídeos/métodos , Estatística como Assunto
7.
Nucleic Acids Res ; 32(4): 1363-71, 2004.
Artigo em Inglês | MEDLINE | ID: mdl-14985472

RESUMO

The ARB (from Latin arbor, tree) project was initiated almost 10 years ago. The ARB program package comprises a variety of directly interacting software tools for sequence database maintenance and analysis which are controlled by a common graphical user interface. Although it was initially designed for ribosomal RNA data, it can be used for any nucleic and amino acid sequence data as well. A central database contains processed (aligned) primary structure data. Any additional descriptive data can be stored in database fields assigned to the individual sequences or linked via local or worldwide networks. A phylogenetic tree visualized in the main window can be used for data access and visualization. The package comprises additional tools for data import and export, sequence alignment, primary and secondary structure editing, profile and filter calculation, phylogenetic analyses, specific hybridization probe design and evaluation and other components for data analysis. Currently, the package is used by numerous working groups worldwide.


Assuntos
Análise de Sequência de DNA , Análise de Sequência de Proteína , Análise de Sequência de RNA , Software , Apresentação de Dados , Bases de Dados Genéticas , Internet , Filogenia , Alinhamento de Sequência , Fatores de Tempo
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA