Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 10 de 10
Filtrar
1.
BMC Bioinformatics ; 12: 309, 2011 Jul 28.
Artículo en Inglés | MEDLINE | ID: mdl-21798039

RESUMEN

BACKGROUND: Several data mining methods require data that are discrete, and other methods often perform better with discrete data. We introduce an efficient Bayesian discretization (EBD) method for optimal discretization of variables that runs efficiently on high-dimensional biomedical datasets. The EBD method consists of two components, namely, a Bayesian score to evaluate discretizations and a dynamic programming search procedure to efficiently search the space of possible discretizations. We compared the performance of EBD to Fayyad and Irani's (FI) discretization method, which is commonly used for discretization. RESULTS: On 24 biomedical datasets obtained from high-throughput transcriptomic and proteomic studies, the classification performances of the C4.5 classifier and the naïve Bayes classifier were statistically significantly better when the predictor variables were discretized using EBD over FI. EBD was statistically significantly more stable to the variability of the datasets than FI. However, EBD was less robust, though not statistically significantly so, than FI and produced slightly more complex discretizations than FI. CONCLUSIONS: On a range of biomedical datasets, a Bayesian discretization method (EBD) yielded better classification performance and stability but was less robust than the widely used FI discretization method. The EBD discretization method is easy to implement, permits the incorporation of prior knowledge and belief, and is sufficiently fast for application to high-dimensional data.


Asunto(s)
Teorema de Bayes , Perfilación de la Expresión Génica/métodos , Proteómica/métodos , Algoritmos , Minería de Datos
2.
Bioinformatics ; 26(5): 668-75, 2010 Mar 01.
Artículo en Inglés | MEDLINE | ID: mdl-20080512

RESUMEN

MOTIVATION: Disease state prediction from biomarker profiling studies is an important problem because more accurate classification models will potentially lead to the discovery of better, more discriminative markers. Data mining methods are routinely applied to such analyses of biomedical datasets generated from high-throughput 'omic' technologies applied to clinical samples from tissues or bodily fluids. Past work has demonstrated that rule models can be successfully applied to this problem, since they can produce understandable models that facilitate review of discriminative biomarkers by biomedical scientists. While many rule-based methods produce rules that make predictions under uncertainty, they typically do not quantify the uncertainty in the validity of the rule itself. This article describes an approach that uses a Bayesian score to evaluate rule models. RESULTS: We have combined the expressiveness of rules with the mathematical rigor of Bayesian networks (BNs) to develop and evaluate a Bayesian rule learning (BRL) system. This system utilizes a novel variant of the K2 algorithm for building BNs from the training data to provide probabilistic scores for IF-antecedent-THEN-consequent rules using heuristic best-first search. We then apply rule-based inference to evaluate the learned models during 10-fold cross-validation performed two times. The BRL system is evaluated on 24 published 'omic' datasets, and on average it performs on par or better than other readily available rule learning methods. Moreover, BRL produces models that contain on average 70% fewer variables, which means that the biomarker panels for disease prediction contain fewer markers for further verification and validation by bench scientists.


Asunto(s)
Teorema de Bayes , Minería de Datos/métodos , Proteómica/métodos , Biomarcadores/análisis , Proteoma/metabolismo
3.
JAMIA Open ; 3(2): 306-317, 2020 Jul.
Artículo en Inglés | MEDLINE | ID: mdl-32734172

RESUMEN

OBJECTIVES: This manuscript reviews the current state of veterinary medical electronic health records and the ability to aggregate and analyze large datasets from multiple organizations and clinics. We also review analytical techniques as well as research efforts into veterinary informatics with a focus on applications relevant to human and animal medicine. Our goal is to provide references and context for these resources so that researchers can identify resources of interest and translational opportunities to advance the field. METHODS AND RESULTS: This review covers various methods of veterinary informatics including natural language processing and machine learning techniques in brief and various ongoing and future projects. After detailing techniques and sources of data, we describe some of the challenges and opportunities within veterinary informatics as well as providing reviews of common One Health techniques and specific applications that affect both humans and animals. DISCUSSION: Current limitations in the field of veterinary informatics include limited sources of training data for developing machine learning and artificial intelligence algorithms, siloed data between academic institutions, corporate institutions, and many small private practices, and inconsistent data formats that make many integration problems difficult. Despite those limitations, there have been significant advancements in the field in the last few years and continued development of a few, key, large data resources that are available for interested clinicians and researchers. These real-world use cases and applications show current and significant future potential as veterinary informatics grows in importance. Veterinary informatics can forge new possibilities within veterinary medicine and between veterinary medicine, human medicine, and One Health initiatives.

4.
BMC Bioinformatics ; 10 Suppl 9: S16, 2009 Sep 17.
Artículo en Inglés | MEDLINE | ID: mdl-19761570

RESUMEN

BACKGROUND: The incorporation of biological knowledge can enhance the analysis of biomedical data. We present a novel method that uses a proteomic knowledge base to enhance the performance of a rule-learning algorithm in identifying putative biomarkers of disease from high-dimensional proteomic mass spectral data. In particular, we use the Empirical Proteomics Ontology Knowledge Base (EPO-KB) that contains previously identified and validated proteomic biomarkers to select m/zs in a proteomic dataset prior to analysis to increase performance. RESULTS: We show that using EPO-KB as a pre-processing method, specifically selecting all biomarkers found only in the biofluid of the proteomic dataset, reduces the dimensionality by 95% and provides a statistically significantly greater increase in performance over no variable selection and random variable selection. CONCLUSION: Knowledge-based variable selection even with a sparsely-populated resource such as the EPO-KB increases overall performance of rule-learning for disease classification from high-dimensional proteomic mass spectra.


Asunto(s)
Algoritmos , Biología Computacional/métodos , Proteoma/análisis , Proteómica/métodos , Reconocimiento de Normas Patrones Automatizadas/métodos
5.
Bioinformatics ; 24(11): 1418-9, 2008 Jun 01.
Artículo en Inglés | MEDLINE | ID: mdl-18400772

RESUMEN

UNLABELLED: The knowledge base EPO-KB (Empirical Proteomic Ontology Knowledge Base) is based on an OWL ontology that represents current knowledge linking mass-to-charge (m/z) ratios to proteins on multiple platforms including Matrix Assisted Laser/Desorption Ionization (MALDI) and Surface Enhanced Laser/Desorption Ionization (SELDI)--Time of Flight (TOF). At present, it contains information on m/z ratio to protein links that were extracted from 120 published research papers. It has a web interface that allows researchers to query and retrieve putative proteins that correspond to a user-specified m/z ratio. EPO-KB also allows automated entry of additional m/z ratio to protein links and is expandable to the addition of gene to protein and protein to disease links. AVAILABILITY: http://www.dbmi.pitt.edu/EPO-KB


Asunto(s)
Algoritmos , Inteligencia Artificial , Sistemas de Administración de Bases de Datos , Bases de Datos de Proteínas , Almacenamiento y Recuperación de la Información/métodos , Reconocimiento de Normas Patrones Automatizadas/métodos , Proteínas/química , Biomarcadores , Programas Informáticos , Interfaz Usuario-Computador
6.
PLoS One ; 6(11): e26542, 2011.
Artículo en Inglés | MEDLINE | ID: mdl-22132074

RESUMEN

Aberrant interactions between the host and the intestinal bacteria are thought to contribute to the pathogenesis of many digestive diseases. However, studying the complex ecosystem at the human mucosal-luminal interface (MLI) is challenging and requires an integrative systems biology approach. Therefore, we developed a novel method integrating lavage sampling of the human mucosal surface, high-throughput proteomics, and a unique suite of bioinformatic and statistical analyses. Shotgun proteomic analysis of secreted proteins recovered from the MLI confirmed the presence of both human and bacterial components. To profile the MLI metaproteome, we collected 205 mucosal lavage samples from 38 healthy subjects, and subjected them to high-throughput proteomics. The spectral data were subjected to a rigorous data processing pipeline to optimize suitability for quantitation and analysis, and then were evaluated using a set of biostatistical tools. Compared to the mucosal transcriptome, the MLI metaproteome was enriched for extracellular proteins involved in response to stimulus and immune system processes. Analysis of the metaproteome revealed significant individual-related as well as anatomic region-related (biogeographic) features. Quantitative shotgun proteomics established the identity and confirmed the biogeographic association of 49 proteins (including 3 functional protein networks) demarcating the proximal and distal colon. This robust and integrated proteomic approach is thus effective for identifying functional features of the human mucosal ecosystem, and a fresh understanding of the basic biology and disease processes at the MLI.


Asunto(s)
Ecosistema , Mucosa Intestinal/microbiología , Proteómica/métodos , Biopsia , Femenino , Salud , Humanos , Mucosa Intestinal/patología , Masculino , Persona de Mediana Edad , Anotación de Secuencia Molecular , Filogenia , Proteoma/genética , Proteoma/metabolismo , Reproducibilidad de los Resultados , Manejo de Especímenes , Transcriptoma/genética
7.
AMIA Annu Symp Proc ; 2009: 406-10, 2009 Nov 14.
Artículo en Inglés | MEDLINE | ID: mdl-20351889

RESUMEN

An important step in the analysis of high-dimensional biomedical data is feature selection. Typically, a feature subset selected by a feature selection method is evaluated for relevance towards a task such as prediction or classification. Another important property of a feature selection method is stability that refers to robustness of the selected features to perturbations in the data. In biomarker discovery, for example, domain experts prefer a parsimonious subset of features that are relatively robust to slight changes in the data. We present a stability measure called the adjusted stability measure that computes robustness of a feature selection method with respect to random feature selection. This measure is useful for comparing the robustness of feature selection methods and is superior to similar measures that do not account for random feature selection. We demonstrate the application of this measure on a biomedical dataset.


Asunto(s)
Clasificación/métodos , Biología Computacional , Bases de Datos Factuales , Reconocimiento de Normas Patrones Automatizadas , Teorema de Bayes , Humanos , Modelos Logísticos , Conceptos Matemáticos , Neoplasias , Proteómica
8.
Arch Dermatol ; 145(11): 1262-6, 2009 Nov.
Artículo en Inglés | MEDLINE | ID: mdl-19917955

RESUMEN

OBJECTIVE: To assess the efficacy of alefacept for the treatment of severe alopecia areata (AA). DESIGN: Multicenter, double-blind, randomized, placebo-controlled clinical trial. SETTING: Academic departments of dermatology in the United States. PARTICIPANTS: Forty-five individuals with chronic and severe AA affecting 50% to 95% of the scalp hair and resistant to previous therapies. Intervention Alefacept, a US Food and Drug Administration-approved T-cell biologic inhibitor for the treatment of moderate to severe plaque psoriasis. Main Outcome Measure Improved Severity of Alopecia Tool (SALT) score over 24 weeks. RESULTS: Participants receiving alefacept for 12 consecutive weeks demonstrated no statistically significant improvement in AA when compared with a well-matched placebo-receiving group (P = .70). Conclusion Alefacept is ineffective for the treatment of severe AA.


Asunto(s)
Alopecia Areata/diagnóstico , Alopecia Areata/tratamiento farmacológico , Fármacos Dermatológicos/administración & dosificación , Proteínas Recombinantes de Fusión/administración & dosificación , Centros Médicos Académicos , Adolescente , Adulto , Anciano , Alefacept , Relación Dosis-Respuesta a Droga , Método Doble Ciego , Esquema de Medicación , Femenino , Estudios de Seguimiento , Humanos , Inyecciones Intramusculares , Masculino , Persona de Mediana Edad , Ciudad de Nueva York , Probabilidad , Medición de Riesgo , Índice de Severidad de la Enfermedad , Estadísticas no Paramétricas , Resultado del Tratamiento , Adulto Joven
9.
AMIA Annu Symp Proc ; : 445-9, 2008 Nov 06.
Artículo en Inglés | MEDLINE | ID: mdl-18999186

RESUMEN

Discretization acts as a variable selection method in addition to transforming the continuous values of the variable to discrete ones. Machine learning algorithms such as Support Vector Machines and Random Forests have been used for classification in high-dimensional genomic and proteomic data due to their robustness to the dimensionality of the data. We show that discretization can help improve significantly the classification performance of these algorithms as well as algorithms like Naïve Bayes that are sensitive to the dimensionality of the data.


Asunto(s)
Algoritmos , Inteligencia Artificial , Sistemas de Administración de Bases de Datos , Bases de Datos Factuales , Técnicas de Apoyo para la Decisión , Reconocimiento de Normas Patrones Automatizadas/métodos
10.
AMIA Annu Symp Proc ; : 1033, 2008 Nov 06.
Artículo en Inglés | MEDLINE | ID: mdl-18999243

RESUMEN

The Empirical Proteomic Ontology Knowledge Base (EPO-KB) is an online database that represents current knowledge of biomarkers and contains associations between mass-to-charge (m/z) ratios of mass-spectrometry peaks to proteins. Such a database is a useful tool for identifying putative proteins associated with a m/z ratio. At present, EPO-KB contains data that have been extracted from 120 published research papers. It has been used in successful identification of a protein associated with a biomarker.


Asunto(s)
Biomarcadores/química , Bases de Datos de Proteínas , Almacenamiento y Recuperación de la Información/métodos , Procesamiento de Lenguaje Natural , Mapeo Peptídico/métodos , Proteoma/química , Proteoma/clasificación , Interfaz Usuario-Computador
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA