Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 18 de 18
Filtrar
1.
Adv Data Anal Classif ; 16(3): 691-723, 2022.
Artículo en Inglés | MEDLINE | ID: mdl-36043219

RESUMEN

A probabilistic model for random hypergraphs is introduced to represent unary, binary and higher order interactions among objects in real-world problems. This model is an extension of the latent class analysis model that introduces two clustering structures for hyperedges and captures variation in the size of hyperedges. An expectation maximization algorithm with minorization maximization steps is developed to perform parameter estimation. Model selection using Bayesian Information Criterion is proposed. The model is applied to simulated data and two real-world data sets where interesting results are obtained.

2.
Adv Data Anal Classif ; 16(1): 55-92, 2022.
Artículo en Inglés | MEDLINE | ID: mdl-35308632

RESUMEN

In supervised classification problems, the test set may contain data points belonging to classes not observed in the learning phase. Moreover, the same units in the test data may be measured on a set of additional variables recorded at a subsequent stage with respect to when the learning sample was collected. In this situation, the classifier built in the learning phase needs to adapt to handle potential unknown classes and the extra dimensions. We introduce a model-based discriminant approach, Dimension-Adaptive Mixture Discriminant Analysis (D-AMDA), which can detect unobserved classes and adapt to the increasing dimensionality. Model estimation is carried out via a full inductive approach based on an EM algorithm. The method is then embedded in a more general framework for adaptive variable selection and classification suitable for data of large dimensions. A simulation study and an artificial experiment related to classification of adulterated honey samples are used to validate the ability of the proposed framework to deal with complex situations.

3.
Stat Methods Appt ; 30(5): 1365-1398, 2021.
Artículo en Inglés | MEDLINE | ID: mdl-34840548

RESUMEN

We propose a weighted stochastic block model (WSBM) which extends the stochastic block model to the important case in which edges are weighted. We address the parameter estimation of the WSBM by use of maximum likelihood and variational approaches, and establish the consistency of these estimators. The problem of choosing the number of classes in a WSBM is addressed. The proposed model is applied to simulated data and an illustrative data set.

4.
Anal Chim Acta ; 1153: 338245, 2021 Apr 08.
Artículo en Inglés | MEDLINE | ID: mdl-33714445

RESUMEN

Classification of high-dimensional spectroscopic data is a common task in analytical chemistry. Well-established procedures like support vector machines (SVMs) and partial least squares discriminant analysis (PLS-DA) are the most common methods for tackling this supervised learning problem. Nonetheless, interpretation of these models remains sometimes difficult, and solutions based on feature selection are often adopted as they lead to the automatic identification of the most informative wavelengths. Unfortunately, for some delicate applications like food authenticity, mislabeled and adulterated spectra occur both in the calibration and/or validation sets, with dramatic effects on the model development, its prediction accuracy and robustness. Motivated by these issues, the present paper proposes a robust model-based method that simultaneously performs variable selection, outliers and label noise detection. We demonstrate the effectiveness of our proposal in dealing with three agri-food spectroscopic studies, where several forms of perturbations are considered. Our approach succeeds in diminishing problem complexity, identifying anomalous spectra and attaining competitive predictive accuracy considering a very low number of selected wavelengths.

5.
Sci Rep ; 11(1): 2525, 2021 01 28.
Artículo en Inglés | MEDLINE | ID: mdl-33510263

RESUMEN

Improved prostate cancer detection methods would avoid over-diagnosis of clinically indolent disease informing appropriate treatment decisions. The aims of this study were to investigate the role of a panel of Inflammation biomarkers to inform the need for a biopsy to diagnose prostate cancer. Peripheral blood serum obtained from 436 men undergoing transrectal ultrasound guided biopsy were assessed for a panel of 18 inflammatory serum biomarkers in addition to Total and Free Prostate Specific Antigen (PSA). This panel was integrated into a previously developed Irish clinical risk calculator (IPRC) for the detection of prostate cancer and high-grade prostate cancer (Gleason Score ≥ 7). Using logistic regression and multinomial regression methods, two models (Logst-RC and Multi-RC) were developed considering linear and nonlinear effects of the panel in conjunction with clinical and demographic parameters for determination of the two endpoints. Both models significantly improved the predictive ability of the clinical model for detection of prostate cancer (from 0.656 to 0.731 for Logst-RC and 0.713 for Multi-RC) and high-grade prostate cancer (from 0.716 to 0.785 for Logst-RC and 0.767 for Multi-RC) and demonstrated higher clinical net benefit. This improved discriminatory power and clinical utility may allow for individualised risk stratification improving clinical decision making.


Asunto(s)
Biomarcadores/sangre , Mediadores de Inflamación/sangre , Neoplasias de la Próstata/sangre , Neoplasias de la Próstata/diagnóstico , Adulto , Anciano , Anciano de 80 o más Años , Biopsia , Detección Precoz del Cáncer , Humanos , Biopsia Líquida , Masculino , Persona de Mediana Edad , Clasificación del Tumor , Estadificación de Neoplasias , Neoplasias de la Próstata/epidemiología , Curva ROC , Medición de Riesgo , Factores de Riesgo
6.
J Comput Graph Stat ; 28(1): 185-196, 2019.
Artículo en Inglés | MEDLINE | ID: mdl-31447541

RESUMEN

Many existing statistical and machine learning tools for social network analysis focus on a single level of analysis. Methods designed for clustering optimize a global partition of the graph, whereas projection-based approaches (e.g., the latent space model in the statistics literature) represent in rich detail the roles of individuals. Many pertinent questions in sociology and economics, however, span multiple scales of analysis. Further, many questions involve comparisons across disconnected graphs that will, inevitably be of different sizes, either due to missing data or the inherent heterogeneity in real-world networks. We propose a class of network models that represent network structure on multiple scales and facilitate comparison across graphs with different numbers of individuals. These models differentially invest modeling effort within subgraphs of high density, often termed communities, while maintaining a parsimonious structure between said subgraphs. We show that our model class is projective, highlighting an ongoing discussion in the social network modeling literature on the dependence of inference paradigms on the size of the observed graph. We illustrate the utility of our method using data on household relations from Karnataka, India. Supplementary material for this article is available online.

7.
Prostate ; 78(10): 724-730, 2018 07.
Artículo en Inglés | MEDLINE | ID: mdl-29608018

RESUMEN

BACKGROUND: Up to a third of prostate cancer patients fail curative treatment strategies such as surgery and radiation therapy in the form of biochemical recurrence (BCR) which can be predictive of poor outcome. Recent clinical trials have shown that men experiencing BCR might benefit from earlier intervention post-radical prostatectomy (RP). Therefore, there is an urgent need to identify earlier prognostic biomarkers which will guide clinicians in making accurate diagnosis and timely decisions on the next appropriate treatment. The objective of this study was to evaluate Serum Response Factor (SRF) protein expression following RP and to investigate its association with BCR. MATERIALS AND METHODS: SRF nuclear expression was evaluated by immunohistochemistry (IHC) in TMAs across three international radical prostatectomy cohorts for a total of 615 patients. Log-rank test and Kaplan-Meier analyses were used for BCR comparisons. Stepwise backwards elimination proportional hazard regression analysis was used to explore the significance of SRF in predicting BCR in the context of other clinical pathological variables. Area under the curve (AUC) values were generated by simulating repeated random sub-samples. RESULTS: Analysis of the immunohistochemical staining of benign versus cancer cores showed higher expression of nuclear SRF protein expression in cancer cores compared with benign for all the three TMAs analysed (P < 0.001, n = 615). Kaplan-Meier curves of the three TMAs combined showed that patients with higher SRF nuclear expression had a shorter time to BCR compared with patients with lower SRF expression (P < 0.001, n = 215). Together with pathological T stage T3, SRF was identified as a predictor of BCR using stepwise backwards elimination proportional hazard regression analysis (P = 0.0521). Moreover ROC curves and AUC values showed that SRF was better than T stage in predicting BCR at year 3 and 5 following radical prostatectomy, the combination of SRF and T stage had a higher AUC value than the two taken separately. CONCLUSIONS: SRF assessment by IHC following RP could be useful in guiding clinicians to better identify patients for appropriate follow-up and timely treatment.


Asunto(s)
Recurrencia Local de Neoplasia/metabolismo , Próstata/metabolismo , Neoplasias de la Próstata/metabolismo , Neoplasias de la Próstata/cirugía , Factor de Respuesta Sérica/biosíntesis , Anciano , Humanos , Inmunoquímica , Masculino , Persona de Mediana Edad , Recurrencia Local de Neoplasia/sangre , Recurrencia Local de Neoplasia/patología , Próstata/cirugía , Neoplasias de la Próstata/sangre , Neoplasias de la Próstata/patología , Factor de Respuesta Sérica/sangre , Análisis de Supervivencia
8.
Wellcome Open Res ; 3: 5, 2018.
Artículo en Inglés | MEDLINE | ID: mdl-29503875

RESUMEN

Tracing the fate of stable isotopically-enriched nutrients is a sophisticated method of describing and quantifying the activity of metabolic pathways. Nuclear Magnetic Resonance (NMR) offers high resolution data, yet is under-utilised due to length of time required to collect the data, quantification requiring multiple samples and complicated analysis. Here we present two techniques, quantitative spectral filters and enhancement of the splitting due to J-coupling in 1H, 13C-HSQC NMR spectra, which allow the rapid collection of NMR data in a quantitative manner on a single sample. The reduced duration of HSQC spectra data acquisition opens up the possibility of real-time tracing of metabolism including the study of metabolic pathways in vivo. We show how these novel techniques can be used to trace the fate of labelled nutrients in a whole organ model of kidney preservation prior to transplantation using a porcine kidney as a model organ, and also show how the use of multiple nutrients, differentially labelled with 13C and 15N, can be used to provide additional information with which to profile metabolic pathways.

9.
J Comput Graph Stat ; 24(2): 520-538, 2015 Jun 01.
Artículo en Inglés | MEDLINE | ID: mdl-26101465

RESUMEN

A novel and flexible framework for investigating the roles of actors within a network is introduced. Particular interest is in roles as defined by local network connectivity patterns, identified using the ego-networks extracted from the network. A mixture of Exponential-family Random Graph Models is developed for these ego-networks in order to cluster the nodes into roles. We refer to this model as the ego-ERGM. An Expectation-Maximization algorithm is developed to infer the unobserved cluster assignments and to estimate the mixture model parameters using a maximum pseudo-likelihood approximation. The flexibility and utility of the method are demonstrated on examples of simulated and real networks.

10.
BMC Med Inform Decis Mak ; 13: 126, 2013 Nov 15.
Artículo en Inglés | MEDLINE | ID: mdl-24238348

RESUMEN

BACKGROUND: There are dilemmas associated with the diagnosis and prognosis of prostate cancer which has lead to over diagnosis and over treatment. Prediction tools have been developed to assist the treatment of the disease. METHODS: A retrospective review was performed of the Irish Prostate Cancer Research Consortium database and 603 patients were used in the study. Statistical models based on routinely used clinical variables were built using logistic regression, random forests and k nearest neighbours to predict prostate cancer stage. The predictive ability of the models was examined using discrimination metrics, calibration curves and clinical relevance, explored using decision curve analysis. The N = 603 patients were then applied to the 2007 Partin table to compare the predictions from the current gold standard in staging prediction to the models developed in this study. RESULTS: 30% of the study cohort had non organ-confined disease. The model built using logistic regression illustrated the highest discrimination metrics (AUC = 0.622, Sens = 0.647, Spec = 0.601), best calibration and the most clinical relevance based on decision curve analysis. This model also achieved higher discrimination than the 2007 Partin table (ECE AUC = 0.572 & 0.509 for T1c and T2a respectively). However, even the best statistical model does not accurately predict prostate cancer stage. CONCLUSIONS: This study has illustrated the inability of the current clinical variables and the 2007 Partin table to accurately predict prostate cancer stage. New biomarker features are urgently required to address the problem clinician's face in identifying the most appropriate treatment for their patients. This paper also demonstrated a concise methodological approach to evaluate novel features or prediction models.


Asunto(s)
Modelos Estadísticos , Estadificación de Neoplasias/normas , Pronóstico , Neoplasias de la Próstata , Adulto , Anciano , Calibración/normas , Bases de Datos Factuales/estadística & datos numéricos , Humanos , Irlanda , Masculino , Persona de Mediana Edad , Valor Predictivo de las Pruebas , Neoplasias de la Próstata/diagnóstico , Neoplasias de la Próstata/patología , Reproducibilidad de los Resultados , Estudios Retrospectivos , Sensibilidad y Especificidad
13.
J Med Internet Res ; 13(1): e14, 2011 Jan 31.
Artículo en Inglés | MEDLINE | ID: mdl-21282098

RESUMEN

The Internet has become an important health information resource for patients and the general public. Wikipedia, a collaboratively written Web-based encyclopedia, has become the dominant online reference work. It is usually among the top results of search engine queries, including when medical information is sought. Since April 2004, editors have formed a group called WikiProject Medicine to coordinate and discuss the English-language Wikipedia's medical content. This paper, written by members of the WikiProject Medicine, discusses the intricacies, strengths, and weaknesses of Wikipedia as a source of health information and compares it with other medical wikis. Medical professionals, their societies, patient groups, and institutions can help improve Wikipedia's health-related entries. Several examples of partnerships already show that there is enthusiasm to strengthen Wikipedia's biomedical content. Given its unique global reach, we believe its possibilities for use as a tool for worldwide health promotion are underestimated. We invite the medical community to join in editing Wikipedia, with the goal of providing people with free access to reliable, understandable, and up-to-date health information.


Asunto(s)
Información de Salud al Consumidor , Enciclopedias como Asunto , Salud Global , Promoción de la Salud/métodos , Internet , Salud Pública , Humanos , Difusión de la Información , Servicios de Información , Educación del Paciente como Asunto
14.
Stroke ; 42(3): 681-6, 2011 Mar.
Artículo en Inglés | MEDLINE | ID: mdl-21233462

RESUMEN

BACKGROUND AND PURPOSE: Additional exercise therapy has been shown to have a positive impact on function after acute stroke and research is now focusing on methods to increase the amount of therapy that is delivered. This randomized controlled trial examined the impact of additional family-mediated exercise (FAME) therapy on outcome after acute stroke. METHODS: Forty participants with acute stroke were randomly assigned to either a control group who received routine therapy with no formal input from their family members or a FAME group, who received routine therapy and additional lower limb FAME therapy for 8 weeks. The primary outcome measure used was the lower limb section of the Fugl-Meyer Assessment modified by Lindmark. Other measures of impairment, activity, and participation were completed at baseline, postintervention, and at a 3-month follow-up. RESULTS: Statistically significant differences in favor of the FAME group were noted on all measures of impairment and activity postintervention (P<0.05). These improvements persisted at the 3-month follow-up but only walking was statistically significant (P<0.05). Participants in the FAME group were also significantly more integrated into their community at follow-up (P<0.05). Family members in the FAME group reported a significant decrease in their levels of caregiver strain at the follow-up when compared with those in the control group (P<0.01). CONCLUSIONS: This evidence-based FAME intervention can serve to optimize patient recovery and family involvement after acute stroke at the same time as being mindful of available resources.


Asunto(s)
Cuidadores/normas , Terapia por Ejercicio/métodos , Terapia por Ejercicio/normas , Recuperación de la Función/fisiología , Rehabilitación de Accidente Cerebrovascular , Accidente Cerebrovascular/fisiopatología , Anciano , Anciano de 80 o más Años , Estudios de Cohortes , Femenino , Estudios de Seguimiento , Humanos , Masculino , Persona de Mediana Edad , Factores de Tiempo , Resultado del Tratamiento
15.
J Proteome Res ; 10(3): 1361-73, 2011 Mar 04.
Artículo en Inglés | MEDLINE | ID: mdl-21166384

RESUMEN

In recent years, Prostate Specific Antigen (PSA) testing is widespread and has been associated with deceased mortality rates; however, this testing has raised concerns of overdiagnosis and overtreatment. It is clear that additional biomarkers are required. To identify these biomarkers, we have undertaken proteomics and metabolomics expression profiles of serum samples from BPH, Gleason score 5 and 7 using two-dimensional difference in gel electrophoresis (2D-DIGE) and nuclear magnetic resonance spectroscopy (NMR). Panels of serum protein biomarkers were identified by applying Random Forests to the 2D-DIGE data. The evaluation of selected biomarker panels has shown that they can provide higher prediction accuracy than the current diagnostic standard. With careful validation of these serum biomarker panels, these panels may potentially help to reduce unnecessary invasive diagnostic procedures and more accurately direct the urologist to curative surgery.


Asunto(s)
Biomarcadores de Tumor/análisis , Biomarcadores de Tumor/sangre , Neoplasias de la Próstata/sangre , Neoplasias de la Próstata/diagnóstico , Neoplasias de la Próstata/patología , Electroforesis Bidimensional Diferencial en Gel/métodos , Área Bajo la Curva , Análisis por Conglomerados , Humanos , Masculino , Espectrometría de Masas/métodos , Estadificación de Neoplasias , Reproducibilidad de los Resultados
17.
Ann Appl Stat ; 4(1): 396-421, 2010 Mar 01.
Artículo en Inglés | MEDLINE | ID: mdl-20936055

RESUMEN

Food authenticity studies are concerned with determining if food samples have been correctly labelled or not. Discriminant analysis methods are an integral part of the methodology for food authentication. Motivated by food authenticity applications, a model-based discriminant analysis method that includes variable selection is presented. The discriminant analysis model is fitted in a semi-supervised manner using both labeled and unlabeled data. The method is shown to give excellent classification performance on several high-dimensional multiclass food authenticity datasets with more variables than observations. The variables selected by the proposed method provide information about which variables are meaningful for classification purposes. A headlong search strategy for variable selection is shown to be efficient in terms of computation and achieves excellent classification performance. In applications to several food authenticity datasets, our proposed method outperformed default implementations of Random Forests, AdaBoost, transductive SVMs and Bayesian Multinomial Regression by substantial margins.

18.
Bioinformatics ; 26(21): 2705-12, 2010 Nov 01.
Artículo en Inglés | MEDLINE | ID: mdl-20802251

RESUMEN

MOTIVATION: In recent years, work has been carried out on clustering gene expression microarray data. Some approaches are developed from an algorithmic viewpoint whereas others are developed via the application of mixture models. In this article, a family of eight mixture models which utilizes the factor analysis covariance structure is extended to 12 models and applied to gene expression microarray data. This modelling approach builds on previous work by introducing a modified factor analysis covariance structure, leading to a family of 12 mixture models, including parsimonious models. This family of models allows for the modelling of the correlation between gene expression levels even when the number of samples is small. Parameter estimation is carried out using a variant of the expectation-maximization algorithm and model selection is achieved using the Bayesian information criterion. This expanded family of Gaussian mixture models, known as the expanded parsimonious Gaussian mixture model (EPGMM) family, is then applied to two well-known gene expression data sets. RESULTS: The performance of the EPGMM family of models is quantified using the adjusted Rand index. This family of models gives very good performance, relative to existing popular clustering techniques, when applied to real gene expression microarray data. AVAILABILITY: The reduced, preprocessed data that were analysed are available at www.paulmcnicholas.info


Asunto(s)
Perfilación de la Expresión Génica/métodos , Análisis de Secuencia por Matrices de Oligonucleótidos/métodos , Algoritmos , Análisis por Conglomerados , Simulación por Computador , Distribución Normal , Reconocimiento de Normas Patrones Automatizadas/métodos
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA
...