Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 8 de 8
Filtrar
Más filtros










Base de datos
Intervalo de año de publicación
1.
AMIA Jt Summits Transl Sci Proc ; 2024: 221-229, 2024.
Artículo en Inglés | MEDLINE | ID: mdl-38827091

RESUMEN

We recently demonstrated that electronically constructed family pedigrees (e-pedigrees) have great value in epidemiologic research using electronic health record (EHR) data. Prior to this work, it has been well accepted that family health history is a major predictor for a wide spectrum of diseases, reflecting shared effects of genetics, environment, and lifestyle. With the widespread digitalization of patient data via EHRs, there is an unprecedented opportunity to use machine learning algorithms to better predict disease risk. Although predictive models have previously been constructed for a few important diseases, we currently know very little about how accurately the risk for most diseases can be predicted. It is further unknown if the incorporation of e-pedigrees in machine learning can improve the value of these models. In this study, we devised a family pedigree-driven high-throughput machine learning pipeline to simultaneously predict risks for thousands of diagnosis codes using thousands of input features. Models were built to predict future disease risk for three time windows using both Logistic Regression and XGBoost. For example, we achieved average areas under the receiver operating characteristic curves (AUCs) of 0.82, 0.77 and 0.71 for 1, 6, and 24 months, respectively using XGBoost and without e-pedigrees. When adding e-pedigree features to the XGBoost pipeline, AUCs increased to 0.83, 0.79 and 0.74 for the same three time periods, respectively. E-pedigrees similarly improved the predictions when using Logistic Regression. These results emphasize the potential value of incorporating family health history via e-pedigrees into machine learning with no further human time.

2.
bioRxiv ; 2023 Aug 26.
Artículo en Inglés | MEDLINE | ID: mdl-37662370

RESUMEN

Spatial barcoding-based transcriptomic (ST) data require cell type deconvolution for cellular-level downstream analysis. Here we present SDePER, a hybrid machine learning and regression method, to deconvolve ST data using reference single-cell RNA sequencing (scRNA-seq) data. SDePER uses a machine learning approach to remove the systematic difference between ST and scRNA-seq data (platform effects) explicitly and efficiently to ensure the linear relationship between ST data and cell type-specific expression profile. It also considers sparsity of cell types per capture spot and across-spots spatial correlation in cell type compositions. Based on the estimated cell type proportions, SDePER imputes cell type compositions and gene expression at unmeasured locations in a tissue map with enhanced resolution. Applications to coarse-grained simulated data and four real datasets showed that SDePER achieved more accurate and robust results than existing methods, suggesting the importance of considering platform effects, sparsity and spatial correlation in cell type deconvolution.

3.
IEEE Trans Cybern ; 52(6): 4415-4429, 2022 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-33095737

RESUMEN

Polarimetric synthetic aperture radar (PolSAR) data are sequentially acquired and have multiple views obtained from different feature extractors or multiple frequency bands. The fast and accurate classification of PolSAR data in dynamically changing environments is a critical and challenging task. Online learning can handle this task by learning a classifier incrementally from a stream of samples. In this article, we propose an online semisupervised active learning framework for multiview PolSAR data classification, called OSAM. First, a novel online active learning strategy is designed based on the relationships among multiple views and a randomized rule, which allows to only query the labels of some informative incoming samples. Then, in order to utilize both the incoming labeled and unlabeled samples to update the classifiers, a novel online semisupervised learning model is proposed based on co-regularized multiview learning and graph regularization. In addition, the proposed method can deal with the dynamic large-scale multifeature or multifrequency PolSAR data where not only the amount of data but also the number of classes gradually increases in the learning process. Moreover, the mistake bound of the proposed method is derived rigorously. Extensive experiments are conducted on real PolSAR data to evaluate the performance of our algorithm, and the results demonstrate the effectiveness of the proposed method.

4.
IEEE Trans Image Process ; 30: 8607-8618, 2021.
Artículo en Inglés | MEDLINE | ID: mdl-34648443

RESUMEN

Feature is a crucial element of polarimetric synthetic aperture radar (PolSAR) image classification. Multiple types of Features, such as polarimetric features (PF) generated from the PolSAR data and various polarimetric target decompositions, texture features (TF) of the Pauli color-coded PolSAR images are used as features for PolSAR image classification. The obtained PF and TF often form the high-dimensional data, which leads to high computational complexity. Moreover, some features are irrelative and do nothing to improve the classification performance. Therefore, it is fairly indispensable to select a subset of useful features for PolSAR image classification. This paper proposes a multi-view feature selection method for PolSAR image classification. Firstly, two types of features, PF and TF are generated separately. Then the optimization model is built to pursue the feature selection matrices. Specifically, in order to maintain the consistency of different types of features, we search for the common representation of multiple types of features in the optimization problem. The l2,1 norm sparsity regularization is imposed on the feature selection matrices to achieve feature selection. In addition, the manifold regularization on the common representation is utilized to preserve the structure information of the data. The effectiveness of the proposed method is evaluated on three real PolSAR data sets. Experimental results demonstrate the superiority of the proposed method.

5.
Bioinformatics ; 37(21): 3966-3968, 2021 11 05.
Artículo en Inglés | MEDLINE | ID: mdl-34086863

RESUMEN

MOTIVATION: The use and functionality of Electronic Health Records (EHR) have increased rapidly in the past few decades. EHRs are becoming an important depository of patient health information and can capture family data. Pedigree analysis is a longstanding and powerful approach that can gain insight into the underlying genetic and environmental factors in human health, but traditional approaches to identifying and recruiting families are low-throughput and labor-intensive. Therefore, high-throughput methods to automatically construct family pedigrees are needed. RESULTS: We developed a stand-alone application: Electronic Pedigrees, or E-Pedigrees, which combines two validated family prediction algorithms into a single software package for high throughput pedigrees construction. The convenient platform considers patients' basic demographic information and/or emergency contact data to infer high-accuracy parent-child relationship. Importantly, E-Pedigrees allows users to layer in additional pedigree data when available and provides options for applying different logical rules to improve accuracy of inferred family relationships. This software is fast and easy to use, is compatible with different EHR data sources, and its output is a standard PED file appropriate for multiple downstream analyses. AVAILABILITY AND IMPLEMENTATION: The Python 3.3+ version E-Pedigrees application is freely available on: https://github.com/xiayuan-huang/E-pedigrees.


Asunto(s)
Algoritmos , Programas Informáticos , Humanos , Linaje , Registros Electrónicos de Salud
6.
Artículo en Inglés | MEDLINE | ID: mdl-29993868

RESUMEN

Feature extraction is a very important step for polarimetric synthetic aperture radar (PolSAR) image classification. Many dimensionality reduction (DR) methods have been employed to extract features for supervised PolSAR image classification. However, these DR-based feature extraction methods only consider each single pixel independently and thus fail to take into account the spatial relationship of the neighboring pixels, so their performance may not be satisfactory. To address this issue, we introduce a novel tensor local discriminant embedding (TLDE) method for feature extraction for supervised PolSAR image classification. The proposed method combines the spatial and polarimetric information of each pixel by characterizing the pixel with the patch centered at this pixel. Then each pixel is represented as a third-order tensor, of which the first two modes indicate the spatial information of the patch (i.e. the row and the column of the patch) and the third mode denotes the polarimetric information of the patch. Based on the label information of samples and the redundance of the spatial and polarimetric information, a supervised tensor-based dimensionality reduction technique, called TLDE, is introduced to find three projections which project each pixel, that is, the third-order tensor into the low-dimensional feature. Finally, classification is completed based on the extracted features using the nearest neighbor (NN) classifier and the support vector machine (SVM) classifier. The proposed method is evaluated on two real PolSAR data sets and the simulated PolSAR data sets with various number of looks. The experimental results demonstrate that the proposed method not only improves the classification accuracy greatly, but also alleviates the influence of speckle noise on classification.

7.
Bioinformatics ; 34(4): 635-642, 2018 02 15.
Artículo en Inglés | MEDLINE | ID: mdl-28968884

RESUMEN

Motivation: Pedigree analysis is a longstanding and powerful approach to gain insight into the underlying genetic factors in human health, but identifying, recruiting and genotyping families can be difficult, time consuming and costly. Development of high throughput methods to identify families and foster downstream analyses are necessary. Results: This paper describes simple methods that allowed us to identify 173 368 family pedigrees with high probability using basic demographic data available in most electronic health records (EHRs). We further developed and validate a novel statistical method that uses EHR data to identify families more likely to have a major genetic component to their diseases risk. Lastly, we showed that incorporating EHR-linked family data into genetic association testing may provide added power for genetic mapping without additional recruitment or genotyping. The totality of these results suggests that EHR-linked families can enable classical genetic analyses in a high-throughput manner. Availability and implementation: Pseudocode is provided as supplementary information. Contact: HEBBRING.SCOTT@marshfieldresearch.org. Supplementary information: Supplementary data are available at Bioinformatics online.


Asunto(s)
Registros Electrónicos de Salud , Investigación Genética , Genoma Humano , Linaje , Algoritmos , Mapeo Cromosómico , Bases de Datos Factuales , Femenino , Estudios de Asociación Genética , Enfermedades Genéticas Congénitas , Humanos , Masculino , Persona de Mediana Edad
8.
IEEE Trans Image Process ; 25(6): 2620-34, 2016 05.
Artículo en Inglés | MEDLINE | ID: mdl-27071175

RESUMEN

In this paper, we propose a nonlocal total variation (NLTV)-based variational model for polarimetric synthetic aperture radar (PolSAR) data speckle reduction. This model, named WisNLTV, is obtained based on the Wishart fidelity term and the NLTV regularization defined for the complex-valued fourth-order tensor data. Since the proposed model is non-convex, an equivalent bi-convex model is obtained using the property of conjugate functions. Then, an efficient iteration algorithm is developed to solve the equivalent bi-convex model, based on the alternating minimization and the forward-backward operator splitting technique. The proposed iteration algorithm is proved to be convergent under certain conditions theoretically and numerically. Experimental results on both synthetic and real PolSAR data demonstrate that the proposed method can effectively reduce speckle noise and, meanwhile, better preserve the details and the repetitive structures such as textures and edges, and the polarimetric scattering characteristics, compared with the other methods.

SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA
...