Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 105
Filtrar
Mais filtros

Base de dados
Tipo de documento
Intervalo de ano de publicação
1.
Proc Natl Acad Sci U S A ; 119(51): e2206580119, 2022 Dec 20.
Artigo em Inglês | MEDLINE | ID: mdl-36525536

RESUMO

While the gig economy provides flexible jobs for millions of workers globally, a lack of organization identity and coworker bonds contributes to their low engagement and high attrition rates. To test the impact of virtual teams on worker productivity and retention, we conduct a field experiment with 27,790 drivers on a ride-sharing platform. We organize drivers into teams that are randomly assigned to receiving their team ranking, or individual ranking within their team, or individual performance information (control). We find that treated drivers work longer hours and generate significantly higher revenue. Furthermore, drivers in the team-ranking treatment continue to be more engaged 3 mo after the end of the experiment. A machine-learning analysis of 149 team contests in 86 cities suggests that social comparison, driver experience, and within-team similarity are the key predictors of the virtual team efficacy.

2.
Alzheimer Dis Assoc Disord ; 32(1): 18-27, 2018.
Artigo em Inglês | MEDLINE | ID: mdl-29227306

RESUMO

BACKGROUND: Clinical trials increasingly aim to retard disease progression during presymptomatic phases of Mild Cognitive Impairment (MCI) and thus recruiting study participants at high risk for developing MCI is critical for cost-effective prevention trials. However, accurately identifying those who are destined to develop MCI is difficult. Collecting biomarkers is often expensive. METHODS: We used only noninvasive clinical variables collected in the National Alzheimer's Coordinating Center (NACC) Uniform Data Sets version 2.0 and applied machine learning techniques to build a low-cost and accurate Mild Cognitive Impairment (MCI) conversion prediction calculator. Cross-validation and bootstrap were used to select as few variables as possible accurately predicting MCI conversion within 4 years. RESULTS: A total of 31,872 unique subjects, 748 clinical variables, and additional 128 derived variables in NACC data sets were used. About 15 noninvasive clinical variables are identified for predicting MCI/aMCI/naMCI converters, respectively. Over 75% Receiver Operating Characteristic Area Under the Curves (ROC AUC) was achieved. By bootstrap we created a simple spreadsheet calculator which estimates the probability of developing MCI within 4 years with a 95% confidence interval. CONCLUSIONS: We achieved reasonably high prediction accuracy using only clinical variables. The approach used here could be useful for study enrichment in preclinical trials where enrolling participants at risk of cognitive decline is critical for proving study efficacy, and also for developing a shorter assessment battery.


Assuntos
Big Data , Disfunção Cognitiva/diagnóstico , Conjuntos de Dados como Assunto , Modelos Estatísticos , Idoso , Idoso de 80 Anos ou mais , Encéfalo/patologia , Feminino , Humanos , Aprendizado de Máquina , Masculino , Sensibilidade e Especificidade
3.
BMC Genomics ; 17: 669, 2016 08 23.
Artigo em Inglês | MEDLINE | ID: mdl-27549765

RESUMO

BACKGROUND: Major depressive disorder (MDD) is a heterogeneous disease at the level of clinical symptoms, and this heterogeneity is likely reflected at the level of biology. Two clinical subtypes within MDD that have garnered interest are "melancholic depression" and "anxious depression". Metabolomics enables us to characterize hundreds of small molecules that comprise the metabolome, and recent work suggests the blood metabolome may be able to inform treatment decisions for MDD, however work is at an early stage. Here we examine a metabolomics data set to (1) test whether clinically homogenous MDD subtypes are also more biologically homogeneous, and hence more predictiable, (2) devise a robust machine learning framework that preserves biological meaning, and (3) describe the metabolomic biosignature for melancholic depression. RESULTS: With the proposed computational system we achieves around 80 % classification accuracy, sensitivity and specificity for melancholic depression, but only ~72 % for anxious depression or MDD, suggesting the blood metabolome contains more information about melancholic depression.. We develop an ensemble feature selection framework (EFSF) in which features are first clustered, and learning then takes place on the cluster centroids, retaining information about correlated features during the feature selection process rather than discarding them as most machine learning methods will do. Analysis of the most discriminative feature clusters revealed differences in metabolic classes such as amino acids and lipids as well as pathways studied extensively in MDD such as the activation of cortisol in chronic stress. CONCLUSIONS: We find the greater clinical homogeneity does indeed lead to better prediction based on biological measurements in the case of melancholic depression. Melancholic depression is shown to be associated with changes in amino acids, catecholamines, lipids, stress hormones, and immune-related metabolites. The proposed computational framework can be adapted to analyze data from many other biomedical applications where the data has similar characteristics.


Assuntos
Biomarcadores/sangue , Análise Química do Sangue/métodos , Transtorno Depressivo Maior/psicologia , Metabolômica/métodos , Adolescente , Adulto , Idoso , Transtorno Depressivo Maior/metabolismo , Feminino , Humanos , Aprendizado de Máquina , Masculino , Pessoa de Meia-Idade , Adulto Jovem
4.
BMC Bioinformatics ; 16: 147, 2015 May 07.
Artigo em Inglês | MEDLINE | ID: mdl-25948335

RESUMO

BACKGROUND: Profiling gene expression in brain structures at various spatial and temporal scales is essential to understanding how genes regulate the development of brain structures. The Allen Developing Mouse Brain Atlas provides high-resolution 3-D in situ hybridization (ISH) gene expression patterns in multiple developing stages of the mouse brain. Currently, the ISH images are annotated with anatomical terms manually. In this paper, we propose a computational approach to annotate gene expression pattern images in the mouse brain at various structural levels over the course of development. RESULTS: We applied deep convolutional neural network that was trained on a large set of natural images to extract features from the ISH images of developing mouse brain. As a baseline representation, we applied invariant image feature descriptors to capture local statistics from ISH images and used the bag-of-words approach to build image-level representations. Both types of features from multiple ISH image sections of the entire brain were then combined to build 3-D, brain-wide gene expression representations. We employed regularized learning methods for discriminating gene expression patterns in different brain structures. Results show that our approach of using convolutional model as feature extractors achieved superior performance in annotating gene expression patterns at multiple levels of brain structures throughout four developing ages. Overall, we achieved average AUC of 0.894 ± 0.014, as compared with 0.820 ± 0.046 yielded by the bag-of-words approach. CONCLUSIONS: Deep convolutional neural network model trained on natural image sets and applied to gene expression pattern annotation tasks yielded superior performance, demonstrating its transfer learning property is applicable to such biological image sets.


Assuntos
Encéfalo/metabolismo , Regulação da Expressão Gênica no Desenvolvimento , Anotação de Sequência Molecular , Redes Neurais de Computação , Reconhecimento Automatizado de Padrão , Animais , Encéfalo/crescimento & desenvolvimento , Perfilação da Expressão Gênica/métodos , Processamento de Imagem Assistida por Computador , Hibridização In Situ , Camundongos
5.
Bioinformatics ; 30(2): 266-73, 2014 Jan 15.
Artigo em Inglês | MEDLINE | ID: mdl-24300439

RESUMO

MOTIVATION: Drosophila melanogaster is a major model organism for investigating the function and interconnection of animal genes in the earliest stages of embryogenesis. Today, images capturing Drosophila gene expression patterns are being produced at a higher throughput than ever before. The analysis of spatial patterns of gene expression is most biologically meaningful when images from a similar time point during development are compared. Thus, the critical first step is to determine the developmental stage of an embryo. This information is also needed to observe and analyze expression changes over developmental time. Currently, developmental stages (time) of embryos in images capturing spatial expression pattern are annotated manually, which is time- and labor-intensive. Embryos are often designated into stage ranges, making the information on developmental time course. This makes downstream analyses inefficient and biological interpretations of similarities and differences in spatial expression patterns challenging, particularly when using automated tools for analyzing expression patterns of large number of images. RESULTS: Here, we present a new computational approach to annotate developmental stage for Drosophila embryos in the gene expression images. In an analysis of 3724 images, the new approach shows high accuracy in predicting the developmental stage correctly (79%). In addition, it provides a stage score that enables one to more finely annotate each embryo so that they are divided into early and late periods of development within standard stage demarcations. Stage scores for all images containing expression patterns of the same gene enable a direct way to view expression changes over developmental time for any gene. We show that the genomewide-expression-maps generated using images from embryos in refined stages illuminate global gene activities and changes much better, and more refined stage annotations improve our ability to better interpret results when expression pattern matches are discovered between genes. AVAILABILITY AND IMPLEMENTATION: The software package is availablefor download at: http://www.public.asu.edu/*jye02/Software/Fly-Project/.


Assuntos
Biologia Computacional , Proteínas de Drosophila/genética , Drosophila melanogaster/genética , Embrião não Mamífero/citologia , Perfilação da Expressão Gênica , Regulação da Expressão Gênica no Desenvolvimento , Processamento de Imagem Assistida por Computador , Algoritmos , Animais , Drosophila melanogaster/embriologia , Embrião não Mamífero/metabolismo , Desenvolvimento Embrionário/genética , Reconhecimento Automatizado de Padrão
6.
Bioinformatics ; 30(9): 1319-21, 2014 May 01.
Artigo em Inglês | MEDLINE | ID: mdl-24413523

RESUMO

Spatial patterns of gene expression are of key importance in understanding developmental networks. Using in situ hybridization, many laboratories are generating images to describe these spatial patterns and to test biological hypotheses. To facilitate such analyses, we have developed biologist-centric software (myFX) that contains computational methods to automatically process and analyze images depicting embryonic gene expression in the fruit fly Drosophila melanogaster. It facilitates creating digital descriptions of spatial patterns in images and enables measurements of pattern similarity and visualization of expression across genes and developmental stages. myFX interacts directly with the online FlyExpress database, which allows users to search thousands of existing patterns to find co-expressed genes by image comparison.


Assuntos
Drosophila melanogaster/genética , Perfilação da Expressão Gênica/métodos , Regulação da Expressão Gênica no Desenvolvimento , Animais , Drosophila melanogaster/embriologia , Expressão Gênica , Software
7.
Neuroimage ; 102 Pt 1: 192-206, 2014 Nov 15.
Artigo em Inglês | MEDLINE | ID: mdl-23988272

RESUMO

Bio-imaging technologies allow scientists to collect large amounts of high-dimensional data from multiple heterogeneous sources for many biomedical applications. In the study of Alzheimer's Disease (AD), neuroimaging data, gene/protein expression data, etc., are often analyzed together to improve predictive power. Joint learning from multiple complementary data sources is advantageous, but feature-pruning and data source selection are critical to learn interpretable models from high-dimensional data. Often, the data collected has block-wise missing entries. In the Alzheimer's Disease Neuroimaging Initiative (ADNI), most subjects have MRI and genetic information, but only half have cerebrospinal fluid (CSF) measures, a different half has FDG-PET; only some have proteomic data. Here we propose how to effectively integrate information from multiple heterogeneous data sources when data is block-wise missing. We present a unified "bi-level" learning model for complete multi-source data, and extend it to incomplete data. Our major contributions are: (1) our proposed models unify feature-level and source-level analysis, including several existing feature learning approaches as special cases; (2) the model for incomplete data avoids imputing missing data and offers superior performance; it generalizes to other applications with block-wise missing data sources; (3) we present efficient optimization algorithms for modeling complete and incomplete data. We comprehensively evaluate the proposed models including all ADNI subjects with at least one of four data types at baseline: MRI, FDG-PET, CSF and proteomics. Our proposed models compare favorably with existing approaches.


Assuntos
Doença de Alzheimer/diagnóstico , Mineração de Dados , Neuroimagem/estatística & dados numéricos , Algoritmos , Doença de Alzheimer/líquido cefalorraquidiano , Humanos , Imageamento por Ressonância Magnética , Tomografia por Emissão de Pósitrons , Proteômica
8.
Neuroimage ; 87: 220-41, 2014 Feb 15.
Artigo em Inglês | MEDLINE | ID: mdl-24176869

RESUMO

Many neuroimaging applications deal with imbalanced imaging data. For example, in Alzheimer's Disease Neuroimaging Initiative (ADNI) dataset, the mild cognitive impairment (MCI) cases eligible for the study are nearly two times the Alzheimer's disease (AD) patients for structural magnetic resonance imaging (MRI) modality and six times the control cases for proteomics modality. Constructing an accurate classifier from imbalanced data is a challenging task. Traditional classifiers that aim to maximize the overall prediction accuracy tend to classify all data into the majority class. In this paper, we study an ensemble system of feature selection and data sampling for the class imbalance problem. We systematically analyze various sampling techniques by examining the efficacy of different rates and types of undersampling, oversampling, and a combination of over and undersampling approaches. We thoroughly examine six widely used feature selection algorithms to identify significant biomarkers and thereby reduce the complexity of the data. The efficacy of the ensemble techniques is evaluated using two different classifiers including Random Forest and Support Vector Machines based on classification accuracy, area under the receiver operating characteristic curve (AUC), sensitivity, and specificity measures. Our extensive experimental results show that for various problem settings in ADNI, (1) a balanced training set obtained with K-Medoids technique based undersampling gives the best overall performance among different data sampling techniques and no sampling approach; and (2) sparse logistic regression with stability selection achieves competitive performance among various feature selection algorithms. Comprehensive experiments with various settings show that our proposed ensemble model of multiple undersampled datasets yields stable and promising results.


Assuntos
Doença de Alzheimer , Biomarcadores/análise , Projetos de Pesquisa , Máquina de Vetores de Suporte , Idoso , Doença de Alzheimer/metabolismo , Doença de Alzheimer/patologia , Área Sob a Curva , Encéfalo/patologia , Disfunção Cognitiva/metabolismo , Disfunção Cognitiva/patologia , Feminino , Humanos , Masculino , Curva ROC
9.
IEEE Trans Image Process ; 33: 1683-1698, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-38416621

RESUMO

Image restoration under adverse weather conditions (e.g., rain, snow, and haze) is a fundamental computer vision problem that has important implications for various downstream applications. Distinct from early methods that are specially designed for specific types of weather, recent works tend to simultaneously remove various adverse weather effects based on either spatial feature representation learning or semantic information embedding. Inspired by various successful applications incorporating large-scale pre-trained models (e.g., CLIP), in this paper, we explore their potential benefits for leveraging large-scale pre-trained models in this task based on both spatial feature representation learning and semantic information embedding aspects: 1) spatial feature representation learning, we design a Spatially Adaptive Residual (SAR) encoder to adaptively extract degraded areas. To facilitate training of this model, we propose a Soft Residual Distillation (CLIP-SRD) strategy to transfer spatial knowledge from CLIP between clean and adverse weather images; 2) semantic information embedding, we propose a CLIP Weather Prior (CWP) embedding module to enable the network to adaptively respond to different weather conditions. This module integrates the sample-specific weather priors extracted by the CLIP image encoder with the distribution-specific information (as learned by a set of parameters) and embeds these elements using a cross-attention mechanism. Extensive experiments demonstrate that our proposed method can achieve state-of-the-art performance under various and severe adverse weather conditions. The code will be made available.

10.
BMC Bioinformatics ; 14: 350, 2013 Dec 03.
Artigo em Inglês | MEDLINE | ID: mdl-24299119

RESUMO

BACKGROUND: Drosophila melanogaster has been established as a model organism for investigating the developmental gene interactions. The spatio-temporal gene expression patterns of Drosophila melanogaster can be visualized by in situ hybridization and documented as digital images. Automated and efficient tools for analyzing these expression images will provide biological insights into the gene functions, interactions, and networks. To facilitate pattern recognition and comparison, many web-based resources have been created to conduct comparative analysis based on the body part keywords and the associated images. With the fast accumulation of images from high-throughput techniques, manual inspection of images will impose a serious impediment on the pace of biological discovery. It is thus imperative to design an automated system for efficient image annotation and comparison. RESULTS: We present a computational framework to perform anatomical keywords annotation for Drosophila gene expression images. The spatial sparse coding approach is used to represent local patches of images in comparison with the well-known bag-of-words (BoW) method. Three pooling functions including max pooling, average pooling and Sqrt (square root of mean squared statistics) pooling are employed to transform the sparse codes to image features. Based on the constructed features, we develop both an image-level scheme and a group-level scheme to tackle the key challenges in annotating Drosophila gene expression pattern images automatically. To deal with the imbalanced data distribution inherent in image annotation tasks, the undersampling method is applied together with majority vote. Results on Drosophila embryonic expression pattern images verify the efficacy of our approach. CONCLUSION: In our experiment, the three pooling functions perform comparably well in feature dimension reduction. The undersampling with majority vote is shown to be effective in tackling the problem of imbalanced data. Moreover, combining sparse coding and image-level scheme leads to consistent performance improvement in keywords annotation.


Assuntos
Drosophila melanogaster/citologia , Drosophila melanogaster/genética , Regulação da Expressão Gênica no Desenvolvimento , Genoma de Inseto/genética , Modelos Genéticos , Anotação de Sequência Molecular/métodos , Animais , Diferenciação Celular/genética , Divisão Celular/genética , Biologia Computacional/classificação , Biologia Computacional/métodos , Drosophila melanogaster/embriologia , Perfilação da Expressão Gênica/classificação , Perfilação da Expressão Gênica/métodos , Ensaios de Triagem em Larga Escala , Anotação de Sequência Molecular/classificação , Valor Preditivo dos Testes , Máquina de Vetores de Suporte
11.
Neuroimage ; 78: 233-48, 2013 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-23583359

RESUMO

Alzheimer's disease (AD), the most common type of dementia, is a severe neurodegenerative disorder. Identifying biomarkers that can track the progress of the disease has recently received increasing attentions in AD research. An accurate prediction of disease progression would facilitate optimal decision-making for clinicians and patients. A definitive diagnosis of AD requires autopsy confirmation, thus many clinical/cognitive measures including Mini Mental State Examination (MMSE) and Alzheimer's Disease Assessment Scale cognitive subscale (ADAS-Cog) have been designed to evaluate the cognitive status of the patients and used as important criteria for clinical diagnosis of probable AD. In this paper, we consider the problem of predicting disease progression measured by the cognitive scores and selecting biomarkers predictive of the progression. Specifically, we formulate the prediction problem as a multi-task regression problem by considering the prediction at each time point as a task and propose two novel multi-task learning formulations. We have performed extensive experiments using data from the Alzheimer's Disease Neuroimaging Initiative (ADNI). Specifically, we use the baseline MRI features to predict MMSE/ADAS-Cog scores in the next 4 years. Results demonstrate the effectiveness of the proposed multi-task learning formulations for disease progression in comparison with single-task learning algorithms including ridge regression and Lasso. We also perform longitudinal stability selection to identify and analyze the temporal patterns of biomarkers in disease progression. We observe that cortical thickness average of left middle temporal, cortical thickness average of left and right Entorhinal, and white matter volume of left Hippocampus play significant roles in predicting ADAS-Cog at all time points. We also observe that several MRI biomarkers provide significant information for predicting MMSE scores for the first 2 years, however very few are shown to be significant in predicting MMSE score at later stages. The lack of predictable MRI biomarkers in later stages may contribute to the lower prediction performance of MMSE than that of ADAS-Cog in our study and other related studies.


Assuntos
Algoritmos , Doença de Alzheimer/patologia , Inteligência Artificial , Idoso , Progressão da Doença , Feminino , Humanos , Imageamento por Ressonância Magnética , Masculino , Análise de Regressão
12.
Neuroimage ; 74: 209-30, 2013 Jul 01.
Artigo em Inglês | MEDLINE | ID: mdl-23435208

RESUMO

Many methods have been proposed for computer-assisted diagnostic classification. Full tensor information and machine learning with 3D maps derived from brain images may help detect subtle differences or classify subjects into different groups. Here we develop a new approach to apply tensor-based morphometry to parametric surface models for diagnostic classification. We use this approach to identify cortical surface features for use in diagnostic classifiers. First, with holomorphic 1-forms, we compute an efficient and accurate conformal mapping from a multiply connected mesh to the so-called slit domain. Next, the surface parameterization approach provides a natural way to register anatomical surfaces across subjects using a constrained harmonic map. To analyze anatomical differences, we then analyze the full Riemannian surface metric tensors, which retain multivariate information on local surface geometry. As the number of voxels in a 3D image is large, sparse learning is a promising method to select a subset of imaging features and to improve classification accuracy. Focusing on vertices with greatest effect sizes, we train a diagnostic classifier using the surface features selected by an L1-norm based sparse learning method. Stability selection is applied to validate the selected feature sets. We tested the algorithm on MRI-derived cortical surfaces from 42 subjects with genetically confirmed Williams syndrome and 40 age-matched controls, multivariate statistics on the local tensors gave greater effect sizes for detecting group differences relative to other TBM-based statistics including analysis of the Jacobian determinant and the largest eigenvalue of the surface metric. Our method also gave reasonable classification results relative to the Jacobian determinant, the pair of eigenvalues of the Jacobian matrix and volume features. This analysis pipeline may boost the power of morphometry studies, and may assist with image-based classification.


Assuntos
Mapeamento Encefálico/métodos , Encéfalo/patologia , Interpretação de Imagem Assistida por Computador/métodos , Imageamento Tridimensional/métodos , Imageamento por Ressonância Magnética/métodos , Humanos , Síndrome de Williams/patologia
13.
Bioinformatics ; 28(21): 2847-8, 2012 Nov 01.
Artigo em Inglês | MEDLINE | ID: mdl-22923306

RESUMO

UNLABELLED: Mobile technologies provide unique opportunities for ubiquitous distribution of scientific information through user-friendly interfaces. Therefore, we have developed a new FlyExpress mobile application that makes available a growing collection (>100 000) of standardized in situ hybridization images containing spatial patterns of gene expression from Drosophila melanogaster (fruit fly) embryogenesis. Using this application, scientists can visualize and compare expression patterns of >4000 developmentally relevant genes. The FlyExpress app displays the expression patterns of the selected gene for different visual projections (e.g. lateral) and displays them according to their developmental stages, which shows a gene's progression of spatial expression over developmental time. Ultimately, we envision the use of FlyExpress app in the laboratory where scientists may wish to immediately conduct a visual comparison of a known expression pattern with the one observed on the bench top or to display expression patterns of interest during scientific discussions at large. AVAILABILITY: Search "FlyExpress" on the Apple iTunes store.


Assuntos
Telefone Celular , Drosophila melanogaster/embriologia , Drosophila melanogaster/genética , Desenvolvimento Embrionário/genética , Perfilação da Expressão Gênica/métodos , Regulação da Expressão Gênica no Desenvolvimento/genética , Armazenamento e Recuperação da Informação/métodos , Algoritmos , Animais , Padronização Corporal/genética , Apresentação de Dados , Família Multigênica/genética , Interface Usuário-Computador
14.
IEEE Trans Pattern Anal Mach Intell ; 45(12): 15260-15274, 2023 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-37725727

RESUMO

In reinforcement learning, a promising direction to avoid online trial-and-error costs is learning from an offline dataset. Current offline reinforcement learning methods commonly learn in the policy space constrained to in-support regions by the offline dataset, in order to ensure the robustness of the outcome policies. Such constraints, however, also limit the potential of the outcome policies. In this paper, to release the potential of offline policy learning, we investigate the decision-making problems in out-of-support regions directly and propose offline Model-based Adaptable Policy LEarning (MAPLE). By this approach, instead of learning in in-support regions, we learn an adaptable policy that can adapt its behavior in out-of-support regions when deployed. We give a practical implementation of MAPLE via meta-learning techniques and ensemble model learning techniques. We conduct experiments on MuJoCo locomotion tasks with offline datasets. The results show that the proposed method can make robust decisions in out-of-support regions and achieve better performance than SOTA algorithms.

15.
BMC Bioinformatics ; 13: 107, 2012 May 23.
Artigo em Inglês | MEDLINE | ID: mdl-22621237

RESUMO

BACKGROUND: Fruit fly embryogenesis is one of the best understood animal development systems, and the spatiotemporal gene expression dynamics in this process are captured by digital images. Analysis of these high-throughput images will provide novel insights into the functions, interactions, and networks of animal genes governing development. To facilitate comparative analysis, web-based interfaces have been developed to conduct image retrieval based on body part keywords and images. Currently, the keyword annotation of spatiotemporal gene expression patterns is conducted manually. However, this manual practice does not scale with the continuously expanding collection of images. In addition, existing image retrieval systems based on the expression patterns may be made more accurate using keywords. RESULTS: In this article, we adapt advanced data mining and computer vision techniques to address the key challenges in annotating and retrieving fruit fly gene expression pattern images. To boost the performance of image annotation and retrieval, we propose representations integrating spatial information and sparse features, overcoming the limitations of prior schemes. CONCLUSIONS: We perform systematic experimental studies to evaluate the proposed schemes in comparison with current methods. Experimental results indicate that the integration of spatial information and sparse features lead to consistent performance improvement in image annotation, while for the task of retrieval, sparse features alone yields better results.


Assuntos
Drosophila melanogaster/genética , Perfilação da Expressão Gênica/métodos , Anotação de Sequência Molecular , Reconhecimento Automatizado de Padrão , Software , Animais , Mineração de Dados , Bases de Dados Factuais , Drosophila melanogaster/embriologia , Internet , Máquina de Vetores de Suporte
16.
Neuroimage ; 61(3): 622-32, 2012 Jul 02.
Artigo em Inglês | MEDLINE | ID: mdl-22498655

RESUMO

Analysis of incomplete data is a big challenge when integrating large-scale brain imaging datasets from different imaging modalities. In the Alzheimer's Disease Neuroimaging Initiative (ADNI), for example, over half of the subjects lack cerebrospinal fluid (CSF) measurements; an independent half of the subjects do not have fluorodeoxyglucose positron emission tomography (FDG-PET) scans; many lack proteomics measurements. Traditionally, subjects with missing measures are discarded, resulting in a severe loss of available information. In this paper, we address this problem by proposing an incomplete Multi-Source Feature (iMSF) learning method where all the samples (with at least one available data source) can be used. To illustrate the proposed approach, we classify patients from the ADNI study into groups with Alzheimer's disease (AD), mild cognitive impairment (MCI) and normal controls, based on the multi-modality data. At baseline, ADNI's 780 participants (172AD, 397 MCI, 211 NC), have at least one of four data types: magnetic resonance imaging (MRI), FDG-PET, CSF and proteomics. These data are used to test our algorithm. Depending on the problem being solved, we divide our samples according to the availability of data sources, and we learn shared sets of features with state-of-the-art sparse learning methods. To build a practical and robust system, we construct a classifier ensemble by combining our method with four other methods for missing value estimation. Comprehensive experiments with various parameters show that our proposed iMSF method and the ensemble model yield stable and promising results.


Assuntos
Inteligência Artificial , Processamento de Imagem Assistida por Computador/métodos , Neuroimagem/métodos , Idoso , Algoritmos , Doença de Alzheimer/líquido cefalorraquidiano , Doença de Alzheimer/patologia , Disfunção Cognitiva/líquido cefalorraquidiano , Disfunção Cognitiva/patologia , Bases de Dados Factuais , Feminino , Fluordesoxiglucose F18 , Humanos , Processamento de Imagem Assistida por Computador/estatística & dados numéricos , Imageamento por Ressonância Magnética , Masculino , Pessoa de Meia-Idade , Neuroimagem/instrumentação , Tomografia por Emissão de Pósitrons , Proteômica , Compostos Radiofarmacêuticos
18.
Bioinformatics ; 27(23): 3319-20, 2011 Dec 01.
Artigo em Inglês | MEDLINE | ID: mdl-21994220

RESUMO

SUMMARY: Images containing spatial expression patterns illuminate the roles of different genes during embryogenesis. In order to generate initial clues to regulatory interactions, biologists frequently need to know the set of genes expressed at the same time at specific locations in a developing embryo, as well as related research publications. However, text-based mining of image annotations and research articles cannot produce all relevant results, because the primary data are images that exist as graphical objects. We have developed a unique knowledge base (FlyExpress) to facilitate visual mining of images from Drosophila melanogaster embryogenesis. By clicking on specific locations in pictures of fly embryos from different stages of development and different visual projections, users can produce a list of genes and publications instantly. In FlyExpress, each queryable embryo picture is a heat-map that captures the expression patterns of more than 4500 genes and more than 2600 published articles. In addition, one can view spatial patterns for particular genes over time as well as find other genes with similar expression patterns at a given developmental stage. Therefore, FlyExpress is a unique tool for mining spatiotemporal expression patterns in a format readily accessible to the scientific community. AVAILABILITY: http://www.flyexpress.net CONTACT: s.kumar@asu.edu.


Assuntos
Proteínas de Drosophila/genética , Drosophila melanogaster/embriologia , Drosophila melanogaster/genética , Regulação da Expressão Gênica no Desenvolvimento , Animais , Recursos Audiovisuais , Mineração de Dados , Proteínas de Drosophila/metabolismo , Drosophila melanogaster/metabolismo , Desenvolvimento Embrionário , Perfilação da Expressão Gênica
19.
BMC Neurol ; 12: 46, 2012 Jun 25.
Artigo em Inglês | MEDLINE | ID: mdl-22731740

RESUMO

BACKGROUND: Patients with Mild Cognitive Impairment (MCI) are at high risk of progression to Alzheimer's dementia. Identifying MCI individuals with high likelihood of conversion to dementia and the associated biosignatures has recently received increasing attention in AD research. Different biosignatures for AD (neuroimaging, demographic, genetic and cognitive measures) may contain complementary information for diagnosis and prognosis of AD. METHODS: We have conducted a comprehensive study using a large number of samples from the Alzheimer's Disease Neuroimaging Initiative (ADNI) to test the power of integrating various baseline data for predicting the conversion from MCI to probable AD and identifying a small subset of biosignatures for the prediction and assess the relative importance of different modalities in predicting MCI to AD conversion. We have employed sparse logistic regression with stability selection for the integration and selection of potential predictors. Our study differs from many of the other ones in three important respects: (1) we use a large cohort of MCI samples that are unbiased with respect to age or education status between case and controls (2) we integrate and test various types of baseline data available in ADNI including MRI, demographic, genetic and cognitive measures and (3) we apply sparse logistic regression with stability selection to ADNI data for robust feature selection. RESULTS: We have used 319 MCI subjects from ADNI that had MRI measurements at the baseline and passed quality control, including 177 MCI Non-converters and 142 MCI Converters. Conversion was considered over the course of a 4-year follow-up period. A combination of 15 features (predictors) including those from MRI scans, APOE genotyping, and cognitive measures achieves the best prediction with an AUC score of 0.8587. CONCLUSIONS: Our results demonstrate the power of integrating various baseline data for prediction of the conversion from MCI to probable AD. Our results also demonstrate the effectiveness of stability selection for feature selection in the context of sparse logistic regression.


Assuntos
Doença de Alzheimer/diagnóstico , Doença de Alzheimer/etiologia , Disfunção Cognitiva/complicações , Disfunção Cognitiva/diagnóstico , Sistemas de Apoio a Decisões Clínicas , Diagnóstico por Computador/métodos , Idoso , Algoritmos , Inteligência Artificial , Feminino , Humanos , Masculino , Modelos de Riscos Proporcionais , Reprodutibilidade dos Testes , Sensibilidade e Especificidade
20.
IIE Trans ; 44(11): 915-931, 2012 Nov 01.
Artigo em Inglês | MEDLINE | ID: mdl-24526804

RESUMO

Networks models have been widely used in many domains to characterize the interacting relationship between physical entities. A typical problem faced is to identify the networks of multiple related tasks that share some similarities. In this case, a transfer learning approach that can leverage the knowledge gained during the modeling of one task to help better model another task is highly desirable. In this paper, we propose a transfer learning approach, which adopts a Bayesian hierarchical model framework to characterize task relatedness and additionally uses the L1-regularization to ensure robust learning of the networks with limited sample sizes. A method based on the Expectation-Maximization (EM) algorithm is further developed to learn the networks from data. Simulation studies are performed, which demonstrate the superiority of the proposed transfer learning approach over single task learning that learns the network of each task in isolation. The proposed approach is also applied to identification of brain connectivity networks of Alzheimer's disease (AD) from functional magnetic resonance image (fMRI) data. The findings are consistent with the AD literature.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA