Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 2 de 2
Filtrar
Mais filtros

Base de dados
Ano de publicação
Tipo de documento
Intervalo de ano de publicação
1.
BMC Med Res Methodol ; 23(1): 144, 2023 06 19.
Artigo em Inglês | MEDLINE | ID: mdl-37337173

RESUMO

BACKGROUND: Machine learning tools such as random forests provide important opportunities for modeling large, complex modern data generated in medicine. Unfortunately, when it comes to understanding why machine learning models are predictive, applied research continues to rely on 'out of bag' (OOB) variable importance metrics (VIMPs) that are known to have considerable shortcomings within the statistics community. After explaining the limitations of OOB VIMPs - including bias towards correlated features and limited interpretability - we describe a modern approach called 'knockoff VIMPs' and explain its advantages. METHODS: We first evaluate current VIMP practices through an in-depth literature review of 50 recent random forest manuscripts. Next, we recommend organized and interpretable strategies for analysis with knockoff VIMPs, including computing them for groups of features and considering multiple model performance metrics. To demonstrate methods, we develop a random forest to predict 5-year incident stroke in the Sleep Heart Health Study and compare results based on OOB and knockoff VIMPs. RESULTS: Nearly all papers in the literature review contained substantial limitations in their use of VIMPs. In our demonstration, using OOB VIMPs for individual variables suggested two highly correlated lung function variables (forced expiratory volume, forced vital capacity) as the best predictors of incident stroke, followed by age and height. Using an organized analytic approach that considered knockoff VIMPs of both groups of features and individual features, the largest contributions to model sensitivity were medications (especially cardiovascular) and measured medical risk factors, while the largest contributions to model specificity were age, diastolic blood pressure, self-reported medical risk factors, polysomnography features, and pack-years of smoking. Thus, we reach very different conclusions about stroke risk factors using OOB VIMPs versus knockoff VIMPs. CONCLUSIONS: The near-ubiquitous reliance on OOB VIMPs may provide misleading results for researchers who use such methods to guide their research. Given the rapid pace of scientific inquiry using machine learning, it is essential to bring modern knockoff VIMPs that are interpretable and unbiased into widespread applied practice to steer researchers using random forest machine learning toward more meaningful results.


Assuntos
Algoritmo Florestas Aleatórias , Acidente Vascular Cerebral , Humanos , Benchmarking , Aprendizado de Máquina , Acidente Vascular Cerebral/diagnóstico , Acidente Vascular Cerebral/epidemiologia , Sono
2.
Elife ; 2: e00426, 2013 Apr 09.
Artigo em Inglês | MEDLINE | ID: mdl-23580255

RESUMO

Genetic and molecular approaches have been critical for elucidating the mechanism of the mammalian circadian clock. Here, we demonstrate that the ClockΔ19 mutant behavioral phenotype is significantly modified by mouse strain genetic background. We map a suppressor of the ClockΔ19 mutation to a ∼900 kb interval on mouse chromosome 1 and identify the transcription factor, Usf1, as the responsible gene. A SNP in the promoter of Usf1 causes elevation of its transcript and protein in strains that suppress the Clock mutant phenotype. USF1 competes with the CLOCK:BMAL1 complex for binding to E-box sites in target genes. Saturation binding experiments demonstrate reduced affinity of the CLOCKΔ19:BMAL1 complex for E-box sites, thereby permitting increased USF1 occupancy on a genome-wide basis. We propose that USF1 is an important modulator of molecular and behavioral circadian rhythms in mammals. DOI:http://dx.doi.org/10.7554/eLife.00426.001.


Assuntos
Fatores de Transcrição ARNTL/metabolismo , Proteínas CLOCK/metabolismo , Relógios Circadianos , Ritmo Circadiano , DNA/metabolismo , Mutação , Fatores Estimuladores Upstream/metabolismo , Fatores de Transcrição ARNTL/genética , Animais , Sítios de Ligação , Ligação Competitiva , Proteínas CLOCK/genética , Relógios Circadianos/genética , Ritmo Circadiano/genética , Elementos E-Box , Regulação da Expressão Gênica , Genótipo , Camundongos , Camundongos Endogâmicos BALB C , Camundongos Endogâmicos C57BL , Camundongos Transgênicos , Fenótipo , Polimorfismo de Nucleotídeo Único , Regiões Promotoras Genéticas , Domínios e Motivos de Interação entre Proteínas , RNA Mensageiro/metabolismo , Transdução de Sinais , Especificidade da Espécie , Fatores de Tempo , Transcrição Gênica , Ativação Transcricional , Fatores Estimuladores Upstream/genética
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA