Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 16 de 16
Filtrar
Mais filtros











Base de dados
Intervalo de ano de publicação
1.
Molecules ; 28(11)2023 May 27.
Artigo em Inglês | MEDLINE | ID: mdl-37298865

RESUMO

A short peptide, FHHF-11, was designed to change stiffness as a function of pH due to changing degree of protonation of histidines. As pH changes in the physiologically relevant range, G' was measured at 0 Pa (pH 6) and 50,000 Pa (pH 8). This peptide-based hydrogel is antimicrobial and cytocompatible with skin cells (fibroblasts). It was demonstrated that the incorporation of unnatural AzAla tryptophan analog residue improves the antimicrobial properties of the hydrogel. The material developed can have a practical application and be a paradigm shift in the approach to wound treatment, and it will improve healing outcomes for millions of patients each year.


Assuntos
Hidrogéis , Pele , Humanos , Hidrogéis/farmacologia , Hidrogéis/química , Peptídeos/farmacologia , Antibacterianos/química , Concentração de Íons de Hidrogênio
2.
Cancers (Basel) ; 15(3)2023 Jan 26.
Artigo em Inglês | MEDLINE | ID: mdl-36765713

RESUMO

BACKGROUND: Chemotherapy-induced peripheral neuropathy (CIPN) is a common therapeutic complication affecting cancer patients' quality-of-life. We evaluated clinical characteristics, demographics, and lifestyle factors in association with CIPN following taxane treatment. METHODS: Data were extracted from the electronic health record of 3387 patients diagnosed with a primary cancer and receiving taxane (i.e., paclitaxel or docetaxel) at Vanderbilt University Medical Center. Neuropathy was assessed via a validated computer algorithm. Univariate and multivariate regression models were applied to evaluate odds ratios (ORs) and 95% confidence intervals (CIs) of CIPN-associated factors. RESULTS: Female sex (OR = 1.28, 95% CI = 1.01-1.62), high body-mass index (BMI) (OR = 1.31, 95% CI = 1.06-1.61 for overweight, and OR = 1.49, 95% CI = 1.21-1.83 for obesity), diabetes (OR = 1.66, 95% CI = 1.34-2.06), high mean taxane dose (OR = 1.05, 95% CI = 1.03-1.08 per 10 mg/m2), and more treatment cycles (1.12, 95% CI = 1.10-1.14) were positively associated with CIPN. Concurrent chemotherapy (OR = 0.74, 95% CI = 0.58-0.94) and concurrent radiotherapy (OR = 0.77, 95% CI = 0.59-1.00) were inversely associated with CIPN. Obesity and diabetes both had a stronger association with docetaxel CIPN compared to paclitaxel, although interaction was only significant for diabetes and taxane (p = 0.019). Increased BMI was associated with CIPN only among non-diabetic patients (OR:1.34 for overweight and 1.68 for obesity), while diabetes increased CIPN risk across all BMI strata (ORs were 2.65, 2.41, and 2.15 for normal weight, overweight, and obese, respectively) compared to normal-weight non-diabetic patients (p for interaction = 0.039). CONCLUSIONS: Female sex, obesity, and diabetes are significantly associated with taxine-induced CIPN. Further research is needed to identify clinical and pharmacologic strategies to prevent and mitigate CIPN in at-risk patient populations.

3.
Patterns (N Y) ; 3(8): 100570, 2022 Aug 12.
Artigo em Inglês | MEDLINE | ID: mdl-36033590

RESUMO

The All of Us Research Program seeks to engage at least one million diverse participants to advance precision medicine and improve human health. We describe here the cloud-based Researcher Workbench that uses a data passport model to democratize access to analytical tools and participant information including survey, physical measurement, and electronic health record (EHR) data. We also present validation study findings for several common complex diseases to demonstrate use of this novel platform in 315,000 participants, 78% of whom are from groups historically underrepresented in biomedical research, including 49% self-reporting non-White races. Replication findings include medication usage pattern differences by race in depression and type 2 diabetes, validation of known cancer associations with smoking, and calculation of cardiovascular risk scores by reported race effects. The cloud-based Researcher Workbench represents an important advance in enabling secure access for a broad range of researchers to this large resource and analytical tools.

4.
J Am Med Inform Assoc ; 29(7): 1131-1141, 2022 06 14.
Artigo em Inglês | MEDLINE | ID: mdl-35396991

RESUMO

OBJECTIVE: A participant's medical history is important in clinical research and can be captured from electronic health records (EHRs) and self-reported surveys. Both can be incomplete, EHR due to documentation gaps or lack of interoperability and surveys due to recall bias or limited health literacy. This analysis compares medical history collected in the All of Us Research Program through both surveys and EHRs. MATERIALS AND METHODS: The All of Us medical history survey includes self-report questionnaire that asks about diagnoses to over 150 medical conditions organized into 12 disease categories. In each category, we identified the 3 most and least frequent self-reported diagnoses and retrieved their analogues from EHRs. We calculated agreement scores and extracted participant demographic characteristics for each comparison set. RESULTS: The 4th All of Us dataset release includes data from 314 994 participants; 28.3% of whom completed medical history surveys, and 65.5% of whom had EHR data. Hearing and vision category within the survey had the highest number of responses, but the second lowest positive agreement with the EHR (0.21). The Infectious disease category had the lowest positive agreement (0.12). Cancer conditions had the highest positive agreement (0.45) between the 2 data sources. DISCUSSION AND CONCLUSION: Our study quantified the agreement of medical history between 2 sources-EHRs and self-reported surveys. Conditions that are usually undocumented in EHRs had low agreement scores, demonstrating that survey data can supplement EHR data. Disagreement between EHR and survey can help identify possible missing records and guide researchers to adjust for biases.


Assuntos
Registros Eletrônicos de Saúde , Saúde da População , Documentação , Humanos , Armazenamento e Recuperação da Informação , Inquéritos e Questionários
5.
Cancer Med ; 10(19): 6767-6776, 2021 10.
Artigo em Inglês | MEDLINE | ID: mdl-34547180

RESUMO

BACKGROUND: Large interindividual variations have been reported in chemotherapy-induced toxicities. Little is known whether racial disparities exist in neutropenia associated with taxanes. METHODS: Patients with a diagnosis of primary cancer who underwent chemotherapy with taxanes were identified from Vanderbilt University Medical Center's Synthetic Derivative. Multinomial regression models were applied to evaluate odds ratios (ORs) and 95% confidence intervals (CIs) of neutropenia associated with race, with adjustments for demographic variables, baseline neutrophil count, chemotherapy-related information, prior treatments, and cancer site. RESULTS: A total of 3492 patients were included in the study. Compared with White patients, grade 2 or higher neutropenia was more frequently recorded among Black patients who received taxanes overall (42.2% vs. 32.7%, p < 0.001) or paclitaxel (43.0% vs. 36.7%, p < 0.001) but not among those who received docetaxel (32.0% vs. 30.2%, p = 0.821). After adjustments for multiple covariates, Black patients who received chemotherapy with any taxanes had significantly higher risk of grade 2 (OR = 1.53; 95% CI = 1.09-2.14) and grade 3 (OR = 1.91; 95% CI = 1.36-2.67) neutropenia but comparable risk of grade 4 neutropenia (OR = 1.19; 95% CI = 0.79-1.79). Similar association patterns were observed for Black patients who specifically received paclitaxel, but a null association was found for those treated with docetaxel. CONCLUSION: Black cancer patients treated with taxanes for any cancer had a higher risk of neutropenia compared with their White counterparts, especially those who received paclitaxel. More research is needed to understand the mechanism(s) underlying this racial disparity in order to enhance the delivery of patient-centered oncology.


Assuntos
Disparidades em Assistência à Saúde/tendências , Neoplasias/sangue , Neutropenia/induzido quimicamente , Taxoides/efeitos adversos , Idoso , Feminino , Humanos , Masculino , Pessoa de Meia-Idade , Neoplasias/tratamento farmacológico , Neoplasias/patologia , Fatores Raciais
6.
Kidney Int ; 99(4): 926-939, 2021 04.
Artigo em Inglês | MEDLINE | ID: mdl-33137338

RESUMO

Rapid decline of glomerular filtration rate estimated from creatinine (eGFRcrea) is associated with severe clinical endpoints. In contrast to cross-sectionally assessed eGFRcrea, the genetic basis for rapid eGFRcrea decline is largely unknown. To help define this, we meta-analyzed 42 genome-wide association studies from the Chronic Kidney Diseases Genetics Consortium and United Kingdom Biobank to identify genetic loci for rapid eGFRcrea decline. Two definitions of eGFRcrea decline were used: 3 mL/min/1.73m2/year or more ("Rapid3"; encompassing 34,874 cases, 107,090 controls) and eGFRcrea decline 25% or more and eGFRcrea under 60 mL/min/1.73m2 at follow-up among those with eGFRcrea 60 mL/min/1.73m2 or more at baseline ("CKDi25"; encompassing 19,901 cases, 175,244 controls). Seven independent variants were identified across six loci for Rapid3 and/or CKDi25: consisting of five variants at four loci with genome-wide significance (near UMOD-PDILT (2), PRKAG2, WDR72, OR2S2) and two variants among 265 known eGFRcrea variants (near GATM, LARP4B). All these loci were novel for Rapid3 and/or CKDi25 and our bioinformatic follow-up prioritized variants and genes underneath these loci. The OR2S2 locus is novel for any eGFRcrea trait including interesting candidates. For the five genome-wide significant lead variants, we found supporting effects for annual change in blood urea nitrogen or cystatin-based eGFR, but not for GATM or LARP4B. Individuals at high compared to those at low genetic risk (8-14 vs. 0-5 adverse alleles) had a 1.20-fold increased risk of acute kidney injury (95% confidence interval 1.08-1.33). Thus, our identified loci for rapid kidney function decline may help prioritize therapeutic targets and identify mechanisms and individuals at risk for sustained deterioration of kidney function.


Assuntos
Estudo de Associação Genômica Ampla , Rim , Proteínas Quinases Ativadas por AMP , Creatinina , Taxa de Filtração Glomerular/genética , Humanos , Isomerases de Dissulfetos de Proteínas , Reino Unido
7.
World J Surg ; 44(1): 84-94, 2020 01.
Artigo em Inglês | MEDLINE | ID: mdl-31605180

RESUMO

BACKGROUND: The extent to which obesity and genetics determine postoperative complications is incompletely understood. METHODS: We performed a retrospective study using two population cohorts with electronic health record (EHR) data. The first included 736,726 adults with body mass index (BMI) recorded between 1990 and 2017 at Vanderbilt University Medical Center. The second cohort consisted of 65,174 individuals from 12 institutions contributing EHR and genome-wide genotyping data to the Electronic Medical Records and Genomics (eMERGE) Network. Pairwise logistic regression analyses were used to measure the association of BMI categories with postoperative complications derived from International Classification of Disease-9 codes, including postoperative infection, incisional hernia, and intestinal obstruction. A genetic risk score was constructed from 97 obesity-risk single-nucleotide polymorphisms for a Mendelian randomization study to determine the association of genetic risk of obesity on postoperative complications. Logistic regression analyses were adjusted for sex, age, site, and race/principal components. RESULTS: Individuals with overweight or obese BMI (≥25 kg/m2) had increased risk of incisional hernia (odds ratio [OR] 1.7-5.5, p < 3.1 × 10-20), and people with obesity (BMI ≥ 30 kg/m2) had increased risk of postoperative infection (OR 1.2-2.3, p < 2.5 × 10-5). In the eMERGE cohort, genetically predicted BMI was associated with incisional hernia (OR 2.1 [95% CI 1.8-2.5], p = 1.4 × 10-6) and postoperative infection (OR 1.6 [95% CI 1.4-1.9], p = 3.1 × 10-6). Association findings were similar after limitation of the cohorts to those who underwent abdominal procedures. CONCLUSIONS: Clinical and Mendelian randomization studies suggest that obesity, as measured by BMI, is associated with the development of postoperative incisional hernia and infection.


Assuntos
Análise da Randomização Mendeliana/métodos , Obesidade/complicações , Complicações Pós-Operatórias/genética , Adulto , Índice de Massa Corporal , Feminino , Humanos , Modelos Logísticos , Masculino , Pessoa de Meia-Idade , Polimorfismo de Nucleotídeo Único , Complicações Pós-Operatórias/etiologia , Estudos Retrospectivos , Fatores de Risco
8.
Nat Genet ; 51(10): 1459-1474, 2019 10.
Artigo em Inglês | MEDLINE | ID: mdl-31578528

RESUMO

Elevated serum urate levels cause gout and correlate with cardiometabolic diseases via poorly understood mechanisms. We performed a trans-ancestry genome-wide association study of serum urate in 457,690 individuals, identifying 183 loci (147 previously unknown) that improve the prediction of gout in an independent cohort of 334,880 individuals. Serum urate showed significant genetic correlations with many cardiometabolic traits, with genetic causality analyses supporting a substantial role for pleiotropy. Enrichment analysis, fine-mapping of urate-associated loci and colocalization with gene expression in 47 tissues implicated the kidney and liver as the main target organs and prioritized potentially causal genes and variants, including the transcriptional master regulators in the liver and kidney, HNF1A and HNF4A. Experimental validation showed that HNF4A transactivated the promoter of ABCG2, encoding a major urate transporter, in kidney cells, and that HNF4A p.Thr139Ile is a functional variant. Transcriptional coregulation within and across organs may be a general mechanism underlying the observed pleiotropy between urate and cardiometabolic traits.


Assuntos
Doenças Cardiovasculares/sangue , Marcadores Genéticos , Gota/sangue , Doenças Metabólicas/sangue , Polimorfismo de Nucleotídeo Único , Transdução de Sinais , Ácido Úrico/sangue , Membro 2 da Subfamília G de Transportadores de Cassetes de Ligação de ATP/genética , Doenças Cardiovasculares/epidemiologia , Doenças Cardiovasculares/genética , Estudos de Coortes , Loci Gênicos , Predisposição Genética para Doença , Estudo de Associação Genômica Ampla , Gota/epidemiologia , Gota/genética , Fator 1-alfa Nuclear de Hepatócito/genética , Fator 4 Nuclear de Hepatócito/genética , Humanos , Rim/metabolismo , Rim/patologia , Fígado/metabolismo , Fígado/patologia , Doenças Metabólicas/epidemiologia , Doenças Metabólicas/genética , Proteínas de Neoplasias/genética , Especificidade de Órgãos
9.
Genes Immun ; 20(7): 555-565, 2019 09.
Artigo em Inglês | MEDLINE | ID: mdl-30459343

RESUMO

Resting-state white blood cell (WBC) count is a marker of inflammation and immune system health. There is evidence that WBC count is not fixed over time and there is heterogeneity in WBC trajectory that is associated with morbidity and mortality. Latent class mixed modeling (LCMM) is a method that can identify unobserved heterogeneity in longitudinal data and attempts to classify individuals into groups based on a linear model of repeated measurements. We applied LCMM to repeated WBC count measures derived from electronic medical records of participants of the National Human Genetics Research Institute (NHRGI) electronic MEdical Record and GEnomics (eMERGE) network study, revealing two WBC count trajectory phenotypes. Advancing these phenotypes to GWAS, we found genetic associations between trajectory class membership and regions on chromosome 1p34.3 and chromosome 11q13.4. The chromosome 1 region contains CSF3R, which encodes the granulocyte colony-stimulating factor receptor. This protein is a major factor in neutrophil stimulation and proliferation. The association on chromosome 11 contain genes RNF169 and XRRA1; both involved in the regulation of double-strand break DNA repair.


Assuntos
Contagem de Leucócitos/métodos , Leucócitos/classificação , Adulto , Idoso , Bases de Dados Genéticas , Registros Eletrônicos de Saúde , Feminino , Estudo de Associação Genômica Ampla , Humanos , Análise de Classes Latentes , Masculino , Pessoa de Meia-Idade , Fenótipo , Polimorfismo de Nucleotídeo Único/genética , Proteínas/genética , Receptores de Fator Estimulador de Colônias/genética , Ubiquitina-Proteína Ligases/genética
10.
J Am Med Inform Assoc ; 25(11): 1540-1546, 2018 11 01.
Artigo em Inglês | MEDLINE | ID: mdl-30124903

RESUMO

Electronic health record (EHR) algorithms for defining patient cohorts are commonly shared as free-text descriptions that require human intervention both to interpret and implement. We developed the Phenotype Execution and Modeling Architecture (PhEMA, http://projectphema.org) to author and execute standardized computable phenotype algorithms. With PhEMA, we converted an algorithm for benign prostatic hyperplasia, developed for the electronic Medical Records and Genomics network (eMERGE), into a standards-based computable format. Eight sites (7 within eMERGE) received the computable algorithm, and 6 successfully executed it against local data warehouses and/or i2b2 instances. Blinded random chart review of cases selected by the computable algorithm shows PPV ≥90%, and 3 out of 5 sites had >90% overlap of selected cases when comparing the computable algorithm to their original eMERGE implementation. This case study demonstrates potential use of PhEMA computable representations to automate phenotyping across different EHR systems, but also highlights some ongoing challenges.


Assuntos
Algoritmos , Registros Eletrônicos de Saúde , Fenótipo , Hiperplasia Prostática/diagnóstico , Data Warehousing , Bases de Dados Factuais , Genômica , Humanos , Masculino , Estudos de Casos Organizacionais , Hiperplasia Prostática/genética
11.
J Am Soc Nephrol ; 28(3): 981-994, 2017 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-27920155

RESUMO

Genome-wide association studies have identified >50 common variants associated with kidney function, but these variants do not fully explain the variation in eGFR. We performed a two-stage meta-analysis of associations between genotypes from the Illumina exome array and eGFR on the basis of serum creatinine (eGFRcrea) among participants of European ancestry from the CKDGen Consortium (nStage1: 111,666; nStage2: 48,343). In single-variant analyses, we identified single nucleotide polymorphisms at seven new loci associated with eGFRcrea (PPM1J, EDEM3, ACP1, SPEG, EYA4, CYP1A1, and ATXN2L; PStage1<3.7×10-7), of which most were common and annotated as nonsynonymous variants. Gene-based analysis identified associations of functional rare variants in three genes with eGFRcrea, including a novel association with the SOS Ras/Rho guanine nucleotide exchange factor 2 gene, SOS2 (P=5.4×10-8 by sequence kernel association test). Experimental follow-up in zebrafish embryos revealed changes in glomerular gene expression and renal tubule morphology in the embryonic kidney of acp1- and sos2-knockdowns. These developmental abnormalities associated with altered blood clearance rate and heightened prevalence of edema. This study expands the number of loci associated with kidney function and identifies novel genes with potential roles in kidney formation.


Assuntos
Exoma/genética , Taxa de Filtração Glomerular/genética , Rim/embriologia , Proteínas Tirosina Fosfatases/genética , Proteínas Proto-Oncogênicas/genética , Proteínas Son Of Sevenless/genética , Animais , Loci Gênicos , Estudo de Associação Genômica Ampla , Humanos , Peixe-Zebra
12.
Arthritis Rheumatol ; 69(3): 680-681, 2017 03.
Artigo em Inglês | MEDLINE | ID: mdl-27860419
13.
Arthritis Rheumatol ; 69(2): 291-300, 2017 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-27589350

RESUMO

OBJECTIVE: The differences between seronegative and seropositive rheumatoid arthritis (RA) have not been widely reported. We performed electronic health record (EHR)-based phenome-wide association studies (PheWAS) to identify disease associations in seropositive and seronegative RA. METHODS: A validated algorithm identified RA subjects from the de-identified version of the Vanderbilt University Medical Center EHR. Serotypes were determined by rheumatoid factor (RF) and anti-cyclic citrullinated peptide antibody (ACPA) values. We tested EHR-derived phenotypes using PheWAS comparing seropositive RA and seronegative RA, yielding disease associations. PheWAS was also performed in RF-positive versus RF-negative subjects and ACPA-positive versus ACPA-negative subjects. Following PheWAS, select phenotypes were then manually reviewed, and fibromyalgia was specifically evaluated using a validated algorithm. RESULTS: A total of 2,199 RA individuals with either RF or ACPA testing were identified. Of these, 1,382 patients (63%) were classified as seropositive. Seronegative RA was associated with myalgia and myositis (odds ratio [OR] 2.1, P = 3.7 × 10-10 ) and back pain. A manual review of the health record showed that among subjects coded for Myalgia and Myositis, ∼80% had fibromyalgia. Follow-up with a specific EHR algorithm for fibromyalgia confirmed that seronegative RA was associated with fibromyalgia (OR 1.8, P = 4.0 × 10-6 ). Seropositive RA was associated with chronic airway obstruction (OR 2.2, P = 1.4 × 10-4 ) and tobacco use (OR 2.2, P = 7.0 × 10-4 ). CONCLUSION: This PheWAS of RA patients identifies a strong association between seronegativity and fibromyalgia. It also affirms relationships between seropositivity and chronic airway obstruction and between seropositivity and tobacco use. These findings demonstrate the utility of the PheWAS approach to discover novel phenotype associations within different subgroups of a disease.


Assuntos
Artrite Reumatoide/complicações , Artrite Reumatoide/genética , Fibromialgia/complicações , Fibromialgia/genética , Artrite Reumatoide/sangue , Feminino , Estudos de Associação Genética , Humanos , Masculino , Pessoa de Meia-Idade , Fenótipo , Testes Sorológicos
14.
Bioinformatics ; 30(16): 2375-6, 2014 Aug 15.
Artigo em Inglês | MEDLINE | ID: mdl-24733291

RESUMO

UNLABELLED: Phenome-wide association studies (PheWAS) have been used to replicate known genetic associations and discover new phenotype associations for genetic variants. This PheWAS implementation allows users to translate ICD-9 codes to PheWAS case and control groups, perform analyses using these and/or other phenotypes with covariate adjustments and plot the results. We demonstrate the methods by replicating a PheWAS on rs3135388 (near HLA-DRB, associated with multiple sclerosis) and performing a novel PheWAS using an individual's maximum white blood cell count (WBC) as a continuous measure. Our results for rs3135388 replicate known associations with more significant results than the original study on the same dataset. Our PheWAS of WBC found expected results, including associations with infections, myeloproliferative diseases and associated conditions, such as anemia. These results demonstrate the performance of the improved classification scheme and the flexibility of PheWAS encapsulated in this package. AVAILABILITY AND IMPLEMENTATION: This R package is freely available under the Gnu Public License (GPL-3) from http://phewascatalog.org. It is implemented in native R and is platform independent.


Assuntos
Estudos de Associação Genética/métodos , Variação Genética , Fenótipo , Software , Interpretação Estatística de Dados , Humanos , Esclerose Múltipla/genética
15.
Nature ; 506(7488): 376-81, 2014 Feb 20.
Artigo em Inglês | MEDLINE | ID: mdl-24390342

RESUMO

A major challenge in human genetics is to devise a systematic strategy to integrate disease-associated variants with diverse genomic and biological data sets to provide insight into disease pathogenesis and guide drug discovery for complex traits such as rheumatoid arthritis (RA). Here we performed a genome-wide association study meta-analysis in a total of >100,000 subjects of European and Asian ancestries (29,880 RA cases and 73,758 controls), by evaluating ∼10 million single-nucleotide polymorphisms. We discovered 42 novel RA risk loci at a genome-wide level of significance, bringing the total to 101 (refs 2 - 4). We devised an in silico pipeline using established bioinformatics methods based on functional annotation, cis-acting expression quantitative trait loci and pathway analyses--as well as novel methods based on genetic overlap with human primary immunodeficiency, haematological cancer somatic mutations and knockout mouse phenotypes--to identify 98 biological candidate genes at these 101 risk loci. We demonstrate that these genes are the targets of approved therapies for RA, and further suggest that drugs approved for other indications may be repurposed for the treatment of RA. Together, this comprehensive genetic study sheds light on fundamental genes, pathways and cell types that contribute to RA pathogenesis, and provides empirical evidence that the genetics of RA can provide important information for drug discovery.


Assuntos
Artrite Reumatoide/tratamento farmacológico , Artrite Reumatoide/genética , Descoberta de Drogas , Predisposição Genética para Doença/genética , Terapia de Alvo Molecular , Alelos , Animais , Artrite Reumatoide/metabolismo , Artrite Reumatoide/patologia , Povo Asiático/genética , Estudos de Casos e Controles , Biologia Computacional , Reposicionamento de Medicamentos , Feminino , Estudo de Associação Genômica Ampla , Neoplasias Hematológicas/genética , Neoplasias Hematológicas/metabolismo , Humanos , Masculino , Camundongos , Camundongos Knockout , Polimorfismo de Nucleotídeo Único/genética , População Branca/genética
16.
J Am Med Inform Assoc ; 20(e2): e253-9, 2013 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-23851443

RESUMO

OBJECTIVES: Generalizable, high-throughput phenotyping methods based on supervised machine learning (ML) algorithms could significantly accelerate the use of electronic health records data for clinical and translational research. However, they often require large numbers of annotated samples, which are costly and time-consuming to review. We investigated the use of active learning (AL) in ML-based phenotyping algorithms. METHODS: We integrated an uncertainty sampling AL approach with support vector machines-based phenotyping algorithms and evaluated its performance using three annotated disease cohorts including rheumatoid arthritis (RA), colorectal cancer (CRC), and venous thromboembolism (VTE). We investigated performance using two types of feature sets: unrefined features, which contained at least all clinical concepts extracted from notes and billing codes; and a smaller set of refined features selected by domain experts. The performance of the AL was compared with a passive learning (PL) approach based on random sampling. RESULTS: Our evaluation showed that AL outperformed PL on three phenotyping tasks. When unrefined features were used in the RA and CRC tasks, AL reduced the number of annotated samples required to achieve an area under the curve (AUC) score of 0.95 by 68% and 23%, respectively. AL also achieved a reduction of 68% for VTE with an optimal AUC of 0.70 using refined features. As expected, refined features improved the performance of phenotyping classifiers and required fewer annotated samples. CONCLUSIONS: This study demonstrated that AL can be useful in ML-based phenotyping methods. Moreover, AL and feature engineering based on domain knowledge could be combined to develop efficient and generalizable phenotyping methods.


Assuntos
Algoritmos , Inteligência Artificial , Registros Eletrônicos de Saúde , Fenótipo , Estudos de Associação Genética , Humanos , Máquina de Vetores de Suporte
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA