Pesquisa | Portal de Pesquisa da BVS

1.

Real-World Effectiveness of BNT162b2 Against Infection and Severe Diseases in Children and Adolescents.

Wu, Qiong; Tong, Jiayi; Zhang, Bingyu; Zhang, Dazheng; Chen, Jiajie; Lei, Yuqing; Lu, Yiwen; Wang, Yudong; Li, Lu; Shen, Yishan; Xu, Jie; Bailey, L Charles; Bian, Jiang; Christakis, Dimitri A; Fitzgerald, Megan L; Hirabayashi, Kathryn; Jhaveri, Ravi; Khaitan, Alka; Lyu, Tianchen; Rao, Suchitra; Razzaghi, Hanieh; Schwenk, Hayden T; Wang, Fei; Gage Witvliet, Margot I; Tchetgen Tchetgen, Eric J; Morris, Jeffrey S; Forrest, Christopher B; Chen, Yong.

Ann Intern Med ; 177(2): 165-176, 2024 02.

Artigo em Inglês | MEDLINE | ID: mdl-38190711

RESUMO

BACKGROUND: The efficacy of the BNT162b2 vaccine in pediatrics was assessed by randomized trials before the Omicron variant's emergence. The long-term durability of vaccine protection in this population during the Omicron period remains limited. OBJECTIVE: To assess the effectiveness of BNT162b2 in preventing infection and severe diseases with various strains of the SARS-CoV-2 virus in previously uninfected children and adolescents. DESIGN: Comparative effectiveness research accounting for underreported vaccination in 3 study cohorts: adolescents (12 to 20 years) during the Delta phase and children (5 to 11 years) and adolescents (12 to 20 years) during the Omicron phase. SETTING: A national collaboration of pediatric health systems (PEDSnet). PARTICIPANTS: 77 392 adolescents (45 007 vaccinated) during the Delta phase and 111 539 children (50 398 vaccinated) and 56 080 adolescents (21 180 vaccinated) during the Omicron phase. INTERVENTION: First dose of the BNT162b2 vaccine versus no receipt of COVID-19 vaccine. MEASUREMENTS: Outcomes of interest include documented infection, COVID-19 illness severity, admission to an intensive care unit (ICU), and cardiac complications. The effectiveness was reported as (1-relative risk)*100, with confounders balanced via propensity score stratification. RESULTS: During the Delta period, the estimated effectiveness of the BNT162b2 vaccine was 98.4% (95% CI, 98.1% to 98.7%) against documented infection among adolescents, with no statistically significant waning after receipt of the first dose. An analysis of cardiac complications did not suggest a statistically significant difference between vaccinated and unvaccinated groups. During the Omicron period, the effectiveness against documented infection among children was estimated to be 74.3% (CI, 72.2% to 76.2%). Higher levels of effectiveness were seen against moderate or severe COVID-19 (75.5% [CI, 69.0% to 81.0%]) and ICU admission with COVID-19 (84.9% [CI, 64.8% to 93.5%]). Among adolescents, the effectiveness against documented Omicron infection was 85.5% (CI, 83.8% to 87.1%), with 84.8% (CI, 77.3% to 89.9%) against moderate or severe COVID-19, and 91.5% (CI, 69.5% to 97.6%) against ICU admission with COVID-19. The effectiveness of the BNT162b2 vaccine against the Omicron variant declined 4 months after the first dose and then stabilized. The analysis showed a lower risk for cardiac complications in the vaccinated group during the Omicron variant period. LIMITATION: Observational study design and potentially undocumented infection. CONCLUSION: This study suggests that BNT162b2 was effective for various COVID-19-related outcomes in children and adolescents during the Delta and Omicron periods, and there is some evidence of waning effectiveness over time. PRIMARY FUNDING SOURCE: National Institutes of Health.

Assuntos

Vacina BNT162 , COVID-19 , Estados Unidos , Humanos , Adolescente , Criança , Vacinas contra COVID-19 , COVID-19/prevenção & controle , Pesquisa Comparativa da Efetividade , Hospitalização

2.

Early prediction of Alzheimer's disease and related dementias using real-world electronic health records.

Li, Qian; Yang, Xi; Xu, Jie; Guo, Yi; He, Xing; Hu, Hui; Lyu, Tianchen; Marra, David; Miller, Amber; Smith, Glenn; DeKosky, Steven; Boyce, Richard D; Schliep, Karen; Shenkman, Elizabeth; Maraganore, Demetrius; Wu, Yonghui; Bian, Jiang.

Alzheimers Dement ; 19(8): 3506-3518, 2023 08.

Artigo em Inglês | MEDLINE | ID: mdl-36815661

RESUMO

INTRODUCTION: This study aims to explore machine learning (ML) methods for early prediction of Alzheimer's disease (AD) and related dementias (ADRD) using the real-world electronic health records (EHRs). METHODS: A total of 23,835 ADRD and 1,038,643 control patients were identified from the OneFlorida+ Research Consortium. Two ML methods were used to develop the prediction models. Both knowledge-driven and data-driven approaches were explored. Four computable phenotyping algorithms were tested. RESULTS: The gradient boosting tree (GBT) models trained with the data-driven approach achieved the best area under the curve (AUC) scores of 0.939, 0.906, 0.884, and 0.854 for early prediction of ADRD 0, 1, 3, or 5 years before diagnosis, respectively. A number of important clinical and sociodemographic factors were identified. DISCUSSION: We tested various settings and showed the predictive ability of using ML approaches for early prediction of ADRD with EHRs. The models can help identify high-risk individuals for early informed preventive or prognostic clinical decisions.

Assuntos

Doença de Alzheimer , Humanos , Doença de Alzheimer/diagnóstico , Doença de Alzheimer/epidemiologia , Registros Eletrônicos de Saúde , Prognóstico , Aprendizado de Máquina , Algoritmos

3.

A study of deep learning methods for de-identification of clinical notes in cross-institute settings.

Yang, Xi; Lyu, Tianchen; Li, Qian; Lee, Chih-Yin; Bian, Jiang; Hogan, William R; Wu, Yonghui.

BMC Med Inform Decis Mak ; 19(Suppl 5): 232, 2019 12 05.

Artigo em Inglês | MEDLINE | ID: mdl-31801524

RESUMO

BACKGROUND: De-identification is a critical technology to facilitate the use of unstructured clinical text while protecting patient privacy and confidentiality. The clinical natural language processing (NLP) community has invested great efforts in developing methods and corpora for de-identification of clinical notes. These annotated corpora are valuable resources for developing automated systems to de-identify clinical text at local hospitals. However, existing studies often utilized training and test data collected from the same institution. There are few studies to explore automated de-identification under cross-institute settings. The goal of this study is to examine deep learning-based de-identification methods at a cross-institute setting, identify the bottlenecks, and provide potential solutions. METHODS: We created a de-identification corpus using a total 500 clinical notes from the University of Florida (UF) Health, developed deep learning-based de-identification models using 2014 i2b2/UTHealth corpus, and evaluated the performance using UF corpus. We compared five different word embeddings trained from the general English text, clinical text, and biomedical literature, explored lexical and linguistic features, and compared two strategies to customize the deep learning models using UF notes and resources. RESULTS: Pre-trained word embeddings using a general English corpus achieved better performance than embeddings from de-identified clinical text and biomedical literature. The performance of deep learning models trained using only i2b2 corpus significantly dropped (strict and relax F1 scores dropped from 0.9547 and 0.9646 to 0.8568 and 0.8958) when applied to another corpus annotated at UF Health. Linguistic features could further improve the performance of de-identification in cross-institute settings. After customizing the models using UF notes and resource, the best model achieved the strict and relaxed F1 scores of 0.9288 and 0.9584, respectively. CONCLUSIONS: It is necessary to customize de-identification models using local clinical text and other resources when applied in cross-institute settings. Fine-tuning is a potential solution to re-use pre-trained parameters and reduce the training time to customize deep learning-based de-identification models trained using clinical corpus from a different institution.

Assuntos

Anonimização de Dados , Aprendizado Profundo , Confidencialidade , Registros Eletrônicos de Saúde , Humanos , Linguística , Processamento de Linguagem Natural

4.

Developing a computable phenotype for glioblastoma.

Yan, Sandra; Melnick, Kaitlyn; He, Xing; Lyu, Tianchen; Moor, Rachel S F; Still, Megan E H; Mitchell, Duane A; Shenkman, Elizabeth A; Wang, Han; Guo, Yi; Bian, Jiang; Ghiaseddin, Ashley P.

Neuro Oncol ; 26(6): 1163-1170, 2024 Jun 03.

Artigo em Inglês | MEDLINE | ID: mdl-38141226

RESUMO

BACKGROUND: Glioblastoma is the most common malignant brain tumor, and thus it is important to be able to identify patients with this diagnosis for population studies. However, this can be challenging as diagnostic codes are nonspecific. The aim of this study was to create a computable phenotype (CP) for glioblastoma multiforme (GBM) from structured and unstructured data to identify patients with this condition in a large electronic health record (EHR). METHODS: We used the University of Florida (UF) Health Integrated Data Repository, a centralized clinical data warehouse that stores clinical and research data from various sources within the UF Health system, including the EHR system. We performed multiple iterations to refine the GBM-relevant diagnosis codes, procedure codes, medication codes, and keywords through manual chart review of patient data. We then evaluated the performances of various possible proposed CPs constructed from the relevant codes and keywords. RESULTS: We underwent six rounds of manual chart reviews to refine the CP elements. The final CP algorithm for identifying GBM patients was selected based on the best F1-score. Overall, the CP rule "if the patient had at least 1 relevant diagnosis code and at least 1 relevant keyword" demonstrated the highest F1-score using both structured and unstructured data. Thus, it was selected as the best-performing CP rule. CONCLUSIONS: We developed and validated a CP algorithm for identifying patients with GBM using both structured and unstructured EHR data from a large tertiary care center. The final algorithm achieved an F1-score of 0.817, indicating a high performance, which minimizes possible biases from misclassification errors.

Assuntos

Neoplasias Encefálicas , Registros Eletrônicos de Saúde , Glioblastoma , Fenótipo , Humanos , Glioblastoma/patologia , Glioblastoma/diagnóstico , Neoplasias Encefálicas/patologia , Neoplasias Encefálicas/diagnóstico , Algoritmos , Feminino

5.

Develop and validate a computable phenotype for the identification of Alzheimer's disease patients using electronic health record data.

He, Xing; Wei, Ruoqi; Huang, Yu; Chen, Zhaoyi; Lyu, Tianchen; Bost, Sarah; Tong, Jiayi; Li, Lu; Zhou, Yujia; Li, Zhao; Guo, Jingchuan; Tang, Huilin; Wang, Fei; DeKosky, Steven; Xu, Hua; Chen, Yong; Zhang, Rui; Xu, Jie; Guo, Yi; Wu, Yonghui; Bian, Jiang.

Alzheimers Dement (Amst) ; 16(3): e12613, 2024.

Artigo em Inglês | MEDLINE | ID: mdl-38966622

RESUMO

INTRODUCTION: Alzheimer's disease (AD) is often misclassified in electronic health records (EHRs) when relying solely on diagnosis codes. This study aimed to develop a more accurate, computable phenotype (CP) for identifying AD patients using structured and unstructured EHR data. METHODS: We used EHRs from the University of Florida Health (UFHealth) system and created rule-based CPs iteratively through manual chart reviews. The CPs were then validated using data from the University of Texas Health Science Center at Houston (UTHealth) and the University of Minnesota (UMN). RESULTS: Our best-performing CP was "patient has at least 2 AD diagnoses and AD-related keywords in AD encounters," with an F1-score of 0.817 at UF, 0.961 at UTHealth, and 0.623 at UMN, respectively. DISCUSSION: We developed and validated rule-based CPs for AD identification with good performance, which will be crucial for studies that aim to use real-world data like EHRs. Highlights: Developed a computable phenotype (CP) to identify Alzheimer's disease (AD) patients using EHR data.Utilized both structured and unstructured EHR data to enhance CP accuracy.Achieved a high F1-score of 0.817 at UFHealth, and 0.961 and 0.623 at UTHealth and UMN.Validated the CP across different demographics, ensuring robustness and fairness.

6.

Develop and Validate a Computable Phenotype for the Identification of Alzheimer's Disease Patients Using Electronic Health Record Data.

He, Xing; Wei, Ruoqi; Huang, Yu; Chen, Zhaoyi; Lyu, Tianchen; Bost, Sarah; Tong, Jiayi; Li, Lu; Zhou, Yujia; Guo, Jingchuan; Tang, Huilin; Wang, Fei; DeKosky, Steven; Xu, Hua; Chen, Yong; Zhang, Rui; Xu, Jie; Guo, Yi; Wu, Yonghui; Bian, Jiang.

medRxiv ; 2024 Feb 06.

Artigo em Inglês | MEDLINE | ID: mdl-38370766

RESUMO

INTRODUCTION: Alzheimer's Disease (AD) are often misclassified in electronic health records (EHRs) when relying solely on diagnostic codes. This study aims to develop a more accurate, computable phenotype (CP) for identifying AD patients by using both structured and unstructured EHR data. METHODS: We used EHRs from the University of Florida Health (UF Health) system and created rule-based CPs iteratively through manual chart reviews. The CPs were then validated using data from the University of Texas Health Science Center at Houston (UT Health) and the University of Minnesota (UMN). RESULTS: Our best-performing CP is " patient has at least 2 AD diagnoses and AD-related keywords " with an F1-score of 0.817 at UF, and 0.961 and 0.623 at UT Health and UMN, respectively. DISCUSSION: We developed and validated rule-based CPs for AD identification with good performance, crucial for studies that aim to use real-world data like EHRs.

7.

The role of health system penetration rate in estimating the prevalence of type 1 diabetes in children and adolescents using electronic health records.

Li, Piaopiao; Lyu, Tianchen; Alkhuzam, Khalid; Spector, Eliot; Donahoo, William T; Bost, Sarah; Wu, Yonghui; Hogan, William R; Prosperi, Mattia; Schatz, Desmond A; Atkinson, Mark A; Haller, Michael J; Shenkman, Elizabeth A; Guo, Yi; Bian, Jiang; Shao, Hui.

J Am Med Inform Assoc ; 31(1): 165-173, 2023 12 22.

Artigo em Inglês | MEDLINE | ID: mdl-37812771

RESUMO

OBJECTIVE: Having sufficient population coverage from the electronic health records (EHRs)-connected health system is essential for building a comprehensive EHR-based diabetes surveillance system. This study aimed to establish an EHR-based type 1 diabetes (T1D) surveillance system for children and adolescents across racial and ethnic groups by identifying the minimum population coverage from EHR-connected health systems to accurately estimate T1D prevalence. MATERIALS AND METHODS: We conducted a retrospective, cross-sectional analysis involving children and adolescents <20 years old identified from the OneFlorida+ Clinical Research Network (2018-2020). T1D cases were identified using a previously validated computable phenotyping algorithm. The T1D prevalence for each ZIP Code Tabulation Area (ZCTA, 5 digits), defined as the number of T1D cases divided by the total number of residents in the corresponding ZCTA, was calculated. Population coverage for each ZCTA was measured using observed health system penetration rates (HSPR), which was calculated as the ratio of residents in the corresponding ZTCA and captured by OneFlorida+ to the overall population in the same ZCTA reported by the Census. We used a recursive partitioning algorithm to identify the minimum required observed HSPR to estimate T1D prevalence and compare our estimate with the reported T1D prevalence from the SEARCH study. RESULTS: Observed HSPRs of 55%, 55%, and 60% were identified as the minimum thresholds for the non-Hispanic White, non-Hispanic Black, and Hispanic populations. The estimated T1D prevalence for non-Hispanic White and non-Hispanic Black were 2.87 and 2.29 per 1000 youth, which are comparable to the reference study's estimation. The estimated prevalence of T1D for Hispanics (2.76 per 1000 youth) was higher than the reference study's estimation (1.48-1.64 per 1000 youth). The standardized T1D prevalence in the overall Florida population was 2.81 per 1000 youth in 2019. CONCLUSION: Our study provides a method to estimate T1D prevalence in children and adolescents using EHRs and reports the estimated HSPRs and prevalence of T1D for different race and ethnicity groups to facilitate EHR-based diabetes surveillance.

Assuntos

Diabetes Mellitus Tipo 1 , Criança , Humanos , Adolescente , Adulto Jovem , Adulto , Diabetes Mellitus Tipo 1/epidemiologia , Prevalência , Registros Eletrônicos de Saúde , Estudos Transversais , Estudos Retrospectivos

8.

Real-world Effectiveness of BNT162b2 Against Infection and Severe Diseases in Children and Adolescents.

Wu, Qiong; Tong, Jiayi; Zhang, Bingyu; Zhang, Dazheng; Chen, Jiajie; Lei, Yuqing; Lu, Yiwen; Wang, Yudong; Li, Lu; Shen, Yishan; Xu, Jie; Bailey, L Charles; Bian, Jiang; Christakis, Dimitri A; Fitzgerald, Megan L; Hirabayashi, Kathryn; Jhaveri, Ravi; Khaitan, Alka; Lyu, Tianchen; Rao, Suchitra; Razzaghi, Hanieh; Schwenk, Hayden T; Wang, Fei; Witvliet, Margot I; Tchetgen, Eric J Tchetgen; Morris, Jeffrey S; Forrest, Christopher B; Chen, Yong.

medRxiv ; 2023 Nov 13.

Artigo em Inglês | MEDLINE | ID: mdl-38014095

RESUMO

Background: The efficacy of the BNT162b2 vaccine in pediatrics was assessed by randomized trials before the Omicron variant's emergence. The long-term durability of vaccine protection in this population during the Omicron period remains limited. Objective: To assess the effectiveness of BNT162b2 in preventing infection and severe diseases with various strains of the SARS-CoV-2 virus in previously uninfected children and adolescents. Design: Comparative effectiveness research accounting for underreported vaccination in three study cohorts: adolescents (12 to 20 years) during the Delta phase, children (5 to 11 years) and adolescents (12 to 20 years) during the Omicron phase. Setting: A national collaboration of pediatric health systems (PEDSnet). Participants: 77,392 adolescents (45,007 vaccinated) in the Delta phase, 111,539 children (50,398 vaccinated) and 56,080 adolescents (21,180 vaccinated) in the Omicron period. Exposures: First dose of the BNT162b2 vaccine vs. no receipt of COVID-19 vaccine. Measurements: Outcomes of interest include documented infection, COVID-19 illness severity, admission to an intensive care unit (ICU), and cardiac complications. The effectiveness was reported as (1-relative risk)*100% with confounders balanced via propensity score stratification. Results: During the Delta period, the estimated effectiveness of BNT162b2 vaccine was 98.4% (95% CI, 98.1 to 98.7) against documented infection among adolescents, with no significant waning after receipt of the first dose. An analysis of cardiac complications did not find an increased risk after vaccination. During the Omicron period, the effectiveness against documented infection among children was estimated to be 74.3% (95% CI, 72.2 to 76.2). Higher levels of effectiveness were observed against moderate or severe COVID-19 (75.5%, 95% CI, 69.0 to 81.0) and ICU admission with COVID-19 (84.9%, 95% CI, 64.8 to 93.5). Among adolescents, the effectiveness against documented Omicron infection was 85.5% (95% CI, 83.8 to 87.1), with 84.8% (95% CI, 77.3 to 89.9) against moderate or severe COVID-19, and 91.5% (95% CI, 69.5 to 97.6)) against ICU admission with COVID-19. The effectiveness of the BNT162b2 vaccine against the Omicron variant declined after 4 months following the first dose and then stabilized. The analysis revealed a lower risk of cardiac complications in the vaccinated group during the Omicron variant period. Limitations: Observational study design and potentially undocumented infection. Conclusions: Our study suggests that BNT162b2 was effective for various COVID-19-related outcomes in children and adolescents during the Delta and Omicron periods, and there is some evidence of waning effectiveness over time. Primary Funding Source: National Institutes of Health.

9.

Environmental effects on acute exacerbations of respiratory diseases: A real-world big data study.

Fishe, Jennifer; Zheng, Yi; Lyu, Tianchen; Bian, Jiang; Hu, Hui.

Sci Total Environ ; 806(Pt 1): 150352, 2022 Feb 01.

Artigo em Inglês | MEDLINE | ID: mdl-34555607

RESUMO

BACKGROUND: The effects of weather periods, race/ethnicity, and sex on environmental triggers for respiratory exacerbations are not well understood. This study linked the OneFlorida network (~15 million patients) with an external exposome database to analyze environmental triggers for asthma, bronchitis, and COPD exacerbations while accounting for seasonality, sex, and race/ethnicity. METHODS: This is a case-crossover study of OneFlorida database from 2012 to 2017 examining associations of asthma, bronchitis, and COPD exacerbations with exposures to heat index, PM 2.5 and O 3. We spatiotemporally linked exposures using patients' residential addresses to generate average exposures during hazard and control periods, with each case serving as its own control. We considered age, sex, race/ethnicity, and neighborhood deprivation index as potential effect modifiers in conditional logistic regression models. RESULTS: A total of 1,148,506 exacerbations among 533,446 patients were included. Across all three conditions, hotter heat indices conferred increasing exacerbation odds, except during November to March, where the opposite was seen. There were significant differences when stratified by race/ethnicity (e.g., for asthma in April, May, and October, heat index quartile 4, odds were 1.49 (95% confidence interval (CI) 1.42-1.57) for Non-Hispanic Blacks and 2.04 (95% CI 1.92-2.17) for Hispanics compared to 1.27 (95% CI 1.19-1.36) for Non-Hispanic Whites). Pediatric patients' odds of asthma and bronchitis exacerbations were significantly lower than adults in certain circumstances (e.g., for asthma during June - September, pediatric odds 0.71 (95% CI 0.68-0.74) and adult odds 0.82 (95% CI 0.79-0.85) for the highest quartile of PM 2.5). CONCLUSION: This study of acute exacerbations of asthma, bronchitis, and COPD found exacerbation risk after exposure to heat index, PM 2.5 and O 3 varies by weather period, age, and race/ethnicity. Future work can build upon these results to alert vulnerable populations to exacerbation triggers.

Assuntos

Asma , Doença Pulmonar Obstrutiva Crônica , Transtornos Respiratórios , Adulto , Asma/epidemiologia , Big Data , Criança , Estudos Cross-Over , Humanos

10.

A scoping review of semantic integration of health data and information.

Zhang, Hansi; Lyu, Tianchen; Yin, Pengfei; Bost, Sarah; He, Xing; Guo, Yi; Prosperi, Mattia; Hogan, Willian R; Bian, Jiang.

Int J Med Inform ; 165: 104834, 2022 09.

Artigo em Inglês | MEDLINE | ID: mdl-35863206

RESUMO

OBJECTIVE: We summarized a decade of new research focusing on semantic data integration (SDI) since 2009, and we aim to: (1) summarize the state-of-art approaches on integrating health data and information; and (2) identify the main gaps and challenges of integrating health data and information from multiple levels and domains. MATERIALS AND METHODS: We used PubMed as our focus is applications of SDI in biomedical domains and followed the Preferred Reporting Items for Systematic Review and Meta-Analyses (PRISMA) to search and report for relevant studies published between January 1, 2009 and December 31, 2021. We used Covidence-a systematic review management system-to carry out this scoping review. RESULTS: The initial search from PubMed resulted in 5,326 articles using the two sets of keywords. We then removed 44 duplicates and 5,282 articles were retained for abstract screening. After abstract screening, we included 246 articles for full-text screening, among which 87 articles were deemed eligible for full-text extraction. We summarized the 87 articles from four aspects: (1) methods for the global schema; (2) data integration strategies (i.e., federated system vs. data warehousing); (3) the sources of the data; and (4) downstream applications. CONCLUSION: SDI approach can effectively resolve the semantic heterogeneities across different data sources. We identified two key gaps and challenges in existing SDI studies that (1) many of the existing SDI studies used data from only single-level data sources (e.g., integrating individual-level patient records from different hospital systems), and (2) documentation of the data integration processes is sparse, threatening the reproducibility of SDI studies.

Assuntos

Armazenamento e Recuperação da Informação , Semântica , Humanos , Programas de Rastreamento , Reprodutibilidade dos Testes

11.

A Preliminary Study of Extracting Pulmonary Nodules and Nodule Characteristics from Radiology Reports Using Natural Language Processing.

Yang, Shuang; Yang, Xi; Lyu, Tianchen; He, Xing; Braithwaite, Dejana; Mehta, Hiren J; Guo, Yi; Wu, Yonghui; Bian, Jiang.

IEEE Int Conf Healthc Inform ; 2022: 618-619, 2022 Jun.

Artigo em Inglês | MEDLINE | ID: mdl-36168559

RESUMO

This study aims to develop a natural language processing (NLP) tool to extract the pulmonary nodules and nodule characteristics information from free-text clinical narratives. We identified a cohort of 3,080 patients who received low dose computed tomography (LDCT) at the University of Florida health system and collected their clinical narratives including radiology reports in their electronic health records (EHRs). Then, we manually annotated 394 reports as the gold-standard corpus and explored three state-of-the-art transformer-based NLP methods. The best model achieved an F1-score of 0.9279.

12.

Longitudinal association of maternal dietary patterns with antenatal depression: Evidence from the Chinese Pregnant Women Cohort Study.

Zhan, Yongle; Zhao, Yafen; Qu, Yimin; Yue, Hexin; Shi, Yingjie; Chen, Yunli; Liu, Xuan; Liu, Ruiyi; Lyu, Tianchen; Jing, Ao; Meng, Yaohan; Huang, Junfang; Jiang, Yu.

J Affect Disord ; 308: 587-595, 2022 07 01.

Artigo em Inglês | MEDLINE | ID: mdl-35427717

RESUMO

BACKGROUND: Limited evidence to show the longitudinal associations between maternal dietary patterns and antenatal depression (AD) from cohort studies across the entire gestation period. METHODS: Data came from the Chinese Pregnant Women Cohort Study. The qualitative food frequency questionnaire (Q-FFQ) and Edinburgh Postnatal Depression Scale (EPDS) were used to collect diet and depression data. Dietary patterns were derived by using factor analysis. Generalized estimating equation models were used to analyze the association between diet and AD. RESULTS: A total of 4139 participants finishing 3-wave of follow-up were finally included. Four constant diets were identified, namely plant-based, animal-protein, vitamin-rich and oily-fatty patterns. The prevalence of depression was 23.89%, 21.12% and 22.42% for the first, second and third trimesters. There were reverse associations of plant-based pattern (OR:0.85, 95%CI:0.75-0.97), animal-protein pattern (OR:0.85, 95%CI:0.74-0.99) and vitamin-rich pattern (OR:0.58, 95%CI:0.50-0.67) with AD, while a positive association between oily-fatty pattern and AD (OR:1.47, 95%CI:1.29-1.68). Except for the plant-based pattern, other patterns had linear trend relationships with AD (Ptrend < 0.05). Moreover, a 1-SD increase in vitamin-rich pattern scores was associated with a 20% lower AD risk (OR:0.80, 95%CI:0.76-0.84), while a 1-SD increase in oily-fatty pattern scores was associated with a 19% higher risk (OR:1.19, 95%CI:1.13-1.24). Interactions between dietary patterns and lifestyle habits were observed. LIMITATIONS: The self-reported Q-FFQ and EPDS may cause recall bias. CONCLUSIONS: There are longitudinal associations between maternal dietary patterns and antenatal depression. Our findings are expected to provide evidence for a dietary therapy strategy to improve or prevent depression during pregnancy.

Assuntos

Depressão , Gestantes , Animais , China/epidemiologia , Estudos de Coortes , Depressão/epidemiologia , Dieta , Feminino , Humanos , Gravidez , Vitaminas

13.

The application of artificial intelligence and data integration in COVID-19 studies: a scoping review.

Guo, Yi; Zhang, Yahan; Lyu, Tianchen; Prosperi, Mattia; Wang, Fei; Xu, Hua; Bian, Jiang.

J Am Med Inform Assoc ; 28(9): 2050-2067, 2021 08 13.

Artigo em Inglês | MEDLINE | ID: mdl-34151987

RESUMO

OBJECTIVE: To summarize how artificial intelligence (AI) is being applied in COVID-19 research and determine whether these AI applications integrated heterogenous data from different sources for modeling. MATERIALS AND METHODS: We searched 2 major COVID-19 literature databases, the National Institutes of Health's LitCovid and the World Health Organization's COVID-19 database on March 9, 2021. Following the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guideline, 2 reviewers independently reviewed all the articles in 2 rounds of screening. RESULTS: In the 794 studies included in the final qualitative analysis, we identified 7 key COVID-19 research areas in which AI was applied, including disease forecasting, medical imaging-based diagnosis and prognosis, early detection and prognosis (non-imaging), drug repurposing and early drug discovery, social media data analysis, genomic, transcriptomic, and proteomic data analysis, and other COVID-19 research topics. We also found that there was a lack of heterogenous data integration in these AI applications. DISCUSSION: Risk factors relevant to COVID-19 outcomes exist in heterogeneous data sources, including electronic health records, surveillance systems, sociodemographic datasets, and many more. However, most AI applications in COVID-19 research adopted a single-sourced approach that could omit important risk factors and thus lead to biased algorithms. Integrating heterogeneous data for modeling will help realize the full potential of AI algorithms, improve precision, and reduce bias. CONCLUSION: There is a lack of data integration in the AI applications in COVID-19 research and a need for a multilevel AI framework that supports the analysis of heterogeneous data from different sources.

Assuntos

Inteligência Artificial , Pesquisa Biomédica/tendências , COVID-19 , Algoritmos , Bases de Dados como Assunto , Humanos , National Institutes of Health (U.S.) , Proteômica , Estados Unidos , Organização Mundial da Saúde

14.

Causal AI with Real World Data: Do Statins Protect from Alzheimer's Disease Onset?

Prosperi, Mattia; Salemi, Marco; Ghosh, Shantanu; Lyu, Tianchen; Bian, Jiang; Chen, Zhaoyi; Zhao, Jinying.

ICMHI 2021 (2021) ; 2021: 296-303, 2021 May.

Artigo em Inglês | MEDLINE | ID: mdl-37954527

RESUMO

Causal artificial intelligence aims at developing bias-robust models that can be used to intervene on, rather than just be predictive, of risks or outcomes. However, learning interventional models from observational data, including electronic health records (EHR), is challenging due to inherent bias, e.g., protopathic, confounding, collider. When estimating the effects of treatment interventions, classical approaches like propensity score matching are often used, but they pose limitations with large feature sets, nonlinear/nonparallel treatment group assignments, and collider bias. In this work, we used data from a large EHR consortium -OneFlorida- and evaluated causal statistical/machine learning methods for determining the effect of statin treatment on the risk of Alzheimer's disease, a debated clinical research question. We introduced a combination of directed acyclic graph (DAG) learning and comparison with expert's design, with calculation of the generalized adjustment criterion (GAC), to find an optimal set of covariates for estimation of treatment effects -ameliorating collider bias. The DAG/CAC approach was assessed together with traditional propensity score matching, inverse probability weighting, virtual-twin/counterfactual random forests, and deep counterfactual networks. We showed large heterogeneity in effect estimates upon different model configurations. Our results did not exclude a protective effect of statins, where the DAG/GAC point estimate aligned with the maximum credibility estimate, although the 95% credibility interval included a null effect, warranting further studies and replication.

15.

Cohort profile: the Chinese Pregnant Women Cohort Study and Offspring Follow-up (CPWCSaOF).

Lyu, Tianchen; Chen, Yunli; Zhan, Yongle; Shi, Yingjie; Yue, Hexin; Liu, Xuan; Meng, Yaohan; Jing, Ao; Qu, Yimin; Ma, Haihui; Huang, Ping; Man, Dongmei; Li, Xiaoxiu; Wu, Hongguo; Zhao, Jian; Shan, Guangliang; Jiang, Yu.

BMJ Open ; 11(3): e044933, 2021 03 23.

Artigo em Inglês | MEDLINE | ID: mdl-33757952

RESUMO

PURPOSE: A multicentre prospective cohort study, known as the Chinese Pregnant Women Cohort Study (CPWCS), was established in 2017 to collect exposure data during pregnancy (except environmental exposure) and analyse the relationship between lifestyle during pregnancy and obstetric outcomes. Data about mothers and their children's life and health as well as children's laboratory testing will be collected during the offspring follow-up of CPWCS, which will enable us to further investigate the longitudinal relationship between exposure in different periods (during pregnancy and childhood) and children's development. PARTICIPANTS: 9193 pregnant women in 24 hospitals in China who were in their first trimester (5-13 weeks gestational age) from 25 July 2017 to 26 November 2018 were included in CPWCS by convenience sampling. Five hospitals in China which participated in CPWCS with good cooperation will be selected as the sample source for the Chinese Pregnant Women Cohort Study (Offspring Follow-up) (CPWCS-OF). FINDINGS TO DATE: Some factors affecting pregnancy outcomes and health problems during pregnancy have been discovered through data analysis. The details are discussed in the 'Findings to date' section. FUTURE PLANS: Infants and children and their mothers who meet the criteria will be enrolled in the study and will be followed up every 2 years. The longitudinal relationship between exposure (questionnaire data, physical examination and biospecimens, medical records, and objective environmental data collected through geographical information system and remote sensing technology) in different periods (during pregnancy and childhood) and children's health (such as sleeping problem, oral health, bowel health and allergy-related health problems) will be analysed. TRAIL REGISTRATION NUMBER: CPWCS was registered with ClinicalTrials.gov on 18 January 2018: NCT03403543. CPWCS-OF was registered with ClinicalTrials.gov on 24 June 2020: NCT04444791.

Assuntos

Gestantes , Criança , China/epidemiologia , Estudos de Coortes , Feminino , Seguimentos , Humanos , Lactente , Gravidez , Estudos Prospectivos

16.

Examination of Early CNS Symptoms and Severe Coronavirus Disease 2019: A Multicenter Observational Case Series.

Marra, David E; Busl, Katharina M; Robinson, Christopher P; Bruzzone, Maria J; Miller, Amber H; Chen, Zhaoyi; Guo, Yi; Lyu, Tianchen; Bian, Jiang; Smith, Glenn E.

Crit Care Explor ; 3(6): e0456, 2021 Jun.

Artigo em Inglês | MEDLINE | ID: mdl-34136827

RESUMO

To determine if early CNS symptoms are associated with severe coronavirus disease 2019. DESIGN: A retrospective, observational case series study design. SETTING: Electronic health records were reviewed for patients from five healthcare systems across the state of Florida, United States. PATIENTS: A clinical sample (n = 36,615) of patients with confirmed diagnosis of coronavirus disease 2019 were included. Twelve percent (n = 4,417) of the sample developed severe coronavirus disease 2019, defined as requiring critical care, mechanical ventilation, or diagnosis of acute respiratory distress syndrome, sepsis, or severe inflammatory response syndrome. INTERVENTIONS: None. MEASUREMENT AND MAIN RESULTS: We reviewed the electronic health record for diagnosis of early CNS symptoms (encephalopathy, headache, ageusia, anosmia, dizziness, acute cerebrovascular disease) between 14 days before the diagnosis of coronavirus disease 2019 and 8 days after the diagnosis of coronavirus disease 2019, or before the date of severe coronavirus disease 2019 diagnosis, whichever came first. Hierarchal logistic regression models were used to examine the odds of developing severe coronavirus disease 2019 based on diagnosis of early CNS symptoms. Severe coronavirus disease 2019 patients were significantly more likely to have early CNS symptoms (32.8%) compared with nonsevere patients (6.11%; χ2[1] = 3,266.08, p < 0.0001, φ = 0.29). After adjusting for demographic variables and pertinent comorbidities, early CNS symptoms were significantly associated with severe coronavirus disease 2019 (odds ratio = 3.21). Diagnosis of encephalopathy (odds ratio = 14.38) was associated with greater odds of severe coronavirus disease 2019; whereas diagnosis of anosmia (odds ratio = 0.45), ageusia (odds ratio = 0.46), and headache (odds ratio = 0.63) were associated with reduced odds of severe coronavirus disease 2019. CONCLUSIONS: Early CNS symptoms, and specifically encephalopathy, are differentially associated with risk of severe coronavirus disease 2019 and may serve as an early marker for differences in clinical disease course. Therapies for early coronavirus disease 2019 are scarce, and further identification of subgroups at risk may help to advance understanding of the severity trajectories and enable focused treatment.

17.

Leverage Real-world Longitudinal Data in Large Clinical Research Networks for Alzheimer's Disease and Related Dementia (ADRD).

Duan, Rui; Chen, Zhaoyi; Tong, Jiayi; Luo, Chongliang; Lyu, Tianchen; Tao, Cui; Maraganore, Demetrius; Bian, Jiang; Chen, Yong.

AMIA Annu Symp Proc ; 2020: 393-401, 2020.

Artigo em Inglês | MEDLINE | ID: mdl-33936412

RESUMO

With vast amounts ofpatients' medical information, electronic health records (EHRs) are becoming one of the most important data sources in biomedical and health care research. Effectively integrating data from multiple clinical sites can help provide more generalized real-world evidence that is clinically meaningful. To analyze the clinical data from multiple sites, distributed algorithms are developed to protect patient privacy without sharing individual-level medical information. In this paper, we applied the One-shot Distributed Algorithm for Cox proportional hazard model (ODAC) to the longitudinal data from the OneFlorida Clinical Research Consortium to demonstrate the feasibility of implementing the distributed algorithms in large research networks. We studied the associations between the clinical risk factors and Alzheimer's disease and related dementia (ADRD) onsets to advance clinical research on our understanding of the complex risk factors of ADRD and ultimately improve the care of ADRD patients.

Assuntos

Algoritmos , Doença de Alzheimer , Demência , Registros Eletrônicos de Saúde , Humanos , Modelos de Riscos Proporcionais , Fatores de Risco

18.

A Natural Language Processing Tool to Extract Quantitative Smoking Status from Clinical Narratives.

Yang, Xi; Yang, Hanyuan; Lyu, Tianchen; Yang, Shuang; Guo, Yi; Bian, Jiang; Xu, Hua; Wu, Yonghui.

IEEE Int Conf Healthc Inform ; 20202020.

Artigo em Inglês | MEDLINE | ID: mdl-33786419

RESUMO

This study presents a natural language processing (NLP) tool to extract quantitative smoking information (e.g., Pack-Year, Quit Year, Smoking Year, and Pack per Day) from clinical notes and standardized them into Pack-Year unit. We annotated a corpus of 200 clinical notes from patients who had low-dose CT imaging procedures for lung cancer screening and developed an NLP system using a two-layer rule-engine structure. We divided the 200 notes into a training set and a test set and developed the NLP system only using the training set. The experimental results on the test set showed that our NLP system achieved the best F1 scores of 0.963 and 0.946 for lenient and strict evaluation, respectively.

19.

A Natural Language Processing Tool to Extract Quantitative Smoking Status from Clinical Narratives.

Yang, Xi; Yang, Hanyuan; Lyu, Tianchen; Yang, Shuang; Guo, Yi; Bian, Jiang; Xu, Hua; Wu, Yonghui.

medRxiv ; 2020 Nov 05.

Artigo em Inglês | MEDLINE | ID: mdl-33173920

RESUMO

This study presents a natural language processing (NLP) tool to extract quantitative smoking information (e.g., Pack-Year, Quit Year, Smoking Year, and Pack per Day) from clinical notes and standardized them into Pack-Year unit. We annotated a corpus of 200 clinical notes from patients who had low-dose CT imaging procedures for lung cancer screening and developed an NLP system using a two-layer rule-engine structure. We divided the 200 notes into a training set and a test set and developed the NLP system only using the training set. The experimental results on the test set showed that our NLP system achieved the best F1 scores of 0.963 and 0.946 for lenient and strict evaluation, respectively. NOTE: Accepted as a presentation at the 2020 IEEE International Conference on Healthcare Informatics (ICHI) Workshop on Health Natural Language Processing (HealthNLP 2020). https://ohnlp.github.io/HealthNLP2020/healthnlp2020# .

20.

Developing and Validating a Computable Phenotype for the Identification of Transgender and Gender Nonconforming Individuals and Subgroups.

Guo, Yi; He, Xing; Lyu, Tianchen; Zhang, Hansi; Wu, Yonghui; Yang, Xi; Chen, Zhaoyi; Markham, Merry Jennifer; Modave, François; Xie, Mengjun; Hogan, William; Harle, Christopher A; Shenkman, Elizabeth A; Bian, Jiang.

AMIA Annu Symp Proc ; 2020: 514-523, 2020.

Artigo em Inglês | MEDLINE | ID: mdl-33936425

RESUMO

Transgender and gender nonconforming (TGNC) individuals face significant marginalization, stigma, and discrimination. Under-reporting of TGNC individuals is common since they are often unwilling to self-identify. Meanwhile, the rapid adoption of electronic health record (EHR) systems has made large-scale, longitudinal real-world clinical data available to research and provided a unique opportunity to identify TGNC individuals using their EHRs, contributing to a promising routine health surveillance approach. Built upon existing work, we developed and validated a computable phenotype (CP) algorithm for identifying TGNC individuals and their natal sex (i.e., male-to-female or female-to-male) using both structured EHR data and unstructured clinical notes. Our CP algorithm achieved a 0.955 F1-score on the training data and a perfect F1-score on the independent testing data. Consistent with the literature, we observed an increasing percentage of TGNC individuals and a disproportionate burden of adverse health outcomes, especially sexually transmitted infections and mental health distress, in this population.

Assuntos

Algoritmos , Técnicas de Apoio para a Decisão , Registros Eletrônicos de Saúde , Identidade de Gênero , Minorias Sexuais e de Gênero/psicologia , Pessoas Transgênero/psicologia , Adolescente , Adulto , Idoso , Idoso de 80 Anos ou mais , Criança , Pré-Escolar , Feminino , Terapia de Reposição Hormonal/métodos , Humanos , Lactente , Masculino , Pessoa de Meia-Idade , Fenótipo , Reprodutibilidade dos Testes , Procedimentos de Readequação Sexual , Adulto Jovem

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

RESUMO

RESUMO

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA