Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 5 de 5
Filter
1.
J Biomed Inform ; 70: 35-51, 2017 06.
Article in English | MEDLINE | ID: mdl-28410982

ABSTRACT

In data-driven phenotyping, a core computational task is to identify medical concepts and their variations from sources of electronic health records (EHR) to stratify phenotypic cohorts. A conventional analytic framework for phenotyping largely uses a manual knowledge engineering approach or a supervised learning approach where clinical cases are represented by variables encompassing diagnoses, medicinal treatments and laboratory tests, among others. In such a framework, tasks associated with feature engineering and data annotation remain a tedious and expensive exercise, resulting in poor scalability. In addition, certain clinical conditions, such as those that are rare and acute in nature, may never accumulate sufficient data over time, which poses a challenge to establishing accurate and informative statistical models. In this paper, we use infectious diseases as the domain of study to demonstrate a hierarchical learning method based on ensemble learning that attempts to address these issues through feature abstraction. We use a sparse annotation set to train and evaluate many phenotypes at once, which we call bulk learning. In this batch-phenotyping framework, disease cohort definitions can be learned from within the abstract feature space established by using multiple diseases as a substrate and diagnostic codes as surrogates. In particular, using surrogate labels for model training renders possible its subsequent evaluation using only a sparse annotated sample. Moreover, statistical models can be trained and evaluated, using the same sparse annotation, from within the abstract feature space of low dimensionality that encapsulates the shared clinical traits of these target diseases, collectively referred to as the bulk learning set.


Subject(s)
Electronic Health Records , Supervised Machine Learning , Data Curation , Humans , Models, Statistical , Phenotype
2.
J Am Med Inform Assoc ; 30(2): 256-272, 2023 01 18.
Article in English | MEDLINE | ID: mdl-36255273

ABSTRACT

OBJECTIVE: To identify and characterize clinical subgroups of hospitalized Coronavirus Disease 2019 (COVID-19) patients. MATERIALS AND METHODS: Electronic health records of hospitalized COVID-19 patients at NewYork-Presbyterian/Columbia University Irving Medical Center were temporally sequenced and transformed into patient vector representations using Paragraph Vector models. K-means clustering was performed to identify subgroups. RESULTS: A diverse cohort of 11 313 patients with COVID-19 and hospitalizations between March 2, 2020 and December 1, 2021 were identified; median [IQR] age: 61.2 [40.3-74.3]; 51.5% female. Twenty subgroups of hospitalized COVID-19 patients, labeled by increasing severity, were characterized by their demographics, conditions, outcomes, and severity (mild-moderate/severe/critical). Subgroup temporal patterns were characterized by the durations in each subgroup, transitions between subgroups, and the complete paths throughout the course of hospitalization. DISCUSSION: Several subgroups had mild-moderate severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) infections but were hospitalized for underlying conditions (pregnancy, cardiovascular disease [CVD], etc.). Subgroup 7 included solid organ transplant recipients who mostly developed mild-moderate or severe disease. Subgroup 9 had a history of type-2 diabetes, kidney and CVD, and suffered the highest rates of heart failure (45.2%) and end-stage renal disease (80.6%). Subgroup 13 was the oldest (median: 82.7 years) and had mixed severity but high mortality (33.3%). Subgroup 17 had critical disease and the highest mortality (64.6%), with age (median: 68.1 years) being the only notable risk factor. Subgroups 18-20 had critical disease with high complication rates and long hospitalizations (median: 40+ days). All subgroups are detailed in the full text. A chord diagram depicts the most common transitions, and paths with the highest prevalence, longest hospitalizations, lowest and highest mortalities are presented. Understanding these subgroups and their pathways may aid clinicians in their decisions for better management and earlier intervention for patients.


Subject(s)
COVID-19 , Cardiovascular Diseases , Humans , Female , Middle Aged , Aged , Male , SARS-CoV-2 , Electronic Health Records , Hospitalization
3.
Sci Rep ; 11(1): 11212, 2021 05 27.
Article in English | MEDLINE | ID: mdl-34045491

ABSTRACT

Prediabetes and diabetes mellitus (preDM/DM) have become alarmingly prevalent among youth in recent years. However, simple questionnaire-based screening tools to reliably assess diabetes risk are only available for adults, not youth. As a first step in developing such a tool, we used a large-scale dataset from the National Health and Nutritional Examination Survey (NHANES) to examine the performance of a published pediatric clinical screening guideline in identifying youth with preDM/DM based on American Diabetes Association diagnostic biomarkers. We assessed the agreement between the clinical guideline and biomarker criteria using established evaluation measures (sensitivity, specificity, positive/negative predictive value, F-measure for the positive/negative preDM/DM classes, and Kappa). We also compared the performance of the guideline to those of machine learning (ML) based preDM/DM classifiers derived from the NHANES dataset. Approximately 29% of the 2858 youth in our study population had preDM/DM based on biomarker criteria. The clinical guideline had a sensitivity of 43.1% and specificity of 67.6%, positive/negative predictive values of 35.2%/74.5%, positive/negative F-measures of 38.8%/70.9%, and Kappa of 0.1 (95%CI: 0.06-0.14). The performance of the guideline varied across demographic subgroups. Some ML-based classifiers performed comparably to or better than the screening guideline, especially in identifying preDM/DM youth (p = 5.23 × 10-5).We demonstrated that a recommended pediatric clinical screening guideline did not perform well in identifying preDM/DM status among youth. Additional work is needed to develop a simple yet accurate screener for youth diabetes risk, potentially by using advanced ML methods and a wider range of clinical and behavioral health data.


Subject(s)
Diabetes Mellitus, Type 2/diagnosis , Machine Learning , Prediabetic State/diagnosis , Adolescent , Blood Glucose , Child , Female , Humans , Male , Mass Screening , Nutrition Surveys , Risk Assessment , Risk Factors , Sensitivity and Specificity , Young Adult
4.
Nanomaterials (Basel) ; 11(7)2021 Jul 20.
Article in English | MEDLINE | ID: mdl-34361253

ABSTRACT

Copper oxide particles of various sizes and constituent phases were used to form conductive circuits by means of photonic sintering. With the assistance of extremely low-energy-density xenon flash pulses (1.34 J/cm2), a mixture of nano/submicron copper oxide particles can be reduced in several seconds to form electrical conductive copper films or circuits exhibiting an average thickness of 6 µm without damaging the underlying polymeric substrate, which is quite unique compared to commercial nano-CuO inks whose sintered structure is usually 1 µm or less. A mixture of submicron/nano copper oxide particles with a weight ratio of 3:1 and increasing the fraction of Cu2O in the copper oxide both decrease the electrical resistivity of the reduced copper. Adding copper formate further improved the continuity of interconnects and, thereby, the electrical conductance. Exposure to three-pulse low-energy-density flashes yields an electrical resistivity of 64.6 µΩ·cm. This study not only shed the possibility to use heat-vulnerate polymers as substrate materials benefiting from extremely low-energy light sources, but also achieved photonic-sintered thick copper films through the adoption of submicron copper oxide particles.

5.
J Clin Invest ; 131(22)2021 11 15.
Article in English | MEDLINE | ID: mdl-34609967

ABSTRACT

Air pollution is a well-known contributor to asthma. Air toxics are hazardous air pollutants that cause or may cause serious health effects. Although individual air toxics have been associated with asthma, only a limited number of studies have specifically examined combinations of air toxics associated with the disease. We geocoded air toxic levels from the US National Air Toxics Assessment (NATA) to residential locations for participants of our AiRway in Asthma (ARIA) study. We then applied Data-driven ExposurE Profile extraction (DEEP), a machine learning-based method, to discover combinations of early-life air toxics associated with current use of daily asthma controller medication, lifetime emergency department visit for asthma, and lifetime overnight hospitalization for asthma. We discovered 20 multi-air toxic combinations and 18 single air toxics associated with at least 1 outcome. The multi-air toxic combinations included those containing acrylic acid, ethylidene dichloride, and hydroquinone, and they were significantly associated with asthma outcomes. Several air toxic members of the combinations would not have been identified by single air toxic analyses, supporting the use of machine learning-based methods designed to detect combinatorial effects. Our findings provide knowledge about air toxic combinations associated with childhood asthma.


Subject(s)
Air Pollutants/adverse effects , Asthma/etiology , Machine Learning , Acrylates/adverse effects , Adolescent , Air Pollutants/analysis , Child , Ethyl Chloride/adverse effects , Female , Humans , Hydroquinones/adverse effects , Male , Risk Factors
SELECTION OF CITATIONS
SEARCH DETAIL