Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 145
Filtrar
1.
Am J Transplant ; 24(3): 458-467, 2024 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-37468109

RESUMO

Primary graft dysfunction (PGD) is the leading cause of morbidity and mortality in the first 30 days after lung transplantation. Risk factors for the development of PGD include donor and recipient characteristics, but how multiple variables interact to impact the development of PGD and how clinicians should consider these in making decisions about donor acceptance remain unclear. This was a single-center retrospective cohort study to develop and evaluate machine learning pipelines to predict the development of PGD grade 3 within the first 72 hours of transplantation using donor and recipient variables that are known at the time of donor offer acceptance. Among 576 bilateral lung recipients, 173 (30%) developed PGD grade 3. The cohort underwent a 75% to 25% train-test split, and lasso regression was used to identify 11 variables for model development. A K-nearest neighbor's model showing the best calibration and performance with relatively small confidence intervals was selected as the final predictive model with an area under the receiver operating characteristics curve of 0.65. Machine learning models can predict the risk for development of PGD grade 3 based on data available at the time of donor offer acceptance. This may improve donor-recipient matching and donor utilization in the future.


Assuntos
Transplante de Pulmão , Disfunção Primária do Enxerto , Humanos , Estudos Retrospectivos , Disfunção Primária do Enxerto/diagnóstico , Disfunção Primária do Enxerto/etiologia , Transplante de Pulmão/efeitos adversos , Fatores de Risco , Pulmão
2.
Genet Med ; 26(3): 101035, 2024 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-38059438

RESUMO

PURPOSE: Clinically ascertained variants are under-utilized in neurodevelopmental disorder research. We established the Brain Gene Registry (BGR) to coregister clinically identified variants in putative brain genes with participant phenotypes. Here, we report 179 genetic variants in the first 179 BGR registrants and analyze the proportion that were novel to ClinVar at the time of entry and those that were absent in other disease databases. METHODS: From 10 academically affiliated institutions, 179 individuals with 179 variants were enrolled into the BGR. Variants were cross-referenced for previous presence in ClinVar and for presence in 6 other genetic databases. RESULTS: Of 179 variants in 76 genes, 76 (42.5%) were novel to ClinVar, and 62 (34.6%) were absent from all databases analyzed. Of the 103 variants present in ClinVar, 37 (35.9%) were uncertain (ClinVar aggregate classification of variant of uncertain significance or conflicting classifications). For 5 variants, the aggregate ClinVar classification was inconsistent with the interpretation from the BGR site-provided classification. CONCLUSION: A significant proportion of clinical variants that are novel or uncertain are not shared, limiting the evidence base for new gene-disease relationships. Registration of paired clinical genetic test results with phenotype has the potential to advance knowledge of the relationships between genes and neurodevelopmental disorders.


Assuntos
Bases de Dados Genéticas , Variação Genética , Humanos , Variação Genética/genética , Testes Genéticos/métodos , Fenótipo , Encéfalo
3.
BMC Bioinformatics ; 22(1): 100, 2021 Mar 01.
Artigo em Inglês | MEDLINE | ID: mdl-33648439

RESUMO

BACKGROUND: There have been many recent breakthroughs in processing and analyzing large-scale data sets in biomedical informatics. For example, the CytoGPS algorithm has enabled the use of text-based karyotypes by transforming them into a binary model. However, such advances are accompanied by new problems of data sparsity, heterogeneity, and noisiness that are magnified by the large-scale multidimensional nature of the data. To address these problems, we developed the Mercator R package, which processes and visualizes binary biomedical data. We use Mercator to address biomedical questions of cytogenetic patterns relating to lymphoid hematologic malignancies, which include a broad set of leukemias and lymphomas. Karyotype data are one of the most common form of genetic data collected on lymphoid malignancies, because karyotyping is part of the standard of care in these cancers. RESULTS: In this paper we combine the analytic power of CytoGPS and Mercator to perform a large-scale multidimensional pattern recognition study on 22,741 karyotype samples in 47 different hematologic malignancies obtained from the public Mitelman database. CONCLUSION: Our findings indicate that Mercator was able to identify both known and novel cytogenetic patterns across different lymphoid malignancies, furthering our understanding of the genetics of these diseases.


Assuntos
Doenças Hematológicas , Cariotipagem , Neoplasias , Aberrações Cromossômicas , Humanos , Cariótipo
4.
Antimicrob Agents Chemother ; 65(7): e0006321, 2021 06 17.
Artigo em Inglês | MEDLINE | ID: mdl-33972243

RESUMO

Infection caused by carbapenem-resistant (CR) organisms is a rising problem in the United States. While the risk factors for antibiotic resistance are well known, there remains a large need for the early identification of antibiotic-resistant infections. Using machine learning (ML), we sought to develop a prediction model for carbapenem resistance. All patients >18 years of age admitted to a tertiary-care academic medical center between 1 January 2012 and 10 October 2017 with ≥1 bacterial culture were eligible for inclusion. All demographic, medication, vital sign, procedure, laboratory, and culture/sensitivity data were extracted from the electronic health record. Organisms were considered CR if a single isolate was reported as intermediate or resistant. Patients with CR and non-CR organisms were temporally matched to maintain the positive/negative case ratio. Extreme gradient boosting was used for model development. In total, 68,472 patients met inclusion criteria, with 1,088 patients identified as having CR organisms. Sixty-seven features were used for predictive modeling. The most important features were number of prior antibiotic days, recent central venous catheter placement, and inpatient surgery. After model training, the area under the receiver operating characteristic curve was 0.846. The sensitivity of the model was 30%, with a positive predictive value (PPV) of 30% and a negative predictive value of 99%. Using readily available clinical data, we were able to create a ML model capable of predicting CR infections at the time of culture collection with a high PPV.


Assuntos
Carbapenêmicos , Aprendizado de Máquina , Carbapenêmicos/farmacologia , Humanos , Valor Preditivo dos Testes , Estudos Retrospectivos , Medição de Risco
5.
Crit Care Med ; 49(4): e433-e443, 2021 04 01.
Artigo em Inglês | MEDLINE | ID: mdl-33591014

RESUMO

OBJECTIVES: Assess the impact of heterogeneity among established sepsis criteria (Sepsis-1, Sepsis-3, Centers for Disease Control and Prevention Adult Sepsis Event, and Centers for Medicare and Medicaid severe sepsis core measure 1) through the comparison of corresponding sepsis cohorts. DESIGN: Retrospective analysis of data extracted from electronic health record. SETTING: Single, tertiary-care center in St. Louis, MO. PATIENTS: Adult, nonsurgical inpatients admitted between January 1, 2012, and January 6, 2018. INTERVENTIONS: None. MEASUREMENTS AND MAIN RESULTS: In the electronic health record data, 286,759 encounters met inclusion criteria across the study period. Application of established sepsis criteria yielded cohorts varying in prevalence: Centers for Disease Control and Prevention Adult Sepsis Event (4.4%), Centers for Medicare and Medicaid severe sepsis core measure 1 (4.8%), International Classification of Disease code (7.2%), Sepsis-3 (7.5%), and Sepsis-1 (11.3%). Between the two modern established criteria, Sepsis-3 (n = 21,550) and Centers for Disease Control and Prevention Adult Sepsis Event (n = 12,494), the size of the overlap was 7,763. The sepsis cohorts also varied in time from admission to sepsis onset (hr): Sepsis-1 (2.9), Sepsis-3 (4.1), Centers for Disease Control and Prevention Adult Sepsis Event (4.6), and Centers for Medicare and Medicaid severe sepsis core measure 1 (7.6); sepsis discharge International Classification of Disease code rate: Sepsis-1 (37.4%), Sepsis-3 (40.1%), Centers for Medicare and Medicaid severe sepsis core measure 1 (48.5%), and Centers for Disease Control and Prevention Adult Sepsis Event (54.5%); and inhospital mortality rate: Sepsis-1 (13.6%), Sepsis-3 (18.8%), International Classification of Disease code (20.4%), Centers for Medicare and Medicaid severe sepsis core measure 1 (22.5%), and Centers for Disease Control and Prevention Adult Sepsis Event (24.1%). CONCLUSIONS: The application of commonly used sepsis definitions on a single population produced sepsis cohorts with low agreement, significantly different baseline demographics, and clinical outcomes.


Assuntos
Bases de Dados Factuais/estatística & dados numéricos , Sepse/classificação , Sepse/diagnóstico , Índice de Gravidade de Doença , Humanos , Classificação Internacional de Doenças , Avaliação de Resultados em Cuidados de Saúde , Estudos Retrospectivos , Sepse/epidemiologia , Choque Séptico/classificação , Choque Séptico/diagnóstico , Estados Unidos
6.
BMC Med Inform Decis Mak ; 21(1): 15, 2021 01 07.
Artigo em Inglês | MEDLINE | ID: mdl-33413329

RESUMO

BACKGROUND: The Coronavirus Disease 2019 (COVID-19) pandemic has infected over 10 million people globally with a relatively high mortality rate. There are many therapeutics undergoing clinical trials, but there is no effective vaccine or therapy for treatment thus far. After affected by the Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2), molecular signaling pathways of host cells play critical roles during the life cycle of SARS-CoV-2. Thus, it is significant to identify the involved molecular signaling pathways within the host cells. Drugs targeting these molecular signaling pathways could be potentially effective for COVID-19 treatment. METHODS: In this study, we developed a novel integrative analysis approach to identify the related molecular signaling pathways within host cells, and repurposed drugs as potentially effective treatments for COVID-19, based on the transcriptional response of host cells. RESULTS: We identified activated signaling pathways associated with the infection caused SARS-CoV-2 in human lung epithelial cells through integrative analysis. Then, the activated gene ontologies (GOs) and super GOs were identified. Signaling pathways and GOs such as MAPK, JNK, STAT, ERK, JAK-STAT, IRF7-NFkB signaling, and MYD88/CXCR6 immune signaling were particularly activated. Based on the identified signaling pathways and GOs, a set of potentially effective drugs were repurposed by integrating the drug-target and reverse gene expression data resources. In addition to many drugs being evaluated in clinical trials, the dexamethasone was top-ranked in the prediction, which was the first reported drug to be able to significantly reduce the death rate of COVID-19 patients receiving respiratory support. CONCLUSIONS: The integrative genomics data analysis and results can be helpful to understand the associated molecular signaling pathways within host cells, and facilitate the discovery of effective drugs for COVID-19 treatment.


Assuntos
Tratamento Farmacológico da COVID-19 , Reposicionamento de Medicamentos , Preparações Farmacêuticas , Transdução de Sinais , Transcrição Gênica , Células Cultivadas , Células Epiteliais/virologia , Ontologia Genética , Humanos , SARS-CoV-2/efeitos dos fármacos
7.
Bioinformatics ; 35(24): 5365-5366, 2019 12 15.
Artigo em Inglês | MEDLINE | ID: mdl-31263896

RESUMO

SUMMARY: Karyotype data are the most common form of genetic data that is regularly used clinically. They are collected as part of the standard of care in many diseases, particularly in pediatric and cancer medicine contexts. Karyotypes are represented in a unique text-based format, with a syntax defined by the International System for human Cytogenetic Nomenclature (ISCN). While human-readable, ISCN is not intrinsically machine-readable. This limitation has prevented the full use of complex karyotype data in discovery science use cases. To enhance the utility and value of karyotype data, we developed a tool named CytoGPS. CytoGPS first parses ISCN karyotypes into a machine-readable format. It then converts the ISCN karyotype into a binary Loss-Gain-Fusion (LGF) model, which represents all cytogenetic abnormalities as combinations of loss, gain, or fusion events, in a format that is analyzable using modern computational methods. Such data is then made available for comprehensive 'downstream' analyses that previously were not feasible. AVAILABILITY AND IMPLEMENTATION: Freely available at http://cytogps.org.


Assuntos
Aberrações Cromossômicas , Cariótipo , Humanos , Cariotipagem , Neoplasias , Software
8.
BMC Bioinformatics ; 20(Suppl 24): 679, 2019 Dec 20.
Artigo em Inglês | MEDLINE | ID: mdl-31861985

RESUMO

BACKGROUND: RNA sequencing technologies have allowed researchers to gain a better understanding of how the transcriptome affects disease. However, sequencing technologies often unintentionally introduce experimental error into RNA sequencing data. To counteract this, normalization methods are standardly applied with the intent of reducing the non-biologically derived variability inherent in transcriptomic measurements. However, the comparative efficacy of the various normalization techniques has not been tested in a standardized manner. Here we propose tests that evaluate numerous normalization techniques and applied them to a large-scale standard data set. These tests comprise a protocol that allows researchers to measure the amount of non-biological variability which is present in any data set after normalization has been performed, a crucial step to assessing the biological validity of data following normalization. RESULTS: In this study we present two tests to assess the validity of normalization methods applied to a large-scale data set collected for systematic evaluation purposes. We tested various RNASeq normalization procedures and concluded that transcripts per million (TPM) was the best performing normalization method based on its preservation of biological signal as compared to the other methods tested. CONCLUSION: Normalization is of vital importance to accurately interpret the results of genomic and transcriptomic experiments. More work, however, needs to be performed to optimize normalization methods for RNASeq data. The present effort helps pave the way for more systematic evaluations of normalization methods across different platforms. With our proposed schema researchers can evaluate their own or future normalization methods to further improve the field of RNASeq normalization.


Assuntos
RNA/genética , Análise de Sequência de RNA/métodos , Genoma , Genômica , Humanos , Transcriptoma
9.
Ann Surg Oncol ; 24(2): 347-354, 2017 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-27469124

RESUMO

PURPOSE: Identification of indeterminate melanocytic skin lesions capable of neoplastic progression is suboptimal and may potentially result in unnecessary morbidity from surgery. MicroRNAs (miRs) may be useful in classifying indeterminate Spitz tumors as having high or low risk for malignant behavior. METHODS: RNA was extracted from paraffin-embedded tissues of benign nevi, benign Spitz tumors, indeterminate Spitz tumors, and Spitzoid melanomas in adults (n = 62) and children (n = 28). The expression profile of 12 miRs in adults (6 miRs in children) was analyzed by real-time polymerase chain reaction. RESULTS: Benign Spitz lesions were characterized by decreased expression of miR-125b and miR-211, and upregulation of miR-22, compared with benign nevi (p < 0.05). A comparison of Spitzoid melanomas to benign nevi revealed overexpression of miR-21, miR-150, and miR-155 in the malignant primaries (p < 0.05). In adults, Spitzoid melanomas exhibited upregulation of miR-21, miR-150, and miR-155 compared with indeterminate Spitz lesions. Indeterminate Spitz lesions with low-risk pathologic features had lower miR-21 and miR-155 expression compared with Spitzoid melanoma tumors in adults (p < 0.05), while pathologic high-risk indeterminate Spitz lesions had increased levels of miR-200c expression compared with low-risk indeterminate lesions (p < 0.05). Pediatric Spitzoid melanomas exhibited increased miR-21 expression compared with indeterminate Spitz lesions (p < 0.05). Moreover, miR-155 expression was increased in indeterminate lesions with mitotic counts >1 and depth of invasion >1 mm, suggesting miR-155 expression is associated with histological characteristics. CONCLUSIONS: miR expression profiles can be measured in indeterminate Spitz tumors and correlate with markers of malignant potential.


Assuntos
Biomarcadores Tumorais/genética , Melanoma/classificação , MicroRNAs/genética , Nevo de Células Epitelioides e Fusiformes/classificação , Neoplasias Cutâneas/classificação , Adulto , Criança , Diagnóstico Diferencial , Feminino , Seguimentos , Humanos , Masculino , Melanoma/diagnóstico , Melanoma/genética , Nevo de Células Epitelioides e Fusiformes/diagnóstico , Nevo de Células Epitelioides e Fusiformes/genética , Prognóstico , Neoplasias Cutâneas/diagnóstico , Neoplasias Cutâneas/genética
11.
J Surg Res ; 205(2): 350-358, 2016 10.
Artigo em Inglês | MEDLINE | ID: mdl-27664883

RESUMO

BACKGROUND: Melanoma skin cancer remains the leading cause of skin cancer-related deaths. Spitz lesions represent a subset of melanocytic skin lesions characterized by epithelioid or spindled melanocytes organized in nests. These lesions occupy a spectrum ranging from benign Spitz and atypical Spitz lesions all the way to malignant Spitz tumors. Appropriate management is reliant on accurate diagnostic classification, yet this effort remains challenging using current light microscopic techniques. The discovery of novel biomarkers such as microRNAs (miR) may ultimately be a useful diagnostic adjunct for the evaluation of Spitz lesions. miR expression profiles have been suggested for non-Spitz melanomas but have yet to be ascribed to Spitz lesions. We hypothesized that distinct miR expression profiles would be associated with different lesions along the Spitz spectrum. MATERIALS AND METHODS: RNAs extracted from paraffin-embedded, formalin-fixed tissues of 11 resected skin lesions including benign nevi (n = 2), benign Spitz lesions (n = 3), atypical Spitz lesions (n = 3), and malignant Spitz tumors (n = 3) were analyzed by the NanoString platform for simultaneous evaluation of over 800 miRs in each patient sample. RESULTS: Benign Spitz lesions had increased expression of miR-21-5p and miR-363-3p compared with those of benign nevi. Malignant Spitz lesions exhibited overexpression of miR-21-5p, miR-155-5p, and miR-1283 relative to both benign nevi and benign Spitz tumors. Notably, atypical Spitz tumors had increased expression of miR-451a and decreased expression of miR-155-5p expression relative to malignant Spitz lesions. Conversely, atypical Spitz lesions had increased expression of miR-21-5p, miR-34a-5p, miR-451a, miR-1283, and miR-1260a relative to benign Spitz tumors. CONCLUSIONS: Overall, distinct miR profiles are suggested among Spitz lesions of varying malignant potential with some similarities to non-Spitz melanoma tumors. This work demonstrates the feasibility of this analytic method and forms the basis for further validation studies.


Assuntos
Biomarcadores Tumorais/genética , Regulação Neoplásica da Expressão Gênica , MicroRNAs/metabolismo , Nevo de Células Epitelioides e Fusiformes/diagnóstico , Neoplasias Cutâneas/diagnóstico , Transcriptoma , Adolescente , Adulto , Estudos de Casos e Controles , Diagnóstico Diferencial , Feminino , Seguimentos , Perfilação da Expressão Gênica , Humanos , Masculino , Nevo de Células Epitelioides e Fusiformes/genética , Neoplasias Cutâneas/genética , Adulto Jovem
12.
J Biomed Inform ; 60: 95-103, 2016 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-26828957

RESUMO

BACKGROUND: Community-level factors have been clearly linked to health outcomes, but are challenging to incorporate into medical practice. Increasing use of electronic health records (EHRs) makes patient-level data available for researchers in a systematic and accessible way, but these data remain siloed from community-level data relevant to health. PURPOSE: This study sought to link community and EHR data from an older female patient cohort participating in an ongoing intervention at the Ohio State University Wexner Medical Center to associate community-level data with patient-level cardiovascular health (CVH) as well as to assess the utility of this EHR integration methodology. MATERIALS AND METHODS: CVH was characterized among patients using available EHR data collected May through July of 2013. EHR data for 153 patients were linked to United States census-tract level data to explore feasibility and insights gained from combining these disparate data sources. Analyses were conducted in 2014. RESULTS: Using the linked data, weekly per capita expenditure on fruits and vegetables was found to be significantly associated with CVH at the p<0.05 level and three other community-level attributes (median income, average household size, and unemployment rate) were associated with CVH at the p<0.10 level. CONCLUSIONS: This work paves the way for future integration of community and EHR-based data into patient care as a novel methodology to gain insight into multi-level factors that affect CVH and other health outcomes. Further, our findings demonstrate the specific architectural and functional challenges associated with integrating decision support technologies and geographic information to support tailored and patient-centered decision making therein.


Assuntos
Sistema Cardiovascular , Atenção à Saúde , Registros Eletrônicos de Saúde , Nível de Saúde , Armazenamento e Recuperação da Informação , Idoso , Estudos de Coortes , Feminino , Sistemas de Informação Geográfica , Humanos , Ohio , Características de Residência , Fatores Socioeconômicos
13.
BMC Med Inform Decis Mak ; 16: 40, 2016 Mar 29.
Artigo em Inglês | MEDLINE | ID: mdl-27025583

RESUMO

Recent advances in the adoption and use of health information technology (HIT) have had a dramatic impact on the practice of medicine. In many environments, this has led to the ability to achieve new efficiencies and levels of safety. In others, the impact has been less positive, and is associated with both: 1) workflow and user experience dissatisfaction; and 2) perceptions of missed opportunities relative to the use of computational tools to enable data-driven and precise clinical decision making. Simultaneously, the "pipeline" through which new diagnostic tools and therapeutic agents are being developed and brought to the point-of-care or population health is challenged in terms of both cost and timeliness. Given the confluence of these trends, it can be argued that now is the time to consider new ways in which HIT can be used to deliver health and wellness interventions comparable to traditional approaches (e.g., drugs, devices, diagnostics, and behavioral modifications). Doing so could serve to fulfill the promise of what has been recently promoted as "precision medicine" in a rapid and cost-effective manner. However, it will also require the health and life sciences community to embrace new modes of using HIT, wherein the use of technology becomes a primary intervention as opposed to enabler of more conventional approaches, a model that we refer to in this commentary as "interventional informatics". Such a paradigm requires attention to critical issues, including: 1) the nature of the relationships between HIT vendors and healthcare innovators; 2) the formation and function of multidisciplinary teams consisting of technologists, informaticians, and clinical or scientific subject matter experts; and 3) the optimal design and execution of clinical studies that focus on HIT as the intervention of interest. Ultimately, the goal of an "interventional informatics" approach can and should be to substantially improve human health and wellness through the use of data-driven interventions at the point of care of broader population levels. Achieving a vision of "interventional informatics" will requires us to re-think how we study HIT tools in order to generate the necessary evidence-base that can support and justify their use as a primary means of improving the human condition.


Assuntos
Estudos Clínicos como Assunto , Informática Médica , Humanos , Informática Médica/tendências
14.
BMC Med Inform Decis Mak ; 14: 36, 2014 May 08.
Artigo em Inglês | MEDLINE | ID: mdl-24886134

RESUMO

BACKGROUND: Obesity and overweight are multifactorial diseases that affect two thirds of Americans, lead to numerous health conditions and deeply strain our healthcare system. With the increasing prevalence and dangers associated with higher body weight, there is great impetus to focus on public health strategies to prevent or curb the disease. Electronic health records (EHRs) are a powerful source for retrospective health data, but they lack important community-level information known to be associated with obesity. We explored linking EHR and community data to study factors associated with overweight and obesity in a systematic and rigorous way. METHODS: We augmented EHR-derived data on 62,701 patients with zip code-level socioeconomic and obesogenic data. Using a multinomial logistic regression model, we estimated odds ratios and 95% confidence intervals (OR, 95% CI) for community-level factors associated with overweight and obese body mass index (BMI), accounting for the clustering of patients within zip codes. RESULTS: 33, 31 and 35 percent of individuals had BMIs corresponding to normal, overweight and obese, respectively. Models adjusted for age, race and gender showed more farmers' markets/1,000 people (0.19, 0.10-0.36), more grocery stores/1,000 people (0.58, 0.36-0.93) and a 10% increase in percentage of college graduates (0.80, 0.77-0.84) were associated with lower odds of obesity. The same factors yielded odds ratios of smaller magnitudes for overweight. Our results also indicate that larger grocery stores may be inversely associated with obesity. CONCLUSIONS: Integrating community data into the EHR maximizes the potential of secondary use of EHR data to study and impact obesity prevention and other significant public health issues.


Assuntos
Índice de Massa Corporal , Coleta de Dados , Registros Eletrônicos de Saúde , Obesidade/epidemiologia , Características de Residência , Determinantes Sociais da Saúde , Adolescente , Adulto , Idoso , Coleta de Dados/estatística & dados numéricos , Registros Eletrônicos de Saúde/estatística & dados numéricos , Feminino , Humanos , Modelos Logísticos , Masculino , Informática Médica/métodos , Pessoa de Meia-Idade , Obesidade/prevenção & controle , Ohio/epidemiologia , Sobrepeso/epidemiologia , Sobrepeso/prevenção & controle , Características de Residência/estatística & dados numéricos , Estudos Retrospectivos , Determinantes Sociais da Saúde/estatística & dados numéricos , Adulto Jovem
16.
bioRxiv ; 2024 Jan 15.
Artigo em Inglês | MEDLINE | ID: mdl-38293243

RESUMO

Recently, large-scale scRNA-seq datasets have been generated to understand the complex and poorly understood signaling mechanisms within microenvironment of Alzheimer's Disease (AD), which are critical for identifying novel therapeutic targets and precision medicine. Though a set of targets have been identified, however, it remains a challenging to infer the core intra- and inter-multi-cell signaling communication networks using the scRNA-seq data, considering the complex and highly interactive background signaling network. Herein, we introduced a novel graph transformer model, PathFinder, to infer multi-cell intra- and inter-cellular signaling pathways and signaling communications among multi-cell types. Compared with existing models, the novel and unique design of PathFinder is based on the divide-and-conquer strategy, which divides the complex signaling networks into signaling paths, and then score and rank them using a novel graph transformer architecture to infer the intra- and inter-cell signaling communications. We evaluated PathFinder using scRNA-seq data of APOE4-genotype specific AD mice models and identified novel APOE4 altered intra- and inter-cell interaction networks among neurons, astrocytes, and microglia. PathFinder is a general signaling network inference model and can be applied to other omics data-driven signaling network inference.

17.
Front Cell Neurosci ; 18: 1369242, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-38846640

RESUMO

Recently, large-scale scRNA-seq datasets have been generated to understand the complex signaling mechanisms within the microenvironment of Alzheimer's Disease (AD), which are critical for identifying novel therapeutic targets and precision medicine. However, the background signaling networks are highly complex and interactive. It remains challenging to infer the core intra- and inter-multi-cell signaling communication networks using scRNA-seq data. In this study, we introduced a novel graph transformer model, PathFinder, to infer multi-cell intra- and inter-cellular signaling pathways and communications among multi-cell types. Compared with existing models, the novel and unique design of PathFinder is based on the divide-and-conquer strategy. This model divides complex signaling networks into signaling paths, which are then scored and ranked using a novel graph transformer architecture to infer intra- and inter-cell signaling communications. We evaluated the performance of PathFinder using two scRNA-seq data cohorts. The first cohort is an APOE4 genotype-specific AD, and the second is a human cirrhosis cohort. The evaluation confirms the promising potential of using PathFinder as a general signaling network inference model.

18.
bioRxiv ; 2024 Apr 06.
Artigo em Inglês | MEDLINE | ID: mdl-37808763

RESUMO

Objective: Accurately identifying clinical phenotypes from Electronic Health Records (EHRs) provides additional insights into patients' health, especially when such information is unavailable in structured data. This study evaluates the application of OpenAI's Generative Pre-trained Transformer (GPT)-4 model to identify clinical phenotypes from EHR text in non-small cell lung cancer (NSCLC) patients. The goal was to identify disease stages, treatments and progression utilizing GPT-4, and compare its performance against GPT-3.5-turbo, Flan-T5-xl, Flan-T5-xxl, and two rule-based and machine learning-based methods, namely, scispaCy and medspaCy. Materials and Methods: Phenotypes such as initial cancer stage, initial treatment, evidence of cancer recurrence, and affected organs during recurrence were identified from 13,646 records for 63 NSCLC patients from Washington University in St. Louis, Missouri. The performance of the GPT-4 model is evaluated against GPT-3.5-turbo, Flan-T5-xxl, Flan-T5-xl, medspaCy and scispaCy by comparing precision, recall, and micro-F1 scores. Results: GPT-4 achieved higher F1 score, precision, and recall compared to Flan-T5-xl, Flan-T5-xxl, medspaCy and scispaCy's models. GPT-3.5-turbo performed similarly to that of GPT-4. GPT and Flan-T5 models were not constrained by explicit rule requirements for contextual pattern recognition. SpaCy models relied on predefined patterns, leading to their suboptimal performance. Discussion and Conclusion: GPT-4 improves clinical phenotype identification due to its robust pre-training and remarkable pattern recognition capability on the embedded tokens. It demonstrates data-driven effectiveness even with limited context in the input. While rule-based models remain useful for some tasks, GPT models offer improved contextual understanding of the text, and robust clinical phenotype extraction.

19.
JAMIA Open ; 7(3): ooae060, 2024 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-38962662

RESUMO

Objective: Accurately identifying clinical phenotypes from Electronic Health Records (EHRs) provides additional insights into patients' health, especially when such information is unavailable in structured data. This study evaluates the application of OpenAI's Generative Pre-trained Transformer (GPT)-4 model to identify clinical phenotypes from EHR text in non-small cell lung cancer (NSCLC) patients. The goal was to identify disease stages, treatments and progression utilizing GPT-4, and compare its performance against GPT-3.5-turbo, Flan-T5-xl, Flan-T5-xxl, Llama-3-8B, and 2 rule-based and machine learning-based methods, namely, scispaCy and medspaCy. Materials and Methods: Phenotypes such as initial cancer stage, initial treatment, evidence of cancer recurrence, and affected organs during recurrence were identified from 13 646 clinical notes for 63 NSCLC patients from Washington University in St. Louis, Missouri. The performance of the GPT-4 model is evaluated against GPT-3.5-turbo, Flan-T5-xxl, Flan-T5-xl, Llama-3-8B, medspaCy, and scispaCy by comparing precision, recall, and micro-F1 scores. Results: GPT-4 achieved higher F1 score, precision, and recall compared to Flan-T5-xl, Flan-T5-xxl, Llama-3-8B, medspaCy, and scispaCy's models. GPT-3.5-turbo performed similarly to that of GPT-4. GPT, Flan-T5, and Llama models were not constrained by explicit rule requirements for contextual pattern recognition. spaCy models relied on predefined patterns, leading to their suboptimal performance. Discussion and Conclusion: GPT-4 improves clinical phenotype identification due to its robust pre-training and remarkable pattern recognition capability on the embedded tokens. It demonstrates data-driven effectiveness even with limited context in the input. While rule-based models remain useful for some tasks, GPT models offer improved contextual understanding of the text, and robust clinical phenotype extraction.

20.
J Neurodev Disord ; 16(1): 17, 2024 Apr 17.
Artigo em Inglês | MEDLINE | ID: mdl-38632549

RESUMO

Monogenic disorders account for a large proportion of population-attributable risk for neurodevelopmental disabilities. However, the data necessary to infer a causal relationship between a given genetic variant and a particular neurodevelopmental disorder is often lacking. Recognizing this scientific roadblock, 13 Intellectual and Developmental Disabilities Research Centers (IDDRCs) formed a consortium to create the Brain Gene Registry (BGR), a repository pairing clinical genetic data with phenotypic data from participants with variants in putative brain genes. Phenotypic profiles are assembled from the electronic health record (EHR) and a battery of remotely administered standardized assessments collectively referred to as the Rapid Neurobehavioral Assessment Protocol (RNAP), which include cognitive, neurologic, and neuropsychiatric assessments, as well as assessments for attention deficit hyperactivity disorder (ADHD) and autism spectrum disorder (ASD). Co-enrollment of BGR participants in the Clinical Genome Resource's (ClinGen's) GenomeConnect enables display of variant information in ClinVar. The BGR currently contains data on 479 participants who are 55% male, 6% Asian, 6% Black or African American, 76% white, and 12% Hispanic/Latine. Over 200 genes are represented in the BGR, with 12 or more participants harboring variants in each of these genes: CACNA1A, DNMT3A, SLC6A1, SETD5, and MYT1L. More than 30% of variants are de novo and 43% are classified as variants of uncertain significance (VUSs). Mean standard scores on cognitive or developmental screens are below average for the BGR cohort. EHR data reveal developmental delay as the earliest and most common diagnosis in this sample, followed by speech and language disorders, ASD, and ADHD. BGR data has already been used to accelerate gene-disease validity curation of 36 genes evaluated by ClinGen's BGR Intellectual Disability (ID)-Autism (ASD) Gene Curation Expert Panel. In summary, the BGR is a resource for use by stakeholders interested in advancing translational research for brain genes and continues to recruit participants with clinically reported variants to establish a rich and well-characterized national resource to promote research on neurodevelopmental disorders.


Assuntos
Transtorno do Espectro Autista , Transtorno Autístico , Deficiência Intelectual , Transtornos do Neurodesenvolvimento , Humanos , Masculino , Feminino , Transtorno do Espectro Autista/genética , Encéfalo , Sistema de Registros , Metiltransferases
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA