Pesquisa | Portal Regional da BVS

1.

Oncologists Must Consider Participant Data When Using Large-Scale Cancer Data Sets.

Avila, Santiago; Roberson, Mya L; Rajagopal, Padma Sheila.

JCO Clin Cancer Inform ; 8: e2300245, 2024 Jun.

Artigo em Inglês | MEDLINE | ID: mdl-38959448

RESUMO

Primer that helps clarify large-scale clinical data sets and participant demographics for oncologists.

Assuntos

Neoplasias , Oncologistas , Humanos , Neoplasias/epidemiologia , Oncologia/métodos , Conjuntos de Dados como Assunto , Bases de Dados Factuais

2.

Designing a core data set for benign hysterectomy registration system and its implementation in a referral teaching hospital in Northwest Iran.

Asl, Fatemeh Moghadami; Maserat, Elham; Vaezi, Maryam; Mohammadzadeh, Zeinab.

BMC Pregnancy Childbirth ; 24(1): 460, 2024 Jul 03.

Artigo em Inglês | MEDLINE | ID: mdl-38961444

RESUMO

BACKGROUND AND AIMS: Although minimally invasive hysterectomy offers advantages, abdominal hysterectomy remains the predominant surgical method. Creating a standardized dataset and establishing a hysterectomy registry system present opportunities for early interventions in reducing volume and selecting benign hysterectomy methods. This research aims to develop a dataset for designing benign hysterectomy registration system. METHODS: Between April and September 2020, a qualitative study was carried out to create a data set for enrolling patients who were candidate for hysterectomy. At this stage, the research team conducted an information needs assessment, relevant data element identification, registry software development, and field testing; Subsequently, a web-based application was designed. In June 2023the registry software was evaluated using data extracted from medical records of patients admitted at Al-Zahra Hospital in Tabriz, Iran. RESULTS: During two months, 40 patients with benign hysterectomy were successfully registered. The final dataset for the hysterectomy patient registry comprise 11 main groups, 27 subclasses, and a total of 91 Data elements. Mandatory data and essential reports were defined. Furthermore, a web-based registry system designed and evaluated based on data set and various scenarios. CONCLUSION: Creating a hysterectomy registration system is the initial stride toward identifying and registering hysterectomy candidate patients. this system capture information about the procedure techniques, and associated complications. In Iran, this registry can serve as a valuable resource for assessing the quality of care delivered and the distribution of clinical measures.

Assuntos

Hospitais de Ensino , Histerectomia , Sistema de Registros , Humanos , Feminino , Irã (Geográfico) , Histerectomia/métodos , Histerectomia/estatística & dados numéricos , Adulto , Pessoa de Meia-Idade , Encaminhamento e Consulta/estatística & dados numéricos , Pesquisa Qualitativa , Conjuntos de Dados como Assunto

3.

ChatGPT for science: how to talk to your data.

Nowogrodzki, Julian.

Nature ; 631(8022): 924-925, 2024 Jul.

Artigo em Inglês | MEDLINE | ID: mdl-39039191

Assuntos

Análise de Dados , Conjuntos de Dados como Assunto , Processamento de Linguagem Natural , Pesquisadores

4.

Genome-scale metabolic network of human carotid plaque reveals the pivotal role of glutamine/glutamate metabolism in macrophage modulating plaque inflammation and vulnerability.

Jin, Han; Zhang, Cheng; Nagenborg, Jan; Juhasz, Peter; Ruder, Adele V; Sikkink, Cornelis J J M; Mees, Barend M E; Waring, Olivia; Sluimer, Judith C; Neumann, Dietbert; Goossens, Pieter; Donners, Marjo M P C; Mardinoglu, Adil; Biessen, Erik A L.

Cardiovasc Diabetol ; 23(1): 240, 2024 Jul 08.

Artigo em Inglês | MEDLINE | ID: mdl-38978031

RESUMO

BACKGROUND: Metabolism is increasingly recognized as a key regulator of the function and phenotype of the primary cellular constituents of the atherosclerotic vascular wall, including endothelial cells, smooth muscle cells, and inflammatory cells. However, a comprehensive analysis of metabolic changes associated with the transition of plaque from a stable to a hemorrhaged phenotype is lacking. METHODS: In this study, we integrated two large mRNA expression and protein abundance datasets (BIKE, n = 126; MaasHPS, n = 43) from human atherosclerotic carotid artery plaque to reconstruct a genome-scale metabolic network (GEM). Next, the GEM findings were linked to metabolomics data from MaasHPS, providing a comprehensive overview of metabolic changes in human plaque. RESULTS: Our study identified significant changes in lipid, cholesterol, and inositol metabolism, along with altered lysosomal lytic activity and increased inflammatory activity, in unstable plaques with intraplaque hemorrhage (IPH+) compared to non-hemorrhaged (IPH-) plaques. Moreover, topological analysis of this network model revealed that the conversion of glutamine to glutamate and their flux between the cytoplasm and mitochondria were notably compromised in hemorrhaged plaques, with a significant reduction in overall glutamate levels in IPH+ plaques. Additionally, reduced glutamate availability was associated with an increased presence of macrophages and a pro-inflammatory phenotype in IPH+ plaques, suggesting an inflammation-prone microenvironment. CONCLUSIONS: This study is the first to establish a robust and comprehensive GEM for atherosclerotic plaque, providing a valuable resource for understanding plaque metabolism. The utility of this GEM was illustrated by its ability to reliably predict dysregulation in the cholesterol hydroxylation, inositol metabolism, and the glutamine/glutamate pathway in rupture-prone hemorrhaged plaques, a finding that may pave the way to new diagnostic or therapeutic measures.

Assuntos

Doenças das Artérias Carótidas , Ácido Glutâmico , Glutamina , Macrófagos , Redes e Vias Metabólicas , Fenótipo , Placa Aterosclerótica , Humanos , Glutamina/metabolismo , Ácido Glutâmico/metabolismo , Macrófagos/metabolismo , Macrófagos/patologia , Doenças das Artérias Carótidas/metabolismo , Doenças das Artérias Carótidas/patologia , Doenças das Artérias Carótidas/genética , Ruptura Espontânea , Artérias Carótidas/patologia , Artérias Carótidas/metabolismo , Metabolômica , Bases de Dados Genéticas , Inflamação/metabolismo , Inflamação/genética , Inflamação/patologia , Metabolismo Energético , Conjuntos de Dados como Assunto , Masculino

5.

New datasets from PsychENCODE.

Wiseman, Shari.

Nat Neurosci ; 27(7): 1214, 2024 Jul.

Artigo em Inglês | MEDLINE | ID: mdl-38982200

Assuntos

Encéfalo , Humanos , Encéfalo/fisiologia , Conjuntos de Dados como Assunto

6.

nnU-Net-based deep-learning for pulmonary embolism: detection, clot volume quantification, and severity correlation in the RSPECT dataset.

Lanza, Ezio; Ammirabile, Angela; Francone, Marco.

Eur J Radiol ; 177: 111592, 2024 Aug.

Artigo em Inglês | MEDLINE | ID: mdl-38968751

RESUMO

OBJECTIVES: CT pulmonary angiography is the gold standard for diagnosing pulmonary embolism, and DL algorithms are being developed to manage the increase in demand. The nnU-Net is a new auto-adaptive DL framework that minimizes manual tuning, making it easier to develop effective algorithms for medical imaging even without specific expertise. This study assesses the performance of a locally developed nnU-Net algorithm on the RSPECT dataset for PE detection, clot volume measurement, and correlation with right ventricle overload. MATERIALS & METHODS: User input was limited to segmentation using 3DSlicer. We worked with the RSPECT dataset and trained an algorithm from 205 PE and 340 negatives. The test dataset comprised 6573 exams. Performance was tested against PE characteristics, such as central, non-central, and RV overload. Blood clot volume (BCV) was extracted from each exam. We employed ROC curves and logistic regression for statistical validation. RESULTS: Negative studies had a median BCV of 1 µL, which increased to 345 µL in PE-positive cases and 7,378 µL in central PEs. Statistical analysis confirmed a significant BCV correlation with PE presence, central PE, and increased RV/LV ratio (p < 0.0001). The model's AUC for PE detection was 0.865, with an 83 % accuracy at a 55 µL threshold. Central PE detection AUC was 0.937 with 91 % accuracy at 850 µL. The RV overload AUC stood at 0.848 with 79 % accuracy. CONCLUSION: The nnU-Net algorithm demonstrated accurate PE detection, particularly for central PE. BCV is an accurate metric for automated severity stratification and case prioritization. CLINICAL RELEVANCE STATEMENT: The nnU-Net framework can be utilized to create a dependable DL for detecting PE. It offers a user-friendly approach to those lacking expertise in AI and rapidly extracts the Blood Clot Volume, a metric that can evaluate the PE's severity.

Assuntos

Angiografia por Tomografia Computadorizada , Aprendizado Profundo , Embolia Pulmonar , Embolia Pulmonar/diagnóstico por imagem , Humanos , Angiografia por Tomografia Computadorizada/métodos , Masculino , Algoritmos , Feminino , Índice de Gravidade de Doença , Pessoa de Meia-Idade , Conjuntos de Dados como Assunto , Idoso

7.

Measuring what counts in Aboriginal and Torres Strait Islander care: a review of general practice datasets available for assessing chronic disease care.

McBride Kelly, Liam; Wong, Deborah; Timothy, Andrea.

Aust J Prim Health ; 302024 Jul.

Artigo em Inglês | MEDLINE | ID: mdl-38981000

RESUMO

Background Large datasets exist in Australia that make de-identified primary healthcare data extracted from clinical information systems available for research use. This study reviews these datasets for their capacity to provide insight into chronic disease care for Aboriginal and Torres Strait Islander peoples, and the extent to which the principles of Indigenous Data Sovereignty are reflected in data collection and governance arrangements. Methods Datasets were included if they collect primary healthcare clinical information system data, collect data nationally, and capture Aboriginal and Torres Strait Islander peoples. We searched PubMed and the public Internet for data providers meeting the inclusion criteria. We developed a framework to assess data providers across domains, including representativeness, usability, data quality, adherence with Indigenous Data Sovereignty and their capacity to provide insights into chronic disease. Datasets were assessed against the framework based on email interviews and publicly available information. Results We identified seven datasets. Only two datasets reported on chronic disease, collected data nationally and captured a substantial number of Aboriginal and Torres Strait Islander patients. No dataset was identified that captured a significant number of both mainstream general practice clinics and Aboriginal Community Controlled Health Organisations. Conclusions It is critical that more accurate, comprehensive and culturally meaningful Aboriginal and Torres Strait Islander healthcare data are collected. These improvements must be guided by the principles of Indigenous Data Sovereignty and Governance. Validated and appropriate chronic disease indicators for Aboriginal and Torres Strait Islander peoples must be developed, including indicators of social and cultural determinants of health.

Assuntos

Medicina Geral , Serviços de Saúde do Indígena , Humanos , Austrália , Povos Aborígenes Australianos e Ilhéus do Estreito de Torres , Doença Crônica , Conjuntos de Dados como Assunto , Medicina Geral/estatística & dados numéricos , Medicina Geral/métodos , Serviços de Saúde do Indígena/estatística & dados numéricos , Atenção Primária à Saúde/estatística & dados numéricos

8.

Live chromosome identifying and tracking reveals size-based spatial pathway of meiotic errors in oocytes.

Takenouchi, Osamu; Sakakibara, Yogo; Kitajima, Tomoya S.

Science ; 385(6706): eadn5529, 2024 Jul 19.

Artigo em Inglês | MEDLINE | ID: mdl-39024439

RESUMO

Meiotic errors of relatively small chromosomes in oocytes result in egg aneuploidies that cause miscarriages and congenital diseases. Unlike somatic cells, which preferentially mis-segregate larger chromosomes, aged oocytes preferentially mis-segregate smaller chromosomes through unclear processes. Here, we provide a comprehensive three-dimensional chromosome identifying-and-tracking dataset throughout meiosis I in live mouse oocytes. This analysis reveals a prometaphase pathway that actively moves smaller chromosomes to the inner region of the metaphase plate. In the inner region, chromosomes are pulled by stronger bipolar microtubule forces, which facilitates premature chromosome separation, a major cause of segregation errors in aged oocytes. This study reveals a spatial pathway that facilitates aneuploidy of small chromosomes preferentially in aged eggs and implicates the role of the M phase in creating a chromosome size-based spatial arrangement.

Assuntos

Aneuploidia , Segregação de Cromossomos , Meiose , Microtúbulos , Oócitos , Animais , Feminino , Camundongos , Cromossomos de Mamíferos/genética , Metáfase , Microtúbulos/metabolismo , Oócitos/citologia , Oócitos/metabolismo , Conjuntos de Dados como Assunto

9.

Exploring CrossFit performance prediction and analysis via extensive data and machine learning.

Lim, Byunggul; Song, Wook.

J Sports Med Phys Fitness ; 64(7): 640-649, 2024 Jul.

Artigo em Inglês | MEDLINE | ID: mdl-38916087

RESUMO

BACKGROUND: The analysis of athletic performance has always aroused great interest from sport scientist. This study utilized machine learning methods to build predictive models using a comprehensive CrossFit (CF) dataset, aiming to reveal valuable insights into the factors influencing performance and emerging trends. METHODS: Random forest (RF) and multiple linear regression (MLR) were employed to predict performance in four key weightlifting exercises within CF: clean and jerk, snatch, back squat, and deadlift. Performance was evaluated using R-squared (R2) values and mean squared error (MSE). Feature importance analysis was conducted using RF, XGBoost, and AdaBoost models. RESULTS: The RF model excelled in deadlift performance prediction (R2=0.80), while the MLR model demonstrated remarkable accuracy in clean and jerk (R2=0.93). Across exercises, clean and jerk consistently emerged as a crucial predictor. The feature importance analysis revealed intricate relationships among exercises, with gender significantly impacting deadlift performance. CONCLUSIONS: This research advances our understanding of performance prediction in CF through machine learning techniques. It provides actionable insights for practitioners, optimize performance, and demonstrates the potential for future advancements in data-driven sports analytics.

Assuntos

Desempenho Atlético , Levantamento de Peso , Humanos , Aprendizado de Máquina , Conjuntos de Dados como Assunto , Algoritmo Florestas Aleatórias , Adulto , Análise de Dados , Masculino , Feminino

10.

Classification of autonomous vehicle crash severity: Solving the problems of imbalanced datasets and small sample size.

Kuo, Pei-Fen; Hsu, Wei-Ting; Lord, Dominique; Putra, I Gede Brawiswa.

Accid Anal Prev ; 205: 107666, 2024 Sep.

Artigo em Inglês | MEDLINE | ID: mdl-38901160

RESUMO

Only a few researchers have shown how environmental factors and road features relate to Autonomous Vehicle (AV) crash severity levels, and none have focused on the data limitation problems, such as small sample sizes, imbalanced datasets, and high dimensional features. To address these problems, we analyzed an AV crash dataset (2019 to 2021) from the California Department of Motor Vehicles (CA DMV), which included 266 collision reports (51 of those causing injuries). We included external environmental variables by collecting various points of interest (POIs) and roadway features from Open Street Map (OSM) and Data San Francisco (SF). Random Over-Sampling Examples (ROSE) and the Synthetic Minority Over-Sampling Technique (SMOTE) methods were used to balance the dataset and increase the sample size. These two balancing methods were used to expand the dataset and solve the small sample size problem simultaneously. Mutual information, random forest, and XGboost were utilized to address the high dimensional feature and the selection problem caused by including a variety of types of POIs as predictive variables. Because existing studies do not use consistent procedures, we compared the effectiveness of using the feature-selection preprocessing method as the first process to employing the data-balance technique as the first process. Our results showed that AV crash severity levels are related to vehicle manufacturers, vehicle damage level, collision type, vehicle movement, the parties involved in the crash, speed limit, and some types of POIs (areas near transportation, entertainment venues, public places, schools, and medical facilities). Both resampling methods and three data preprocessing methods improved model performance, and the model that used SMOTE and data-balancing first was the best. The results suggest that over-sampling and the feature selection method can improve model prediction performance and define new factors related to AV crash severity levels.

Assuntos

Acidentes de Trânsito , Acidentes de Trânsito/estatística & dados numéricos , Acidentes de Trânsito/classificação , Humanos , Tamanho da Amostra , California/epidemiologia , Automóveis/estatística & dados numéricos , Conjuntos de Dados como Assunto

11.

Connectomic reconstruction of a female Drosophila ventral nerve cord.

Azevedo, Anthony; Lesser, Ellen; Phelps, Jasper S; Mark, Brandon; Elabbady, Leila; Kuroda, Sumiya; Sustar, Anne; Moussa, Anthony; Khandelwal, Avinash; Dallmann, Chris J; Agrawal, Sweta; Lee, Su-Yee J; Pratt, Brandon; Cook, Andrew; Skutt-Kakaria, Kyobi; Gerhard, Stephan; Lu, Ran; Kemnitz, Nico; Lee, Kisuk; Halageri, Akhilesh; Castro, Manuel; Ih, Dodam; Gager, Jay; Tammam, Marwan; Dorkenwald, Sven; Collman, Forrest; Schneider-Mizell, Casey; Brittain, Derrick; Jordan, Chris S; Dickinson, Michael; Pacureanu, Alexandra; Seung, H Sebastian; Macrina, Thomas; Lee, Wei-Chung Allen; Tuthill, John C.

Nature ; 631(8020): 360-368, 2024 Jul.

Artigo em Inglês | MEDLINE | ID: mdl-38926570

RESUMO

A deep understanding of how the brain controls behaviour requires mapping neural circuits down to the muscles that they control. Here, we apply automated tools to segment neurons and identify synapses in an electron microscopy dataset of an adult female Drosophila melanogaster ventral nerve cord (VNC)1, which functions like the vertebrate spinal cord to sense and control the body. We find that the fly VNC contains roughly 45 million synapses and 14,600 neuronal cell bodies. To interpret the output of the connectome, we mapped the muscle targets of leg and wing motor neurons using genetic driver lines2 and X-ray holographic nanotomography3. With this motor neuron atlas, we identified neural circuits that coordinate leg and wing movements during take-off. We provide the reconstruction of VNC circuits, the motor neuron atlas and tools for programmatic and interactive access as resources to support experimental and theoretical studies of how the nervous system controls behaviour.

Assuntos

Conectoma , Drosophila melanogaster , Neurônios Motores , Tecido Nervoso , Vias Neurais , Sinapses , Animais , Feminino , Conjuntos de Dados como Assunto , Drosophila melanogaster/anatomia & histologia , Drosophila melanogaster/citologia , Drosophila melanogaster/fisiologia , Drosophila melanogaster/ultraestrutura , Extremidades/fisiologia , Extremidades/inervação , Holografia , Microscopia Eletrônica , Neurônios Motores/citologia , Neurônios Motores/fisiologia , Neurônios Motores/ultraestrutura , Movimento , Músculos/inervação , Músculos/fisiologia , Tecido Nervoso/anatomia & histologia , Tecido Nervoso/citologia , Tecido Nervoso/fisiologia , Tecido Nervoso/ultraestrutura , Vias Neurais/citologia , Vias Neurais/fisiologia , Vias Neurais/ultraestrutura , Sinapses/fisiologia , Sinapses/ultraestrutura , Tomografia por Raios X , Asas de Animais/inervação , Asas de Animais/fisiologia

12.

Standards of Basic Dataset of Chronic Disease Behavior Risk Factor Surveillance in Adults.

Biomed Environ Sci ; 37(5): 549-550, 2024 May 20.

Artigo em Inglês | MEDLINE | ID: mdl-38843930

Assuntos

Conjuntos de Dados como Assunto , Humanos , Doença Crônica/epidemiologia , Adulto , Fatores de Risco , Sistema de Vigilância de Fator de Risco Comportamental

13.

DREAMER: a computational framework to evaluate readiness of datasets for machine learning.

Ahangaran, Meysam; Zhu, Hanzhi; Li, Ruihui; Yin, Lingkai; Jang, Joseph; Chaudhry, Arnav P; Farrer, Lindsay A; Au, Rhoda; Kolachalama, Vijaya B.

BMC Med Inform Decis Mak ; 24(1): 152, 2024 Jun 04.

Artigo em Inglês | MEDLINE | ID: mdl-38831432

RESUMO

BACKGROUND: Machine learning (ML) has emerged as the predominant computational paradigm for analyzing large-scale datasets across diverse domains. The assessment of dataset quality stands as a pivotal precursor to the successful deployment of ML models. In this study, we introduce DREAMER (Data REAdiness for MachinE learning Research), an algorithmic framework leveraging supervised and unsupervised machine learning techniques to autonomously evaluate the suitability of tabular datasets for ML model development. DREAMER is openly accessible as a tool on GitHub and Docker, facilitating its adoption and further refinement within the research community.. RESULTS: The proposed model in this study was applied to three distinct tabular datasets, resulting in notable enhancements in their quality with respect to readiness for ML tasks, as assessed through established data quality metrics. Our findings demonstrate the efficacy of the framework in substantially augmenting the original dataset quality, achieved through the elimination of extraneous features and rows. This refinement yielded improved accuracy across both supervised and unsupervised learning methodologies. CONCLUSION: Our software presents an automated framework for data readiness, aimed at enhancing the integrity of raw datasets to facilitate robust utilization within ML pipelines. Through our proposed framework, we streamline the original dataset, resulting in enhanced accuracy and efficiency within the associated ML algorithms.

Assuntos

Aprendizado de Máquina , Humanos , Conjuntos de Dados como Assunto , Aprendizado de Máquina não Supervisionado , Algoritmos , Aprendizado de Máquina Supervisionado , Software

14.

Assessment of Embedding Schemes in a Hybrid Machine Learning/Classical Potentials (ML/MM) Approach.

Grassano, Juan S; Pickering, Ignacio; Roitberg, Adrian E; González Lebrero, Mariano C; Estrin, Dario A; Semelak, Jonathan A.

J Chem Inf Model ; 64(10): 4047-4058, 2024 May 27.

Artigo em Inglês | MEDLINE | ID: mdl-38710065

RESUMO

Machine learning (ML) methods have reached high accuracy levels for the prediction of in vacuo molecular properties. However, the simulation of large systems solely through ML methods (such as those based on neural network potentials) is still a challenge. In this context, one of the most promising frameworks for integrating ML schemes in the simulation of complex molecular systems are the so-called ML/MM methods. These multiscale approaches combine ML methods with classical force fields (MM), in the same spirit as the successful hybrid quantum mechanics-molecular mechanics methods (QM/MM). The key issue for such ML/MM methods is an adequate description of the coupling between the region of the system described by ML and the region described at the MM level. In the context of QM/MM schemes, the main ingredient of the interaction is electrostatic, and the state of the art is the so-called electrostatic-embedding. In this study, we analyze the quality of simpler mechanical embedding-based approaches, specifically focusing on their application within a ML/MM framework utilizing atomic partial charges derived in vacuo. Taking as reference electrostatic embedding calculations performed at a QM(DFT)/MM level, we explore different atomic charges schemes, as well as a polarization correction computed using atomic polarizabilites. Our benchmark data set comprises a set of about 80k small organic structures from the ANI-1x and ANI-2x databases, solvated in water. The results suggest that the minimal basis iterative stockholder (MBIS) atomic charges yield the best agreement with the reference coupling energy. Remarkable enhancements are achieved by including a simple polarization correction.

Assuntos

Aminoácidos/química , Bases de Dados Factuais , Modelos Moleculares , Modelos Químicos , Conjuntos de Dados como Assunto

15.

[Interoperability Working Group: core dataset and information systems for data integration and data exchange in the Medical Informatics Initiative]. / Arbeitsgruppe Interoperabilität: Kerndatensatz und Informationssysteme für Integration und Austausch von Daten in der Medizininformatik-Initiative.

Ammon, Danny; Kurscheidt, Maximilian; Buckow, Karoline; Kirsten, Toralf; Löbe, Matthias; Meineke, Frank; Prasser, Fabian; Saß, Julian; Sax, Ulrich; Stäubert, Sebastian; Thun, Sylvia; Wettstein, Reto; Wiedekopf, Joshua P; Wodke, Judith A H; Boeker, Martin; Ganslandt, Thomas.

Bundesgesundheitsblatt Gesundheitsforschung Gesundheitsschutz ; 67(6): 656-667, 2024 Jun.

Artigo em Alemão | MEDLINE | ID: mdl-38753022

RESUMO

The interoperability Working Group of the Medical Informatics Initiative (MII) is the platform for the coordination of overarching procedures, data structures, and interfaces between the data integration centers (DIC) of the university hospitals and national and international interoperability committees. The goal is the joint content-related and technical design of a distributed infrastructure for the secondary use of healthcare data that can be used via the Research Data Portal for Health. Important general conditions are data privacy and IT security for the use of health data in biomedical research. To this end, suitable methods are used in dedicated task forces to enable procedural, syntactic, and semantic interoperability for data use projects. The MII core dataset was developed as several modules with corresponding information models and implemented using the HL7® FHIR® standard to enable content-related and technical specifications for the interoperable provision of healthcare data through the DIC. International terminologies and consented metadata are used to describe these data in more detail. The overall architecture, including overarching interfaces, implements the methodological and legal requirements for a distributed data use infrastructure, for example, by providing pseudonymized data or by federated analyses. With these results of the Interoperability Working Group, the MII is presenting a future-oriented solution for the exchange and use of healthcare data, the applicability of which goes beyond the purpose of research and can play an essential role in the digital transformation of the healthcare system.

Assuntos

Interoperabilidade da Informação em Saúde , Humanos , Conjuntos de Dados como Assunto , Registros Eletrônicos de Saúde , Alemanha , Interoperabilidade da Informação em Saúde/normas , Informática Médica , Registro Médico Coordenado/métodos , Integração de Sistemas

16.

A qualitative interview study to determine barriers and facilitators of implementing automated decision support tools for genomic data access.

Rahimzadeh, Vasiliki; Baek, Jinyoung; Lawson, Jonathan; Dove, Edward S.

BMC Med Ethics ; 25(1): 51, 2024 May 05.

Artigo em Inglês | MEDLINE | ID: mdl-38706004

RESUMO

Data access committees (DAC) gatekeep access to secured genomic and related health datasets yet are challenged to keep pace with the rising volume and complexity of data generation. Automated decision support (ADS) systems have been shown to support consistency, compliance, and coordination of data access review decisions. However, we lack understanding of how DAC members perceive the value add of ADS, if any, on the quality and effectiveness of their reviews. In this qualitative study, we report findings from 13 semi-structured interviews with DAC members from around the world to identify relevant barriers and facilitators to implementing ADS for genomic data access management. Participants generally supported pilot studies that test ADS performance, for example in cataloging data types, verifying user credentials and tagging datasets for use terms. Concerns related to over-automation, lack of human oversight, low prioritization, and misalignment with institutional missions tempered enthusiasm for ADS among the DAC members we engaged. Tensions for change in institutional settings within which DACs operated was a powerful motivator for why DAC members considered the implementation of ADS into their access workflows, as well as perceptions of the relative advantage of ADS over the status quo. Future research is needed to build the evidence base around the comparative effectiveness and decisional outcomes of institutions that do/not use ADS into their workflows.

Assuntos

Conjuntos de Dados como Assunto , Técnicas de Apoio para a Decisão , Genômica , Software , Automação , Fluxo de Trabalho , Entrevistas como Assunto , Sistemas de Dados , Conjuntos de Dados como Assunto/legislação & jurisprudência , Humanos

17.

A whole-slide foundation model for digital pathology from real-world data.

Xu, Hanwen; Usuyama, Naoto; Bagga, Jaspreet; Zhang, Sheng; Rao, Rajesh; Naumann, Tristan; Wong, Cliff; Gero, Zelalem; González, Javier; Gu, Yu; Xu, Yanbo; Wei, Mu; Wang, Wenhui; Ma, Shuming; Wei, Furu; Yang, Jianwei; Li, Chunyuan; Gao, Jianfeng; Rosemon, Jaylen; Bower, Tucker; Lee, Soohee; Weerasinghe, Roshanthi; Wright, Bill J; Robicsek, Ari; Piening, Brian; Bifulco, Carlo; Wang, Sheng; Poon, Hoifung.

Nature ; 630(8015): 181-188, 2024 Jun.

Artigo em Inglês | MEDLINE | ID: mdl-38778098

RESUMO

Digital pathology poses unique computational challenges, as a standard gigapixel slide may comprise tens of thousands of image tiles1-3. Prior models have often resorted to subsampling a small portion of tiles for each slide, thus missing the important slide-level context4. Here we present Prov-GigaPath, a whole-slide pathology foundation model pretrained on 1.3 billion 256 × 256 pathology image tiles in 171,189 whole slides from Providence, a large US health network comprising 28 cancer centres. The slides originated from more than 30,000 patients covering 31 major tissue types. To pretrain Prov-GigaPath, we propose GigaPath, a novel vision transformer architecture for pretraining gigapixel pathology slides. To scale GigaPath for slide-level learning with tens of thousands of image tiles, GigaPath adapts the newly developed LongNet5 method to digital pathology. To evaluate Prov-GigaPath, we construct a digital pathology benchmark comprising 9 cancer subtyping tasks and 17 pathomics tasks, using both Providence and TCGA data6. With large-scale pretraining and ultra-large-context modelling, Prov-GigaPath attains state-of-the-art performance on 25 out of 26 tasks, with significant improvement over the second-best method on 18 tasks. We further demonstrate the potential of Prov-GigaPath on vision-language pretraining for pathology7,8 by incorporating the pathology reports. In sum, Prov-GigaPath is an open-weight foundation model that achieves state-of-the-art performance on various digital pathology tasks, demonstrating the importance of real-world data and whole-slide modelling.

Assuntos

Conjuntos de Dados como Assunto , Processamento de Imagem Assistida por Computador , Aprendizado de Máquina , Patologia Clínica , Humanos , Benchmarking , Processamento de Imagem Assistida por Computador/métodos , Neoplasias/classificação , Neoplasias/diagnóstico , Neoplasias/patologia , Patologia Clínica/métodos , Masculino , Feminino

18.

Pseudo-class part prototype networks for interpretable breast cancer classification.

Choukali, Mohammad Amin; Amirani, Mehdi Chehel; Valizadeh, Morteza; Abbasi, Ata; Komeili, Majid.

Sci Rep ; 14(1): 10341, 2024 05 06.

Artigo em Inglês | MEDLINE | ID: mdl-38710757

RESUMO

Interpretability in machine learning has become increasingly important as machine learning is being used in more and more applications, including those with high-stakes consequences such as healthcare where Interpretability has been regarded as a key to the successful adoption of machine learning models. However, using confounding/irrelevant information in making predictions by deep learning models, even the interpretable ones, poses critical challenges to their clinical acceptance. That has recently drawn researchers' attention to issues beyond the mere interpretation of deep learning models. In this paper, we first investigate application of an inherently interpretable prototype-based architecture, known as ProtoPNet, for breast cancer classification in digital pathology and highlight its shortcomings in this application. Then, we propose a new method that uses more medically relevant information and makes more accurate and interpretable predictions. Our method leverages the clustering concept and implicitly increases the number of classes in the training dataset. The proposed method learns more relevant prototypes without any pixel-level annotated data. To have a more holistic assessment, in addition to classification accuracy, we define a new metric for assessing the degree of interpretability based on the comments of a group of skilled pathologists. Experimental results on the BreakHis dataset show that the proposed method effectively improves the classification accuracy and interpretability by respectively 8 % and 18 % . Therefore, the proposed method can be seen as a step toward implementing interpretable deep learning models for the detection of breast cancer using histopathology images.

Assuntos

Neoplasias da Mama , Aprendizado Profundo , Redes Neurais de Computação , Patologia Clínica , Feminino , Humanos , Neoplasias da Mama/classificação , Neoplasias da Mama/diagnóstico , Neoplasias da Mama/patologia , Análise por Conglomerados , Curadoria de Dados , Conjuntos de Dados como Assunto , Aprendizado Profundo/normas , Patologia Clínica/métodos , Patologia Clínica/normas , Sensibilidade e Especificidade , Reprodutibilidade dos Testes

19.

Data normalization for addressing the challenges in the analysis of single-cell transcriptomic datasets.

Cuevas-Diaz Duran, Raquel; Wei, Haichao; Wu, Jiaqian.

BMC Genomics ; 25(1): 444, 2024 May 06.

Artigo em Inglês | MEDLINE | ID: mdl-38711017

RESUMO

BACKGROUND: Normalization is a critical step in the analysis of single-cell RNA-sequencing (scRNA-seq) datasets. Its main goal is to make gene counts comparable within and between cells. To do so, normalization methods must account for technical and biological variability. Numerous normalization methods have been developed addressing different sources of dispersion and making specific assumptions about the count data. MAIN BODY: The selection of a normalization method has a direct impact on downstream analysis, for example differential gene expression and cluster identification. Thus, the objective of this review is to guide the reader in making an informed decision on the most appropriate normalization method to use. To this aim, we first give an overview of the different single cell sequencing platforms and methods commonly used including isolation and library preparation protocols. Next, we discuss the inherent sources of variability of scRNA-seq datasets. We describe the categories of normalization methods and include examples of each. We also delineate imputation and batch-effect correction methods. Furthermore, we describe data-driven metrics commonly used to evaluate the performance of normalization methods. We also discuss common scRNA-seq methods and toolkits used for integrated data analysis. CONCLUSIONS: According to the correction performed, normalization methods can be broadly classified as within and between-sample algorithms. Moreover, with respect to the mathematical model used, normalization methods can further be classified into: global scaling methods, generalized linear models, mixed methods, and machine learning-based methods. Each of these methods depict pros and cons and make different statistical assumptions. However, there is no better performing normalization method. Instead, metrics such as silhouette width, K-nearest neighbor batch-effect test, or Highly Variable Genes are recommended to assess the performance of normalization methods.

Assuntos

Análise de Célula Única , Animais , Humanos , Algoritmos , Perfilação da Expressão Gênica/métodos , Perfilação da Expressão Gênica/normas , RNA-Seq/métodos , RNA-Seq/normas , Análise de Sequência de RNA/métodos , Análise de Célula Única/métodos , Transcriptoma , Conjuntos de Dados como Assunto

20.

A meta-analysis on global change drivers and the risk of infectious disease.

Mahon, Michael B; Sack, Alexandra; Aleuy, O Alejandro; Barbera, Carly; Brown, Ethan; Buelow, Heather; Civitello, David J; Cohen, Jeremy M; de Wit, Luz A; Forstchen, Meghan; Halliday, Fletcher W; Heffernan, Patrick; Knutie, Sarah A; Korotasz, Alexis; Larson, Joanna G; Rumschlag, Samantha L; Selland, Emily; Shepack, Alexander; Vincent, Nitin; Rohr, Jason R.

Nature ; 629(8013): 830-836, 2024 May.

Artigo em Inglês | MEDLINE | ID: mdl-38720068

RESUMO

Anthropogenic change is contributing to the rise in emerging infectious diseases, which are significantly correlated with socioeconomic, environmental and ecological factors1. Studies have shown that infectious disease risk is modified by changes to biodiversity2-6, climate change7-11, chemical pollution12-14, landscape transformations15-20 and species introductions21. However, it remains unclear which global change drivers most increase disease and under what contexts. Here we amassed a dataset from the literature that contains 2,938 observations of infectious disease responses to global change drivers across 1,497 host-parasite combinations, including plant, animal and human hosts. We found that biodiversity loss, chemical pollution, climate change and introduced species are associated with increases in disease-related end points or harm, whereas urbanization is associated with decreases in disease end points. Natural biodiversity gradients, deforestation and forest fragmentation are comparatively unimportant or idiosyncratic as drivers of disease. Overall, these results are consistent across human and non-human diseases. Nevertheless, context-dependent effects of the global change drivers on disease were found to be common. The findings uncovered by this meta-analysis should help target disease management and surveillance efforts towards global change drivers that increase disease. Specifically, reducing greenhouse gas emissions, managing ecosystem health, and preventing biological invasions and biodiversity loss could help to reduce the burden of plant, animal and human diseases, especially when coupled with improvements to social and economic determinants of health.

Assuntos

Biodiversidade , Mudança Climática , Doenças Transmissíveis , Poluição Ambiental , Espécies Introduzidas , Animais , Humanos , Efeitos Antropogênicos , Mudança Climática/estatística & dados numéricos , Doenças Transmissíveis/epidemiologia , Doenças Transmissíveis/etiologia , Conservação dos Recursos Naturais/tendências , Conjuntos de Dados como Assunto , Poluição Ambiental/efeitos adversos , Agricultura Florestal , Florestas , Espécies Introduzidas/estatística & dados numéricos , Doenças das Plantas/etiologia , Medição de Risco , Urbanização

RESUMO

Assuntos

RESUMO

Assuntos

Assuntos

RESUMO

Assuntos

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA