Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 7.530
Filtrar
1.
PLoS One ; 19(5): e0299583, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-38696410

RESUMO

The mapping of metabolite-specific data to pathways within cellular metabolism is a major data analysis step needed for biochemical interpretation. A variety of machine learning approaches, particularly deep learning approaches, have been used to predict these metabolite-to-pathway mappings, utilizing a training dataset of known metabolite-to-pathway mappings. A few such training datasets have been derived from the Kyoto Encyclopedia of Genes and Genomes (KEGG). However, several prior published machine learning approaches utilized an erroneous KEGG-derived training dataset that used SMILES molecular representations strings (KEGG-SMILES dataset) and contained a sizable proportion (~26%) duplicate entries. The presence of so many duplicates taint the training and testing sets generated from k-fold cross-validation of the KEGG-SMILES dataset. Therefore, the k-fold cross-validation performance of the resulting machine learning models was grossly inflated by the erroneous presence of these duplicate entries. Here we describe and evaluate the KEGG-SMILES dataset so that others may avoid using it. We also identify the prior publications that utilized this erroneous KEGG-SMILES dataset so their machine learning results can be properly and critically evaluated. In addition, we demonstrate the reduction of model k-fold cross-validation (CV) performance after de-duplicating the KEGG-SMILES dataset. This is a cautionary tale about properly vetting prior published benchmark datasets before using them in machine learning approaches. We hope others will avoid similar mistakes.


Assuntos
Redes e Vias Metabólicas , Aprendizado de Máquina Supervisionado , Humanos , Conjuntos de Dados como Assunto
2.
JMIR Mhealth Uhealth ; 12: e54622, 2024 May 02.
Artigo em Inglês | MEDLINE | ID: mdl-38696234

RESUMO

BACKGROUND: Postpartum depression (PPD) poses a significant maternal health challenge. The current approach to detecting PPD relies on in-person postpartum visits, which contributes to underdiagnosis. Furthermore, recognizing PPD symptoms can be challenging. Therefore, we explored the potential of using digital biomarkers from consumer wearables for PPD recognition. OBJECTIVE: The main goal of this study was to showcase the viability of using machine learning (ML) and digital biomarkers related to heart rate, physical activity, and energy expenditure derived from consumer-grade wearables for the recognition of PPD. METHODS: Using the All of Us Research Program Registered Tier v6 data set, we performed computational phenotyping of women with and without PPD following childbirth. Intraindividual ML models were developed using digital biomarkers from Fitbit to discern between prepregnancy, pregnancy, postpartum without depression, and postpartum with depression (ie, PPD diagnosis) periods. Models were built using generalized linear models, random forest, support vector machine, and k-nearest neighbor algorithms and evaluated using the κ statistic and multiclass area under the receiver operating characteristic curve (mAUC) to determine the algorithm with the best performance. The specificity of our individualized ML approach was confirmed in a cohort of women who gave birth and did not experience PPD. Moreover, we assessed the impact of a previous history of depression on model performance. We determined the variable importance for predicting the PPD period using Shapley additive explanations and confirmed the results using a permutation approach. Finally, we compared our individualized ML methodology against a traditional cohort-based ML model for PPD recognition and compared model performance using sensitivity, specificity, precision, recall, and F1-score. RESULTS: Patient cohorts of women with valid Fitbit data who gave birth included <20 with PPD and 39 without PPD. Our results demonstrated that intraindividual models using digital biomarkers discerned among prepregnancy, pregnancy, postpartum without depression, and postpartum with depression (ie, PPD diagnosis) periods, with random forest (mAUC=0.85; κ=0.80) models outperforming generalized linear models (mAUC=0.82; κ=0.74), support vector machine (mAUC=0.75; κ=0.72), and k-nearest neighbor (mAUC=0.74; κ=0.62). Model performance decreased in women without PPD, illustrating the method's specificity. Previous depression history did not impact the efficacy of the model for PPD recognition. Moreover, we found that the most predictive biomarker of PPD was calories burned during the basal metabolic rate. Finally, individualized models surpassed the performance of a conventional cohort-based model for PPD detection. CONCLUSIONS: This research establishes consumer wearables as a promising tool for PPD identification and highlights personalized ML approaches, which could transform early disease detection strategies.


Assuntos
Biomarcadores , Depressão Pós-Parto , Dispositivos Eletrônicos Vestíveis , Humanos , Depressão Pós-Parto/diagnóstico , Depressão Pós-Parto/psicologia , Feminino , Adulto , Biomarcadores/análise , Estudos Transversais , Dispositivos Eletrônicos Vestíveis/estatística & dados numéricos , Dispositivos Eletrônicos Vestíveis/normas , Aprendizado de Máquina/normas , Gravidez , Estados Unidos , Conjuntos de Dados como Assunto , Curva ROC
3.
Nature ; 629(8011): 370-375, 2024 May.
Artigo em Inglês | MEDLINE | ID: mdl-38600390

RESUMO

Roads are expanding at the fastest pace in human history. This is the case especially in biodiversity-rich tropical nations, where roads can result in forest loss and fragmentation, wildfires, illicit land invasions and negative societal effects1-5. Many roads are being constructed illegally or informally and do not appear on any existing road map6-10; the toll of such 'ghost roads' on ecosystems is poorly understood. Here we use around 7,000 h of effort by trained volunteers to map ghost roads across the tropical Asia-Pacific region, sampling 1.42 million plots, each 1 km2 in area. Our intensive sampling revealed a total of 1.37 million km of roads in our plots-from 3.0 to 6.6 times more roads than were found in leading datasets of roads globally. Across our study area, road building almost always preceded local forest loss, and road density was by far the strongest correlate11 of deforestation out of 38 potential biophysical and socioeconomic covariates. The relationship between road density and forest loss was nonlinear, with deforestation peaking soon after roads penetrate a landscape and then declining as roads multiply and remaining accessible forests largely disappear. Notably, after controlling for lower road density inside protected areas, we found that protected areas had only modest additional effects on preventing forest loss, implying that their most vital conservation function is limiting roads and road-related environmental disruption. Collectively, our findings suggest that burgeoning, poorly studied ghost roads are among the gravest of all direct threats to tropical forests.


Assuntos
Automóveis , Conservação dos Recursos Naturais , Agricultura Florestal , Florestas , Árvores , Clima Tropical , Ásia , Conservação dos Recursos Naturais/estatística & dados numéricos , Conservação dos Recursos Naturais/tendências , Árvores/crescimento & desenvolvimento , Conjuntos de Dados como Assunto , Agricultura Florestal/métodos , Agricultura Florestal/estatística & dados numéricos , Agricultura Florestal/tendências
4.
J Neural Eng ; 21(2)2024 Apr 17.
Artigo em Inglês | MEDLINE | ID: mdl-38588700

RESUMO

Objective. The instability of the EEG acquisition devices may lead to information loss in the channels or frequency bands of the collected EEG. This phenomenon may be ignored in available models, which leads to the overfitting and low generalization of the model.Approach. Multiple self-supervised learning tasks are introduced in the proposed model to enhance the generalization of EEG emotion recognition and reduce the overfitting problem to some extent. Firstly, channel masking and frequency masking are introduced to simulate the information loss in certain channels and frequency bands resulting from the instability of EEG, and two self-supervised learning-based feature reconstruction tasks combining masked graph autoencoders (GAE) are constructed to enhance the generalization of the shared encoder. Secondly, to take full advantage of the complementary information contained in these two self-supervised learning tasks to ensure the reliability of feature reconstruction, a weight sharing (WS) mechanism is introduced between the two graph decoders. Thirdly, an adaptive weight multi-task loss (AWML) strategy based on homoscedastic uncertainty is adopted to combine the supervised learning loss and the two self-supervised learning losses to enhance the performance further.Main results. Experimental results on SEED, SEED-V, and DEAP datasets demonstrate that: (i) Generally, the proposed model achieves higher averaged emotion classification accuracy than various baselines included in both subject-dependent and subject-independent scenarios. (ii) Each key module contributes to the performance enhancement of the proposed model. (iii) It achieves higher training efficiency, and significantly lower model size and computational complexity than the state-of-the-art (SOTA) multi-task-based model. (iv) The performances of the proposed model are less influenced by the key parameters.Significance. The introduction of the self-supervised learning task helps to enhance the generalization of the EEG emotion recognition model and eliminate overfitting to some extent, which can be modified to be applied in other EEG-based classification tasks.


Assuntos
Eletroencefalografia , Emoções , Aprendizado de Máquina Supervisionado , Aprendizado de Máquina Supervisionado/normas , Conjuntos de Dados como Assunto , Humanos
5.
Sci Data ; 11(1): 358, 2024 Apr 09.
Artigo em Inglês | MEDLINE | ID: mdl-38594314

RESUMO

This paper presents a standardised dataset versioning framework for improved reusability, recognition and data version tracking, facilitating comparisons and informed decision-making for data usability and workflow integration. The framework adopts a software engineering-like data versioning nomenclature ("major.minor.patch") and incorporates data schema principles to promote reproducibility and collaboration. To quantify changes in statistical properties over time, the concept of data drift metrics (d) is introduced. Three metrics (dP, dE,PCA, and dE,AE) based on unsupervised Machine Learning techniques (Principal Component Analysis and Autoencoders) are evaluated for dataset creation, update, and deletion. The optimal choice is the dE,PCA metric, combining PCA models with splines. It exhibits efficient computational time, with values below 50 for new dataset batches and values consistent with seasonal or trend variations. Major updates (i.e., values of 100) occur when scaling transformations are applied to over 30% of variables while efficiently handling information loss, yielding values close to 0. This metric achieved a favourable trade-off between interpretability, robustness against information loss, and computation time.


Assuntos
Conjuntos de Dados como Assunto , Software , Análise de Componente Principal , Reprodutibilidade dos Testes , Fluxo de Trabalho , Conjuntos de Dados como Assunto/normas , Aprendizado de Máquina
6.
Nature ; 629(8010): 105-113, 2024 May.
Artigo em Inglês | MEDLINE | ID: mdl-38632407

RESUMO

Arctic and alpine tundra ecosystems are large reservoirs of organic carbon1,2. Climate warming may stimulate ecosystem respiration and release carbon into the atmosphere3,4. The magnitude and persistency of this stimulation and the environmental mechanisms that drive its variation remain uncertain5-7. This hampers the accuracy of global land carbon-climate feedback projections7,8. Here we synthesize 136 datasets from 56 open-top chamber in situ warming experiments located at 28 arctic and alpine tundra sites which have been running for less than 1 year up to 25 years. We show that a mean rise of 1.4 °C [confidence interval (CI) 0.9-2.0 °C] in air and 0.4 °C [CI 0.2-0.7 °C] in soil temperature results in an increase in growing season ecosystem respiration by 30% [CI 22-38%] (n = 136). Our findings indicate that the stimulation of ecosystem respiration was due to increases in both plant-related and microbial respiration (n = 9) and continued for at least 25 years (n = 136). The magnitude of the warming effects on respiration was driven by variation in warming-induced changes in local soil conditions, that is, changes in total nitrogen concentration and pH and by context-dependent spatial variation in these conditions, in particular total nitrogen concentration and the carbon:nitrogen ratio. Tundra sites with stronger nitrogen limitations and sites in which warming had stimulated plant and microbial nutrient turnover seemed particularly sensitive in their respiration response to warming. The results highlight the importance of local soil conditions and warming-induced changes therein for future climatic impacts on respiration.


Assuntos
Respiração Celular , Ecossistema , Aquecimento Global , Tundra , Regiões Árticas , Carbono/metabolismo , Carbono/análise , Ciclo do Carbono , Conjuntos de Dados como Assunto , Concentração de Íons de Hidrogênio , Nitrogênio/metabolismo , Nitrogênio/análise , Plantas/metabolismo , Estações do Ano , Solo/química , Microbiologia do Solo , Temperatura , Fatores de Tempo
7.
Hum Brain Mapp ; 45(6): e26683, 2024 Apr 15.
Artigo em Inglês | MEDLINE | ID: mdl-38647035

RESUMO

Machine learning (ML) approaches are increasingly being applied to neuroimaging data. Studies in neuroscience typically have to rely on a limited set of training data which may impair the generalizability of ML models. However, it is still unclear which kind of training sample is best suited to optimize generalization performance. In the present study, we systematically investigated the generalization performance of sex classification models trained on the parcelwise connectivity profile of either single samples or compound samples of two different sizes. Generalization performance was quantified in terms of mean across-sample classification accuracy and spatial consistency of accurately classifying parcels. Our results indicate that the generalization performance of parcelwise classifiers (pwCs) trained on single dataset samples is dependent on the specific test samples. Certain datasets seem to "match" in the sense that classifiers trained on a sample from one dataset achieved a high accuracy when tested on the respected other one and vice versa. The pwCs trained on the compound samples demonstrated overall highest generalization performance for all test samples, including one derived from a dataset not included in building the training samples. Thus, our results indicate that both a large sample size and a heterogeneous data composition of a training sample have a central role in achieving generalizable results.


Assuntos
Conectoma , Aprendizado de Máquina , Imageamento por Ressonância Magnética , Humanos , Feminino , Masculino , Adulto , Conectoma/métodos , Caracteres Sexuais , Conjuntos de Dados como Assunto , Adulto Jovem , Encéfalo/diagnóstico por imagem , Encéfalo/fisiologia
8.
Neuroimage ; 292: 120603, 2024 Apr 15.
Artigo em Inglês | MEDLINE | ID: mdl-38588833

RESUMO

Fetal brain development is a complex process involving different stages of growth and organization which are crucial for the development of brain circuits and neural connections. Fetal atlases and labeled datasets are promising tools to investigate prenatal brain development. They support the identification of atypical brain patterns, providing insights into potential early signs of clinical conditions. In a nutshell, prenatal brain imaging and post-processing via modern tools are a cutting-edge field that will significantly contribute to the advancement of our understanding of fetal development. In this work, we first provide terminological clarification for specific terms (i.e., "brain template" and "brain atlas"), highlighting potentially misleading interpretations related to inconsistent use of terms in the literature. We discuss the major structures and neurodevelopmental milestones characterizing fetal brain ontogenesis. Our main contribution is the systematic review of 18 prenatal brain atlases and 3 datasets. We also tangentially focus on clinical, research, and ethical implications of prenatal neuroimaging.


Assuntos
Atlas como Assunto , Encéfalo , Imageamento por Ressonância Magnética , Neuroimagem , Feminino , Humanos , Gravidez , Encéfalo/diagnóstico por imagem , Encéfalo/embriologia , Conjuntos de Dados como Assunto , Desenvolvimento Fetal/fisiologia , Feto/diagnóstico por imagem , Imageamento por Ressonância Magnética/métodos , Neuroimagem/métodos
9.
J Virol ; 98(4): e0185823, 2024 Apr 16.
Artigo em Inglês | MEDLINE | ID: mdl-38445887

RESUMO

Most individuals are latently infected with herpes simplex virus type 1 (HSV-1), and it is well-established that HSV-1 establishes latency in sensory neurons of peripheral ganglia. However, it was recently proposed that latent HSV-1 is also present in immune cells recovered from the ganglia of experimentally infected mice. Here, we reanalyzed the single-cell RNA sequencing (scRNA-Seq) data that formed the basis for that conclusion. Unexpectedly, off-target priming in 3' scRNA-Seq experiments enabled the detection of non-polyadenylated HSV-1 latency-associated transcript (LAT) intronic RNAs. However, LAT reads were near-exclusively detected in mixed populations of cells undergoing cell death. Specific loss of HSV-1 LAT and neuronal transcripts during quality control filtering indicated widespread destruction of neurons, supporting the presence of contaminating cell-free RNA in other cells following tissue processing. In conclusion, the reported detection of latent HSV-1 in non-neuronal cells is best explained using compromised scRNA-Seq datasets.IMPORTANCEMost people are infected with herpes simplex virus type 1 (HSV-1) during their life. Once infected, the virus generally remains in a latent (silent) state, hiding within the neurons of peripheral ganglia. Periodic reactivation (reawakening) of the virus may cause fresh diseases such as cold sores. A recent study using single-cell RNA sequencing (scRNA-Seq) proposed that HSV-1 can also establish latency in the immune cells of mice, challenging existing dogma. We reanalyzed the data from that study and identified several flaws in the methodologies and analyses performed that invalidate the published conclusions. Specifically, we showed that the methodologies used resulted in widespread destruction of neurons which resulted in the presence of contaminants that confound the data analysis. We thus conclude that there remains little to no evidence for HSV-1 latency in immune cells.


Assuntos
Artefatos , Gânglios Sensitivos , Herpesvirus Humano 1 , Células Receptoras Sensoriais , Análise de Sequência de RNA , Análise da Expressão Gênica de Célula Única , Latência Viral , Animais , Camundongos , Morte Celular , Conjuntos de Dados como Assunto , Gânglios Sensitivos/imunologia , Gânglios Sensitivos/patologia , Gânglios Sensitivos/virologia , Herpes Simples/imunologia , Herpes Simples/patologia , Herpes Simples/virologia , Herpesvirus Humano 1/genética , Herpesvirus Humano 1/isolamento & purificação , MicroRNAs/análise , MicroRNAs/genética , Reprodutibilidade dos Testes , RNA Viral/análise , RNA Viral/genética , Células Receptoras Sensoriais/patologia , Células Receptoras Sensoriais/virologia
10.
Histopathology ; 84(7): 1111-1129, 2024 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-38443320

RESUMO

AIMS: The International Collaboration on Cancer Reporting (ICCR), a global alliance of major (inter-)national pathology and cancer organisations, is an initiative aimed at providing a unified international approach to reporting cancer. ICCR recently published new data sets for the reporting of invasive breast carcinoma, surgically removed lymph nodes for breast tumours and ductal carcinoma in situ, variants of lobular carcinoma in situ and low-grade lesions. The data set in this paper addresses the neoadjuvant setting. The aim is to promote high-quality, standardised reporting of tumour response and residual disease after neoadjuvant treatment that can be used for subsequent management decisions for each patient. METHODS: The ICCR convened expert panels of breast pathologists with a representative surgeon and oncologist to critically review and discuss current evidence. Feedback from the international public consultation was critical in the development of this data set. RESULTS: The expert panel concluded that a dedicated data set was required for reporting of breast specimens post-neoadjuvant therapy with inclusion of data elements specific to the neoadjuvant setting as core or non-core elements. This data set proposes a practical approach for handling and reporting breast resection specimens following neoadjuvant therapy. The comments for each data element clarify terminology, discuss available evidence and highlight areas with limited evidence that need further study. This data set overlaps with, and should be used in conjunction with, the data sets for the reporting of invasive breast carcinoma and surgically removed lymph nodes from patients with breast tumours, as appropriate. Key issues specific to the neoadjuvant setting are included in this paper. The entire data set is freely available on the ICCR website. CONCLUSIONS: High-quality, standardised reporting of tumour response and residual disease after neoadjuvant treatment are critical for subsequent management decisions for each patient.


Assuntos
Neoplasias da Mama , Terapia Neoadjuvante , Humanos , Neoplasias da Mama/patologia , Neoplasias da Mama/terapia , Feminino , Conjuntos de Dados como Assunto
11.
Sci Data ; 11(1): 290, 2024 Mar 12.
Artigo em Inglês | MEDLINE | ID: mdl-38472209

RESUMO

Fat infiltration in skeletal muscle is now recognized as a standard feature of aging and is directly related to the decline in muscle function. However, there is still a limited systematic integration and exploration of the mechanisms underlying the occurrence of myosteatosis in aging across species. Here, we re-analyzed bulk RNA-seq datasets to investigate the association between fat infiltration in skeletal muscle and aging. Our integrated analysis of single-nucleus transcriptomics in aged humans and Laiwu pigs with high intramuscular fat content, identified species-preference subclusters and revealed core gene programs associated with myosteatosis. Furthermore, we found that fibro/adipogenic progenitors (FAPs) had potential capacity of differentiating into PDE4D+/PDE7B+ preadipocytes across species. Additionally, cell-cell communication analysis revealed that FAPs may be associated with other adipogenic potential clusters via the COL4A2 and COL6A3 pathways. Our study elucidates the correlation mechanism between aging and fat infiltration in skeletal muscle, and these consensus signatures in both humans and pigs may contribute to increasing reproducibility and reliability in future studies involving in the field of muscle research.


Assuntos
Adipogenia , Envelhecimento , Músculo Esquelético , Idoso , Animais , Humanos , Adipogenia/fisiologia , Diferenciação Celular , Músculo Esquelético/fisiologia , Suínos , Conjuntos de Dados como Assunto , RNA-Seq , Transcriptoma , Adipócitos , Células-Tronco
12.
Sci Data ; 11(1): 289, 2024 Mar 12.
Artigo em Inglês | MEDLINE | ID: mdl-38472225

RESUMO

High heterogeneity and complex interactions of malignant cells in breast cancer has been recognized as a driver of cancer progression and therapeutic failure. However, complete understanding of common cancer cell states and their underlying driver factors remain scarce and challenging. Here, we revealed seven consensus cancer cell states recurring cross patients by integrative analysis of single-cell RNA sequencing data of breast cancer. The distinct biological functions, the subtype-specific distribution, the potential cells of origin and the interrelation of consensus cancer cell states were systematically elucidated and validated in multiple independent datasets. We further uncovered the internal regulons and external cell components in tumor microenvironments, which contribute to the consensus cancer cell states. Using the state-specific signature, we also inferred the abundance of cells with each consensus cancer cell state by deconvolution of large breast cancer RNA-seq cohorts, revealing the association of immune-related state with better survival. Our study provides new insights for the cancer cell state composition and potential therapeutic strategies of breast cancer.


Assuntos
Neoplasias da Mama , Análise de Célula Única , Feminino , Humanos , Neoplasias da Mama/diagnóstico , Neoplasias da Mama/genética , Relevância Clínica , Microambiente Tumoral , Conjuntos de Dados como Assunto , Análise de Sequência de RNA
14.
Nature ; 627(8002): 108-115, 2024 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-38448695

RESUMO

The sea level along the US coastlines is projected to rise by 0.25-0.3 m by 2050, increasing the probability of more destructive flooding and inundation in major cities1-3. However, these impacts may be exacerbated by coastal subsidence-the sinking of coastal land areas4-a factor that is often underrepresented in coastal-management policies and long-term urban planning2,5. In this study, we combine high-resolution vertical land motion (that is, raising or lowering of land) and elevation datasets with projections of sea-level rise to quantify the potential inundated areas in 32 major US coastal cities. Here we show that, even when considering the current coastal-defence structures, further land area of between 1,006 and 1,389 km2 is threatened by relative sea-level rise by 2050, posing a threat to a population of 55,000-273,000 people and 31,000-171,000 properties. Our analysis shows that not accounting for spatially variable land subsidence within the cities may lead to inaccurate projections of expected exposure. These potential consequences show the scale of the adaptation challenge, which is not appreciated in most US coastal cities.


Assuntos
Altitude , Cidades , Planejamento de Cidades , Inundações , Movimento (Física) , Elevação do Nível do Mar , Cidades/estatística & dados numéricos , Planejamento de Cidades/métodos , Planejamento de Cidades/tendências , Inundações/prevenção & controle , Inundações/estatística & dados numéricos , Estados Unidos , Conjuntos de Dados como Assunto , Elevação do Nível do Mar/estatística & dados numéricos , Aclimatação
15.
Nature ; 628(8009): 788-794, 2024 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-38538788

RESUMO

Biodiversity faces unprecedented threats from rapid global change1. Signals of biodiversity change come from time-series abundance datasets for thousands of species over large geographic and temporal scales. Analyses of these biodiversity datasets have pointed to varied trends in abundance, including increases and decreases. However, these analyses have not fully accounted for spatial, temporal and phylogenetic structures in the data. Here, using a new statistical framework, we show across ten high-profile biodiversity datasets2-11 that increases and decreases under existing approaches vanish once spatial, temporal and phylogenetic structures are accounted for. This is a consequence of existing approaches severely underestimating trend uncertainty and sometimes misestimating the trend direction. Under our revised average abundance trends that appropriately recognize uncertainty, we failed to observe a single increasing or decreasing trend at 95% credible intervals in our ten datasets. This emphasizes how little is known about biodiversity change across vast spatial and taxonomic scales. Despite this uncertainty at vast scales, we reveal improved local-scale prediction accuracy by accounting for spatial, temporal and phylogenetic structures. Improved prediction offers hope of estimating biodiversity change at policy-relevant scales, guiding adaptive conservation responses.


Assuntos
Biodiversidade , Incerteza , Animais , Conservação dos Recursos Naturais/métodos , Conservação dos Recursos Naturais/tendências , Conjuntos de Dados como Assunto , Filogenia , Análise Espaço-Temporal , Fatores de Tempo
16.
Artif Intell Med ; 151: 102846, 2024 May.
Artigo em Inglês | MEDLINE | ID: mdl-38547777

RESUMO

BACKGROUND AND OBJECTIVES: Generating coherent reports from medical images is an important task for reducing doctors' workload. Unlike traditional image captioning tasks, the task of medical image report generation faces more challenges. Current models for generating reports from medical images often fail to characterize some abnormal findings, and some models generate reports with low quality. In this study, we propose a model to generate high-quality reports from medical images. METHODS: In this paper, we propose a model called Hybrid Discriminator Generative Adversarial Network (HDGAN), which combines Generative Adversarial Network (GAN) with Reinforcement Learning (RL). The HDGAN model consists of a generator, a one-sentence discriminator, and a one-word discriminator. Specifically, the RL reward signals are judged on the one-sentence discriminator and one-word discriminator separately. The one-sentence discriminator can better learn sentence-level structural information, while the one-word discriminator can learn word diversity information effectively. RESULTS: Our approach performs better on the IU-X-ray and COV-CTR datasets than the baseline models. For the ROUGE metric, our method outperforms the state-of-the-art model by 0.36 on the IU-X-ray, 0.06 on the MIMIC-CXR and 0.156 on the COV-CTR. CONCLUSIONS: The compositional framework we proposed can generate more accurate medical image reports at different levels.


Assuntos
Aprendizado Profundo , Diagnóstico por Imagem , Processamento de Imagem Assistida por Computador , Redes Neurais de Computação , Conjuntos de Dados como Assunto , Diagnóstico por Imagem/métodos , Processamento de Imagem Assistida por Computador/métodos , Radiografia Torácica , Tórax/diagnóstico por imagem , Humanos
17.
Nature ; 627(8003): 340-346, 2024 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-38374255

RESUMO

Comprehensively mapping the genetic basis of human disease across diverse individuals is a long-standing goal for the field of human genetics1-4. The All of Us Research Program is a longitudinal cohort study aiming to enrol a diverse group of at least one million individuals across the USA to accelerate biomedical research and improve human health5,6. Here we describe the programme's genomics data release of 245,388 clinical-grade genome sequences. This resource is unique in its diversity as 77% of participants are from communities that are historically under-represented in biomedical research and 46% are individuals from under-represented racial and ethnic minorities. All of Us identified more than 1 billion genetic variants, including more than 275 million previously unreported genetic variants, more than 3.9 million of which had coding consequences. Leveraging linkage between genomic data and the longitudinal electronic health record, we evaluated 3,724 genetic variants associated with 117 diseases and found high replication rates across both participants of European ancestry and participants of African ancestry. Summary-level data are publicly available, and individual-level data can be accessed by researchers through the All of Us Researcher Workbench using a unique data passport model with a median time from initial researcher registration to data access of 29 hours. We anticipate that this diverse dataset will advance the promise of genomic medicine for all.


Assuntos
Conjuntos de Dados como Assunto , Genética Médica , Genética Populacional , Genoma Humano , Genômica , Grupos Minoritários , Grupos Raciais , Humanos , Acesso à Informação , População Negra/genética , Registros Eletrônicos de Saúde , Etnicidade/genética , População Europeia/genética , Predisposição Genética para Doença/genética , Variação Genética/genética , Genoma Humano/genética , Estudos Longitudinais , Grupos Raciais/genética , Reprodutibilidade dos Testes , Pesquisadores , Fatores de Tempo , Populações Vulneráveis
18.
Nature ; 627(8003): 335-339, 2024 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-38418873

RESUMO

The latitudinal diversity gradient (LDG) dominates global patterns of diversity1,2, but the factors that underlie the LDG remain elusive. Here we use a unique global dataset3 to show that vascular plants on oceanic islands exhibit a weakened LDG and explore potential mechanisms for this effect. Our results show that traditional physical drivers of island biogeography4-namely area and isolation-contribute to the difference between island and mainland diversity at a given latitude (that is, the island species deficit), as smaller and more distant islands experience reduced colonization. However, plant species with mutualists are underrepresented on islands, and we find that this plant mutualism filter explains more variation in the island species deficit than abiotic factors. In particular, plant species that require animal pollinators or microbial mutualists such as arbuscular mycorrhizal fungi contribute disproportionately to the island species deficit near the Equator, with contributions decreasing with distance from the Equator. Plant mutualist filters on species richness are particularly strong at low absolute latitudes where mainland richness is highest, weakening the LDG of oceanic islands. These results provide empirical evidence that mutualisms, habitat heterogeneity and dispersal are key to the maintenance of high tropical plant diversity and mediate the biogeographic patterns of plant diversity on Earth.


Assuntos
Biodiversidade , Mapeamento Geográfico , Ilhas , Plantas , Simbiose , Animais , Conjuntos de Dados como Assunto , Micorrizas/fisiologia , Plantas/microbiologia , Polinização , Clima Tropical , Oceanos e Mares , Filogeografia
19.
JAMA Netw Open ; 7(2): e2356619, 2024 Feb 05.
Artigo em Inglês | MEDLINE | ID: mdl-38393731

RESUMO

Importance: Nonadherence to antihypertensive medications is associated with uncontrolled blood pressure, higher mortality rates, and increased health care costs, and food insecurity is one of the modifiable medication nonadherence risk factors. The Supplemental Nutrition Assistance Program (SNAP), a social intervention program for addressing food insecurity, may help improve adherence to antihypertensive medications. Objective: To evaluate whether receipt of SNAP benefits can modify the consequences of food insecurity on nonadherence to antihypertensive medications. Design, Setting, and Participants: A retrospective cohort study design was used to assemble a cohort of antihypertensive medication users from the linked Medical Expenditure Panel Survey (MEPS)-National Health Interview Survey (NHIS) dataset for 2016 to 2017. The MEPS is a national longitudinal survey on verified self-reported prescribed medication use and health care access measures, and the NHIS is an annual cross-sectional survey of US households that collects comprehensive health information, health behavior, and sociodemographic data, including receipt of SNAP benefits. Receipt of SNAP benefits in the past 12 months and food insecurity status in the past 30 days were assessed through standard questionnaires during the study period. Data analysis was performed from March to December 2021. Exposure: Status of SNAP benefit receipt. Main Outcomes and Measures: The main outcome, nonadherence to antihypertensive medication refill adherence (MRA), was defined using the MEPS data as the total days' supply divided by 365 days for each antihypertensive medication class. Patients were considered nonadherent if their overall MRA was less than 80%. Food insecurity status in the 30 days prior to the survey was modeled as the effect modifier. Inverse probability of treatment (IPT) weighting was used to control for measured confounding effects of baseline covariates. A probit model was used, weighted by the product of the computed IPT weights and MEPS weights, to estimate the population average treatment effects (PATEs) of SNAP benefit receipt on nonadherence. A stratified analysis approach was used to assess for potential effect modification by food insecurity status. Results: This analysis involved 6692 antihypertensive medication users, of whom 1203 (12.8%) reported receiving SNAP benefits and 1338 (14.8%) were considered as food insecure. The mean (SD) age was 63.0 (13.3) years; 3632 (51.3%) of the participants were women and 3060 (45.7%) were men. Although SNAP was not associated with nonadherence to antihypertensive medications in the overall population, it was associated with a 13.6-percentage point reduction in nonadherence (PATE, -13.6 [95% CI, -25.0 to -2.3]) among the food-insecure subgroup but not among their food-secure counterparts. Conclusions and Relevance: This analysis of a national observational dataset suggests that patients with hypertension who receive SNAP benefits may be less likely to become nonadherent to antihypertensive medication, especially if they are experiencing food insecurity. Further examination of the role of SNAP as a potential intervention for preventing nonadherence to antihypertensive medications through prospectively designed interventional studies or natural experiment study designs is needed.


Assuntos
Assistência Alimentar , Feminino , Humanos , Masculino , Pessoa de Meia-Idade , Anti-Hipertensivos/uso terapêutico , Estudos Transversais , Pobreza , Estudos Retrospectivos , Idoso , Conjuntos de Dados como Assunto
20.
Sci Data ; 11(1): 224, 2024 Feb 21.
Artigo em Inglês | MEDLINE | ID: mdl-38383523

RESUMO

The cutaneous absorption parameters of xenobiotics are crucial for the development of drugs and cosmetics, as well as for assessing environmental and occupational chemical risks. Despite the great variability in the design of experimental conditions due to uncertain international guidelines, datasets like HuskinDB have been created to report skin absorption endpoints. This review updates available skin permeability data by rigorously compiling research published between 2012 and 2021. Inclusion and exclusion criteria have been selected to build the most harmonized and reusable dataset possible. The Generative Topographic Mapping method was applied to the present dataset and compared to HuskinDB to monitor the progress in skin permeability research and locate chemotypes of particular concern. The open-source dataset (SkinPiX) includes steady-state flux, maximum flux, lag time and permeability coefficient results for the substances tested, as well as relevant information on experimental parameters that can impact the data. It can be used to extract subsets of data for comparisons and to build predictive models.


Assuntos
Absorção Cutânea , Pele , Xenobióticos , Permeabilidade , Pele/metabolismo , Xenobióticos/metabolismo , Conjuntos de Dados como Assunto , Humanos
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA