Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 5.053
Filtrar
Mais filtros

Intervalo de ano de publicação
1.
Annu Rev Neurosci ; 44: 129-151, 2021 07 08.
Artigo em Inglês | MEDLINE | ID: mdl-33556250

RESUMO

Improvements in understanding the neurobiological basis of mental illness have unfortunately not translated into major advances in treatment. At this point, it is clear that psychiatric disorders are exceedingly complex and that, in order to account for and leverage this complexity, we need to collect longitudinal data sets from much larger and more diverse samples than is practical using traditional methods. We discuss how smartphone-based research methods have the potential to dramatically advance our understanding of the neuroscience of mental health. This, we expect, will take the form of complementing lab-based hard neuroscience research with dense sampling of cognitive tests, clinical questionnaires, passive data from smartphone sensors, and experience-sampling data as people go about their daily lives. Theory- and data-driven approaches can help make sense of these rich data sets, and the combination of computational tools and the big data that smartphones make possible has great potential value for researchers wishing to understand how aspects of brain function give rise to, or emerge from, states of mental health and illness.


Assuntos
Transtornos Mentais , Neurociências , Humanos , Saúde Mental , Smartphone
2.
Trends Genet ; 2024 Aug 07.
Artigo em Inglês | MEDLINE | ID: mdl-39117482

RESUMO

Harnessing cutting-edge technologies to enhance crop productivity is a pivotal goal in modern plant breeding. Artificial intelligence (AI) is renowned for its prowess in big data analysis and pattern recognition, and is revolutionizing numerous scientific domains including plant breeding. We explore the wider potential of AI tools in various facets of breeding, including data collection, unlocking genetic diversity within genebanks, and bridging the genotype-phenotype gap to facilitate crop breeding. This will enable the development of crop cultivars tailored to the projected future environments. Moreover, AI tools also hold promise for refining crop traits by improving the precision of gene-editing systems and predicting the potential effects of gene variants on plant phenotypes. Leveraging AI-enabled precision breeding can augment the efficiency of breeding programs and holds promise for optimizing cropping systems at the grassroots level. This entails identifying optimal inter-cropping and crop-rotation models to enhance agricultural sustainability and productivity in the field.

3.
Brief Bioinform ; 25(4)2024 May 23.
Artigo em Inglês | MEDLINE | ID: mdl-39007597

RESUMO

Thyroid cancer incidences endure to increase even though a large number of inspection tools have been developed recently. Since there is no standard and certain procedure to follow for the thyroid cancer diagnoses, clinicians require conducting various tests. This scrutiny process yields multi-dimensional big data and lack of a common approach leads to randomly distributed missing (sparse) data, which are both formidable challenges for the machine learning algorithms. This paper aims to develop an accurate and computationally efficient deep learning algorithm to diagnose the thyroid cancer. In this respect, randomly distributed missing data stemmed singularity in learning problems is treated and dimensionality reduction with inner and target similarity approaches are developed to select the most informative input datasets. In addition, size reduction with the hierarchical clustering algorithm is performed to eliminate the considerably similar data samples. Four machine learning algorithms are trained and also tested with the unseen data to validate their generalization and robustness abilities. The results yield 100% training and 83% testing preciseness for the unseen data. Computational time efficiencies of the algorithms are also examined under the equal conditions.


Assuntos
Algoritmos , Aprendizado Profundo , Neoplasias da Glândula Tireoide , Neoplasias da Glândula Tireoide/diagnóstico , Humanos , Aprendizado de Máquina , Análise por Conglomerados
4.
Brief Bioinform ; 25(3)2024 Mar 27.
Artigo em Inglês | MEDLINE | ID: mdl-38701422

RESUMO

In this review article, we explore the transformative impact of deep learning (DL) on structural bioinformatics, emphasizing its pivotal role in a scientific revolution driven by extensive data, accessible toolkits and robust computing resources. As big data continue to advance, DL is poised to become an integral component in healthcare and biology, revolutionizing analytical processes. Our comprehensive review provides detailed insights into DL, featuring specific demonstrations of its notable applications in bioinformatics. We address challenges tailored for DL, spotlight recent successes in structural bioinformatics and present a clear exposition of DL-from basic shallow neural networks to advanced models such as convolution, recurrent, artificial and transformer neural networks. This paper discusses the emerging use of DL for understanding biomolecular structures, anticipating ongoing developments and applications in the realm of structural bioinformatics.


Assuntos
Biologia Computacional , Aprendizado Profundo , Biologia Computacional/métodos , Redes Neurais de Computação , Humanos
5.
Proc Natl Acad Sci U S A ; 120(19): e2215829120, 2023 05 09.
Artigo em Inglês | MEDLINE | ID: mdl-37126710

RESUMO

Technology startups play an essential role in the economy-with seven of the ten largest companies rooted in technology, and venture capital investments totaling approximately $300B annually. Yet, important startup outcomes (e.g., whether a startup raises venture capital or gets acquired) remain difficult to forecast-particularly during the early stages of venture formation. Here, we examine the impact of an essential, yet underexplored, factor that can be observed from the moment of startup creation: founder personality. We predict psychological traits from digital footprints to explore how founder personality is associated with critical startup milestones. Observing 10,541 founder-startup dyads, we provide large-scale, ecologically valid evidence that founder personality is associated with outcomes across all phases of a venture's life (i.e., from raising the earliest funding round to exiting via acquisition or initial public offering). We find that openness and agreeableness are positively related to the likelihood of raising an initial round of funding (but unrelated to all subsequent conditional outcomes). Neuroticism is negatively related to all outcomes, highlighting the importance of founders' resilience. Finally, conscientiousness is positively related to early-stage investment, but negatively related to exit conditional on funding. While prior work has painted conscientiousness as a major benefactor of performance, our findings highlight a potential boundary condition: The fast-moving world of technology startups affords founders with lower or moderate levels of conscientiousness a competitive advantage when it comes to monetizing their business via acquisition or IPO.


Assuntos
Comércio , Personalidade , Neuroticismo , Empreendedorismo , Tecnologia
6.
Gastroenterology ; 166(4): 680-689.e4, 2024 04.
Artigo em Inglês | MEDLINE | ID: mdl-38123025

RESUMO

BACKGROUND & AIMS: Endoscopic submucosal dissection (ESD) is a well-established treatment modality for gastric neoplasms. We aimed to investigate the effect of procedural volume on the outcome of ESD for gastric cancer or adenoma. METHODS: In this population-based cohort study, patients who underwent ESD for gastric cancer or adenoma from November 2011 to December 2017 were identified using the Korean National Health Insurance Service database. Operational definitions to identify the target population and post-procedural complications were created using diagnosis and procedure codes and were validated using hospital medical record data. Outcomes included hemorrhage, perforation, pneumonia, 30-day mortality, a composite outcome comprising all of these adverse outcomes, and additional resection. Hospital volume was categorized into 3 groups based on the results of the threshold analysis: high-, medium-, low-volume centers (HVCs, MVCs, and LVCs, respectively). Inverse probability of treatment weighting analysis was applied to enhance comparability across the volume groups. RESULTS: There were 94,246 procedures performed in 88,687 patients during the study period. There were 5886 composite events including 4925 hemorrhage, 447 perforation, and 703 pneumonia cases. There were significant differences in ESD-related adverse outcomes among the 3 hospital volume categories, showing that HVCs and MVCs were associated with a lower risk of a composite outcome than LVCs (inverse probability of treatment-weighted odds ratio [OR], 0.651; 95% CI, 0.521-0.814; inverse probability of treatment-weighted OR, 0.641; 95% CI, 0.534-0.769). Similar tendencies were also shown for hemorrhage, perforation, and pneumonia; however, these were not evident for additional resection. CONCLUSIONS: Procedural volume was closely associated with clinical outcome in patients undergoing ESD for gastric cancer or adenoma.


Assuntos
Adenoma , Ressecção Endoscópica de Mucosa , Pneumonia , Neoplasias Gástricas , Humanos , Ressecção Endoscópica de Mucosa/efeitos adversos , Ressecção Endoscópica de Mucosa/métodos , Neoplasias Gástricas/cirurgia , Neoplasias Gástricas/etiologia , Estudos de Coortes , Hemorragia , Adenoma/cirurgia , Adenoma/etiologia , Resultado do Tratamento , Estudos Retrospectivos , Mucosa Gástrica/cirurgia
7.
Brief Bioinform ; 24(6)2023 09 22.
Artigo em Inglês | MEDLINE | ID: mdl-37798252

RESUMO

The emergence of massive datasets exploring the multiple levels of molecular biology has made their analysis and knowledge transfer more complex. Flexible tools to manage big biological datasets could be of great help for standardizing the usage of developed data visualizations and integration methods. Business intelligence (BI) tools have been used in many fields as exploratory tools. They have numerous connectors to link numerous data repositories with a unified graphic interface, offering an overview of data and facilitating interpretation for decision makers. BI tools could be a flexible and user-friendly way of handling molecular biological data with interactive visualizations. However, it is rather uncommon to see such tools used for the exploration of massive and complex datasets in biological fields. We believe that two main obstacles could be the reason. Firstly, we posit that the way to import data into BI tools are not compatible with biological databases. Secondly, BI tools may not be adapted to certain particularities of complex biological data, namely, the size, the variability of datasets and the availability of specialized visualizations. This paper highlights the use of five BI tools (Elastic Kibana, Siren Investigate, Microsoft Power BI, Salesforce Tableau and Apache Superset) onto which the massive data management repository engine called Elasticsearch is compatible. Four case studies will be discussed in which these BI tools were applied on biological datasets with different characteristics. We conclude that the performance of the tools depends on the complexity of the biological questions and the size of the datasets.


Assuntos
Conjuntos de Dados como Assunto , Software , Visualização de Dados
8.
Proc Natl Acad Sci U S A ; 119(10): e2120455119, 2022 03 08.
Artigo em Inglês | MEDLINE | ID: mdl-35238633

RESUMO

Crowdsourced online genealogies have an unprecedented potential to shed light on long-run population dynamics, if analyzed properly. We investigate whether the historical mortality dynamics of males in familinx, a popular genealogical dataset, are representative of the general population, or whether they are closer to those of an elite subpopulation in two territories. The first territory is the German Empire, with a low level of genealogical coverage relative to the total population size, while the second territory is The Netherlands, with a higher level of genealogical coverage relative to the population. We find that, for the period around the turn of the 20th century (for which benchmark national life tables are available), mortality is consistently lower and more homogeneous in familinx than in the general population. For that time period, the mortality levels in familinx resemble those of elites in the German Empire, while they are closer to those in national life tables in The Netherlands. For the period before the 19th century, the mortality levels in familinx mirror those of the elites in both territories. We identify the low coverage of the total population and the oversampling of elites in online genealogies as potential explanations for these findings. Emerging digital data may revolutionize our knowledge of historical demographic dynamics, but only if we understand their potential uses and limitations.


Assuntos
Demografia , Expectativa de Vida , Adulto , Alemanha , História do Século XVII , História do Século XVIII , História do Século XIX , História do Século XX , Humanos , Masculino , Países Baixos , Dinâmica Populacional
9.
Proc Natl Acad Sci U S A ; 119(8)2022 02 22.
Artigo em Inglês | MEDLINE | ID: mdl-35135891

RESUMO

With rapid urbanization and increasing climate risks, enhancing the resilience of urban systems has never been more important. Despite the availability of massive datasets of human behavior (e.g., mobile phone data, satellite imagery), studies on disaster resilience have been limited to using static measures as proxies for resilience. However, static metrics have significant drawbacks such as their inability to capture the effects of compounding and accumulating disaster shocks; dynamic interdependencies of social, economic, and infrastructure systems; and critical transitions and regime shifts, which are essential components of the complex disaster resilience process. In this article, we argue that the disaster resilience literature needs to take the opportunities of big data and move toward a different research direction, which is to develop data-driven, dynamical complex systems models of disaster resilience. Data-driven complex systems modeling approaches could overcome the drawbacks of static measures and allow us to quantitatively model the dynamic recovery trajectories and intrinsic resilience characteristics of communities in a generic manner by leveraging large-scale and granular observations. This approach brings a paradigm shift in modeling the disaster resilience process and its linkage with the recovery process, paving the way to answering important questions for policy applications via counterfactual analysis and simulations.

10.
Proc Natl Acad Sci U S A ; 119(24): e2109665119, 2022 Jun 14.
Artigo em Inglês | MEDLINE | ID: mdl-35679347

RESUMO

The information content of crystalline materials becomes astronomical when collective electronic behavior and their fluctuations are taken into account. In the past decade, improvements in source brightness and detector technology at modern X-ray facilities have allowed a dramatically increased fraction of this information to be captured. Now, the primary challenge is to understand and discover scientific principles from big datasets when a comprehensive analysis is beyond human reach. We report the development of an unsupervised machine learning approach, X-ray diffraction (XRD) temperature clustering (X-TEC), that can automatically extract charge density wave order parameters and detect intraunit cell ordering and its fluctuations from a series of high-volume X-ray diffraction measurements taken at multiple temperatures. We benchmark X-TEC with diffraction data on a quasi-skutterudite family of materials, (CaxSr[Formula: see text])3Rh4Sn13, where a quantum critical point is observed as a function of Ca concentration. We apply X-TEC to XRD data on the pyrochlore metal, Cd2Re2O7, to investigate its two much-debated structural phase transitions and uncover the Goldstone mode accompanying them. We demonstrate how unprecedented atomic-scale knowledge can be gained when human researchers connect the X-TEC results to physical principles. Specifically, we extract from the X-TEC-revealed selection rules that the Cd and Re displacements are approximately equal in amplitude but out of phase. This discovery reveals a previously unknown involvement of [Formula: see text] Re, supporting the idea of an electronic origin to the structural order. Our approach can radically transform XRD experiments by allowing in operando data analysis and enabling researchers to refine experiments by discovering interesting regions of phase space on the fly.

11.
Proc Natl Acad Sci U S A ; 119(13): e2117203119, 2022 03 29.
Artigo em Inglês | MEDLINE | ID: mdl-35312366

RESUMO

SignificancePublic databases are an important resource for machine learning research, but their growing availability sometimes leads to "off-label" usage, where data published for one task are used for another. This work reveals that such off-label usage could lead to biased, overly optimistic results of machine-learning algorithms. The underlying cause is that public data are processed with hidden processing pipelines that alter the data features. Here we study three well-known algorithms developed for image reconstruction from magnetic resonance imaging measurements and show they could produce biased results with up to 48% artificial improvement when applied to public databases. We relate to the publication of such results as implicit "data crimes" to raise community awareness of this growing big data problem.


Assuntos
Algoritmos , Aprendizado de Máquina , Viés , Crime , Processamento de Imagem Assistida por Computador
12.
J Infect Dis ; 229(4): 1026-1034, 2024 Apr 12.
Artigo em Inglês | MEDLINE | ID: mdl-38097377

RESUMO

BACKGROUND: Solid organ transplant recipients (SOTRs) are at higher risk for severe infection. However, the risk for severe COVID-19 and vaccine effectiveness among SOTRs remain unclear. METHODS: This retrospective study used a nationwide health care claims database and COVID-19 registry from the Republic of Korea (2020 to 2022). Adult SOTRs diagnosed with COVID-19 were matched with up to 4 non-SOTR COVID-19 patients by propensity score. Severe COVID-19 was defined as treatment with high-flow nasal cannulae, mechanical ventilation, or extracorporeal membrane oxygenation. RESULTS: Among 6783 SOTRs with COVID-19, severe COVID-19 was reported with the highest rate in lung transplant recipients (13.16%), followed by the heart (6.30%), kidney (3.90%), and liver (2.40%). SOTRs had a higher risk of severe COVID-19 compared to non-SOTRs, and lung transplant recipients showed the highest risk (adjusted odds ratio, 18.14; 95% confidence interval [CI], 8.53-38.58). Vaccine effectiveness against severe disease among SOTRs was 47% (95% CI, 18%-65%), 64% (95% CI, 49%-75%), and 64% (95% CI, 29%-81%) for 2, 3, and 4 doses, respectively. CONCLUSIONS: SOTRs are at significantly higher risk for severe COVID-19 compared to non-SOTRs. Vaccination is effective in preventing the progression to severe COVID-19. Efforts should be made to improve vaccine uptake among SOTRs, while additional protective measures should be developed.


Assuntos
COVID-19 , Transplante de Órgãos , Adulto , Humanos , COVID-19/epidemiologia , COVID-19/prevenção & controle , Estudos Retrospectivos , Transplantados , Vacinação , Transplante de Órgãos/efeitos adversos
13.
Med Res Rev ; 44(3): 939-974, 2024 05.
Artigo em Inglês | MEDLINE | ID: mdl-38129992

RESUMO

Virtual screening (VS) is an integral and ever-evolving domain of drug discovery framework. The VS is traditionally classified into ligand-based (LB) and structure-based (SB) approaches. Machine intelligence or artificial intelligence has wide applications in the drug discovery domain to reduce time and resource consumption. In combination with machine intelligence algorithms, VS has emerged into revolutionarily progressive technology that learns within robust decision orders for data curation and hit molecule screening from large VS libraries in minutes or hours. The exponential growth of chemical and biological data has evolved as "big-data" in the public domain demands modern and advanced machine intelligence-driven VS approaches to screen hit molecules from ultra-large VS libraries. VS has evolved from an individual approach (LB and SB) to integrated LB and SB techniques to explore various ligand and target protein aspects for the enhanced rate of appropriate hit molecule prediction. Current trends demand advanced and intelligent solutions to handle enormous data in drug discovery domain for screening and optimizing hits or lead with fewer or no false positive hits. Following the big-data drift and tremendous growth in computational architecture, we presented this review. Here, the article categorized and emphasized individual VS techniques, detailed literature presented for machine learning implementation, modern machine intelligence approaches, and limitations and deliberated the future prospects.


Assuntos
Inteligência Artificial , Descoberta de Drogas , Humanos , Ligantes , Descoberta de Drogas/métodos , Algoritmos
14.
Diabetologia ; 67(2): 236-245, 2024 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-38041737

RESUMO

People living with diabetes have many medical devices available to assist with disease management. A critical aspect that must be considered is how systems for continuous glucose monitoring and insulin pumps communicate with each other and how the data generated by these devices can be downloaded, integrated, presented and used. Not only is interoperability associated with practical challenges, but also devices must adhere to all aspects of regulatory and legal frameworks. Key issues around interoperability in terms of data ownership, privacy and the limitations of interoperability include where the responsibility/liability for device and data interoperability lies and the need for standard data-sharing protocols to allow the seamless integration of data from different sources. There is a need for standardised protocols for the open and transparent handling of data and secure integration of data into electronic health records. Here, we discuss the current status of interoperability in medical devices and data used in diabetes therapy, as well as regulatory and legal issues surrounding both device and data interoperability, focusing on Europe (including the UK) and the USA. We also discuss a potential future landscape in which a clear and transparent framework for interoperability and data handling also fulfils the needs of people living with diabetes and healthcare professionals.


Assuntos
Automonitorização da Glicemia , Diabetes Mellitus , Humanos , Glicemia , Diabetes Mellitus/tratamento farmacológico , Registros Eletrônicos de Saúde , Reino Unido
15.
Diabetologia ; 67(2): 223-235, 2024 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-37979006

RESUMO

The discourse amongst diabetes specialists and academics regarding technology and artificial intelligence (AI) typically centres around the 10% of people with diabetes who have type 1 diabetes, focusing on glucose sensors, insulin pumps and, increasingly, closed-loop systems. This focus is reflected in conference topics, strategy documents, technology appraisals and funding streams. What is often overlooked is the wider application of data and AI, as demonstrated through published literature and emerging marketplace products, that offers promising avenues for enhanced clinical care, health-service efficiency and cost-effectiveness. This review provides an overview of AI techniques and explores the use and potential of AI and data-driven systems in a broad context, covering all diabetes types, encompassing: (1) patient education and self-management; (2) clinical decision support systems and predictive analytics, including diagnostic support, treatment and screening advice, complications prediction; and (3) the use of multimodal data, such as imaging or genetic data. The review provides a perspective on how data- and AI-driven systems could transform diabetes care in the coming years and how they could be integrated into daily clinical practice. We discuss evidence for benefits and potential harms, and consider existing barriers to scalable adoption, including challenges related to data availability and exchange, health inequality, clinician hesitancy and regulation. Stakeholders, including clinicians, academics, commissioners, policymakers and those with lived experience, must proactively collaborate to realise the potential benefits that AI-supported diabetes care could bring, whilst mitigating risk and navigating the challenges along the way.


Assuntos
Inteligência Artificial , Diabetes Mellitus Tipo 1 , Humanos , Disparidades nos Níveis de Saúde , Diabetes Mellitus Tipo 1/terapia
16.
Mol Biol Evol ; 40(3)2023 03 04.
Artigo em Inglês | MEDLINE | ID: mdl-36790822

RESUMO

Genomic regions under positive selection harbor variation linked for example to adaptation. Most tools for detecting positively selected variants have computational resource requirements rendering them impractical on population genomic datasets with hundreds of thousands of individuals or more. We have developed and implemented an efficient haplotype-based approach able to scan large datasets and accurately detect positive selection. We achieve this by combining a pattern matching approach based on the positional Burrows-Wheeler transform with model-based inference which only requires the evaluation of closed-form expressions. We evaluate our approach with simulations, and find it to be both sensitive and specific. The computational resource requirements quantified using UK Biobank data indicate that our implementation is scalable to population genomic datasets with millions of individuals. Our approach may serve as an algorithmic blueprint for the era of "big data" genomics: a combinatorial core coupled with statistical inference in closed form.


Assuntos
Genética Populacional , Metagenômica , Genômica , Genoma , Haplótipos
17.
Am J Transplant ; 2024 Jul 06.
Artigo em Inglês | MEDLINE | ID: mdl-38977243

RESUMO

Acute-on-chronic liver failure (ACLF) is a variably defined syndrome characterized by acute decompensation of cirrhosis with organ failures. At least 13 different definitions and diagnostic criteria for ACLF have been proposed, and there is increasing recognition that patients with ACLF may face disadvantages in the current United States liver allocation system. There is a need, therefore, for more standardized data collection and consensus to improve study design and outcome assessment in ACLF. In this article, we discuss the current landscape of transplantation for patients with ACLF, strategies to optimize organ utility, and data opportunities based on emerging technologies to facilitate improved data collection.

18.
Annu Rev Neurosci ; 39: 197-216, 2016 07 08.
Artigo em Inglês | MEDLINE | ID: mdl-27442070

RESUMO

One goal of systems neuroscience is a structure-function model of nervous system organization that would allow mechanistic linking of mind, brain, and behavior. A necessary but not sufficient foundation is a connectome, a complete matrix of structural connections between the nodes of a nervous system. Connections between two nodes can be described at four nested levels of analysis: macroconnections between gray matter regions, mesoconnections between neuron types, microconnections between individual neurons, and nanoconnections at synapses. A long history of attempts to understand how the brain operates as a system began at the macrolevel in the fifth century, was revolutionized at the meso- and microlevels by Cajal and others in the late nineteenth century, and reached the nanolevel in the mid-twentieth century with the advent of electron microscopy. The greatest challenge today is extracting knowledge and understanding of nervous system structure-function architecture from vast amounts of data.


Assuntos
Encéfalo/fisiologia , Conectoma , Vias Neurais/fisiologia , Neurônios/fisiologia , Sinapses/fisiologia , Animais , Conectoma/métodos , Humanos , Modelos Neurológicos
19.
J Gene Med ; 26(1): e3629, 2024 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-37940369

RESUMO

In recent years, developing the idea of "cancer big data" has emerged as a result of the significant expansion of various fields such as clinical research, genomics, proteomics and public health records. Advances in omics technologies are making a significant contribution to cancer big data in biomedicine and disease diagnosis. The increasingly availability of extensive cancer big data has set the stage for the development of multimodal artificial intelligence (AI) frameworks. These frameworks aim to analyze high-dimensional multi-omics data, extracting meaningful information that is challenging to obtain manually. Although interpretability and data quality remain critical challenges, these methods hold great promise for advancing our understanding of cancer biology and improving patient care and clinical outcomes. Here, we provide an overview of cancer big data and explore the applications of both traditional machine learning and deep learning approaches in cancer genomic and proteomic studies. We briefly discuss the challenges and potential of AI techniques in the integrated analysis of omics data, as well as the future direction of personalized treatment options in cancer.


Assuntos
Inteligência Artificial , Neoplasias , Humanos , Proteômica/métodos , Big Data , Genômica/métodos , Aprendizado de Máquina , Neoplasias/diagnóstico , Neoplasias/genética , Neoplasias/terapia
20.
Hum Brain Mapp ; 45(6): e26683, 2024 Apr 15.
Artigo em Inglês | MEDLINE | ID: mdl-38647035

RESUMO

Machine learning (ML) approaches are increasingly being applied to neuroimaging data. Studies in neuroscience typically have to rely on a limited set of training data which may impair the generalizability of ML models. However, it is still unclear which kind of training sample is best suited to optimize generalization performance. In the present study, we systematically investigated the generalization performance of sex classification models trained on the parcelwise connectivity profile of either single samples or compound samples of two different sizes. Generalization performance was quantified in terms of mean across-sample classification accuracy and spatial consistency of accurately classifying parcels. Our results indicate that the generalization performance of parcelwise classifiers (pwCs) trained on single dataset samples is dependent on the specific test samples. Certain datasets seem to "match" in the sense that classifiers trained on a sample from one dataset achieved a high accuracy when tested on the respected other one and vice versa. The pwCs trained on the compound samples demonstrated overall highest generalization performance for all test samples, including one derived from a dataset not included in building the training samples. Thus, our results indicate that both a large sample size and a heterogeneous data composition of a training sample have a central role in achieving generalizable results.


Assuntos
Conectoma , Aprendizado de Máquina , Imageamento por Ressonância Magnética , Humanos , Feminino , Masculino , Adulto , Conectoma/métodos , Caracteres Sexuais , Conjuntos de Dados como Assunto , Adulto Jovem , Encéfalo/diagnóstico por imagem , Encéfalo/fisiologia
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA