Pesquisa | Portal de Pesquisa da BVS Enfermagem

1.

scPLAN: a hierarchical computational framework for single transcriptomics data annotation, integration and cell-type label refinement.

Guo, Qirui; Yuan, Musu; Zhang, Lei; Deng, Minghua.

Brief Bioinform ; 25(4)2024 May 23.

Artigo em Inglês | MEDLINE | ID: mdl-38935069

RESUMO

MOTIVATION: In the past decade, single-cell RNA sequencing (scRNA-seq) has emerged as a pivotal method for transcriptomic profiling in biomedical research. Precise cell-type identification is crucial for subsequent analysis of single-cell data. And the integration and refinement of annotated data are essential for building comprehensive databases. However, prevailing annotation techniques often overlook the hierarchical organization of cell types, resulting in inconsistent annotations. Meanwhile, most existing integration approaches fail to integrate datasets with different annotation depths and none of them can enhance the labels of outdated data with lower annotation resolutions using more intricately annotated datasets or novel biological findings. RESULTS: Here, we introduce scPLAN, a hierarchical computational framework designed for scRNA-seq data analysis. scPLAN excels in annotating unlabeled scRNA-seq data using a reference dataset structured along a hierarchical cell-type tree. It identifies potential novel cell types in a systematic, layer-by-layer manner. Additionally, scPLAN effectively integrates annotated scRNA-seq datasets with varying levels of annotation depth, ensuring consistent refinement of cell-type labels across datasets with lower resolutions. Through extensive annotation and novel cell detection experiments, scPLAN has demonstrated its efficacy. Two case studies have been conducted to showcase how scPLAN integrates datasets with diverse cell-type label resolutions and refine their cell-type labels. AVAILABILITY: https://github.com/michaelGuo1204/scPLAN.

Assuntos

Biologia Computacional , Perfilação da Expressão Gênica , Análise de Célula Única , Análise de Célula Única/métodos , Perfilação da Expressão Gênica/métodos , Biologia Computacional/métodos , Humanos , Software , Transcriptoma , Análise de Sequência de RNA/métodos , RNA-Seq/métodos , Anotação de Sequência Molecular/métodos

2.

FastBiCmrMLM: a fast and powerful compressed variance component mixed logistic model for big genomic case-control genome-wide association study.

Wang, Jing-Tian; Chang, Xiao-Yu; Zhao, Qiong; Zhang, Yuan-Ming.

Brief Bioinform ; 25(4)2024 May 23.

Artigo em Inglês | MEDLINE | ID: mdl-38888457

RESUMO

Large sample datasets have been regarded as the primary basis for innovative discoveries and the solution to missing heritability in genome-wide association studies. However, their computational complexity cannot consider all comprehensive effects and all polygenic backgrounds, which reduces the effectiveness of large datasets. To address these challenges, we included all effects and polygenic backgrounds in a mixed logistic model for binary traits and compressed four variance components into two. The compressed model combined three computational algorithms to develop an innovative method, called FastBiCmrMLM, for large data analysis. These algorithms were tailored to sample size, computational speed, and reduced memory requirements. To mine additional genes, linkage disequilibrium markers were replaced by bin-based haplotypes, which are analyzed by FastBiCmrMLM, named FastBiCmrMLM-Hap. Simulation studies highlighted the superiority of FastBiCmrMLM over GMMAT, SAIGE and fastGWA-GLMM in identifying dominant, small α (allele substitution effect), and rare variants. In the UK Biobank-scale dataset, we demonstrated that FastBiCmrMLM could detect variants as small as 0.03% and with α ≈ 0. In re-analyses of seven diseases in the WTCCC datasets, 29 candidate genes, with both functional and TWAS evidence, around 36 variants identified only by the new methods, strongly validated the new methods. These methods offer a new way to decipher the genetic architecture of binary traits and address the challenges outlined above.

Assuntos

Algoritmos , Estudo de Associação Genômica Ampla , Estudo de Associação Genômica Ampla/métodos , Humanos , Modelos Logísticos , Estudos de Casos e Controles , Desequilíbrio de Ligação , Polimorfismo de Nucleotídeo Único , Genômica/métodos , Simulação por Computador , Haplótipos , Modelos Genéticos

3.

Enhancing antigenic peptide discovery: Improved MHC-I binding prediction and methodology.

Gizinski, Stanislaw; Preibisch, Grzegorz; Kucharski, Piotr; Tyrolski, Michal; Rembalski, Michal; Grzegorczyk, Piotr; Gambin, Anna.

Methods ; 224: 1-9, 2024 Apr.

Artigo em Inglês | MEDLINE | ID: mdl-38295891

RESUMO

The Major Histocompatibility Complex (MHC) is a critical element of the vertebrate cellular immune system, responsible for presenting peptides derived from intracellular proteins. MHC-I presentation is pivotal in the immune response and holds considerable potential in the realms of vaccine development and cancer immunotherapy. This study delves into the limitations of current methods and benchmarks for MHC-I presentation. We introduce a novel benchmark designed to assess generalization properties and the reliability of models on unseen MHC molecules and peptides, with a focus on the Human Leukocyte Antigen (HLA)-a specific subset of MHC genes present in humans. Finally, we introduce HLABERT, a pretrained language model that outperforms previous methods significantly on our benchmark and establishes a new state-of-the-art on existing benchmarks.

Assuntos

Peptídeos , Proteínas , Humanos , Reprodutibilidade dos Testes , Peptídeos/química , Proteínas/metabolismo , Complexo Principal de Histocompatibilidade/genética , Ligação Proteica

4.

Open-source large language models in action: A bioinformatics chatbot for PRIDE database.

Bai, Jingwen; Kamatchinathan, Selvakumar; Kundu, Deepti J; Bandla, Chakradhar; Vizcaíno, Juan Antonio; Perez-Riverol, Yasset.

Proteomics ; : e2400005, 2024 Mar 31.

Artigo em Inglês | MEDLINE | ID: mdl-38556628

RESUMO

We here present a chatbot assistant infrastructure (https://www.ebi.ac.uk/pride/chatbot/) that simplifies user interactions with the PRIDE database's documentation and dataset search functionality. The framework utilizes multiple Large Language Models (LLM): llama2, chatglm, mixtral (mistral), and openhermes. It also includes a web service API (Application Programming Interface), web interface, and components for indexing and managing vector databases. An Elo-ranking system-based benchmark component is included in the framework as well, which allows for evaluating the performance of each LLM and for improving PRIDE documentation. The chatbot not only allows users to interact with PRIDE documentation but can also be used to search and find PRIDE datasets using an LLM-based recommendation system, enabling dataset discoverability. Importantly, while our infrastructure is exemplified through its application in the PRIDE database context, the modular and adaptable nature of our approach positions it as a valuable tool for improving user experiences across a spectrum of bioinformatics and proteomics tools and resources, among other domains. The integration of advanced LLMs, innovative vector-based construction, the benchmarking framework, and optimized documentation collectively form a robust and transferable chatbot assistant infrastructure. The framework is open-source (https://github.com/PRIDE-Archive/pride-chatbot).

5.

Biclustering analysis on tree-shaped time-series single cell gene expression data of Caenorhabditis elegans.

Guan, Qi; Yan, Xianzhong; Wu, Yida; Zhou, Da; Hu, Jie.

BMC Bioinformatics ; 25(1): 183, 2024 May 09.

Artigo em Inglês | MEDLINE | ID: mdl-38724908

RESUMO

BACKGROUND: In recent years, gene clustering analysis has become a widely used tool for studying gene functions, efficiently categorizing genes with similar expression patterns to aid in identifying gene functions. Caenorhabditis elegans is commonly used in embryonic research due to its consistent cell lineage from fertilized egg to adulthood. Biologists use 4D confocal imaging to observe gene expression dynamics at the single-cell level. However, on one hand, the observed tree-shaped time-series datasets have characteristics such as non-pairwise data points between different individuals. On the other hand, the influence of cell type heterogeneity should also be considered during clustering, aiming to obtain more biologically significant clustering results. RESULTS: A biclustering model is proposed for tree-shaped single-cell gene expression data of Caenorhabditis elegans. Detailedly, a tree-shaped piecewise polynomial function is first employed to fit non-pairwise gene expression time series data. Then, four factors are considered in the objective function, including Pearson correlation coefficients capturing gene correlations, p-values from the Kolmogorov-Smirnov test measuring the similarity between cells, as well as gene expression size and bicluster overlapping size. After that, Genetic Algorithm is utilized to optimize the function. CONCLUSION: The results on the small-scale dataset analysis validate the feasibility and effectiveness of our model and are superior to existing classical biclustering models. Besides, gene enrichment analysis is employed to assess the results on the complete real dataset analysis, confirming that the discovered biclustering results hold significant biological relevance.

Assuntos

Caenorhabditis elegans , Análise de Célula Única , Caenorhabditis elegans/genética , Caenorhabditis elegans/metabolismo , Animais , Análise de Célula Única/métodos , Análise por Conglomerados , Perfilação da Expressão Gênica/métodos , Algoritmos

6.

Comparison of structural variant callers for massive whole-genome sequence data.

Joe, Soobok; Park, Jong-Lyul; Kim, Jun; Kim, Sangok; Park, Ji-Hwan; Yeo, Min-Kyung; Lee, Dongyoon; Yang, Jin Ok; Kim, Seon-Young.

BMC Genomics ; 25(1): 318, 2024 Mar 28.

Artigo em Inglês | MEDLINE | ID: mdl-38549092

RESUMO

BACKGROUND: Detecting structural variations (SVs) at the population level using next-generation sequencing (NGS) requires substantial computational resources and processing time. Here, we compared the performances of 11 SV callers: Delly, Manta, GridSS, Wham, Sniffles, Lumpy, SvABA, Canvas, CNVnator, MELT, and INSurVeyor. These SV callers have been recently published and have been widely employed for processing massive whole-genome sequencing datasets. We evaluated the accuracy, sequence depth, running time, and memory usage of the SV callers. RESULTS: Notably, several callers exhibited better calling performance for deletions than for duplications, inversions, and insertions. Among the SV callers, Manta identified deletion SVs with better performance and efficient computing resources, and both Manta and MELT demonstrated relatively good precision regarding calling insertions. We confirmed that the copy number variation callers, Canvas and CNVnator, exhibited better performance in identifying long duplications as they employ the read-depth approach. Finally, we also verified the genotypes inferred from each SV caller using a phased long-read assembly dataset, and Manta showed the highest concordance in terms of the deletions and insertions. CONCLUSIONS: Our findings provide a comprehensive understanding of the accuracy and computational efficiency of SV callers, thereby facilitating integrative analysis of SV profiles in diverse large-scale genomic datasets.

Assuntos

Variações do Número de Cópias de DNA , Genômica , Humanos , Sequenciamento Completo do Genoma , Sequenciamento de Nucleotídeos em Larga Escala , Análise de Sequência de DNA , Genoma Humano , Variação Estrutural do Genoma

7.

Fetal brain MRI atlases and datasets: A review.

Ciceri, Tommaso; Casartelli, Luca; Montano, Florian; Conte, Stefania; Squarcina, Letizia; Bertoldo, Alessandra; Agarwal, Nivedita; Brambilla, Paolo; Peruzzo, Denis.

Neuroimage ; 292: 120603, 2024 Apr 15.

Artigo em Inglês | MEDLINE | ID: mdl-38588833

RESUMO

Fetal brain development is a complex process involving different stages of growth and organization which are crucial for the development of brain circuits and neural connections. Fetal atlases and labeled datasets are promising tools to investigate prenatal brain development. They support the identification of atypical brain patterns, providing insights into potential early signs of clinical conditions. In a nutshell, prenatal brain imaging and post-processing via modern tools are a cutting-edge field that will significantly contribute to the advancement of our understanding of fetal development. In this work, we first provide terminological clarification for specific terms (i.e., "brain template" and "brain atlas"), highlighting potentially misleading interpretations related to inconsistent use of terms in the literature. We discuss the major structures and neurodevelopmental milestones characterizing fetal brain ontogenesis. Our main contribution is the systematic review of 18 prenatal brain atlases and 3 datasets. We also tangentially focus on clinical, research, and ethical implications of prenatal neuroimaging.

Assuntos

Atlas como Assunto , Encéfalo , Imageamento por Ressonância Magnética , Neuroimagem , Feminino , Humanos , Gravidez , Encéfalo/diagnóstico por imagem , Encéfalo/embriologia , Conjuntos de Dados como Assunto , Desenvolvimento Fetal/fisiologia , Feto/diagnóstico por imagem , Imageamento por Ressonância Magnética/métodos , Neuroimagem/métodos

8.

Comprehensive data analysis of white blood cells with classification and segmentation by using deep learning approaches.

Özcan, Seyma Nur; Uyar, Tansel; Karayegen, Gökay.

Cytometry A ; 105(7): 501-520, 2024 Jul.

Artigo em Inglês | MEDLINE | ID: mdl-38563259

RESUMO

Deep learning approaches have frequently been used in the classification and segmentation of human peripheral blood cells. The common feature of previous studies was that they used more than one dataset, but used them separately. No study has been found that combines more than two datasets to use together. In classification, five types of white blood cells were identified by using a mixture of four different datasets. In segmentation, four types of white blood cells were determined, and three different neural networks, including CNN (Convolutional Neural Network), UNet and SegNet, were applied. The classification results of the presented study were compared with those of related studies. The balanced accuracy was 98.03%, and the test accuracy of the train-independent dataset was determined to be 97.27%. For segmentation, accuracy rates of 98.9% for train-dependent dataset and 92.82% for train-independent dataset for the proposed CNN were obtained in both nucleus and cytoplasm detection. In the presented study, the proposed method showed that it could detect white blood cells from a train-independent dataset with high accuracy. Additionally, it is promising as a diagnostic tool that can be used in the clinical field, with successful results in classification and segmentation.

Assuntos

Aprendizado Profundo , Leucócitos , Redes Neurais de Computação , Humanos , Leucócitos/citologia , Leucócitos/classificação , Processamento de Imagem Assistida por Computador/métodos , Análise de Dados , Núcleo Celular , Citoplasma

9.

Dataset for reporting of the invasive carcinoma of the breast: recommendations from the International Collaboration on Cancer Reporting (ICCR).

Ellis, Ian; Webster, Fleur; Allison, Kimberly H; Dang, Chau; Gobbi, Helenice; Kulka, Janina; Lakhani, Sunil R; Moriya, Takuya; Quinn, Cecily M; Sapino, Anna; Schnitt, Stuart; Sibbering, D Mark; Slodkowska, Elzbieta; Yang, Wentao; Tan, Puay H.

Histopathology ; 85(3): 418-436, 2024 Sep.

Artigo em Inglês | MEDLINE | ID: mdl-38719547

RESUMO

BACKGROUND AND OBJECTIVES: Current national or regional guidelines for the pathology reporting on invasive breast cancer differ in certain aspects, resulting in divergent reporting practice and a lack of comparability of data. Here we report on a new international dataset for the pathology reporting of resection specimens with invasive cancer of the breast. The dataset was produced under the auspices of the International Collaboration on Cancer Reporting (ICCR), a global alliance of major (inter-)national pathology and cancer organizations. METHODS AND RESULTS: The established ICCR process for dataset development was followed. An international expert panel consisting of breast pathologists, a surgeon, and an oncologist prepared a draft set of core and noncore data items based on a critical review and discussion of current evidence. Commentary was provided for each data item to explain the rationale for selecting it as a core or noncore element, its clinical relevance, and to highlight potential areas of disagreement or lack of evidence, in which case a consensus position was formulated. Following international public consultation, the document was finalized and ratified, and the dataset, which includes a synoptic reporting guide, was published on the ICCR website. CONCLUSIONS: This first international dataset for invasive cancer of the breast is intended to promote high-quality, standardized pathology reporting. Its widespread adoption will improve consistency of reporting, facilitate multidisciplinary communication, and enhance comparability of data, all of which will help to improve the management of invasive breast cancer patients.

Assuntos

Neoplasias da Mama , Humanos , Neoplasias da Mama/patologia , Feminino , Patologia Clínica/normas , Conjuntos de Dados como Assunto/normas

10.

A multicenter high-quality data registry for advanced proton therapy approaches: the POWER registry.

Alterio, Daniela; Vincini, Maria Giulia; Volpe, Stefania; Bergamaschi, Luca; Zaffaroni, Mattia; Gandini, Sara; Peruzzotti, Giulia; Cattani, Federica; Garibaldi, Cristina; Jereczek-Fossa, Barbara Alicja; Orecchia, Roberto.

BMC Cancer ; 24(1): 333, 2024 Mar 12.

Artigo em Inglês | MEDLINE | ID: mdl-38475762

RESUMO

BACKGROUND: Paucity and low evidence-level data on proton therapy (PT) represent one of the main issues for the establishment of solid indications in the PT setting. Aim of the present registry, the POWER registry, is to provide a tool for systematic, prospective, harmonized, and multidimensional high-quality data collection to promote knowledge in the field of PT with a particular focus on the use of hypofractionation. METHODS: All patients with any type of oncologic disease (benign and malignant disease) eligible for PT at the European Institute of Oncology (IEO), Milan, Italy, will be included in the present registry. Three levels of data collection will be implemented: Level (1) clinical research (patients outcome and toxicity, quality of life, and cost/effectiveness analysis); Level (2) radiological and radiobiological research (radiomic and dosiomic analysis, as well as biological modeling); Level (3) biological and translational research (biological biomarkers and genomic data analysis). Endpoints and outcome measures of hypofractionation schedules will be evaluated in terms of either Treatment Efficacy (tumor response rate, time to progression/percentages of survivors/median survival, clinical, biological, and radiological biomarkers changes, identified as surrogate endpoints of cancer survival/response to treatment) and Toxicity. The study protocol has been approved by the IEO Ethical Committee (IEO 1885). Other than patients treated at IEO, additional PT facilities (equipped with Proteus®ONE or Proteus®PLUS technologies by IBA, Ion Beam Applications, Louvain-la-Neuve, Belgium) are planned to join the registry data collection. Moreover, the registry will be also fully integrated into international PT data collection networks.

Assuntos

Neoplasias , Terapia com Prótons , Humanos , Biomarcadores , Estudos Prospectivos , Qualidade de Vida , Sistema de Registros , Estudos Multicêntricos como Assunto

11.

Introducing a core dataset for real-world data in multiple sclerosis registries and cohorts: Recommendations from a global task force.

Parciak, Tina; Geys, Lotte; Helme, Anne; van der Mei, Ingrid; Hillert, Jan; Schmidt, Hollie; Salter, Amber; Zakaria, Magd; Middleton, Rodden; Stahmann, Alexander; Dobay, Pamela; Hernandez Martinez-Lapiscina, Elena; Iaffaldano, Pietro; Plueschke, Kelly; Rojas, Juan I; Sabidó, Meritxell; Magyari, Melinda; van der Walt, Anneke; Arickx, Francis; Comi, Giancarlo; Peeters, Liesbet M.

Mult Scler ; 30(3): 396-418, 2024 Mar.

Artigo em Inglês | MEDLINE | ID: mdl-38140852

RESUMO

BACKGROUND: As of September 2022, there was no globally recommended set of core data elements for use in multiple sclerosis (MS) healthcare and research. As a result, data harmonisation across observational data sources and scientific collaboration is limited. OBJECTIVES: To define and agree upon a core dataset for real-world data (RWD) in MS from observational registries and cohorts. METHODS: A three-phase process approach was conducted combining a landscaping exercise with dedicated discussions within a global multi-stakeholder task force consisting of 20 experts in the field of MS and its RWD to define the Core Dataset. RESULTS: A core dataset for MS consisting of 44 variables in eight categories was translated into a data dictionary that has been published and disseminated for emerging and existing registries and cohorts to use. Categories include variables on demographics and comorbidities (patient-specific data), disease history, disease status, relapses, magnetic resonance imaging (MRI) and treatment data (disease-specific data). CONCLUSION: The MS Data Alliance Core Dataset guides emerging registries in their dataset definitions and speeds up and supports harmonisation across registries and initiatives. The straight-forward, time-efficient process using a dedicated global multi-stakeholder task force has proven to be effective to define a concise core dataset.

Assuntos

Esclerose Múltipla , Humanos , Sistema de Registros

12.

Actionable Predictions of Human Pharmacokinetics at the Drug Design Stage.

Komissarov, Leonid; Manevski, Nenad; Groebke Zbinden, Katrin; Schindler, Torsten; Zitnik, Marinka; Sach-Peltason, Lisa.

Mol Pharm ; 2024 Aug 12.

Artigo em Inglês | MEDLINE | ID: mdl-39132855

RESUMO

We present a novel computational approach for predicting human pharmacokinetics (PK) that addresses the challenges of early stage drug design. Our study introduces and describes a large-scale data set of 11 clinical PK end points, encompassing over 2700 unique chemical structures to train machine learning models. To that end multiple advanced training strategies are compared, including the integration of in vitro data and a novel self-supervised pretraining task. In addition to the predictions, our final model provides meaningful epistemic uncertainties for every data point. This allows us to successfully identify regions of exceptional predictive performance, with an absolute average fold error (AAFE/geometric mean fold error) of less than 2.5 across multiple end points. Together, these advancements represent a significant leap toward actionable PK predictions, which can be utilized early on in the drug design process to expedite development and reduce reliance on nonclinical studies.

13.

Efficient data integration under prior probability shift.

Huang, Ming-Yueh; Qin, Jing; Huang, Chiung-Yu.

Biometrics ; 80(2)2024 Mar 27.

Artigo em Inglês | MEDLINE | ID: mdl-38768225

RESUMO

Conventional supervised learning usually operates under the premise that data are collected from the same underlying population. However, challenges may arise when integrating new data from different populations, resulting in a phenomenon known as dataset shift. This paper focuses on prior probability shift, where the distribution of the outcome varies across datasets but the conditional distribution of features given the outcome remains the same. To tackle the challenges posed by such shift, we propose an estimation algorithm that can efficiently combine information from multiple sources. Unlike existing methods that are restricted to discrete outcomes, the proposed approach accommodates both discrete and continuous outcomes. It also handles high-dimensional covariate vectors through variable selection using an adaptive least absolute shrinkage and selection operator penalty, producing efficient estimates that possess the oracle property. Moreover, a novel semiparametric likelihood ratio test is proposed to check the validity of prior probability shift assumptions by embedding the null conditional density function into Neyman's smooth alternatives (Neyman, 1937) and testing study-specific parameters. We demonstrate the effectiveness of our proposed method through extensive simulations and a real data example. The proposed methods serve as a useful addition to the repertoire of tools for dealing dataset shifts.

Assuntos

Algoritmos , Simulação por Computador , Modelos Estatísticos , Probabilidade , Humanos , Funções Verossimilhança , Biometria/métodos , Interpretação Estatística de Dados , Aprendizado de Máquina Supervisionado

14.

Recommendations for data collection in cohort studies of preterm born individuals - The RECAP Preterm Core Dataset.

Powell, Charlotte; Bamber, Deborah; Collins, Helen E; Draper, Elizabeth S; Manktelow, Bradley; Kajante, Eero; Cuttini, Marina; Wolke, Dieter; Maier, Rolf F; Zeitlin, Jennifer; Johnson, Samantha.

Paediatr Perinat Epidemiol ; 2024 Jun 17.

Artigo em Inglês | MEDLINE | ID: mdl-38886295

RESUMO

BACKGROUND: Preterm birth (before 37 completed weeks of gestation) is associated with an increased risk of adverse health and developmental outcomes relative to birth at term. Existing guidelines for data collection in cohort studies of individuals born preterm are either limited in scope, have not been developed using formal consensus methodology, or did not involve a range of stakeholders in their development. Recommendations meeting these criteria would facilitate data pooling and harmonisation across studies. OBJECTIVES: To develop a Core Dataset for use in longitudinal cohort studies of individuals born preterm. METHODS: This work was carried out as part of the RECAP Preterm project. A systematic review of variables included in existing core outcome sets was combined with a scoping exercise conducted with experts on preterm birth. The results were used to generate a draft core dataset. A modified Delphi process was implemented using two stages with three rounds each. Three stakeholder groups participated: RECAP Preterm project partners; external experts in the field; people with lived experience of preterm birth. The Delphi used a 9-point Likert scale. Higher values indicated greater importance for inclusion. Participants also suggested additional variables they considered important for inclusion which were voted on in later rounds. RESULTS: An initial list of 140 data items was generated. Ninety-six participants across 22 countries participated in the Delphi, of which 29% were individuals with lived experience of preterm birth. Consensus was reached on 160 data items covering Antenatal and Birth Information, Neonatal Care, Mortality, Administrative Information, Organisational Level Information, Socio-economic and Demographic information, Physical Health, Education and Learning, Neurodevelopmental Outcomes, Social, Lifestyle and Leisure, Healthcare Utilisation and Quality of Life. CONCLUSIONS: This core dataset includes 160 data items covering antenatal care through outcomes in adulthood. Its use will guide data collection in new studies and facilitate pooling and harmonisation of existing data internationally.

15.

MultiWD: Multi-label wellness dimensions in social media posts.

Garg, Muskan; Liu, Xingyi; Sathvik, M S V P J; Raza, Shaina; Sohn, Sunghwan.

J Biomed Inform ; 150: 104586, 2024 02.

Artigo em Inglês | MEDLINE | ID: mdl-38191011

RESUMO

BACKGROUND: Halbert L. Dunn's concept of wellness is a multi-dimensional aspect encompassing social and mental well-being. Neglecting these dimensions over time can have a negative impact on an individual's mental health. The manual efforts employed in in-person therapy sessions reveal that underlying factors of mental disturbance if triggered, may lead to severe mental health disorders. OBJECTIVE: In our research, we introduce a fine-grained approach focused on identifying indicators of wellness dimensions and mark their presence in self-narrated human-writings on Reddit social media platform. DESIGN AND METHOD: We present the MultiWD dataset, a curated collection comprising 3281 instances, as a specifically designed and annotated dataset that facilitates the identification of multiple wellness dimensions in Reddit posts. In our study, we introduce the task of identifying wellness dimensions and utilize state-of-the-art classifiers to solve this multi-label classification task. RESULTS: Our findings highlights the best and comparative performance of fine-tuned large language models with fine-tuned BERT model. As such, we set BERT as a baseline model to tag wellness dimensions in a user-penned text with F1 score of 76.69. CONCLUSION: Our findings underscore the need of trustworthy and domain-specific knowledge infusion to develop more comprehensive and contextually-aware AI models for tagging and extracting wellness dimensions.

Assuntos

Transtornos Mentais , Mídias Sociais , Humanos , Saúde Mental , Conscientização

16.

Enhancement of cyber security in IoT based on ant colony optimized artificial neural adaptive Tensor flow.

Sadu, Vijaya Bhaskar; Abhishek, Kumar; Al-Omari, Omaia Mohammed; Nallola, Sandhya Rani; Sharma, Rajeev Kumar; Khan, Mohammad Shadab.

Network ; : 1-17, 2024 Jul 15.

Artigo em Inglês | MEDLINE | ID: mdl-39007930

RESUMO

The Internet of Things (IoT) is a network that connects various hardware, software, data storage, and applications. These interconnected devices provide services to businesses and can potentially serve as entry points for cyber-attacks. The privacy of IoT devices is increasingly vulnerable, particularly to threats like viruses and illegal software distribution lead to the theft of critical information. Ant Colony-Optimized Artificial Neural-Adaptive Tensorflow (ACO-ANT) technique is proposed to detect malicious software illicitly disseminated through the IoT. To emphasize the significance of each token in source duplicate data, the noise data undergoes processing using tokenization and weighted attribute techniques. Deep learning (DL) methods are then employed to identify source code duplication. Also the Multi-Objective Recurrent Neural Network (M-RNN) is used to identify suspicious activities within an IoT environment. The performance of proposed technique is examined using Loss, accuracy, F measure, precision to identify its efficiency. The experimental outcomes demonstrate that the proposed method ACO-ANT on Malimg dataset provides 12.35%, 14.75%, 11.84% higher precision and 10.95%, 15.78%, 13.89% higher f-measure compared to the existing methods. Further, leveraging block chain for malware detection is a promising direction for future research the fact that could enhance the security of IoT and identify malware threats.

17.

Adaptive activation Functions with Deep Kronecker Neural Network optimized with Bear Smell Search Algorithm for preventing MANET Cyber security attacks.

Shanmugham, E V R M Kalaimani; Dhatchnamurthy, Saravanan; Pakkiri, Prabbu Sankar; Garg, Neha.

Network ; : 1-25, 2024 Mar 14.

Artigo em Inglês | MEDLINE | ID: mdl-38482862

RESUMO

An Adaptive activation Functions with Deep Kronecker Neural Network optimized with Bear Smell Search Algorithm (BSSA) (ADKNN-BSSA-CSMANET) is proposed for preventing MANET Cyber security attacks. The mobile users are enrolled with Trusted Authority Using a Crypto Hash Signature (SHA-256). Every mobile user uploads their finger vein biometric, user ID, latitude and longitude for confirmation. The packet analyser checks if any attack patterns are identified. It is implemented using adaptive density-based spatial clustering (ADSC) that deems information from packet header. Geodesic filtering (GF) is used as a pre-processing method for eradicating the unsolicited content and filtering pertinent data. Group Teaching Algorithm (GTA)-based feature selection is utilized for ideal collection of features and Adaptive Activation Functions along Deep Kronecker Neural Network (ADKNN) is used to categorizing normal and attack packets (DoS, Probe, U2R, and R2L). Then BSSA is utilized for optimizing the weight parameters of ADKNN classifier for optimal classification. The proposed technique is executed in python and its efficiency is evaluated by several performances metrics, such as Accuracy, Attack Detection Rate, Detection Delay, Packet Delivery Ratio, Throughput, and Energy Consumption. The proposed technique provides 36.64%, 33.06%, and 33.98% lower Detection Delay on NSL-KDD dataset compared with the existing methods.

18.

Development and validation of dietary depression index in Chinese adults.

Gao, Min; Zheng, Jiali; Li, Fangyu; Yan, Yumeng; Wu, Yin; Li, Sha; Li, Jun; Li, Xiaoguang; Wang, Hui.

Nutr Neurosci ; : 1-11, 2024 Jul 24.

Artigo em Inglês | MEDLINE | ID: mdl-39046352

RESUMO

Objective: Previous studies have suggested diet was associated with depressive symptoms. We aimed to develop and validate Dietary Depression Index (DDI) based on dietary prediction of depression in a large Chinese cancer screening cohort.Methods: In the training set (n = 2729), we developed DDI by using intake of 20 food groups derived from a food frequency questionnaire to predict depression as assessed by Patient Health Questionnaire-9 based on the reduced rank regression method. Sensitivity, specificity, positive predictive value, and negative predictive value were used to assess the performance of DDI in evaluating depression in the validation dataset (n = 1176).Results: Receiver operating characteristic analysis was constructed to determine the best cut-off value of DDI in predicting depression. In the study population, the DDI ranged from -3.126 to 1.810. The discriminative ability of DDI in predicting depression was good with the AUC of 0.799 overall, 0.794 in males and 0.808 in females. The best cut-off values of DDI for depression prediction were 0.204 overall, 0.330 in males and 0.034 in females. DDI was a validated method to assess the effects of diet on depression.Conclusion: Among individual food components in DDI, fermented vegetables, fresh vegetables, whole grains and onions were inversely associated, whereas legumes, pickled vegetables and rice were positively associated with depressive symptoms.

19.

Refining neural network algorithms for accurate brain tumor classification in MRI imagery.

Alshuhail, Asma; Thakur, Arastu; Chandramma, R; Mahesh, T R; Almusharraf, Ahlam; Vinoth Kumar, V; Khan, Surbhi Bhatia.

BMC Med Imaging ; 24(1): 118, 2024 May 21.

Artigo em Inglês | MEDLINE | ID: mdl-38773391

RESUMO

Brain tumor diagnosis using MRI scans poses significant challenges due to the complex nature of tumor appearances and variations. Traditional methods often require extensive manual intervention and are prone to human error, leading to misdiagnosis and delayed treatment. Current approaches primarily include manual examination by radiologists and conventional machine learning techniques. These methods rely heavily on feature extraction and classification algorithms, which may not capture the intricate patterns present in brain MRI images. Conventional techniques often suffer from limited accuracy and generalizability, mainly due to the high variability in tumor appearance and the subjective nature of manual interpretation. Additionally, traditional machine learning models may struggle with the high-dimensional data inherent in MRI images. To address these limitations, our research introduces a deep learning-based model utilizing convolutional neural networks (CNNs).Our model employs a sequential CNN architecture with multiple convolutional, max-pooling, and dropout layers, followed by dense layers for classification. The proposed model demonstrates a significant improvement in diagnostic accuracy, achieving an overall accuracy of 98% on the test dataset. The proposed model demonstrates a significant improvement in diagnostic accuracy, achieving an overall accuracy of 98% on the test dataset. The precision, recall, and F1-scores ranging from 97 to 98% with a roc-auc ranging from 99 to 100% for each tumor category further substantiate the model's effectiveness. Additionally, the utilization of Grad-CAM visualizations provides insights into the model's decision-making process, enhancing interpretability. This research addresses the pressing need for enhanced diagnostic accuracy in identifying brain tumors through MRI imaging, tackling challenges such as variability in tumor appearance and the need for rapid, reliable diagnostic tools.

Assuntos

Neoplasias Encefálicas , Aprendizado Profundo , Imageamento por Ressonância Magnética , Redes Neurais de Computação , Humanos , Neoplasias Encefálicas/diagnóstico por imagem , Neoplasias Encefálicas/classificação , Imageamento por Ressonância Magnética/métodos , Algoritmos , Interpretação de Imagem Assistida por Computador/métodos , Masculino , Feminino

20.

Research on segmentation model of optic disc and optic cup in fundus.

Chen, Naigong; Lv, Xiujuan.

BMC Ophthalmol ; 24(1): 273, 2024 Jun 28.

Artigo em Inglês | MEDLINE | ID: mdl-38943095

RESUMO

BACKGROUND: Glaucoma is a worldwide eye disease that can cause irreversible vision loss. Early detection of glaucoma is important to reduce vision loss, and retinal fundus image examination is one of the most commonly used solutions for glaucoma diagnosis due to its low cost. Clinically, the cup-disc ratio of fundus images is an important indicator for glaucoma diagnosis. In recent years, there have been an increasing number of algorithms for segmentation and recognition of the optic disc (OD) and optic cup (OC), but these algorithms generally have poor universality, segmentation performance, and segmentation accuracy. METHODS: By improving the YOLOv8 algorithm for segmentation of OD and OC. Firstly, a set of algorithms was designed to adapt the REFUGE dataset's result images to the input format of the YOLOv8 algorithm. Secondly, in order to improve segmentation performance, the network structure of YOLOv8 was improved, including adding a ROI (Region of Interest) module, modifying the bounding box regression loss function from CIOU to Focal-EIoU. Finally, by training and testing the REFUGE dataset, the improved YOLOv8 algorithm was evaluated. RESULTS: The experimental results show that the improved YOLOv8 algorithm achieves good segmentation performance on the REFUGE dataset. In the OD and OC segmentation tests, the F1 score is 0.999. CONCLUSIONS: We improved the YOLOv8 algorithm and applied the improved model to the segmentation task of OD and OC in fundus images. The results show that our improved model is far superior to the mainstream U-Net model in terms of training speed, segmentation performance, and segmentation accuracy.

Assuntos

Algoritmos , Fundo de Olho , Glaucoma , Disco Óptico , Disco Óptico/diagnóstico por imagem , Humanos , Glaucoma/diagnóstico

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

RESUMO

RESUMO

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA