Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Resultados 1 - 20 de 657
Filtrar
Más filtros

Tipo del documento
Publication year range
1.
Genet Epidemiol ; 2024 Aug 13.
Artículo en Inglés | MEDLINE | ID: mdl-39138631

RESUMEN

Mendelian randomization (MR) is an epidemiological approach that utilizes genetic variants as instrumental variables to estimate the causal effect of an exposure on a health outcome. This paper investigates an MR scenario in which genetic variants aggregate into clusters that identify heterogeneous causal effects. Such variant clusters are likely to emerge if they affect the exposure and outcome via distinct biological pathways. In the multi-outcome MR framework, where a shared exposure causally impacts several disease outcomes simultaneously, these variant clusters can provide insights into the common disease-causing mechanisms underpinning the co-occurrence of multiple long-term conditions, a phenomenon known as multimorbidity. To identify such variant clusters, we adapt the general method of agglomerative hierarchical clustering to multi-sample summary-data MR setup, enabling cluster detection based on variant-specific ratio estimates. Particularly, we tailor the method for multi-outcome MR to aid in elucidating the causal pathways through which a common risk factor contributes to multiple morbidities. We show in simulations that our "MR-AHC" method detects clusters with high accuracy, outperforming the existing methods. We apply the method to investigate the causal effects of high body fat percentage on type 2 diabetes and osteoarthritis, uncovering interconnected cellular processes underlying this multimorbid disease pair.

2.
Genes Cells ; 29(2): 169-177, 2024 Feb.
Artículo en Inglés | MEDLINE | ID: mdl-38158708

RESUMEN

Hypoxia-inducible factor 1 (HIF1) is a transcription factor that is stabilized under hypoxia conditions via post-translational modifications. HIF1 regulates tumor malignancy and metastasis by gene transcriptions, such as Warburg effect and angiogenesis-related genes, in cancer cells. However, the HIF1 downstream genes show varied expressional patterns in different cancer types. Herein, we performed the hierarchical clustering based on the HIF1 downstream gene expression patterns using 1406 cancer cell lines crossing 30 types of cancer to understand the relationship between HIF1 downstream genes and the metastatic potential of cancer cell lines. Two types of cancers, including bone and breast cancers, were classified based on HIF1 downstream genes with significantly altered metastatic potentials. Furthermore, different HIF1 downstream gene subsets were extracted to discriminate each subtype for these cancer types. HIF1 downstream subtyping classification will help to understand the novel insight into tumor malignancy and metastasis in each cancer type.


Asunto(s)
Neoplasias de la Mama , Factor 1 Inducible por Hipoxia , Humanos , Femenino , Factor 1 Inducible por Hipoxia/genética , Factor 1 Inducible por Hipoxia/metabolismo , Línea Celular , Neoplasias de la Mama/patología , Subunidad alfa del Factor 1 Inducible por Hipoxia/genética , Subunidad alfa del Factor 1 Inducible por Hipoxia/metabolismo , Línea Celular Tumoral , Hipoxia de la Célula/fisiología
3.
Brief Bioinform ; 24(1)2023 01 19.
Artículo en Inglés | MEDLINE | ID: mdl-36642409

RESUMEN

Protein language models, trained on millions of biologically observed sequences, generate feature-rich numerical representations of protein sequences. These representations, called sequence embeddings, can infer structure-functional properties, despite protein language models being trained on primary sequence alone. While sequence embeddings have been applied toward tasks such as structure and function prediction, applications toward alignment-free sequence classification have been hindered by the lack of studies to derive, quantify and evaluate relationships between protein sequence embeddings. Here, we develop workflows and visualization methods for the classification of protein families using sequence embedding derived from protein language models. A benchmark of manifold visualization methods reveals that Neighbor Joining (NJ) embedding trees are highly effective in capturing global structure while achieving similar performance in capturing local structure compared with popular dimensionality reduction techniques such as t-SNE and UMAP. The statistical significance of hierarchical clusters on a tree is evaluated by resampling embeddings using a variational autoencoder (VAE). We demonstrate the application of our methods in the classification of two well-studied enzyme superfamilies, phosphatases and protein kinases. Our embedding-based classifications remain consistent with and extend upon previously published sequence alignment-based classifications. We also propose a new hierarchical classification for the S-Adenosyl-L-Methionine (SAM) enzyme superfamily which has been difficult to classify using traditional alignment-based approaches. Beyond applications in sequence classification, our results further suggest NJ trees are a promising general method for visualizing high-dimensional data sets.


Asunto(s)
Secuencia de Aminoácidos , Proteínas , Análisis por Conglomerados , Proteínas/química , Alineación de Secuencia
4.
Am J Epidemiol ; 193(8): 1146-1154, 2024 Aug 05.
Artículo en Inglés | MEDLINE | ID: mdl-38576181

RESUMEN

Multimorbidity, defined as having 2 or more chronic conditions, is a growing public health concern, but research in this area is complicated by the fact that multimorbidity is a highly heterogenous outcome. Individuals in a sample may have a differing number and varied combinations of conditions. Clustering methods, such as unsupervised machine learning algorithms, may allow us to tease out the unique multimorbidity phenotypes. However, many clustering methods exist, and choosing which to use is challenging because we do not know the true underlying clusters. Here, we demonstrate the use of 3 individual algorithms (partition around medoids, hierarchical clustering, and probabilistic clustering) and a clustering ensemble approach (which pools different clustering approaches) to identify multimorbidity clusters in the AIDS Linked to the Intravenous Experience cohort study. We show how the clusters can be compared based on cluster quality, interpretability, and predictive ability. In practice, it is critical to compare the clustering results from multiple algorithms and to choose the approach that performs best in the domain(s) that aligns with plans to use the clusters in future analyses.


Asunto(s)
Algoritmos , Multimorbilidad , Humanos , Análisis por Conglomerados , Femenino , Masculino , Persona de Mediana Edad , Aprendizaje Automático no Supervisado , Adulto
5.
Brief Bioinform ; 23(2)2022 03 10.
Artículo en Inglés | MEDLINE | ID: mdl-35108376

RESUMEN

Metagenomic next-generation sequencing (mNGS) enables comprehensive pathogen detection and has become increasingly popular in clinical diagnosis. The distinct pathogenic traits between strains require mNGS to achieve a strain-level resolution, but an equivocal concept of 'strain' as well as the low pathogen loads in most clinical specimens hinders such strain awareness. Here we introduce a metagenomic intra-species typing (MIST) tool (https://github.com/pandafengye/MIST), which hierarchically organizes reference genomes based on average nucleotide identity (ANI) and performs maximum likelihood estimation to infer the strain-level compositional abundance. In silico analysis using synthetic datasets showed that MIST accurately predicted the strain composition at a 99.9% average nucleotide identity (ANI) resolution with a merely 0.001× sequencing depth. When applying MIST on 359 culture-positive and 359 culture-negative real-world specimens of infected body fluids, we found the presence of multiple-strain reached considerable frequencies (30.39%-93.22%), which were otherwise underestimated by current diagnostic techniques due to their limited resolution. Several high-risk clones were identified to be prevalent across samples, including Acinetobacter baumannii sequence type (ST)208/ST195, Staphylococcus aureus ST22/ST398 and Klebsiella pneumoniae ST11/ST15, indicating potential outbreak events occurring in the clinical settings. Interestingly, contaminations caused by the engineered Escherichia coli strain K-12 and BL21 throughout the mNGS datasets were also identified by MIST instead of the statistical decontamination approach. Our study systemically characterized the infected body fluids at the strain level for the first time. Extension of mNGS testing to the strain level can greatly benefit clinical diagnosis of bacterial infections, including the identification of multi-strain infection, decontamination and infection control surveillance.


Asunto(s)
Infecciones Bacterianas , Líquidos Corporales , Infecciones Bacterianas/diagnóstico , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Humanos , Metagenómica/métodos , Nucleótidos
6.
Eur J Clin Invest ; : e14261, 2024 Jun 08.
Artículo en Inglés | MEDLINE | ID: mdl-38850064

RESUMEN

BACKGROUND: Comorbidities in primary care do not occur in isolation but tend to cluster together causing various clinically complex phenotypes. This study aimed to distinguish phenotype clusters and identify the risks of all-cause mortality in primary care. METHODS: The baseline cohort of the LIPIDOGEN2015 sub-study involved 1779 patients recruited by 438 primary care physicians. To identify different phenotype clusters, we used hierarchical clustering and investigated differences between clinical characteristics and mortality between clusters. We then performed causal analyses using causal mediation analysis to explore potential mediators between different clusters and all-cause mortality. RESULTS: A total of 1756 patients were included (mean age 51.2, SD 13.0; 60.3% female), with a median follow-up of 5.7 years. Three clusters were identified: Cluster 1 (n = 543) was characterised by overweight/obesity (body mass index ≥ 25 kg/m2), older (age ≥ 65 years), more comorbidities; Cluster 2 (n = 459) was characterised by non-overweight/obesity, younger, fewer comorbidities; Cluster 3 (n = 754) was characterised by overweight/obesity, younger, fewer comorbidities. Adjusted Cox regression showed that compared with Cluster 2, Cluster 1 had a significantly higher risk of all-cause mortality (HR 3.87, 95% CI: 1.24-15.91), whereas this was insignificantly different for Cluster 3. Causal mediation analyses showed that decreased protein thiol groups mediated the hazard effect of all-cause mortality in Cluster 1 compared with Cluster 2, but not between Clusters 1 and 3. CONCLUSION: Overweight/obesity older patients with more comorbidities had the highest risk of long-term all-cause mortality, and in the young group population overweight/obesity insignificantly increased the risk in the long-term follow-up, providing a basis for stratified phenotypic risk management.

7.
Brain Behav Immun ; 2024 Sep 26.
Artículo en Inglés | MEDLINE | ID: mdl-39341467

RESUMEN

Alterations in DNA methylation and inflammation could represent valid biomarkers for the stratification of patients with major depressive disorder (MDD). This study explored the use of DNA-methylation based immunological cell-type profiles in the context of MDD and symptom severity over time. In 119 individuals with MDD, DNA-methylation was assessed on whole blood using the Illumina Infinium MethylationEPIC 850 k BeadChip. Quality control and data processing, as well as cell type estimation was conducted using the RnBeads package. The cell type composition was estimated using epigenome-wide DNA methylation signatures, applying the Houseman method, considering six cell types (neutrophils, natural killer cells (NK), B cells, CD4 + T cells, CD8 + T cells and monocytes). Two cytokines (IL-6 and IL-1ß) and hsCRP were quantified in serum. We performed a hierarchical cluster analysis on the six estimated cell-types and tested the differences between these clusters in relation to the two cytokines and hsCRP, depression severity at baseline, and after 6 weeks of treatment (celecoxib/placebo + vortioxetine). We performed a second cluster analysis with cell-types and cytokines combined. ANCOVA was used to test for differences across clusters. We applied the Bonferroni correction. After quality control, we included 113 participants. Two clusters were identified, cluster 1 was high in CD4 + cells and NK, cluster 2 was high in CD8 + T-cells and B-cells, with similar fractions of neutrophils and monocytes. The clusters were not associated with either of the two cytokines and hsCRP, or depression severity at baseline, but cluster 1 showed higher depression severity after 6 weeks, corrected for baseline (p = 0.0060). The second cluster analysis found similar results: cluster 1 was low in CD8 + T-cells, B-cells, and IL-1ß. Cluster 2 was low in CD4 + cells and natural killer cells. Neutrophils, monocytes, IL-6 and hsCRP were not different between the clusters. Participants in cluster 1 showed higher depression severity at baseline than cluster 2 (p = 0.034), but no difference in depression severity after 6 weeks. DNA-methylation based cell-type profiles may be valuable in the immunological characterization and stratification of patients with MDD. Future models should consider the inclusion of more cell-types and cytokines for better a prediction of treatment outcomes.

8.
Pathobiology ; 91(5): 313-325, 2024.
Artículo en Inglés | MEDLINE | ID: mdl-38527431

RESUMEN

INTRODUCTION: Over the past decade, classifications using immune cell infiltration have been applied to many types of tumors; however, mesotheliomas have been less frequently evaluated. METHODS: In this study, 60 well-characterized pleural mesotheliomas (PMs) were evaluated immunohistochemically for the characteristics of immune cells within tumor microenvironment (TME) using 10 immunohistochemical markers: CD3, CD4, CD8, CD56, CD68, CD163, FOXP3, CD27, PD-1, and TIM-3. For further characterization of PMs, hierarchical clustering analyses using these 10 markers were performed. RESULTS: Among the immune cell markers, CD3 (p < 0.0001), CD4 (p = 0.0016), CD8 (p = 0.00094), CD163+ (p = 0.042), and FOXP3+ (p = 0.025) were significantly associated with an unfavorable clinical outcome. Immune checkpoint receptor expressions on tumor-infiltrating lymphocytes such as PD-1 (p = 0.050), CD27 (p = 0.014), and TIM-3 (p = 0.0098) were also associated with unfavorable survival. Hierarchical clustering analyses identified three groups showing specific characteristics and significant associations with patient survival (p = 0.016): the highest number of immune cells (ICHigh); the lowest number of immune cells, especially CD8+ and CD163+ cells (ICLow); and intermediate number of immune cells (ICInt). ICHigh tumors showed significantly higher expression of PD-L1 (p = 0.00038). Cox proportional hazard model identified ICHigh [hazard ratio (HR) = 2.90] and ICInt (HR = 2.97) as potential risk factors compared with ICLow. Tumor CD47 (HR = 2.36), tumor CD70 (HR = 3.04), and tumor PD-L1 (HR = 3.21) expressions were also identified as potential risk factors for PM patients. CONCLUSION: Our findings indicate immune checkpoint and/or immune cell-targeting therapies against CD70-CD27 and/or CD47-SIRPA axes may be applied for PM patients in combination with PD-L1-PD-1 targeting therapies in accordance with their tumor immune microenvironment characteristics.


Asunto(s)
Biomarcadores de Tumor , Linfocitos Infiltrantes de Tumor , Neoplasias Pleurales , Microambiente Tumoral , Humanos , Microambiente Tumoral/inmunología , Masculino , Femenino , Persona de Mediana Edad , Anciano , Análisis por Conglomerados , Neoplasias Pleurales/inmunología , Neoplasias Pleurales/patología , Linfocitos Infiltrantes de Tumor/inmunología , Mesotelioma/inmunología , Mesotelioma/patología , Adulto , Mesotelioma Maligno/inmunología , Mesotelioma Maligno/patología , Anciano de 80 o más Años , Pronóstico , Inmunohistoquímica
9.
Environ Sci Technol ; 58(9): 4268-4280, 2024 Mar 05.
Artículo en Inglés | MEDLINE | ID: mdl-38393751

RESUMEN

Sub-Saharan Africa is a hotspot for biomass burning (BB)-derived carbonaceous aerosols, including light-absorbing organic (brown) carbon (BrC). However, the chemically complex nature of BrC in BB aerosols from this region is not fully understood. We generated smoke in a chamber through smoldering combustion of common sub-Saharan African biomass fuels (hardwoods, cow dung, savanna grass, and leaves). We quantified aethalometer-based, real-time light-absorption properties of BrC-containing organic-rich BB aerosols, accounting for variations in wavelength, fuel type, relative humidity, and photochemical aging conditions. In filter samples collected from the chamber and Botswana in the winter, we identified 182 BrC species, classified into lignin pyrolysis products, nitroaromatics, coumarins, stilbenes, and flavonoids. Using an extensive set of standards, we determined species-specific mass and emission factors. Our analysis revealed a linear relationship between the combined BrC species contribution to chamber-measured BB aerosol mass (0.4-14%) and the mass-absorption cross-section at 370 nm (0.2-2.2 m2 g-1). Hierarchical clustering resolved key molecular-level components from the BrC matrix, with photochemically aged emissions from leaf and cow-dung burning showing BrC fingerprints similar to those found in Botswana aerosols. These quantitative findings could potentially help refine climate model predictions, aid in source apportionment, and inform effective air quality management policies for human health and the global climate.


Asunto(s)
Contaminantes Atmosféricos , Contaminación del Aire , Humanos , Anciano , Carbono , Biomasa , Monitoreo del Ambiente , Contaminación del Aire/análisis , Aerosoles/análisis , Contaminantes Atmosféricos/análisis , Material Particulado/análisis
10.
Pathol Int ; 74(1): 13-25, 2024 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-38050808

RESUMEN

The present study analyzed the expression of five independent immunohistochemical markers, CD4, CD8, CD66b, CD68, and CD163, on immune cells within the colorectal cancer (CRC) tumor microenvironment (TME). Using hierarchical clustering, patients were successfully classified according to significant associations with clinicopathological features and/or survival. Patients with mismatch repair-proficient (pMMR) CRC were categorized into four groups with survival differences (p = 0.0084): CD4Low , CD4High , MΦHigh , and CD8Low . MΦHigh tumors showed significantly higher expression of CD47 (p < 0.0001), a phagocytosis checkpoint molecule. These tumors contained significantly greater numbers of PD-1+ (p < 0.0001), TIM-3+ (p < 0.0001), and SIRPA+ (p < 0.0001) immune cells. Notably, 10% of the patients with pMMR CRC expressed PD-L1 (CD274) on tumor cells with significantly worse survival (p = 0.00064). The Cox proportional hazards model identified MΦ High (hazard ratio [HR] = 2.02, 95%, p = 0.032), CD8Low (HR = 2.45, p = 0.011), and tumor PD-L1 expression (HR = 2.74, p = 0.0061) as potential risk factors. PD-L1-PD-1 and/or CD47-SIRPA axes targeting immune checkpoint therapies might be considered for patients with pMMR CRC according to their tumor cells and tumor immune microenvironment characteristics.


Asunto(s)
Neoplasias Colorrectales , Humanos , Neoplasias Colorrectales/patología , Antígeno CD47 , Antígeno B7-H1/metabolismo , Receptor de Muerte Celular Programada 1/metabolismo , Biomarcadores de Tumor/análisis , Microambiente Tumoral
11.
J Biopharm Stat ; : 1-19, 2024 Jun 18.
Artículo en Inglés | MEDLINE | ID: mdl-38888431

RESUMEN

Pharmaceutical researchers are continually searching for techniques to improve both drug development processes and patient outcomes. An area of recent interest is the potential for machine learning (ML) applications within pharmacology. One such application not yet given close study is the unsupervised clustering of plasma concentration-time curves, hereafter, pharmacokinetic (PK) curves. In this paper, we present our findings on how to cluster PK curves by their similarity. Specifically, we find clustering to be effective at identifying similar-shaped PK curves and informative for understanding patterns within each cluster of PK curves. Because PK curves are time series data objects, our approach utilizes the extensive body of research related to the clustering of time series data as a starting point. As such, we examine many dissimilarity measures between time series data objects to find those most suitable for PK curves. We identify Euclidean distance as generally most appropriate for clustering PK curves, and we further show that dynamic time warping, Fréchet, and structure-based measures of dissimilarity like correlation may produce unexpected results. As an illustration, we apply these methods in a case study with 250 PK curves used in a previous pharmacogenomic study. Our case study finds that an unsupervised ML clustering with Euclidean distance, without any subject genetic information, is able to independently validate the same conclusions as the reference pharmacogenomic results. To our knowledge, this is the first such demonstration. Further, the case study demonstrates how the clustering of PK curves may generate insights that could be difficult to perceive solely with population level summary statistics of PK metrics.

12.
J Biopharm Stat ; : 1-11, 2024 Apr 01.
Artículo en Inglés | MEDLINE | ID: mdl-38557411

RESUMEN

The incorporation of real-world data (RWD) into medical product development and evaluation has exhibited consistent growth. However, there is no universally adopted method of how much information to borrow from external data. This paper proposes a study design methodology called Tree-based Monte Carlo (TMC) that dynamically integrates patients from various RWD sources to calculate the treatment effect based on the similarity between clinical trial and RWD. Initially, a propensity score is developed to gauge the resemblance between clinical trial data and each real-world dataset. Utilizing this similarity metric, we construct a hierarchical clustering tree that delineates varying degrees of similarity between each RWD source and the clinical trial data. Ultimately, a Gaussian process methodology is employed across this hierarchical clustering framework to synthesize the projected treatment effects of the external group. Simulation result shows that our clustering tree could successfully identify similarity. Data sources exhibiting greater similarity with clinical trial are accorded higher weights in treatment estimation process, while less congruent sources receive comparatively lower emphasis. Compared with another Bayesian method, meta-analytic predictive prior (MAP), our proposed method's estimator is closer to the true value and has smaller bias.

13.
J Med Internet Res ; 26: e50976, 2024 May 30.
Artículo en Inglés | MEDLINE | ID: mdl-38815258

RESUMEN

BACKGROUND: Due to their accessibility and anonymity, web-based counseling services are expanding at an unprecedented rate. One of the most prominent challenges such services face is repeated users, who represent a small fraction of total users but consume significant resources by continually returning to the system and reiterating the same narrative and issues. A deeper understanding of repeated users and tailoring interventions may help improve service efficiency and effectiveness. Previous studies on repeated users were mainly on telephone counseling, and the classification of repeated users tended to be arbitrary and failed to capture the heterogeneity in this group of users. OBJECTIVE: In this study, we aimed to develop a systematic method to profile repeated users and to understand what drives their use of the service. By doing so, we aimed to provide insight and practical implications that can inform the provision of service catering to different types of users and improve service effectiveness. METHODS: We extracted session data from 29,400 users from a free 24/7 web-based counseling service from 2018 to 2021. To systematically investigate the heterogeneity of repeated users, hierarchical clustering was used to classify the users based on 3 indicators of service use behaviors, including the duration of their user journey, use frequency, and intensity. We then compared the psychological profile of the identified subgroups including their suicide risks and primary concerns to gain insights into the factors driving their patterns of service use. RESULTS: Three clusters of repeated users with clear psychological profiles were detected: episodic, intermittent, and persistent-intensive users. Generally, compared with one-time users, repeated users showed higher suicide risks and more complicated backgrounds, including more severe presenting issues such as suicide or self-harm, bullying, and addictive behaviors. Higher frequency and intensity of service use were also associated with elevated suicide risk levels and a higher proportion of users citing mental disorders as their primary concerns. CONCLUSIONS: This study presents a systematic method of identifying and classifying repeated users in web-based counseling services. The proposed bottom-up clustering method identified 3 subgroups of repeated users with distinct service behaviors and psychological profiles. The findings can facilitate frontline personnel in delivering more efficient interventions and the proposed method can also be meaningful to a wider range of services in improving service provision, resource allocation, and service effectiveness.


Asunto(s)
Consejo , Humanos , Estudios Longitudinales , Análisis por Conglomerados , Femenino , Adulto , Masculino , Consejo/métodos , Consejo/estadística & datos numéricos , Persona de Mediana Edad , Envío de Mensajes de Texto/estadística & datos numéricos , Adulto Joven
14.
Multivariate Behav Res ; 59(2): 266-288, 2024.
Artículo en Inglés | MEDLINE | ID: mdl-38361218

RESUMEN

The walktrap algorithm is one of the most popular community-detection methods in psychological research. Several simulation studies have shown that it is often effective at determining the correct number of communities and assigning items to their proper community. Nevertheless, it is important to recognize that the walktrap algorithm relies on hierarchical clustering because it was originally developed for networks much larger than those encountered in psychological research. In this paper, we present and demonstrate a computational alternative to the hierarchical algorithm that is conceptually easier to understand. More importantly, we show that better solutions to the sum-of-squares optimization problem that is heuristically tackled by hierarchical clustering in the walktrap algorithm can often be obtained using exact or approximate methods for K-means clustering. Three simulation studies and analyses of empirical networks were completed to assess the impact of better sum-of-squares solutions.


Asunto(s)
Algoritmos , Simulación por Computador , Análisis por Conglomerados
15.
Sensors (Basel) ; 24(8)2024 Apr 09.
Artículo en Inglés | MEDLINE | ID: mdl-38676020

RESUMEN

The objective of content-based image retrieval (CBIR) is to locate samples from a database that are akin to a query, relying on the content embedded within the images. A contemporary strategy involves calculating the similarity between compact vectors by encoding both the query and the database images as global descriptors. In this work, we propose an image retrieval method by using hierarchical K-means clustering to efficiently organize the image descriptors within the database, which aims to optimize the subsequent retrieval process. Then, we compute the similarity between the descriptor set within the leaf nodes and the query descriptor to rank them accordingly. Three tree search algorithms are presented to enable a trade-off between search accuracy and speed that allows for substantial gains at the expense of a slightly reduced retrieval accuracy. Our proposed method demonstrates enhancement in image retrieval speed when applied to the CLIP-based model, UNICOM, designed for category-level retrieval, as well as the CNN-based R-GeM model, tailored for particular object retrieval by validating its effectiveness across various domains and backbones. We achieve an 18-times speed improvement while preserving over 99% accuracy when applied to the In-Shop dataset, the largest dataset in the experiments.

16.
Sensors (Basel) ; 24(15)2024 Jul 25.
Artículo en Inglés | MEDLINE | ID: mdl-39123872

RESUMEN

Hierarchical clustering is a widely used data analysis technique. Typically, tools for this method operate on data in its original, readable form, raising privacy concerns when a clustering task involving sensitive data that must remain confidential is outsourced to an external server. To address this issue, we developed a method that integrates Cheon-Kim-Kim-Song homomorphic encryption (HE), allowing the clustering process to be performed without revealing the raw data. In hierarchical clustering, the two nearest clusters are repeatedly merged until the desired number of clusters is reached. The proximity of clusters is evaluated using various metrics. In this study, we considered two well-known metrics: single linkage and complete linkage. Applying HE to these methods involves sorting encrypted distances, which is a resource-intensive operation. Therefore, we propose a cooperative approach in which the data owner aids the sorting process and shares a list of data positions with a computation server. Using this list, the server can determine the clustering of the data points. The proposed approach ensures secure hierarchical clustering using single and complete linkage methods without exposing the original data.

17.
J Environ Manage ; 355: 120496, 2024 Mar.
Artículo en Inglés | MEDLINE | ID: mdl-38437742

RESUMEN

The contamination detection technology helps in water quality management and protection in surface water. It is important to detect sudden contamination events timely from dynamic variations due to various interference factors in online water quality monitoring data. In this study, a framework named "Prediction - Detection - Judgment" is proposed with a method framework of "Time series increment - Hierarchical clustering - Bayes' theorem model". Time to detection is used as an evaluation index of contamination detection methods, along with the probability of detection and false alarm rate. The proposed method is tested with available public data and further applied in a monitoring site of a river. Results showed that the method could detect the contamination events with a 100% probability of detection, a 17% false alarm rate and a time to detection close to 4 monitoring intervals. The proposed index time to detection evaluates the timeliness of the method, and timely detection ensures that contamination events can be responded to and dealt with in time. The site application also demonstrates the feasibility and practicability of the framework proposed in this study and its potential for extensive implementation.


Asunto(s)
Juicio , Abastecimiento de Agua , Teorema de Bayes , Calidad del Agua , Contaminación del Agua
18.
J Environ Manage ; 351: 119852, 2024 Feb.
Artículo en Inglés | MEDLINE | ID: mdl-38159309

RESUMEN

This study proposes a set of water ecosystem services (WES) research system, including classification, benefit quantification and spatial radiation effect, with the goal of promoting harmonious coexistence between humans and nature, as well as providing a theoretical foundation for optimizing water resources management. Hierarchical cluster analysis was applied to categorize WES taking in to account the four nature constraints of product nature, energy flow relationships, circularity, and human social utility. A multi-dimensional benefit quantification methodology system for WES was constructed by combining the emergy theory with multidisciplinary methods of ecology, economics, and sociology. Based on the theories of spatial autocorrelation and breaking point, we investigated the spatial radiation effects of typical services in the cyclic regulation category. The proposed methodology has been applied to Luoyang, China. The results show that the Resource Provisioning (RP) and Cultural Addition (CA) services change greatly over time, and drive the overall WES to increase and then decrease. The spatial and temporal distribution of water resources is uneven, with WES being slightly better in the southern region than the northern region. Additionally, spatial radiation effects of typical regulating services are most prominent in S County. This finding suggests the establishment of scientific and rational intra-basin or inter-basin water management systems to expand the beneficial impacts of water-rich areas on neighboring regions.


Asunto(s)
Conservación de los Recursos Naturales , Ecosistema , Humanos , Análisis Espacial , Ecología , China
19.
AAPS PharmSciTech ; 25(5): 127, 2024 Jun 06.
Artículo en Inglés | MEDLINE | ID: mdl-38844724

RESUMEN

The success of obtaining solid dispersions for solubility improvement invariably depends on the miscibility of the drug and polymeric carriers. This study aimed to categorize and select polymeric carriers via the classical group contribution method using the multivariate analysis of the calculated solubility parameter of RX-HCl. The total, partial, and derivate parameters for RX-HCl were calculated. The data were compared with the results of excipients (N = 36), and a hierarchical clustering analysis was further performed. Solid dispersions of selected polymers in different drug loads were produced using solvent casting and characterized via X-ray diffraction, infrared spectroscopy and scanning electron microscopy. RX-HCl presented a Hansen solubility parameter (HSP) of 23.52 MPa1/2. The exploratory analysis of HSP and relative energy difference (RED) elicited a classification for miscible (n = 11), partially miscible (n = 15), and immiscible (n = 10) combinations. The experimental validation followed by a principal component regression exhibited a significant correlation between the crystallinity reduction and calculated parameters, whereas the spectroscopic evaluation highlighted the hydrogen-bonding contribution towards amorphization. The systematic approach presented a high discrimination ability, contributing to optimal excipient selection for the obtention of solid solutions of RX-HCl.


Asunto(s)
Química Farmacéutica , Excipientes , Polímeros , Clorhidrato de Raloxifeno , Solubilidad , Difracción de Rayos X , Polímeros/química , Excipientes/química , Clorhidrato de Raloxifeno/química , Análisis Multivariante , Difracción de Rayos X/métodos , Química Farmacéutica/métodos , Portadores de Fármacos/química , Composición de Medicamentos/métodos , Microscopía Electrónica de Rastreo/métodos , Enlace de Hidrógeno , Cristalización/métodos
20.
Entropy (Basel) ; 26(3)2024 Mar 18.
Artículo en Inglés | MEDLINE | ID: mdl-38539779

RESUMEN

We address the challenge of identifying meaningful communities by proposing a model based on convex game theory and a measure of community strength. Many existing community detection methods fail to provide unique solutions, and it remains unclear how the solutions depend on initial conditions. Our approach identifies strong communities with a hierarchical structure, visualizable as a dendrogram, and computable in polynomial time using submodular function minimization. This framework extends beyond graphs to hypergraphs or even polymatroids. In the case when the model is graphical, a more efficient algorithm based on the max-flow min-cut algorithm can be devised. Though not achieving near-linear time complexity, the pursuit of practical algorithms is an intriguing avenue for future research. Our work serves as the foundation, offering an analytical framework that yields unique solutions with clear operational meaning for the communities identified.

SELECCIÓN DE REFERENCIAS
Detalles de la búsqueda