Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 194
Filtrar
Mais filtros

Intervalo de ano de publicação
1.
Brief Bioinform ; 25(3)2024 Mar 27.
Artigo em Inglês | MEDLINE | ID: mdl-38627939

RESUMO

The latest breakthroughs in spatially resolved transcriptomics technology offer comprehensive opportunities to delve into gene expression patterns within the tissue microenvironment. However, the precise identification of spatial domains within tissues remains challenging. In this study, we introduce AttentionVGAE (AVGN), which integrates slice images, spatial information and raw gene expression while calibrating low-quality gene expression. By combining the variational graph autoencoder with multi-head attention blocks (MHA blocks), AVGN captures spatial relationships in tissue gene expression, adaptively focusing on key features and alleviating the need for prior knowledge of cluster numbers, thereby achieving superior clustering performance. Particularly, AVGN attempts to balance the model's attention focus on local and global structures by utilizing MHA blocks, an aspect that current graph neural networks have not extensively addressed. Benchmark testing demonstrates its significant efficacy in elucidating tissue anatomy and interpreting tumor heterogeneity, indicating its potential in advancing spatial transcriptomics research and understanding complex biological phenomena.


Assuntos
Benchmarking , Perfilação da Expressão Gênica , Análise por Conglomerados , Redes Neurais de Computação
2.
Brief Bioinform ; 24(2)2023 03 19.
Artigo em Inglês | MEDLINE | ID: mdl-36781228

RESUMO

Recent advances in spatial transcriptomics have enabled measurements of gene expression at cell/spot resolution meanwhile retaining both the spatial information and the histology images of the tissues. Accurately identifying the spatial domains of spots is a vital step for various downstream tasks in spatial transcriptomics analysis. To remove noises in gene expression, several methods have been developed to combine histopathological images for data analysis of spatial transcriptomics. However, these methods either use the image only for the spatial relations for spots, or individually learn the embeddings of the gene expression and image without fully coupling the information. Here, we propose a novel method ConGI to accurately exploit spatial domains by adapting gene expression with histopathological images through contrastive learning. Specifically, we designed three contrastive loss functions within and between two modalities (the gene expression and image data) to learn the common representations. The learned representations are then used to cluster the spatial domains on both tumor and normal spatial transcriptomics datasets. ConGI was shown to outperform existing methods for the spatial domain identification. In addition, the learned representations have also been shown powerful for various downstream tasks, including trajectory inference, clustering, and visualization.


Assuntos
Aprendizagem , Transcriptoma , Perfilação da Expressão Gênica , Análise por Conglomerados , Análise de Dados
3.
Brief Bioinform ; 24(3)2023 05 19.
Artigo em Inglês | MEDLINE | ID: mdl-37080761

RESUMO

Advancing spatially resolved transcriptomics (ST) technologies help biologists comprehensively understand organ function and tissue microenvironment. Accurate spatial domain identification is the foundation for delineating genome heterogeneity and cellular interaction. Motivated by this perspective, a graph deep learning (GDL) based spatial clustering approach is constructed in this paper. First, the deep graph infomax module embedded with residual gated graph convolutional neural network is leveraged to address the gene expression profiles and spatial positions in ST. Then, the Bayesian Gaussian mixture model is applied to handle the latent embeddings to generate spatial domains. Designed experiments certify that the presented method is superior to other state-of-the-art GDL-enabled techniques on multiple ST datasets. The codes and dataset used in this manuscript are summarized at https://github.com/narutoten520/SCGDL.


Assuntos
Aprendizado Profundo , Transcriptoma , Teorema de Bayes , Perfilação da Expressão Gênica , Comunicação Celular
4.
Biometrics ; 80(1)2024 Jan 29.
Artigo em Inglês | MEDLINE | ID: mdl-38372400

RESUMO

Camera traps or acoustic recorders are often used to sample wildlife populations. When animals can be individually identified, these data can be used with spatial capture-recapture (SCR) methods to assess populations. However, obtaining animal identities is often labor-intensive and not always possible for all detected animals. To address this problem, we formulate SCR, including acoustic SCR, as a marked Poisson process, comprising a single counting process for the detections of all animals and a mark distribution for what is observed (eg, animal identity, detector location). The counting process applies equally when it is animals appearing in front of camera traps and when vocalizations are captured by microphones, although the definition of a mark changes. When animals cannot be uniquely identified, the observed marks arise from a mixture of mark distributions defined by the animal activity centers and additional characteristics. Our method generalizes existing latent identity SCR models and provides an integrated framework that includes acoustic SCR. We apply our method to estimate density from a camera trap study of fisher (Pekania pennanti) and an acoustic survey of Cape Peninsula moss frog (Arthroleptella lightfooti). We also test it through simulation. We find latent identity SCR with additional marks such as sex or time of arrival to be a reliable method for estimating animal density.


Assuntos
Densidade Demográfica , Animais , Simulação por Computador
5.
Biometrics ; 80(3)2024 Jul 01.
Artigo em Inglês | MEDLINE | ID: mdl-39073775

RESUMO

Recent breakthroughs in spatially resolved transcriptomics (SRT) technologies have enabled comprehensive molecular characterization at the spot or cellular level while preserving spatial information. Cells are the fundamental building blocks of tissues, organized into distinct yet connected components. Although many non-spatial and spatial clustering approaches have been used to partition the entire region into mutually exclusive spatial domains based on the SRT high-dimensional molecular profile, most require an ad hoc selection of less interpretable dimensional-reduction techniques. To overcome this challenge, we propose a zero-inflated negative binomial mixture model to cluster spots or cells based on their molecular profiles. To increase interpretability, we employ a feature selection mechanism to provide a low-dimensional summary of the SRT molecular profile in terms of discriminating genes that shed light on the clustering result. We further incorporate the SRT geospatial profile via a Markov random field prior. We demonstrate how this joint modeling strategy improves clustering accuracy, compared with alternative state-of-the-art approaches, through simulation studies and 3 real data applications.


Assuntos
Teorema de Bayes , Simulação por Computador , Perfilação da Expressão Gênica , Análise por Conglomerados , Perfilação da Expressão Gênica/métodos , Perfilação da Expressão Gênica/estatística & dados numéricos , Humanos , Transcriptoma , Cadeias de Markov , Modelos Estatísticos , Interpretação Estatística de Dados
6.
Network ; : 1-25, 2024 Mar 14.
Artigo em Inglês | MEDLINE | ID: mdl-38482862

RESUMO

An Adaptive activation Functions with Deep Kronecker Neural Network optimized with Bear Smell Search Algorithm (BSSA) (ADKNN-BSSA-CSMANET) is proposed for preventing MANET Cyber security attacks. The mobile users are enrolled with Trusted Authority Using a Crypto Hash Signature (SHA-256). Every mobile user uploads their finger vein biometric, user ID, latitude and longitude for confirmation. The packet analyser checks if any attack patterns are identified. It is implemented using adaptive density-based spatial clustering (ADSC) that deems information from packet header. Geodesic filtering (GF) is used as a pre-processing method for eradicating the unsolicited content and filtering pertinent data. Group Teaching Algorithm (GTA)-based feature selection is utilized for ideal collection of features and Adaptive Activation Functions along Deep Kronecker Neural Network (ADKNN) is used to categorizing normal and attack packets (DoS, Probe, U2R, and R2L). Then BSSA is utilized for optimizing the weight parameters of ADKNN classifier for optimal classification. The proposed technique is executed in python and its efficiency is evaluated by several performances metrics, such as Accuracy, Attack Detection Rate, Detection Delay, Packet Delivery Ratio, Throughput, and Energy Consumption. The proposed technique provides 36.64%, 33.06%, and 33.98% lower Detection Delay on NSL-KDD dataset compared with the existing methods.

7.
Int J Health Geogr ; 23(1): 16, 2024 Jun 26.
Artigo em Inglês | MEDLINE | ID: mdl-38926856

RESUMO

BACKGROUND: The escalating trend of obesity in Malaysia is surmounting, and the lack of evidence on the environmental influence on obesity is untenable. Obesogenic environmental factors often emerge as a result of shared environmental, demographic, or cultural effects among neighbouring regions that impact lifestyle. Employing spatial clustering can effectively elucidate the geographical distribution of obesity and pinpoint regions with potential obesogenic environments, thereby informing public health interventions and further exploration on the local environments. This study aimed to determine the spatial clustering of body mass index (BMI) among adults in Malaysia. METHOD: This study utilized information of respondents aged 18 to 59 years old from the National Health and Morbidity Survey (NHMS) 2014 and 2015 at Peninsular Malaysia and East Malaysia. Fast food restaurant proximity, district population density, and district median household income were determined from other sources. The analysis was conducted for total respondents and stratified by sex. Multilevel regression was used to produce the BMI estimates on a set of variables, adjusted for data clustering at enumeration blocks. Global Moran's I and Local Indicator of Spatial Association statistics were applied to assess the general clustering and location of spatial clusters of BMI, respectively using point locations of respondents and spatial weights of 8 km Euclidean radius or 5 nearest neighbours. RESULTS: Spatial clustering of BMI independent of individual sociodemographic was significant (p < 0.001) in Peninsular and East Malaysia with Global Moran's index of 0.12 and 0.15, respectively. High-BMI clusters (hotspots) were in suburban districts, whilst the urban districts were low-BMI clusters (cold spots). Spatial clustering was greater among males with hotspots located closer to urban areas, whereas hotspots for females were in less urbanized areas. CONCLUSION: Obesogenic environment was identified in suburban districts, where spatial clusters differ between males and females in certain districts. Future studies and interventions on creating a healthier environment should be geographically targeted and consider gender differences.


Assuntos
Índice de Massa Corporal , Obesidade , Humanos , Masculino , Adulto , Feminino , Malásia/epidemiologia , Obesidade/epidemiologia , Pessoa de Meia-Idade , Adulto Jovem , Adolescente , Análise por Conglomerados , Análise Espacial , Meio Ambiente , Inquéritos Epidemiológicos
8.
Environ Manage ; 73(5): 1016-1031, 2024 May.
Artigo em Inglês | MEDLINE | ID: mdl-38345757

RESUMO

The modeling and mapping of hotspots and coldspots ecosystem services (ESs) is an essential factor in the decision-making process for ESs conservation. Moreover, spatial prioritization is a serious stage in conservation planning. In the present research, based on the InVEST software, Getis-Ord statistics (Gi*), and a set of GIS methods, we quantified and mapped the variation and overlapping among three ESs (carbon storage, soil retention, and habitat quality). Furthermore, an approach was proffered for detecting priority areas to protect multiple ecosystem services. Hotspots recognized via the Gi* statistics technique contain a higher capacity for supplying ESs than other areas. This means that protecting these areas with a bigger number of overlapped hotspots can provide more services. Results indicated that population growth accompanied by the increase in construction sites and low-yield agricultural lands in the Zayanderood dam watershed basin has resulted in ES losses. This situation is represented by increasing soil erosion, reduced carbon storage, reduced biodiversity, and fragmented habitat distribution due to land-use change. The statistically significant carbon storage, soil retention, and habitat quality hotspots with above 95% confidence level account for 21.5%, 39.3%, and 16.9% of the study area, respectively. Therefore, a clear framework was presented in this study for setting ES-based conservation priority. Decision makers and land-use planners can also combine this technique into their framework to identify and conserve ES hotspots to support their targeted ecosystem policies.


Assuntos
Conservação dos Recursos Naturais , Ecossistema , Irã (Geográfico) , Conservação dos Recursos Naturais/métodos , Solo , Carbono , China
9.
Muscle Nerve ; 68(3): 323-328, 2023 09.
Artigo em Inglês | MEDLINE | ID: mdl-37466098

RESUMO

INTRODUCTION/AIMS: Several microgeographic clusters of higher/lower incidence of amyotrophic lateral sclerosis (ALS) have been identified worldwide. Differences in the distribution of local factors were proposed to explain the excess ALS risk, whereas the contribution of known genetic/epigenetic factors remains unclear. The aim is to identify restricted areas of higher risk in Sardinia and to assess whether age, sex, and the most common causative genetic mutations in Sardinia (C9orf72 and TARDBP mutations) contributed to the variation in the ALS risk. METHODS: We performed an ad hoc analysis of the 10-y population-based incident cohort of ALS cases from a recent study of a large Sardinian area. Cluster analysis was performed by age- and sex-adjusted Kulldorff's spatial scan statistic. RESULTS: We identified a statistically significant cluster of higher ALS incidence in a relatively large area including 34 municipalities and >100,000 individuals. The investigated genetic mutations were more frequent in the cluster area than outside. Regardless of the genetic mutations, the excess of ALS risk was significantly associated with either sex or with age ≥ 65 y. Finally, an additive interaction between older age and male sex contributed to the excess of ALS risk in the cluster area but not outside. DISCUSSION: Our analysis demonstrated that known genetic factors, age, and sex may contribute to microgeographic variation in ALS incidence. The significant additive interaction between older age and male sex we found in the high-incidence cluster could suggest the presence of a third factor connecting the analyzed risk factors.


Assuntos
Esclerose Lateral Amiotrófica , Humanos , Masculino , Esclerose Lateral Amiotrófica/epidemiologia , Esclerose Lateral Amiotrófica/genética , Mutação/genética , Incidência , Fatores de Risco , Análise por Conglomerados , Itália/epidemiologia
10.
Malar J ; 22(1): 75, 2023 Mar 04.
Artigo em Inglês | MEDLINE | ID: mdl-36870976

RESUMO

BACKGROUND: Over the last decades, enormous successes have been achieved in reducing malaria burden globally. In Latin America, South East Asia, and the Western Pacific, many countries now pursue the goal of malaria elimination by 2030. It is widely acknowledged that Plasmodium spp. infections cluster spatially so that interventions need to be spatially informed, e.g. spatially targeted reactive case detection strategies. Here, the spatial signature method is introduced as a tool to quantify the distance around an index infection within which other infections significantly cluster. METHODS: Data were considered from cross-sectional surveys from Brazil, Thailand, Cambodia, and Solomon Islands, conducted between 2012 and 2018. Household locations were recorded by GPS and finger-prick blood samples from participants were tested for Plasmodium infection by PCR. Cohort studies from Brazil and Thailand with monthly sampling over a year from 2013 until 2014 were also included. The prevalence of PCR-confirmed infections was calculated at increasing distance around index infections (and growing time intervals in the cohort studies). Statistical significance was defined as prevalence outside of a 95%-quantile interval of a bootstrap null distribution after random re-allocation of locations of infections. RESULTS: Prevalence of Plasmodium vivax and Plasmodium falciparum infections was elevated in close proximity around index infections and decreased with distance in most study sites, e.g. from 21.3% at 0 km to the global study prevalence of 6.4% for P. vivax in the Cambodian survey. In the cohort studies, the clustering decreased with longer time windows. The distance from index infections to a 50% reduction of prevalence ranged from 25 m to 3175 m, tending to shorter distances at lower global study prevalence. CONCLUSIONS: The spatial signatures of P. vivax and P. falciparum infections demonstrate spatial clustering across a diverse set of study sites, quantifying the distance within which the clustering occurs. The method offers a novel tool in malaria epidemiology, potentially informing reactive intervention strategies regarding radius choices of operations around detected infections and thus strengthening malaria elimination endeavours.


Assuntos
Malária Falciparum , Malária Vivax , Humanos , Plasmodium vivax , Estudos Transversais , Plasmodium falciparum , Análise por Conglomerados , Estudos de Coortes
11.
Stat Med ; 42(26): 4794-4823, 2023 Nov 20.
Artigo em Inglês | MEDLINE | ID: mdl-37652405

RESUMO

In spatio-temporal epidemiological analysis, it is of critical importance to identify the significant covariates and estimate the associated time-varying effects on the health outcome. Due to the heterogeneity of spatio-temporal data, the subsets of important covariates may vary across space and the temporal trends of covariate effects could be locally different. However, many spatial models neglected the potential local variation patterns, leading to inappropriate inference. Thus, this article proposes a flexible Bayesian hierarchical model to simultaneously identify spatial clusters of regression coefficients with common temporal trends, select significant covariates for each spatial group by introducing binary entry parameters and estimate spatio-temporally varying disease risks. A multistage strategy is employed to reduce the confounding bias caused by spatially structured random components. A simulation study demonstrates the outperformance of the proposed method, compared with several alternatives based on different assessment criteria. The methodology is motivated by two important case studies. The first concerns the low birth weight incidence data in 159 counties of Georgia, USA, for the years 2007 to 2018 and investigates the time-varying effects of potential contributing covariates in different cluster regions. The second concerns the circulatory disease risks across 323 local authorities in England over 10 years and explores the underlying spatial clusters and associated important risk factors.

12.
BMC Public Health ; 23(1): 1652, 2023 08 29.
Artigo em Inglês | MEDLINE | ID: mdl-37644452

RESUMO

BACKGROUND: Despite significant progress in sanitation status and public health awareness, intestinal infectious diseases (IID) have caused a serious disease burden in China. Little was known about the spatio-temporal pattern of IID at the county level in Zhejiang. Therefore, a spatio-temporal modelling study to identify high-risk regions of IID incidence and potential risk factors was conducted. METHODS: Reported cases of notifiable IID from 2008 to 2021 were obtained from the China Information System for Disease Control and Prevention. Moran's I index and the local indicators of spatial association (LISA) were calculated using Geoda software to identify the spatial autocorrelation and high-risk areas of IID incidence. Bayesian hierarchical model was used to explore socioeconomic and climate factors affecting IID incidence inequities from spatial and temporal perspectives. RESULTS: From 2008 to 2021, a total of 101 cholera, 55,298 bacterial dysentery, 131 amoebic dysentery, 5297 typhoid, 2102 paratyphoid, 27,947 HEV, 1,695,925 hand, foot and mouth disease (HFMD), and 1,505,797 other infectious diarrhea (OID) cases were reported in Zhejiang Province. The hot spots for bacterial dysentery, OID, and HEV incidence were found mainly in Hangzhou, while high-high cluster regions for incidence of enteric fever and HFMD were mainly located in Ningbo. The Bayesian model showed that Areas with a high proportion of males had a lower risk of BD and enteric fever. People under the age of 18 may have a higher risk of IID. High urbanization rate was a protective factor against HFMD (RR = 0.91, 95% CI: 0.88, 0.94), but was a risk factor for HEV (RR = 1.06, 95% CI: 1.01-1.10). BD risk (RR = 1.14, 95% CI: 1.10-1.18) and enteric fever risk (RR = 1.18, 95% CI:1.10-1.27) seemed higher in areas with high GDP per capita. The greater the population density, the higher the risk of BD (RR = 1.29, 95% CI: 1.23-1.36), enteric fever (RR = 1.12, 95% CI: 1.00-1.25), and HEV (RR = 1.15, 95% CI: 1.09-1.21). Among climate variables, higher temperature was associated with a higher risk of BD (RR = 1.32, 95% CI: 1.23-1.41), enteric fever (RR = 1.41, 95% CI: 1.33-1.50), and HFMD (RR = 1.22, 95% CI: 1.08-1.38), and with lower risk of HEV (RR = 0.83, 95% CI: 0.78-0.89). Precipitation was positively correlated with enteric fever (RR = 1.04, 95% CI: 1.00-1.08), HFMD (RR = 1.03, 95% CI: 1.00-1.06), and HEV (RR = 1.05, 95% CI: 1.03-1.08). Higher HFMD risk was also associated with increasing relative humidity (RR = 1.20, 95% CI: 1.16-1.24) and lower wind velocity (RR = 0.88, 95% CI: 0.84-0.92). CONCLUSIONS: There was significant spatial clustering of IID incidence in Zhejiang Province from 2008 to 2021. Spatio-temporal patterns of IID risk could be largely explained by socioeconomic and meteorological factors. Preventive measures and enhanced monitoring should be taken in some high-risk counties in Hangzhou city and Ningbo city.


Assuntos
Doenças Transmissíveis , Disenteria , Febre Tifoide , Masculino , Humanos , Teorema de Bayes , China/epidemiologia , Doenças Transmissíveis/epidemiologia
13.
BMC Public Health ; 23(1): 1612, 2023 08 24.
Artigo em Inglês | MEDLINE | ID: mdl-37612693

RESUMO

BACKGROUND: Child mortality is a major challenge to public health in Pakistan and other developing countries. Reduction of the child mortality rate would improve public health and enhance human well-being and prosperity. This study recognizes the spatial clusters of child mortality across districts of Pakistan and identifies the direct and spatial spillover effects of determinants on the Child Mortality Rate (CMR). METHOD: Data of the multiple indicators cluster survey (MICS) conducted by the United Nations International Children's Emergency Fund (UNICEF) was used to study the CMR. We used spatial univariate autocorrelation to test the spatial dependence between contiguous districts concerning CMR. We also applied the Spatial Durbin Model (SDM) to measure the spatial spillover effects of factors on CMR. RESULTS: The study results showed 31% significant spatial association across the districts and identified a cluster of hot spots characterized by the high-high CMR in the districts of Punjab province. The empirical analysis of the SDM confirmed that the direct and spatial spillover effect of the poorest wealth quintile and MPI vulnerability on CMR is positive whereas access to postnatal care to the newly born child and improved drinking water has negatively (directly and indirectly) determined the CMR in Pakistan. CONCLUSION: The instant results concluded that spatial dependence and significant spatial spillover effects concerning CMR exist across districts. Prioritization of the hot spot districts characterized by higher CMR can significantly reduce the CMR with improvement in financial statuses of households from the poorest quintile and MPI vulnerability as well as improvement in accessibility to postnatal care services and safe drinking water.


Assuntos
Mortalidade da Criança , Água Potável , Criança , Gravidez , Feminino , Humanos , Paquistão/epidemiologia , Parto , Pobreza
14.
J Environ Manage ; 326(Pt B): 116715, 2023 Jan 15.
Artigo em Inglês | MEDLINE | ID: mdl-36403464

RESUMO

The increasing environmental pressure of anthropogenic CO2 emissions is impeding the sustainability of urban agglomerations (UAs). Recent research has shown that the spatial clustering of UA elements reduces CO2 emissions but underestimates its impact on vegetation carbon sequestration. Using an extended IPAT equation analysis framework and the Logarithmic Mean Divisia Index decomposition approach, this study revealed the positive effects of the economy and population spatial clustering on carbon footprint pressure (CFP) mitigation. Specifically, improving economic spatial clustering mitigated the rise in UA's CFP caused by affluence and population growth. Furthermore, population clustering in core cities effectively mitigated CFP in neighboring cities. Additionally, we found that the efficiency improvement, i.e., the decrease in the ratio of carbon emissions and gross domestic product, should be the dominant driver of CFP mitigation, followed by improved vegetation carbon sequestration. However, these drivers have limited future potential. We believe that by improving UA's spatial clustering of the economy and population, future urban environmental pressures and climate risks will be mitigated.


Assuntos
Dióxido de Carbono , Pegada de Carbono , Dióxido de Carbono/análise , Cidades , Análise Espacial , Carbono , Análise por Conglomerados , China , Desenvolvimento Econômico
15.
Behav Res Methods ; 55(8): 4086-4098, 2023 12.
Artigo em Inglês | MEDLINE | ID: mdl-36357762

RESUMO

Synesthesia is a phenomenon where sensory stimuli or cognitive concepts elicit additional perceptual experiences. For instance, in a commonly studied type of synesthesia, stimuli such as words written in black font elicit experiences of other colors, e.g., red. In order to objectively verify synesthesia, participants are asked to choose colors for repeatedly presented stimuli and the consistency of their choices is evaluated (consistency test). Previously, there has been no publicly available and easy-to-use tool for analyzing consistency test results. Here, the R package synr is introduced, which provides an efficient interface for exploring consistency test data and applying common procedures for analyzing them. Importantly, synr also implements a novel method enabling identification of participants whose scores cannot be interpreted, e.g., who only give black or red color responses. To this end, density-based spatial clustering of applications with noise (DBSCAN) is applied in conjunction with a measure of spread in 3D space. An application of synr with pre-existing openly accessible data illustrating how synr is used in practice is presented. Also included is a comparison of synr's data validation procedure and human ratings, which found that synr had high correspondence with human ratings and outperformed human raters in situations where human raters were easily mislead. Challenges for widespread adoption of synr as well as suggestions for using synr within the field of synesthesia and other areas of psychological research are discussed.


Assuntos
Transtornos da Percepção , Humanos , Sinestesia , Percepção de Cores/fisiologia
16.
BMC Bioinformatics ; 23(1): 187, 2022 May 17.
Artigo em Inglês | MEDLINE | ID: mdl-35581558

RESUMO

The rapid global spread and dissemination of SARS-CoV-2 has provided the virus with numerous opportunities to develop several variants. Thus, it is critical to determine the degree of the variations and in which part of the virus those variations occurred. Therefore, in this study, methods that could be used to vectorize the sequence data, perform clustering analysis, and visualize the results were proposed using machine learning methods. To conduct this study, a total of 224,073 cases of SARS-CoV-2 sequence data were collected through NCBI and GISAID, and the data were visualized using dimensionality reduction and clustering analysis models such as T-SNE and DBSCAN. The SARS-CoV-2 virus, which was first detected, was distinguished from different variations, including Omicron and Delta, in the cluster results. Furthermore, it was possible to examine which codon changes in the spike protein caused the variants to be distinguished using feature importance extraction models such as Random Forest or Shapely Value. The proposed method has the advantage of being able to analyse and visualize a large amount of data at once compared to the existing tree-based sequence data analysis. The proposed method was able to identify and visualize significant changes between the SARS-CoV-2 virus, which was first detected in Wuhan, China, in December 2019, and the newly formed mutant virus group. As a result of clustering analysis using sequence data, it was possible to confirm the formation of clusters among various variants in a two-dimensional graph, and by extracting the importance of variables, it was possible to confirm which codon changes played a major role in distinguishing variants. Furthermore, since the proposed method can handle a variety of data sequences, it can be used for all kinds of diseases, including influenza and SARS-CoV-2. Therefore, the proposed method has the potential to become widely used for the effective analysis of disease variations.


Assuntos
COVID-19 , Magnoliopsida , Análise por Conglomerados , Códon , Aprendizado de Máquina , SARS-CoV-2/genética
17.
J Med Virol ; 94(11): 5354-5362, 2022 11.
Artigo em Inglês | MEDLINE | ID: mdl-35864556

RESUMO

The Omicron variant was first reported to the World Health Organization (WHO) from South Africa on November 24, 2021; this variant is spreading rapidly worldwide. No study has conducted a spatiotemporal analysis of the morbidity of Omicron infection at the country level; hence, to explore the spatial transmission of the Omicron variant among the 220 countries worldwide, we aimed to the analyze its spatial autocorrelation and to conduct a multiple linear regression to investigate the underlying factors associated with the pandemic. This study was an ecological study. Data on the number of confirmed cases were extracted from the WHO website. The spatiotemporal characteristic was described in a thematic map. The Global Moran Index (Moran's I) was used to detect the spatial autocorrelation, while the local indicators of spatial association (LISA) were used to analyze the local spatial correlation characteristics. The joinpoint regression model was used to explore the change in the trend of the Omicron incidence over time. The association between the morbidity of Omicron and influencing factors were analyzed using multiple linear regression. This study was an ecological study. Data on the number of confirmed cases were extracted from the WHO website. The spatiotemporal characteristic was described in a thematic map. The Global Moran Index (Moran's I) was used to detect the spatial autocorrelation, while the LISA were used to analyze the local spatial correlation characteristics. The joinpoint regression model was used to explore the change in the trend of the Omicron incidence over time. The association between the morbidity of Omicron and influencing factors were analyzed using multiple linear regression. The value of Moran's I was positive (Moran's I = 0.061, Z-score = 3.772, p = 0.007), indicating a spatial correlation of the morbidity of Omicron at the country level. From November 26, 2021 to February 26, 2022; the morbidity showed obvious spatial clustering. Hotspot clustering was observed mostly in Europe (locations in High-High category: 24). Coldspot clustering was observed mostly in Africa and Asia (locations in Low-Low category: 32). The result of joinpoint regression showed an increasing trend from December 21, 2021 to January 26, 2022. Results of the multiple linear regression analysis demonstrated that the morbidity of Omicron was strongly positively correlated with income support (coefficient = 1.905, 95% confidence interval [CI]: 1.354-2.456, p < 0.001) and strongly negatively correlated with close public transport (coefficient = -1.591, 95% CI: -2.461 to -0.721, p = 0.001). Omicron outbreaks exhibited spatial clustering at the country level worldwide; the countries with higher disease morbidity could impact the other countries that are surrounded by and close to it. The locations with High-High clustering category, which referred to the countries with higher disease morbidity, were mainly observed in Europe, and its adjoining country also showed high spatial clustering. The morbidity of Omicron increased from December 21, 2021 to January 26, 2022. The higher morbidity of Omicron was associated with the economic and policy interventions implemented; hence, to deal with the epidemic, the prevention and control measures should be strengthened in all aspects.


Assuntos
Surtos de Doenças , Pandemias , Análise por Conglomerados , Humanos , Incidência , África do Sul/epidemiologia , Análise Espaço-Temporal
18.
Biometrics ; 78(2): 536-547, 2022 06.
Artigo em Inglês | MEDLINE | ID: mdl-33544886

RESUMO

In this work, we propose a new Bayesian spatial homogeneity pursuit method for survival data under the proportional hazards model to detect spatially clustered patterns in baseline hazard and regression coefficients. Specially, regression coefficients and baseline hazard are assumed to have spatial homogeneity pattern over space. To capture such homogeneity, we develop a geographically weighted Chinese restaurant process prior to simultaneously estimating coefficients and baseline hazards and their uncertainty measures. An efficient Markov chain Monte Carlo (MCMC) algorithm is designed for our proposed methods. Performance is evaluated using simulated data, and further applied to a real data analysis of respiratory cancer in the state of Louisiana.


Assuntos
Neoplasias , Teorema de Bayes , Humanos , Cadeias de Markov , Método de Monte Carlo , Modelos de Riscos Proporcionais
19.
Environ Res ; 212(Pt D): 113569, 2022 09.
Artigo em Inglês | MEDLINE | ID: mdl-35636466

RESUMO

Monitoring of microplastics in environmental samples is relevant to the scientific world, as well as to environmental agencies and water authorities, in particular considering increasing efforts to decrease emissions and the growing concern of governments and the public. Therefore, rapid accurate detection and identification of microplastics including polymers, despite their degradation in the environment, is crucial. The degradation has a significant impact on the infrared spectra of the microplastics and can impede the identification process. This work presents a novel approach to addressing the problem of identification of weathered microplastics. A quantum cascade laser (LDIR) was used to record the infrared spectra of various polymeric particles (81,291 individual particles). Using a combination of pristine and weathered particles, two supervised machine learning (ML) models, namely Subspace k-Nearest Neighbor (Sub-kNN) and Boosted Decision Tree (BDT), were trained to recognize the spectrum characteristics of labeled particles and then used to identify unlabeled samples, with an identification accuracy of 89.7% and 77.1% using 10-fold cross validation. About 90% of the samples could be identified via the Sub-kNN or BDT models. Subsequently, a non-supervised ML model, namely, Density-based Spatial Clustering of Applications with Noise (DBSCAN), was used to cluster samples which could not be labeled from the supervised ML model. This enabled the detection of additional subgroups of microplastics. Manual labelling can then be carried out on a selection of spectra per group (e.g., centroids of each cluster), hence accelerating the identification process and allowing to add new labeled samples to the initial supervised ML. Although expert efforts are still needed, the proposed method greatly lowers labeling efforts by using the combined supervised and unsupervised learning models. In the future, the use of deep neural networks could further boost the implementation of these kinds of approaches for polymer and microplastic identification in environmental settings.


Assuntos
Microplásticos , Poluentes Químicos da Água , Monitoramento Ambiental/métodos , Lasers Semicondutores , Aprendizado de Máquina , Plásticos , Polímeros , Poluentes Químicos da Água/análise
20.
Health Care Manag Sci ; 25(4): 574-589, 2022 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-35732967

RESUMO

Many public health policymaking questions involve data subsets representing application-specific attributes and geographic location. We develop and evaluate standard and tailored techniques for clustering via unsupervised learning (UL) algorithms on such amalgamated (dual-domain) data sets. The aim of the associated algorithms is to identify geographically efficient clusters that also maximize the number of statistically significant differences in disease incidence and demographic variables across top clusters. Two standard UL approaches, k means with k++ initialization (k++) and the standard self-organizing map (SSOM), are considered along with a new, tailored version of the SOM (TSOM). The TSOM algorithm involves optimization of a customized objective function with terms promoting individual geographic cluster cohesion while also maximizing the number of differences across clusters, and two hyper-parameters controlling the relative weighting of geographic and attribute subspaces in a non-Euclidean distance measure within the clustering problem. The performance of these three techniques (k++, SSOM, TSOM) is compared and evaluated in the context of a data set for colorectal cancer incidence in the state of California, at the level of individual counties. Clusters are visualized via chloropleth maps and ordered graphs are also used to illustrate disparities in disease incidence among four identity groups. While all three approaches performed well, the TSOM identified the largest number of disease and demographic disparities while also yielding more geographically efficient top clusters. Techniques presented in this study are relevant to applications including the delivery of health care resources and identifying disparities among identity groups, and to questions involving coordination between county- and state-level policymakers.


Assuntos
Neoplasias Colorretais , Aprendizado de Máquina não Supervisionado , Humanos , Incidência , Análise por Conglomerados , Algoritmos , Neoplasias Colorretais/epidemiologia
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA