RESUMO
INTRODUCTION: Average age is increasing worldwide, raising the public health burden of age-related diseases, as more resources will be required to manage treatments. Phenotypic Age is a score that can be useful to provide an estimate of the probability of developing aging-related conditions, and prevention of such conditions could be performed efficiently studying the mechanisms leading to an increased phenotypic age. The objective of this study is to characterize the mechanisms that lead to aging acceleration from the interactions among socio-demographic factors, health predispositions and biological phenotypes. METHODS: We present an approach based on the combination of mediation analysis and structural equation models (SEM) to better characterize these mechanisms, quantifying the interactions between biological and external factors and the effects of preexisting health conditions and socioeconomic disparities. We use two independent cohorts of the NHANES dataset: we use the largest (n = 13,186) to select the variables that enlarge the gap between phenotypic and chronological ages, we then create a SEM based on nested linear regressions to quantify the influence of all sociodemographic variables expressed in three latent variables indicating ethnicity, socioeconomic status and preexisting health status. We then replicate the model and apply it to the second cohort (n = 4,425) to compare the results. RESULTS: Results show that phenotypic age increases with poor glucose control or obesity-related biomarkers, especially if combined with a low socioeconomic status or the presence of chronic or vascular diseases, and provide a framework to quantify these relationships. Black ethnicity, low income/education and a history of chronic diseases are also associated with a higher phenotypic age. Although these findings are already known in literature, the proposed SEM-based framework provides an useful tool to assess the combinations of these heterogeneous factors from a quantitative point of view. CONCLUSION: In an aging society, phenotypic age is an important metric that can be used to estimate the individual health risk, however its value is influenced by a myriad of external factors, both biological and sociodemographic. The framework proposed in this paper can help quantifying the combined effects of these factors and be a starting point to the creation of personalized prevention and intervention strategies.
RESUMO
We aim to measure and explain the perception of community resilience in Romania. We use survey data from a country-representative sample of 1500 respondents. We rely on factor-based partial least squares path modeling to measure five reflective latent constructs from a CCRAM-type questionnaire. We use these constructs to extract a second-order formative latent construct representing an overall measure of community resilience. Next, we use three sub-dimensions of family resilience, along with individual resilience and several control variables to explain community resilience. Among the five sub-dimensions of the overall measure of community resilience, social trust exerts the highest contribution, followed by place attachment. The predictors of community resilience with the largest effect sizes are the three sub-dimensions of family resilience. The policies geared towards increasing community resilience might not be able to address the most important factors, at least in the case of Romania, because they pertain to informal group interaction, and lie outside the reach of formal administrative authority.
RESUMO
Clustered current status data frequently occur in many fields of survival studies. Some potential factors related to the hazards of interest cannot be directly observed but are characterized through multiple correlated observable surrogates. In this article, we propose a joint modeling method for regression analysis of clustered current status data with latent variables and potentially informative cluster sizes. The proposed models consist of a factor analysis model to characterize latent variables through their multiple surrogates and an additive hazards frailty model to investigate covariate effects on the failure time and incorporate intra-cluster correlations. We develop an estimation procedure that combines the expectation-maximization algorithm and the weighted estimating equations. The consistency and asymptotic normality of the proposed estimators are established. The finite-sample performance of the proposed method is assessed via a series of simulation studies. This procedure is applied to analyze clustered current status data from the National Toxicology Program on a tumorigenicity study given by the United States Department of Health and Human Services.
RESUMO
Learning individualized treatment rules (ITRs) for a target patient population with mental disorders is confronted with many challenges. First, the target population may be different from the training population that provided data for learning ITRs. Ignoring differences between the training patient data and the target population can result in sub-optimal treatment strategies for the target population. Second, for mental disorders, a patient's underlying mental state is not observed but can be inferred from measures of high-dimensional combinations of symptomatology. Treatment mechanisms are unknown and can be complex, and thus treatment effect moderation can take complicated forms. To address these challenges, we propose a novel method that connects measurement models, efficient weighting schemes, and flexible neural network architecture through latent variables to tailor treatments for a target population. Patients' underlying mental states are represented by a compact set of latent state variables while preserving interpretability. Weighting schemes are designed based on lower-dimensional latent variables to efficiently balance population differences so that biases in learning the latent structure and treatment effects are mitigated. Extensive simulation studies demonstrated consistent superiority of the proposed method and the weighting approach. Applications to two real-world studies of patients with major depressive disorder have shown a broad utility of the proposed method in improving treatment outcomes in the target population.
RESUMO
For personalized medicine, we propose a general method of evaluating the potential performance of an individualized treatment rule in future clinical applications with new patients. We focus on rules that choose the most beneficial treatment for the patient out of two active (nonplacebo) treatments, which the clinician will prescribe regularly to the patient after the decision. We develop a measure of the individualization potential (IP) of a rule. The IP compares the expected effectiveness of the rule in a future clinical individualization setting versus the effectiveness of not trying individualization. We illustrate our evaluation method by explaining how to measure the IP of a useful type of individualized rules calculated through a new parametric interaction model of data from parallel-group clinical trials with continuous responses. Our interaction model implies a structural equation model we use to estimate the rule and its IP. We examine the IP both theoretically and with simulations when the estimated individualized rule is put into practice in new patients. Our individualization approach was superior to outcome-weighted machine learning according to simulations. We also show connections with crossover and N-of-1 trials. As a real data application, we estimate a rule for the individualization of treatments for diabetic macular edema and evaluate its IP.
Assuntos
Modelos Estatísticos , Medicina de Precisão , Humanos , Ensaios Clínicos como Assunto , Simulação por Computador , Aprendizado de Máquina , Retinopatia Diabética/tratamento farmacológicoRESUMO
This paper tackles the challenge of estimating correlations between higher-level biological variables (e.g. proteins and gene pathways) when only lower-level measurements are directly observed (e.g. peptides and individual genes). Existing methods typically aggregate lower-level data into higher-level variables and then estimate correlations based on the aggregated data. However, different data aggregation methods can yield varying correlation estimates as they target different higher-level quantities. Our solution is a latent factor model that directly estimates these higher-level correlations from lower-level data without the need for data aggregation. We further introduce a shrinkage estimator to ensure the positive definiteness and improve the accuracy of the estimated correlation matrix. Furthermore, we establish the asymptotic normality of our estimator, enabling efficient computation of P-values for the identification of significant correlations. The effectiveness of our approach is demonstrated through comprehensive simulations and the analysis of proteomics and gene expression datasets. We develop the R package highcor for implementing our method.
RESUMO
Studying protein isoforms is an essential step in biomedical research; at present, the main approach for analyzing proteins is via bottom-up mass spectrometry proteomics, which return peptide identifications, that are indirectly used to infer the presence of protein isoforms. However, the detection and quantification processes are noisy; in particular, peptides may be erroneously detected, and most peptides, known as shared peptides, are associated to multiple protein isoforms. As a consequence, studying individual protein isoforms is challenging, and inferred protein results are often abstracted to the gene-level or to groups of protein isoforms. Here, we introduce IsoBayes, a novel statistical method to perform inference at the isoform level. Our method enhances the information available, by integrating mass spectrometry proteomics and transcriptomics data in a Bayesian probabilistic framework. To account for the uncertainty in the measurement process, we propose a two-layer latent variable approach: first, we sample if a peptide has been correctly detected (or, alternatively filter peptides); second, we allocate the abundance of such selected peptides across the protein(s) they are compatible with. This enables us, starting from peptide-level data, to recover protein-level data; in particular, we: i) infer the presence/absence of each protein isoform (via a posterior probability), ii) estimate its abundance (and credible interval), and iii) target isoforms where transcript and protein relative abundances significantly differ. We benchmarked our approach in simulations, and in two multi-protease real datasets: our method displays good sensitivity and specificity when detecting protein isoforms, its estimated abundances highly correlate with the ground truth, and can detect changes between protein and transcript relative abundances. IsoBayes is freely distributed as a Bioconductor R package, and is accompanied by an example usage vignette.
RESUMO
Although transcriptomics data is typically used to analyze mature spliced mRNA, recent attention has focused on jointly investigating spliced and unspliced (or precursor-) mRNA, which can be used to study gene regulation and changes in gene expression production. Nonetheless, most methods for spliced/unspliced inference (such as RNA velocity tools) focus on individual samples, and rarely allow comparisons between groups of samples (e.g. healthy vs. diseased). Furthermore, this kind of inference is challenging, because spliced and unspliced mRNA abundance is characterized by a high degree of quantification uncertainty, due to the prevalence of multi-mapping reads, ie reads compatible with multiple transcripts (or genes), and/or with both their spliced and unspliced versions. Here, we present DifferentialRegulation, a Bayesian hierarchical method to discover changes between experimental conditions with respect to the relative abundance of unspliced mRNA (over the total mRNA). We model the quantification uncertainty via a latent variable approach, where reads are allocated to their gene/transcript of origin, and to the respective splice version. We designed several benchmarks where our approach shows good performance, in terms of sensitivity and error control, vs. state-of-the-art competitors. Importantly, our tool is flexible, and works with both bulk and single-cell RNA-sequencing data. DifferentialRegulation is distributed as a Bioconductor R package.
Assuntos
Teorema de Bayes , Humanos , RNA Mensageiro/genética , Perfilação da Expressão Gênica/métodos , Splicing de RNA/genética , Regulação da Expressão Gênica , Modelos EstatísticosRESUMO
BACKGROUND: This study takes on the challenge of quantifying a complex causal loop diagram describing how poverty and health affect each other, and does so using longitudinal data from The Netherlands. Furthermore, this paper elaborates on its methodological approach in order to facilitate replication and methodological advancement. METHODS: After adapting a causal loop diagram that was built by stakeholders, a longitudinal structural equation modelling approach was used. A cross-lagged panel model with nine endogenous variables, of which two latent variables, and three time-invariant exogenous variables was constructed. With this model, directional effects are estimated in a Granger-causal manner, using data from 2015 to 2019. Both the direct effects (with a one-year lag) and total effects over multiple (up to eight) years were calculated. Five sensitivity analyses were conducted. Two of these focus on lower-income and lower-wealth individuals. The other three each added one exogenous variable: work status, level of education, and home ownership. RESULTS: The effects of income and financial wealth on health are present, but are relatively weak for the overall population. Sensitivity analyses show that these effects are stronger for those with lower incomes or wealth. Physical capability does seem to have strong positive effects on both income and financial wealth. There are a number of other results as well, as the estimated models are extensive. Many of the estimated effects only become substantial after several years. CONCLUSIONS: Income and financial wealth appear to have limited effects on the health of the overall population of The Netherlands. However, there are indications that these effects may be stronger for individuals who are closer to the poverty threshold. Since the estimated effects of physical capability on income and financial wealth are more substantial, a broad recommendation would be that including physical capability in efforts that are aimed at improving income and financial wealth could be useful and effective. The methodological approach described in this paper could also be applied to other research settings or topics.
Assuntos
Pobreza , Humanos , Países Baixos , Estudos Longitudinais , Análise de Classes Latentes , Feminino , Masculino , Renda , Nível de Saúde , Adulto , Pessoa de Meia-IdadeRESUMO
We consider measurement error models for two variables observed repeatedly and subject to measurement error. One variable is continuous, while the other variable is a mixture of continuous and zero measurements. This second variable has two sources of zeros. The first source is episodic zeros, wherein some of the measurements for an individual may be zero and others positive. The second source is hard zeros, i.e., some individuals will always report zero. An example is the consumption of alcohol from alcoholic beverages: some individuals consume alcoholic beverages episodically, while others never consume alcoholic beverages. However, with a small number of repeat measurements from individuals, it is not possible to determine those who are episodic zeros and those who are hard zeros. We develop a new measurement error model for this problem, and use Bayesian methods to fit it. Simulations and data analyses are used to illustrate our methods. Extensions to parametric models and survival analysis are discussed briefly.
Assuntos
Teorema de Bayes , Modelos Estatísticos , Humanos , Simulação por Computador , Análise de Sobrevida , Consumo de Bebidas Alcoólicas , Interpretação Estatística de DadosRESUMO
Species-to-species and species-to-environment interactions are key drivers of community dynamics. Disentangling these drivers in species-rich assemblages is challenging due to the high number of potentially interacting species (the 'curse of dimensionality'). We develop a process-based model that quantifies how intraspecific and interspecific interactions, and species' covarying responses to environmental fluctuations, jointly drive community dynamics. We fit the model to reef fish abundance time series from 41 reefs of Australia's Great Barrier Reef. We found that fluctuating relative abundances are driven by species' heterogenous responses to environmental fluctuations, whereas interspecific interactions are negligible. Species differences in long-term average abundances are driven by interspecific variation in the magnitudes of both conspecific density-dependence and density-independent growth rates. This study introduces a novel approach to overcoming the curse of dimensionality, which reveals highly individualistic dynamics in coral reef fish communities that imply a high level of niche structure.
Assuntos
Antozoários , Recifes de Corais , Animais , Peixes/fisiologia , Especificidade da Espécie , Fatores de Tempo , Antozoários/fisiologia , BiodiversidadeRESUMO
Open dumping is the prevailing municipal solid waste (MSW) disposal technique in India. Unsanitary landfill releases leachate that contaminates valuable groundwater. Hence, the present study was carried out in the vicinity of the Saduperi open dumpsite, Vellore, Tamil Nadu, India, to explore the key factors that influences groundwater contamination. A total of 216 groundwater samples were collected between May 2021 and April 2022. These samples were categorised into four different seasons such as summer, southwest monsoon (SWM), northeast monsoon (NEM), and winter. Pollution indices such as the Leachate Pollution Index (LPI) and the Heavy Metal Pollution Index (HPI) were used to evaluate the contamination potential. The calculated LPI > 35 in all seasons indicates the prevailing poor environmental condition. It was observed that about 56% of the sampling site was affected by heavy metal concentrations such as Cd, Cr, and Ni. The HPI value was found to be more than the critical value of 100 in the 10 sampling wells for all seasons. Partial least squares-structural equation modelling (PLS-SEM) has also been carried out in this study to create a link between latent variables such as 'IOT Parameters', 'Leachate Parameters', 'Heavy Metal', and 'Groundwater Quality' which were quantified by the yield of R2 value. The R2 value of the sampling well ahead of the dumpsite and along the direction of the groundwater flow values ranges from 24.7 to 86.5% in comparison to the wells located behind the dumpsite, which are prone to more contamination due to migration of leachate. Hence, this present study shows various influencing factors that affect the groundwater quality.
Assuntos
Monitoramento Ambiental , Água Subterrânea , Metais Pesados , Poluentes Químicos da Água , Água Subterrânea/química , Índia , Poluentes Químicos da Água/análise , Monitoramento Ambiental/métodos , Metais Pesados/análise , Qualidade da Água , Estações do AnoRESUMO
Causal-formative indicators are often used in social science research. To achieve identification in causal-formative indicator modeling, constraints need to be applied. A conventional method is to constrain the weight of a formative indicator to be 1. The selection of which indicator to have the fixed weight, however, may influence statistical inferences of the structural path coefficients from the causal-formative construct to outcomes. Another conventional method is to use equal weights (e.g., 1) and assumes that all indicators equally contribute to the latent construct, which can be a strong assumption. To address the limitations of the conventional methods, we proposed an alternative constraint method, in which the sum of the weights is constrained to be a constant. We analytically studied the relations and interpretations of structural path coefficients from the constraint methods, and the results showed that the proposed method yields better interpretations of path coefficients. Simulation studies were conducted to compare the performance of the weight constraint methods in causal-formative indicator modeling with one or two outcomes. Results showed that higher biases in the path coefficient estimates were observed from the conventional methods compared to the proposed method. The proposed method had ignorable bias and satisfactory coverage rates in the studied conditions. This study emphasizes the importance of using an appropriate weight constraint method in causal-formative indicator modeling.
Assuntos
Modelos Estatísticos , Humanos , Simulação por Computador , Causalidade , Ciências Sociais/métodos , Interpretação Estatística de DadosRESUMO
Spearman (Am J Psychol 15(1):201-293, 1904. https://doi.org/10.2307/1412107 ) marks the birth of factor analysis. Many articles and books have extended his landmark paper in permitting multiple factors and determining the number of factors, developing ideas about simple structure and factor rotation, and distinguishing between confirmatory and exploratory factor analysis (CFA and EFA). We propose a new model implied instrumental variable (MIIV) approach to EFA that allows intercepts for the measurement equations, correlated common factors, correlated errors, standard errors of factor loadings and measurement intercepts, overidentification tests of equations, and a procedure for determining the number of factors. We also permit simpler structures by removing nonsignificant loadings. Simulations of factor analysis models with and without cross-loadings demonstrate the impressive performance of the MIIV-EFA procedure in recovering the correct number of factors and in recovering the primary and secondary loadings. For example, in nearly all replications MIIV-EFA finds the correct number of factors when N is 100 or more. Even the primary and secondary loadings of the most complex models were recovered when the sample sizes were at least 500. We discuss limitations and future research areas. Two appendices describe alternative MIIV-EFA algorithms and the sensitivity of the algorithm to cross-loadings.
Assuntos
Modelos Estatísticos , Psicometria , Análise Fatorial , Humanos , Simulação por ComputadorRESUMO
For many years, the economic literature has recognized the role of attitudes, beliefs, and perceptions in estimating the value of a statistical life (VSL). However, few applications have attempted to include them. This article incorporates the perceived controllability and concern about traffic and cardiorespiratory risks to estimate VSL using a hybrid choice model (HCM). The HCM allows us to include unobserved heterogeneity and improve behavioral realism explicitly. Using data from a choice experiment conducted in Santiago, Chile, we estimate a VSL of US$3.78 million for traffic risks and US$2.06 million for cardiorespiratory risks. We found that higher controllability decreases the likelihood that the respondents would be willing to pay for risk reductions in both risks. On the other hand, concern about these risks decreases the willingness to pay for traffic risk reductions but increases it for cardiorespiratory risk reductions.
Assuntos
Valor da Vida , Humanos , Chile , Modelos Estatísticos , Comportamento de Escolha , Masculino , Acidentes de Trânsito , FemininoRESUMO
Purpose: The present paper presents developments and advanced practical applications of Rasch's theory and statistical analysis to construct questionnaires for measuring a person's traits. The flaws of questionnaires providing raw scores are well known. Scores only approximate objective, linear measures. The Rasch Analysis allows you to turn raw scores into measures with an error estimate, satisfying fundamental measurement axioms (e.g., unidimensionality, linearity, generalizability). A previous companion article illustrated the most frequent graphic and numeric representations of results obtained through Rasch Analysis. A more advanced description of the method is presented here.Conclusions: Measures obtained through Rasch Analysis may foster the advancement of the scientific assessment of behaviours, perceptions, skills, attitudes, and knowledge so frequently faced in Physical and Rehabilitation Medicine, not less than in social and educational sciences. Furthermore, suggestions are given on interpreting and managing the inevitable discrepancies between observed scores and ideal measures (data-model "misfit"). Finally, twelve practical take-home messages for appraising published results are provided.Implications for rehabilitationThe current work is the second of two papers addressed to rehabilitation clinicians looking for an in-depth introduction to the Rasch analysis.The first paper illustrates the most common results reported in published papers presenting the Rasch analysis of questionnaires.The present article illustrates more advanced applications of the Rasch analysis, also frequently found in publications.Twelve take-home messages are given for a critical appraisal of the results.
Assuntos
Atitude , Exame Físico , Humanos , Psicometria , Inquéritos e Questionários , Projetos de Pesquisa , Reprodutibilidade dos TestesRESUMO
Purpose: The present article summarises the characteristics of Rasch's theory, providing an original metrological model for persons' measurements. Properties describing the person "as a whole" are key outcome variables in Medicine. This is particularly true in Physical and Rehabilitation Medicine, targeting the person's interaction with the outer world. Such variables include independence, pain, fatigue, balance, and the like. These variables can only be observed through behaviours of various complexity, deemed representative of a given "latent" person's property. So how to infer its "quantity"? Usually, behaviours (items) are scored ordinally, and their "raw" scores are summed across item lists (questionnaires). The limits and flaws of scores (i.e., multidimensionality, non-linearity) are well known, yet they still dominate the measurement in Medicine.Conclusions: Through Rasch's theory and statistical analysis, scores are transformed and tested for their capacity to respect fundamental measurement axioms. Rasch analysis returns the linear measure of the person's property ("ability") and the item's calibrations ("difficulty"), concealed by the raw scores. The difference between a person's ability and item difficulty determines the probability that a "pass" response is observed. The discrepancy between observed scores and the ideal measures (i.e., the residual) invites diagnostic reasoning. In a companion article, advanced applications of Rasch modelling are illustrated. Implications for rehabilitationQuestionnaires' ordinal scores are poor approximations of measures. The Rasch analysis turns questionnaires' scores into interval measures, provided that its assumptions are respected.Thanks to the Rasch analysis, accurate measures of independence, pain, fatigue, cognitive capacities and other whole person's variables of paramount importance in rehabilitation are available.The current work is addressed to rehabilitation professionals looking for an introduction to interpreting published results based on Rasch analysis.The first of a series of two, the present article illustrates the most common graphic and numeric outputs found in published papers presenting the Rasch analysis of questionnaires.
Assuntos
Dor , Exame Físico , Humanos , Fadiga/diagnóstico , Psicometria , Reprodutibilidade dos Testes , Inquéritos e QuestionáriosRESUMO
The present study examines whether the association of the neighborhood environment and overweight in children is moderated by age. This was a cross-sectional study of 832 children aged 3 to 10 years living in the city of Oporto (Portugal). Children were recruited under the scope of the project "Inequalities in Childhood Obesity: The impact of the socioeconomic crisis in Portugal from 2009 to 2015." Overweight was defined according to the International Obesity Task Force criteria. Parents completed a self-administered questionnaire capturing sociodemographic characteristics and their perceptions of their neighborhood environment. Logistic regressions were used to examine the influence of parental perceived neighborhood characteristics (latent variables: attractiveness, traffic safety, crime safety, and walkability) on overweight in children. A stratified analysis by age category was conducted. Overall, 27.8% of the children were overweight, 17.4% were aged 3 to 5 years, and 31.8% were aged 6 to 10 years. Children aged 3 to 5 years were more sensitive to the neighborhood environment than children aged 6 to 10 years. For children aged 3 to 5 years, the risk of overweight was inversely associated with neighborhood crime safety (OR = 1.84; 95% CI 1.07-3.15; p = 0.030). Conclusion: Our study suggests the existence of a sensitive age period in childhood at which exposure to a hostile neighborhood environment is most determining for weight gain. Until today, it was thought that the impact of the neighborhood environment on younger children would be less important as they are less autonomous. But it may not be true. What is Known: ⢠The neighborhood environment may adversely affect children's weight status. However, the moderating role of child age in the association between neighborhood environment and overweight is uncertain. What is New: ⢠The study highlights that the association between the neighborhood environment and child overweight is attenuated by age. It is stronger for preschoolers than for early school-age children.
Assuntos
Sobrepeso , Obesidade Infantil , Humanos , Criança , Sobrepeso/epidemiologia , Sobrepeso/etiologia , Obesidade Infantil/epidemiologia , Obesidade Infantil/etiologia , Estudos Transversais , Aumento de Peso , Pais , Características de ResidênciaRESUMO
Quantifying a person's cumulative exposure burden to per- and polyfluoroalkyl substances (PFAS) mixtures is important for risk assessment, biomonitoring, and reporting of results to participants. However, different people may be exposed to different sets of PFASs due to heterogeneity in the exposure sources and patterns. Applying a single measurement model for the entire population (e.g., by summing concentrations of all PFAS analytes) assumes that each PFAS analyte is equally informative to PFAS exposure burden for all individuals. This assumption may not hold if PFAS exposure sources systematically differ within the population. However, the sociodemographic, dietary, and behavioral characteristics that underlie systematic exposure differences may not be known, or may be due to a combination of these factors. Therefore, we used mixture item response theory, an unsupervised psychometrics and data science method, to develop a customized PFAS exposure burden scoring algorithm. This scoring algorithm ensures that PFAS burden scores can be equitably compared across population subgroups. We applied our methods to PFAS biomonitoring data from the United States National Health and Nutrition Examination Survey (2013-2018). Using mixture item response theory, we found that participants with higher household incomes had higher PFAS burden scores. Asian Americans had significantly higher PFAS burden compared with non-Hispanic Whites and other race/ethnicity groups. However, some disparities were masked when using summed PFAS concentrations as the exposure metric. This work demonstrates that our summary PFAS burden metric, accounting for sources of exposure variation, may be a more fair and informative estimate of PFAS exposure.
Assuntos
Ácidos Alcanossulfônicos , Poluentes Ambientais , Fluorocarbonos , Humanos , Estados Unidos , Inquéritos Nutricionais , Saúde AmbientalRESUMO
Emotional perception and expression are very important for building intelligent conversational systems that are human-like and attractive. Although deep neural approaches have made great progress in the field of conversation generation, there is still a lot of room for research on how to guide systems in generating responses with appropriate emotions. Meanwhile, the problem of systems' tendency to generate high-frequency universal responses remains largely unsolved. To solve this problem, we propose a method to generate diverse emotional responses through selective perturbation. Our model includes a selective word perturbation module and a global emotion control module. The former is used to introduce disturbance factors into the generated responses and enhance their expression diversity. The latter maintains the coherence of the response by limiting the emotional distribution of the response and preventing excessive deviation of emotion and meaning. Experiments are designed on two datasets, and corresponding results show that our model outperforms existing baselines in terms of emotional expression and response diversity.