Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 74
Filtrar
1.
Am J Hum Genet ; 111(8): 1782-1795, 2024 Aug 08.
Artigo em Inglês | MEDLINE | ID: mdl-39053457

RESUMO

In Mendelian randomization, two single SNP-trait correlation-based methods have been developed to infer the causal direction between an exposure (e.g., a gene) and an outcome (e.g., a trait), called MR Steiger's method and its recent extension called Causal Direction-Ratio (CD-Ratio). Here we propose an approach based on R2, the coefficient of determination, to combine information from multiple (possibly correlated) SNPs to simultaneously infer the presence and direction of a causal relationship between an exposure and an outcome. Our proposed method generalizes Steiger's method from using a single SNP to multiple SNPs as IVs. It is especially useful in transcriptome-wide association studies (TWASs) (and similar applications) with typically small sample sizes for gene expression (or another molecular trait) data, providing a more flexible and powerful approach to inferring causal directions. It can be applied to GWAS summary data with a reference panel. We also discuss the influence of invalid IVs and introduce a new approach called R2S to select and remove invalid IVs (if any) to enhance the robustness. We compared the performance of the proposed method with existing methods in simulations to demonstrate its advantages. We applied the methods to identify causal genes for high/low-density lipoprotein cholesterol (HDL/LDL) using the individual-level GTEx gene expression data and UK Biobank GWAS data. The proposed method was able to confirm some well-known causal genes while identifying some novel ones. Additionally, we illustrated an application of the proposed method to GWAS summary to infer causal relationships between HDL/LDL and stroke/coronary artery disease (CAD).


Assuntos
Estudo de Associação Genômica Ampla , Polimorfismo de Nucleotídeo Único , Transcriptoma , Humanos , Estudo de Associação Genômica Ampla/métodos , Transcriptoma/genética , Análise da Randomização Mendeliana/métodos , Modelos Genéticos , LDL-Colesterol/genética , LDL-Colesterol/sangue , Fenótipo
2.
Am J Hum Genet ; 110(2): 349-358, 2023 02 02.
Artigo em Inglês | MEDLINE | ID: mdl-36702127

RESUMO

The coefficient of determination (R2) is a well-established measure to indicate the predictive ability of polygenic scores (PGSs). However, the sampling variance of R2 is rarely considered so that 95% confidence intervals (CI) are not usually reported. Moreover, when comparisons are made between PGSs based on different discovery samples, the sampling covariance of R2 is required to test the difference between them. Here, we show how to estimate the variance and covariance of R2 values to assess the 95% CI and p value of the R2 difference. We apply this approach to real data calculating PGSs in 28,880 European participants derived from UK Biobank (UKBB) and Biobank Japan (BBJ) GWAS summary statistics for cholesterol and BMI. We quantify the significantly higher predictive ability of UKBB PGSs compared to BBJ PGSs (p value 7.6e-31 for cholesterol and 1.4e-50 for BMI). A joint model of UKBB and BBJ PGSs significantly improves the predictive ability, compared to a model of UKBB PGS only (p value 3.5e-05 for cholesterol and 1.3e-28 for BMI). We also show that the predictive ability of regulatory SNPs is significantly enriched over non-regulatory SNPs for cholesterol (p value 8.9e-26 for UKBB and 3.8e-17 for BBJ). We suggest that the proposed approach (available in R package r2redux) should be used to test the statistical significance of difference between pairs of PGSs, which may help to draw a correct conclusion about the comparative predictive ability of PGSs.


Assuntos
Herança Multifatorial , Polimorfismo de Nucleotídeo Único , Humanos , Estudo de Associação Genômica Ampla
3.
Sensors (Basel) ; 23(24)2023 Dec 11.
Artigo em Inglês | MEDLINE | ID: mdl-38139609

RESUMO

Blockchain technologies have gained widespread use in security-sensitive applications due to their robust data protection. However, as blockchains are increasingly integrated into critical data management systems, they have become attractive targets for attackers. Among the various attacks on blockchain systems, distributed denial of service (DDoS) attacks are one of the most significant and potentially devastating. These attacks render the systems incapable of processing transactions, causing the blockchain to come to a halt. To address the challenge of detecting DDoS attacks on blockchains, existing visualization schemes have been developed. However, these schemes often fail to provide early DDoS detection since they typically display only past and current system status. In this paper, we present a novel visualization scheme that not only portrays past and current values but also forecasts future expected system statuses. We achieve these future predictions by utilizing polynomial regression with blockchain data. Additionally, we offer an alternative DDoS detection method employing statistical analysis, specifically the coefficient of determination, to enhance accuracy. Through our experiments, we demonstrate that our proposed scheme excels at predicting future blockchain statuses and anticipating DDoS attacks with minimal error. Our work empowers system managers of blockchain-based applications to identify and mitigate DDoS attacks at an earlier stage.

4.
Sensors (Basel) ; 20(9)2020 May 09.
Artigo em Inglês | MEDLINE | ID: mdl-32397472

RESUMO

Brain source imaging and time frequency mapping (TFM) are commonly used in magneto/electro encephalography (M/EEG) imaging. However, these methods suffer from important limitations. Source imaging is based on an ill-posed inverse problem leading to instability of source localization solutions, has a limited capacity to localize high frequency oscillations and loses its robustness for induced responses (ill-defined trigger). The drawback of TFM is that it involves independent analysis of signals from a number of frequency bands, and from co-localized sensors. In the present article, a regression-based multi-sensor space-time-frequency analysis (MSA) approach, which integrates co-localized sensors and/or multi-frequency information, is proposed. To estimate task-specific brain activations, MSA uses cross-validated, shifted, multiple Pearson correlation, calculated from the time-frequency transformed brain signal and the binary signal of stimuli. The results are projected from the sensor space onto the cortical surface. To assess MSA performance, the proposed method was compared to the weighted minimum norm estimate (wMNE) source imaging method, in terms of spatial selectivity and robustness against an ill-defined trigger. Magnetoencephalography (MEG) recordings were performed in fourteen subjects during two motor tasks: finger tapping and elbow flexion/extension. In particular, our results show that the MSA approach provides good localization performance when compared to wMNE and statistically significant improvement of robustness against ill-defined trigger.


Assuntos
Mapeamento Encefálico , Magnetoencefalografia , Córtex Motor , Eletroencefalografia , Humanos , Análise Espaço-Temporal
5.
Twin Res Hum Genet ; 22(3): 187-194, 2019 06.
Artigo em Inglês | MEDLINE | ID: mdl-31169112

RESUMO

The seasonality of demographic data has been of great interest. It depends mainly on the climatic conditions, and the findings may vary from study to study. Commonly, the studies are based on monthly data. The population at risk plays a central role. For births or deaths over short periods, the population at risk is proportional to the lengths of the months. Hence, one must analyze the number of births (and deaths) per day. If one studies the seasonality of multiple maternities, the population at risk is the total monthly number of confinements and the number of multiple maternities in a given month must be compared with the monthly number of all maternities. Consequently, when one considers the monthly rates of multiple maternities, the monthly number of births is eliminated and one obtains an unaffected seasonality measure of the rates. In general, comparisons between the seasonality of different data sets presuppose standardization of the data to indices with common means, mainly 100. If one assumes seasonality as 'non-flatness' throughout a year, a chi-squared test would be an option, but this test calculates only the heterogeneity and the same test statistic can be obtained for data sets with extreme values occurring in consecutive months or in separate months. Hence, chi-squared tests for seasonality are weak because of this arbitrariness and cannot be considered a model test. When seasonal models are applied, one must pay special attention to how well the applied model fits the data. If the goodness of fit is poor, nonsignificant models obtained can erroneously lead to statements that the seasonality is slight, although the observed seasonal fluctuations are marked. In this study, we investigate how the application of seasonal models can be applied to different demographic variables.


Assuntos
Coeficiente de Natalidade , Demografia , Modelos Teóricos , Estações do Ano , Trigêmeos/estatística & dados numéricos , Gêmeos/estatística & dados numéricos , Feminino , Finlândia/epidemiologia , Humanos , Vigilância da População , Gravidez
6.
Multivariate Behav Res ; 54(4): 514-529, 2019.
Artigo em Inglês | MEDLINE | ID: mdl-30822143

RESUMO

Whenever multiple regression is applied to a multiply imputed data set, several methods for combining significance tests for R2 and the change in R2 across imputed data sets may be used: the combination rules by Rubin, the Fisher z-test for R2 by Harel, and F-tests for the change in R2 by Chaurasia and Harel. For pooling R2 itself, Harel proposed a method based on a Fisher z transformation. In the current article, it is argued that the pooled R2 based on the Fisher z transformation, the Fisher z-test for R2 , and the F-test for the change in R2 have some theoretical flaws. An argument is made for using Rubin's method for pooling significance tests for R2 instead, and alternative procedures for pooling R2 are proposed: simple averaging and a pooled R2 constructed from the pooled significance test by Rubin. Simulations show that the Fisher z-test and Chaurasia and Harel's F-tests generally give inflated type-I error rates, whereas the type-I error rates of Rubin's method are correct. Of the methods for pooling the point estimates of R2 no method clearly performs best, but it is argued that the average of R2 's across imputed data set is preferred.


Assuntos
Algoritmos , Interpretação Estatística de Dados , Modelos Estatísticos , Humanos , Análise Multivariada
7.
Sensors (Basel) ; 18(9)2018 Sep 12.
Artigo em Inglês | MEDLINE | ID: mdl-30213097

RESUMO

Advanced technology for process monitoring and fault diagnosis is widely used in complex industrial processes. An important issue that needs to be considered is the ability to monitor key performance indicators (KPIs), which often cannot be measured sufficiently quickly or accurately. This paper proposes a data-driven approach based on maximizing the coefficient of determination for probabilistic soft sensor development when data are missing. Firstly, the problem of missing data in the training sample set is solved using the expectation maximization (EM) algorithm. Then, by maximizing the coefficient of determination, a probability model between secondary variables and the KPIs is developed. Finally, a Gaussian mixture model (GMM) is used to estimate the joint probability distribution in the probabilistic soft sensor model, whose parameters are estimated using the EM algorithm. An experimental case study on the alumina concentration in the aluminum electrolysis industry is investigated to demonstrate the advantages and the performance of the proposed approach.

8.
Entropy (Basel) ; 20(9)2018 Aug 24.
Artigo em Inglês | MEDLINE | ID: mdl-33265723

RESUMO

In factor analysis, factor contributions of latent variables are assessed conventionally by the sums of the squared factor loadings related to the variables. First, the present paper considers issues in the conventional method. Second, an alternative entropy-based approach for measuring factor contributions is proposed. The method measures the contribution of the common factor vector to the manifest variable vector and decomposes it into contributions of factors. A numerical example is also provided to demonstrate the present approach.

9.
Twin Res Hum Genet ; 20(3): 250-256, 2017 06.
Artigo em Inglês | MEDLINE | ID: mdl-28434430

RESUMO

The seasonality of population data has been of great interest in demographic studies. When seasonality is analyzed, the population at risk plays a central role. In a study of the monthly number of births and deaths, the population at risk is the product of the size of the population and the length of the month. Usually, the population can be assumed to be constant, and consequently, the population at risk is proportional to the length of the month. Hence, the number of cases per day has to be analyzed. If one studies the seasonal variation in twin or multiple maternities, the population at risk is the total number of monthly confinements, and the study should be based on the rates of the multiple maternities. Consequently, if one considers monthly twinning rates, the monthly number of birth data is eliminated and one obtains an unaffected seasonality measure of the twin maternities. The strength of the seasonality is measured by a chi-squared test or by the standard deviation. When seasonal models are applied, one must pay special attention to how well the model fits the data. If the goodness of fit is poor, it can erroneously result in a statement that the seasonality is slight, although the observed seasonal fluctuations are marked.


Assuntos
Coeficiente de Natalidade , Gravidez Múltipla/genética , Gêmeos/genética , Feminino , Humanos , Masculino , Dinâmica Populacional , Gravidez , Estações do Ano
10.
Prep Biochem Biotechnol ; 47(7): 709-719, 2017 Aug 09.
Artigo em Inglês | MEDLINE | ID: mdl-28448745

RESUMO

Methylobacillus sp. zju323 was adopted to improve the biosynthesis of pyrroloquinoline quinone (PQQ) by systematic optimization of the fermentation medium. The Plackett-Burman design was implemented to screen for the key medium components for the PQQ production. CoCl2 · 6H2O, ρ-amino benzoic acid, and MgSO4 · 7H2O were found capable of enhancing the PQQ production most significantly. A five-level three-factor central composite design was used to investigate the direct and interactive effects of these variables. Both response surface methodology (RSM) and artificial neural network-genetic algorithm (ANN-GA) were used to predict the PQQ production and to optimize the medium composition. The results showed that the medium optimized by ANN-GA was better than that by RSM in maximizing PQQ production and the experimental PQQ concentration in the ANN-GA-optimized medium was improved by 44.3% compared with that in the unoptimized medium. Further study showed that this ANN-GA-optimized medium was also effective in improving PQQ production by fed-batch mode, reaching the highest PQQ accumulation of 232.0 mg/L, which was about 47.6% increase relative to that in the original medium. The present work provided an optimized medium and developed a fed-batch strategy which might be potentially applicable in industrial PQQ production.


Assuntos
Microbiologia Industrial/métodos , Methylobacillus/metabolismo , Cofator PQQ/metabolismo , Algoritmos , Meios de Cultura/metabolismo , Fermentação , Redes Neurais de Computação
11.
BMC Bioinformatics ; 17(1): 407, 2016 Oct 06.
Artigo em Inglês | MEDLINE | ID: mdl-27716040

RESUMO

BACKGROUND: The knowledge base-driven pathway analysis is becoming the first choice for many investigators, in that it not only can reduce the complexity of functional analysis by grouping thousands of genes into just several hundred pathways, but also can increase the explanatory power for the experiment by identifying active pathways in different conditions. However, current approaches are designed to analyze a biological system assuming that each pathway is independent of the other pathways. RESULTS: A decision analysis model is developed in this article that accounts for dependence among pathways in time-course experiments and multiple treatments experiments. This model introduces a decision coefficient-a designed index, to identify the most relevant pathways in a given experiment by taking into account not only the direct determination factor of each Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway itself, but also the indirect determination factors from its related pathways. Meanwhile, the direct and indirect determination factors of each pathway are employed to demonstrate the regulation mechanisms among KEGG pathways, and the sign of decision coefficient can be used to preliminarily estimate the impact direction of each KEGG pathway. The simulation study of decision analysis demonstrated the application of decision analysis model for KEGG pathway analysis. CONCLUSIONS: A microarray dataset from bovine mammary tissue over entire lactation cycle was used to further illustrate our strategy. The results showed that the decision analysis model can provide the promising and more biologically meaningful results. Therefore, the decision analysis model is an initial attempt of optimizing pathway analysis methodology.


Assuntos
Técnicas de Apoio para a Decisão , Perfilação da Expressão Gênica/métodos , Lactação/genética , Glândulas Mamárias Animais/metabolismo , Transdução de Sinais , Transcriptoma/genética , Animais , Bovinos , Biologia Computacional/métodos , Bases de Dados Factuais , Feminino , Genoma
12.
Hum Brain Mapp ; 37(12): 4566-4580, 2016 12.
Artigo em Inglês | MEDLINE | ID: mdl-27464464

RESUMO

Spontaneous fluctuations of blood-oxygenation level-dependent functional magnetic resonance imaging (BOLD fMRI) signals are highly synchronous between brain regions that serve similar functions. This provides a means to investigate functional networks; however, most analysis techniques assume functional connections are constant over time. This may be problematic in the case of neurological disease, where functional connections may be highly variable. Recently, several methods have been proposed to determine moment-to-moment changes in the strength of functional connections over an imaging session (so called dynamic connectivity). Here a novel analysis framework based on a hierarchical observation modeling approach was proposed, to permit statistical inference of the presence of dynamic connectivity. A two-level linear model composed of overlapping sliding windows of fMRI signals, incorporating the fact that overlapping windows are not independent was described. To test this approach, datasets were synthesized whereby functional connectivity was either constant (significant or insignificant) or modulated by an external input. The method successfully determines the statistical significance of a functional connection in phase with the modulation, and it exhibits greater sensitivity and specificity in detecting regions with variable connectivity, when compared with sliding-window correlation analysis. For real data, this technique possesses greater reproducibility and provides a more discriminative estimate of dynamic connectivity than sliding-window correlation analysis. Hum Brain Mapp 37:4566-4580, 2016. © 2016 Wiley Periodicals, Inc.


Assuntos
Mapeamento Encefálico/métodos , Encéfalo/diagnóstico por imagem , Encéfalo/fisiologia , Imageamento por Ressonância Magnética , Teorema de Bayes , Circulação Cerebrovascular/fisiologia , Simulação por Computador , Humanos , Modelos Lineares , Imageamento por Ressonância Magnética/métodos , Modelos Neurológicos , Vias Neurais/diagnóstico por imagem , Vias Neurais/fisiologia , Oxigênio/sangue , Curva ROC , Descanso
13.
Asian-Australas J Anim Sci ; 29(9): 1215-21, 2016 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-26954134

RESUMO

The prediction of carcass composition in Hanwoo steers is very important for value-based marketing, and the improvement of prediction accuracy and precision can be achieved through the analyses of independent variables using a prediction equation with a sufficient dataset. The present study was conducted to develop a prediction equation for Hanwoo carcass composition for which data was collected from 7,907 Hanwoo steers raised at a private farm in Gangwon Province, South Korea, and slaughtered in the period between January 2009 and September 2014. Carcass traits such as carcass weight (CWT), back fat thickness (BFT), eye-muscle area (EMA), and marbling score (MAR) were used as independent variables for the development of a prediction equation for carcass composition, such as retail cut weight and percentage (RC, and %RC, respectively), trimmed fat weight and percentage (FAT, and %FAT, respectively), and separated bone weight and percentage (BONE, and %BONE), and its feasibility for practical use was evaluated using the estimated retail yield percentage (ELP) currently used in Korea. The equations were functions of all the variables, and the significance was estimated via stepwise regression analyses. Further, the model equations were verified by means of the residual standard deviation and the coefficient of determination (R(2)) between the predicted and observed values. As the results of stepwise analyses, CWT was the most important single variable in the equation for RC and FAT, and BFT was the most important variable for the equation of %RC and %FAT. The precision and accuracy of three variable equation consisting CWT, BFT, and EMA were very similar to those of four variable equation that included all for independent variables (CWT, BFT, EMA, and MAR) in RC and FAT, while the three variable equations provided a more accurate prediction for %RC. Consequently, the three-variable equation might be more appropriate for practical use than the four-variable equation based on its easy and cost-effective measurement. However, a relatively high average difference for the ELP in absolute value implies a revision of the official equation may be required, although the current official equation for predicting RC with three variables is still valid.

14.
Stat Med ; 34(3): 432-43, 2015 Feb 10.
Artigo em Inglês | MEDLINE | ID: mdl-25345392

RESUMO

Tests for regression coefficients such as global, local, and partial F-tests are common in applied research. In the framework of multiple imputation, there are several papers addressing tests for regression coefficients. However, for simultaneous hypothesis testing, the existing methods are computationally intensive because they involve calculation with vectors and (inversion of) matrices. In this paper, we propose a simple method based on the scalar entity, coefficient of determination, to perform (global, local, and partial) F-tests with multiply imputed data. The proposed method is evaluated using simulated data and applied to suicide prevention data.


Assuntos
Análise de Variância , Interpretação Estatística de Dados , Modelos Lineares , Biometria/métodos , Simulação por Computador , Humanos , Análise de Regressão , Projetos de Pesquisa , Prevenção do Suicídio
15.
J Environ Manage ; 132: 338-45, 2014 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-24342875

RESUMO

Greywater flows and concentrations vary greatly, thus evaluation and prediction of the response of on-site treatment filters to variable loading regimes is challenging. The performance of 0.6 m × 0.2 m (height × diameter) filters of bark, activated charcoal and sand in reduction of biochemical oxygen demand (BOD5), chemical oxygen demand (COD), total nitrogen (Tot-N) and total phosphorus (Tot-P) under variable loading regimes was investigated and modelled. During seven runs, the filters were fed with synthetic greywater at hydraulic loading rates (HLR) of 32-128 L m(-2) day(-1) and organic loading rates (OLR) of 13-76 g BOD5 m(-2) day(-1). Based on the changes in HLR and OLR, the reduction in pollutants was modelled using multiple linear regression. The models showed that increasing the HLR from 32 to 128 L m(-2) day(-1) decreased COD reduction in the bark filters from 74 to 40%, but increased COD reduction in the charcoal and sand filters from 76 to 90% and 65 to 83%, respectively. Moreover, the models showed that increasing the OLR from 13 to 76 g BOD5 m(-2) day(-1) enhanced the pollutant reduction in all filters except for Tot-P in the bark filters, which decreased slightly from 81 to 73%. Decreasing the HLR from 128 to 32 L m(-2) day(-1) enhanced the pollutant reduction in all filters, but decreasing the OLR from 76 to 14 g BOD5 m(-2) day(-1) detached biofilm and decreased the Tot-N and Tot-P reduction in the bark and sand filters. Overall, the bark filters had the capacity to treat high OLR, while the charcoal filters had the capacity to treat high HLR and high OLR. Both bark and charcoal filters had higher capacity than sand filters in dealing with high and variable loads. Bark seems to be an attractive substitute for sand filters in settings short in water and its effluent would be valuable for irrigation, while charcoal filters should be an attractive alternative for settings both rich and short in water supply and when environmental eutrophication has to be considered.


Assuntos
Carvão Vegetal/química , Filtração/métodos , Casca de Planta/química , Dióxido de Silício/química , Eliminação de Resíduos Líquidos/métodos , Poluentes Químicos da Água/química , Monitoramento Ambiental , Modelos Teóricos
16.
BMC Med Genomics ; 17(1): 132, 2024 May 16.
Artigo em Inglês | MEDLINE | ID: mdl-38755654

RESUMO

BACKGROUND: Polygenic risk scores (PRS) quantify an individual's genetic predisposition for different traits and are expected to play an increasingly important role in personalized medicine. A crucial challenge in clinical practice is the generalizability and transferability of PRS models to populations with different ancestries. When assessing the generalizability of PRS models for continuous traits, the R 2 is a commonly used measure to evaluate prediction accuracy. While the R 2 is a well-defined goodness-of-fit measure for statistical linear models, there exist different definitions for its application on test data, which complicates interpretation and comparison of results. METHODS: Based on large-scale genotype data from the UK Biobank, we compare three definitions of the R 2 on test data for evaluating the generalizability of PRS models to different populations. Polygenic models for several phenotypes, including height, BMI and lipoprotein A, are derived based on training data with European ancestry using state-of-the-art regression methods and are evaluated on various test populations with different ancestries. RESULTS: Our analysis shows that the choice of the R 2  definition can lead to considerably different results on test data, making the comparison of R 2  values from the literature problematic. While the definition as the squared correlation between predicted and observed phenotypes solely addresses the discriminative performance and always yields values between 0 and 1, definitions of the R 2 based on the mean squared prediction error (MSPE) with reference to intercept-only models assess both discrimination and calibration. These MSPE-based definitions can yield negative values indicating miscalibrated predictions for out-of-target populations. We argue that the choice of the most appropriate definition depends on the aim of PRS analysis - whether it primarily serves for risk stratification or also for individual phenotype prediction. Moreover, both correlation-based and MSPE-based definitions of R 2 can provide valuable complementary information. CONCLUSIONS: Awareness of the different definitions of the R 2 on test data is necessary to facilitate the reporting and interpretation of results on PRS generalizability. It is recommended to explicitly state which definition was used when reporting R 2 values on test data. Further research is warranted to develop and evaluate well-calibrated polygenic models for diverse populations.


Assuntos
Modelos Genéticos , Herança Multifatorial , Humanos , Fenótipo , Predisposição Genética para Doença
17.
Plant Divers ; 46(4): 542-546, 2024 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-39280972

RESUMO

Generalized Additive Models (GAMs) are widely employed in ecological research, serving as a powerful tool for ecologists to explore complex nonlinear relationships between a response variable and predictors. Nevertheless, evaluating the relative importance of predictors with concurvity (analogous to collinearity) on response variables in GAMs remains a challenge. To address this challenge, we developed an R package named gam.hp. gam.hp calculates individual R 2 values for predictors, based on the concept of 'average shared variance', a method previously introduced for multiple regression and canonical analyses. Through these individual R 2s, which add up to the overall R 2, researchers can evaluate the relative importance of each predictor within GAMs. We illustrate the utility of the gam.hp package by evaluating the relative importance of emission sources and meteorological factors in explaining ozone concentration variability in air quality data from London, UK. We believe that the gam.hp package will improve the interpretation of results obtained from GAMs.

18.
J Exp Anal Behav ; 2024 Aug 19.
Artigo em Inglês | MEDLINE | ID: mdl-39160655

RESUMO

Literature concerning operant behavioral economics shows a strong preference for the coefficient of determination (R2) metric to (a) describe how well an applied model accounts for variance and (b) depict the quality of collected data. Yet R2 is incompatible with nonlinear modeling. In this report, we provide an updated discussion of the concerns with R2. We first review recent articles that have been published in the Journal of the Experimental Analysis of Behavior that employ nonlinear models, noting recent trends in goodness-of-fit reporting, including the continued reliance on R2. We then examine the tendency for these metrics to bias against linear-like patterns via a positive correlation between goodness of fit and the primary outputs of behavioral-economic modeling. Mathematically, R2 is systematically more stringent for lower values for discounting parameters (e.g., k) in discounting studies and lower values for the elasticity parameter (α) in demand analysis. The study results suggest there may be heterogeneity in how this bias emerges in data sets of varied composition and origin. There are limitations when using any goodness-of-fit measure to assess the systematic nature of data in behavioral-economic studies, and to address those we recommend the use of algorithms that test fundamental expectations of the data.

19.
J Voice ; 37(2): 299.e1-299.e8, 2023 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-33455851

RESUMO

PURPOSE: Speech fundamental frequency (SFF) assessment is essential for all dysphonia patients to effectively evaluate the therapeutic effects of voice therapy, especially in patients with disturbances in their voice pitch due to mutational dysphonia, Reinke's edema, or as side effects of hormone therapy. A standard method of SFF measurement remains unknown. Speech tasks such as sustained vowel phonation, counting, reading passage, and spontaneous speech have generally been used for SFF measurements. Ideally, spontaneous speech best reflects SFF; however, this task has not yet been clearly defined and is limited with regard to its adaptation to a clinical setting. A reliable task for SFF measurement in Japanese, which corresponds to a speech task that most closely reflects the value that would be observed with typical spontaneous speech, has not been investigated. This study aimed to identify a reliable speech task by measuring the SFF values elicited by different widely used speech tasks in Japanese, and assess its reliability and coefficient of determination (R2). METHODS: Sixty healthy volunteers (30 men and 30 women; aged 19-30 years; mean age 22.5 years) were enrolled. All experimental procedures were performed in Japanese. The SFF values for the speech tasks were determined through the voice samples recorded using a Pulse Code Modulation (PCM) recorder. Each task, except spontaneous speech, was repeated five times, and the average fundamental frequency in each task was determined as the SFF. To assess the reliability of the SFF values across daily variations within individual speakers, the SFF measurements were repeated on two different days, separated by at least 1 week. RESULTS: The SFF values of sustained /a/ phonation, sustained vowel-average, counting, reading passage, and spontaneous speech had excellent reliability, in terms of their reproduction based on intraclass correlation. Significantly high SFF values were observed, in decreasing order, for sustained vowels-average, counting, reading passage, and spontaneous speech in both males and females. The highest R2 for spontaneous speech was that of reading passage in both males (R2 = 0.771) and females (R2 = 0.806) (P < 0.01). CONCLUSION: When spontaneous speech was presented as a task most reflective of daily conversation, reading passage was determined to be the reliable task to assess the therapeutic effect of voice therapy in Japanese.


Assuntos
Disfonia , Fala , Adulto , Feminino , Humanos , Masculino , Adulto Jovem , Reprodutibilidade dos Testes , Acústica da Fala , Medida da Produção da Fala/métodos , Qualidade da Voz , Idioma
20.
Diagnostics (Basel) ; 13(9)2023 May 06.
Artigo em Inglês | MEDLINE | ID: mdl-37175032

RESUMO

This research is aimed to escalate Adaptive Neuro-Fuzzy Inference System (ANFIS) functioning in order to ensure the veracity of existing time-series modeling. The COVID-19 pandemic has been a global threat for the past three years. Therefore, advanced forecasting of confirmed infection cases is extremely essential to alleviate the crisis brought out by COVID-19. An adaptive neuro-fuzzy inference system-reptile search algorithm (ANFIS-RSA) is developed to effectively anticipate COVID-19 cases. The proposed model integrates a machine-learning model (ANFIS) with a nature-inspired Reptile Search Algorithm (RSA). The RSA technique is used to modulate the parameters in order to improve the ANFIS modeling. Since the performance of the ANFIS model is dependent on optimizing parameters, the statistics of infected cases in China and India were employed through data obtained from WHO reports. To ensure the accuracy of our estimations, corresponding error indicators such as RMSE, RMSRE, MAE, and MAPE were evaluated using the coefficient of determination (R2). The recommended approach employed on the China dataset was compared with other upgraded ANFIS methods to identify the best error metrics, resulting in an R2 value of 0.9775. ANFIS-CEBAS and Flower Pollination Algorithm and Salp Swarm Algorithm (FPASSA-ANFIS) attained values of 0.9645 and 0.9763, respectively. Furthermore, the ANFIS-RSA technique was used on the India dataset to examine its efficiency and acquired the best R2 value (0.98). Consequently, the suggested technique was found to be more beneficial for high-precision forecasting of COVID-19 on time-series data.

SELEÇÃO DE REFERÊNCIAS
Detalhe da pesquisa