RESUMEN
Data analyses typically rely upon assumptions about the missingness mechanisms that lead to observed versus missing data, assumptions that are typically unassessable. We explore an approach where the joint distribution of observed data and missing data are specified in a nonstandard way. In this formulation, which traces back to a representation of the joint distribution of the data and missingness mechanism, apparently first proposed by J. W. Tukey, the modeling assumptions about the distributions are either assessable or are designed to allow relatively easy incorporation of substantive knowledge about the problem at hand, thereby offering a possibly realistic portrayal of the data, both observed and missing. We develop Tukey's representation for exponential-family models, propose a computationally tractable approach to inference in this class of models, and offer some general theoretical comments. We then illustrate the utility of this approach with an example in systems biology.
RESUMEN
As a supporting factor, transportation is an important element of destination image that provides a base for the successful tourism industry. It is like the blood vessels of an area and is considered a determinant in developing a tourist destination. The article aims to characterize the status or problem of transportation accessibility in Kinnaur. GARMIN hand GPS (Global Positioning System) has been used to identify the damaged roads from their start to endpoints. In addition, a simple random sample technique has been used to register the opinion of 280 tourists about the transport facilities. Study results suggest that the bad condition of National Highway-22 is one of the barriers to tourism development in Kinnaur. There were no significant differences found between the selected destinations. Overall, the district headquarters, Kalpa, has been perceived significantly higher agreements by tourists. The government should ensure that the Border Road Organization, the organization entrusted with the responsibility of construction and maintenance of roads in international border areas, has sufficient resources to invest in transport development and its maintenance.
RESUMEN
When a ranking of institutions such as medical centers or universities is based on a numerical measure of performance provided with a standard error, confidence intervals (CIs) should be calculated to assess the uncertainty of these ranks. We present a novel method based on Tukey's honest significant difference test to construct simultaneous CIs for the true ranks. When all the true performances are equal, the probability of coverage of our method attains the nominal level. In case the true performance measures have no exact ties, our method is conservative. For this situation, we propose a rescaling method to the nominal level that results in shorter CIs while keeping control of the simultaneous coverage. We also show that a similar rescaling can be applied to correct a recently proposed Monte-Carlo based method, which is anticonservative. After rescaling, the two methods perform very similarly. However, the rescaling of the Monte-Carlo based method is computationally much more demanding and becomes infeasible when the number of institutions is larger than 30-50. We discuss another recently proposed method similar to ours based on simultaneous CIs for the true performance. We show that our method provides uniformly shorter CIs for the same confidence level. We illustrate the superiority of our new methods with a data analysis for travel time to work in the United States and on rankings of 64 hospitals in the Netherlands.
Asunto(s)
Hospitales , Proyectos de Investigación , Intervalos de Confianza , Método de Montecarlo , Probabilidad , Estados UnidosRESUMEN
This article proposes a new robust smooth-threshold estimating equation to select important variables and automatically estimate parameters for high dimensional longitudinal data. A novel working correlation matrix is proposed to capture correlations within the same subject. The proposed procedure works well when the number of covariates pn increases as the number of subjects n increases. The proposed estimates are competitive with the estimates obtained with the true correlation structure, especially when the data are contaminated. Moreover, the proposed method is robust against outliers in the response variables and/or covariates. Furthermore, the oracle properties for robust smooth-threshold estimating equations under "large n, diverging pn " are established under some regularity conditions. Extensive simulation studies and a yeast cell cycle data are used to evaluate the performance of the proposed method, and results show that the proposed method is competitive with existing robust variable selection procedures.
Asunto(s)
Análisis de Datos , Modelos Estadísticos , Simulación por Computador , Humanos , Proyectos de InvestigaciónRESUMEN
This work proposes a new wave-period estimation (L-dB) method based on the power-spectral-density (PSD) estimation of pitch and roll motional time series of a Doppler wind lidar buoy under the assumption of small angles (±22 deg) and slow yaw drifts (1 min), and the neglection of translational motion. We revisit the buoy's simplified two-degrees-of-freedom (2-DoF) motional model and formulate the PSD associated with the eigenaxis tilt of the lidar buoy, which was modelled as a complex-number random process. From this, we present the L-dB method, which estimates the wave period as the average wavelength associated to the cutoff frequency span at which the spectral components drop off L decibels from the peak level. In the framework of the IJmuiden campaign (North Sea, 29 March-17 June 2015), the L-dB method is compared in reference to most common oceanographic wave-period estimation methods by using a TriaxysTM buoy. Parametric analysis showed good agreement (correlation coefficient, ρ = 0.86, root-mean-square error (RMSE) = 0.46 s, and mean difference, MD = 0.02 s) between the proposed L-dB method and the oceanographic zero-crossing method when the threshold L was set at 8 dB.
RESUMEN
This study harnessed some of the many opportunities provided by the TRMM 3B43 data in order to generate maps illustrating the spatial and temporal distribution of significant linear rates of change of annual total precipitation for the surface of earth bounded by latitudes 50° S and 50° N for the years 1998-2018 by applying pixel-based simple linear regression. These maps are valuable for many applications and should enhance our understanding of the global precipitation patterns and trigger more research in order to explain what has not been explained. It has been found that the whole study area had a mean significant linear rate of change of - 0.4 mm/year. Nearly half of its area had significant linear rates of increase with a mean of 8.5 mm/year while the other half had significant linear rates of decrease with mean of - 7.6 mm/year. Landmass alone can be divided into nearly two halves; the first had significant linear rates of increase with a mean of 5.2 mm/year while the second had significant linear rates of decrease with mean of - 7.0 mm/year. Water areas alone also can nearly be divided into two halves; the first showed significant linear rates of increase with a mean of 9.6 mm/year while the second showed significant linear rates of decrease with mean of - 7.8 mm/year. Grouping the whole study area into six climatic zones and 21 administrative land and water regions and applying pixel-based Tukey test showed that the obtained significant linear rates of change varied significantly among these climatic and administrative regions.
Asunto(s)
Monitoreo del Ambiente , Lluvia , Modelos Lineales , AguaRESUMEN
Kresoxim methyl sorption in soils of five agro-climatic zones of India varied from 41.6% to 84.7%. Highest sorption was recorded in organic carbon rich Almora soil. Isotherm parameters for linear and non-linear Freundlich and Temkin models were almost same, whereas Langmuir parameter Q0, for linear (1.60 to 9.434 µg g-1) and non-linear (8.48 to 17.129 µg g-1) models were quite different. For isotherms optimization different error functions such as sum of squares error (SSE), root mean square error (RMSE), Chi square error, hybrid fractional error (HYBRID) and average relative error (ARE) were calculated. Lowest error function values were obtained for Freundlich isotherm in all the soils except inceptisol (Kolkata) for which Langmuir isotherm gave the best fit. Statistical analysis using SAS 9.3 software and Tukey's HSD test revealed the significant effect (p < 0.001) of soil type on sorption. Sorption correlated positively with the organic carbon and clay contents of the soil.
Asunto(s)
Monitoreo del Ambiente/métodos , Modelos Teóricos , Contaminantes del Suelo/análisis , Suelo/química , Estrobilurinas/análisis , Adsorción , Agricultura , India , Modelos Lineales , Dinámicas no LinealesRESUMEN
Minimum density power divergence estimation provides a general framework for robust statistics, depending on a parameter α , which determines the robustness properties of the method. The usual estimation method is numerical minimization of the power divergence. The paper considers the special case of linear regression. We developed an alternative estimation procedure using the methods of S-estimation. The rho function so obtained is proportional to one minus a suitably scaled normal density raised to the power α . We used the theory of S-estimation to determine the asymptotic efficiency and breakdown point for this new form of S-estimation. Two sets of comparisons were made. In one, S power divergence is compared with other S-estimators using four distinct rho functions. Plots of efficiency against breakdown point show that the properties of S power divergence are close to those of Tukey's biweight. The second set of comparisons is between S power divergence estimation and numerical minimization. Monitoring these two procedures in terms of breakdown point shows that the numerical minimization yields a procedure with larger robust residuals and a lower empirical breakdown point, thus providing an estimate of α leading to more efficient parameter estimates.
RESUMEN
A distribution that maximizes an entropy can be found by applying two different principles. On the one hand, Jaynes (1957a,b) formulated the maximum entropy principle (MaxEnt) as the search for a distribution maximizing a given entropy under some given constraints. On the other hand, Kapur (1994) and Kesavan and Kapur (1989) introduced the generalized maximum entropy principle (GMaxEnt) as the derivation of an entropy for which a given distribution has the maximum entropy property under some given constraints. In this paper, both principles were considered for cumulative entropies. Such entropies depend either on the distribution function (direct), on the survival function (residual) or on both (paired). We incorporate cumulative direct, residual, and paired entropies in one approach called cumulative Φ entropies. Maximizing this entropy without any constraints produces an extremely U-shaped (=bipolar) distribution. Maximizing the cumulative entropy under the constraints of fixed mean and variance tries to transform a distribution in the direction of a bipolar distribution, as far as it is allowed by the constraints. A bipolar distribution represents so-called contradictory information, which is in contrast to minimum or no information. In the literature, to date, only a few maximum entropy distributions for cumulative entropies have been derived. In this paper, we extended the results to well known flexible distributions (like the generalized logistic distribution) and derived some special distributions (like the skewed logistic, the skewed Tukey λ and the extended Burr XII distribution). The generalized maximum entropy principle was applied to the generalized Tukey λ distribution and the Fechner family of skewed distributions. Finally, cumulative entropies were estimated such that the data was drawn from a maximum entropy distribution. This estimator will be applied to the daily S&P500 returns and time durations between mine explosions.
RESUMEN
BACKGROUND: Definition and elimination of outliers is a key element for medical laboratories establishing or verifying reference intervals (RIs). Especially as inclusion of just a few outlying observations may seriously affect the determination of the reference limits. Many methods have been developed for definition of outliers. Several of these methods are developed for the normal distribution and often data require transformation before outlier elimination. METHODS: We have developed a non-parametric transformation independent outlier definition. The new method relies on drawing reproducible histograms. This is done by using defined bin sizes above and below the median. The method is compared to the method recommended by CLSI/IFCC, which uses Box-Cox transformation (BCT) and Tukey's fences for outlier definition. The comparison is done on eight simulated distributions and an indirect clinical datasets. RESULTS: The comparison on simulated distributions shows that without outliers added the recommended method in general defines fewer outliers. However, when outliers are added on one side the proposed method often produces better results. With outliers on both sides the methods are equally good. Furthermore, it is found that the presence of outliers affects the BCT, and subsequently affects the determined limits of current recommended methods. This is especially seen in skewed distributions. The proposed outlier definition reproduced current RI limits on clinical data containing outliers. CONCLUSIONS: We find our simple transformation independent outlier detection method as good as or better than the currently recommended methods.
Asunto(s)
Estadísticas no Paramétricas , Adulto , Análisis Químico de la Sangre/normas , Femenino , Humanos , Laboratorios de Hospital , Masculino , Valores de ReferenciaRESUMEN
While discriminant function analysis is an inherently Bayesian method, researchers attempting to estimate ancestry in human skeletal samples often follow discriminant function analysis with the calculation of frequentist-based typicalities for assigning group membership. Such an approach is problematic because it fails to account for admixture and for variation in why individuals may be classified as outliers or nonmembers of particular groups. This article presents an argument and methodology for employing a fully Bayesian approach in discriminant function analysis applied to cases of ancestry estimation. The approach requires adding the calculation, or estimation, of predictive distributions as the final step in ancestry-focused discriminant analyses. The methods for a fully Bayesian multivariate discriminant analysis are illustrated using craniometrics from identified population samples within the Howells published data. The article also presents ways to visualize predictive distributions calculated in more than three dimensions, explains the limitations of typicality measures, and suggests an analytical route for future studies of ancestry and admixture based in discriminant function analysis.
Asunto(s)
Cefalometría/métodos , Análisis Discriminante , Esqueleto/anatomía & histología , Algoritmos , Teorema de Bayes , Femenino , Antropología Forense/métodos , Humanos , Masculino , Polinesia/etnología , Valor Predictivo de las PruebasRESUMEN
Pairwise comparison is a very common multiple comparison problem. It is known that Fisher's LSD test does not control the familywise error rate (FWER) when there are more than three groups to be compared. Improved testing strategies include the Tukey-Kramer (TK) test that eliminates the F-test step and the two-step Fisher-Hayter (FH) test which requires a significant F-test. We propose a modified FH-test that is uniformly more powerful than the original version and relies on exact size α test under the balanced model. We provide simulations to show that the new procedure is preferred to the FH-test and the TK-test.
Asunto(s)
Biometría/métodos , Modelos Estadísticos , Simulación por ComputadorRESUMEN
AIM: To estimate angular response deviation of MOSFETs in the realm of intraoperative electron radiotherapy (IOERT), review their energy dependence, and propose unambiguous names for detector rotations. BACKGROUND: MOSFETs have been used in IOERT. Movement of the detector, namely rotations, can spoil results. MATERIALS AND METHODS: We propose yaw, pitch, and roll to name the three possible rotations in space, as these unequivocally name aircraft rotations. Reinforced mobile MOSFETs (model TN-502RDM-H) and an Elekta Precise linear accelerator were used. Two detectors were placed in air for the angular response study and the whole set of five detectors was calibrated as usual to evaluate energy dependence. RESULTS: The maximum readout was obtained with a roll of 90° and 4 MeV. With regard to pitch movement, a substantial drop in readout was achieved at 90°. Significant overresponse was measured at 315° with 4 MeV and at 45° with 15 MeV. Energy response is not different for the following groups of energies: 4, 6, and 9 MeV; and 12 MeV, 15 MeV, and 18 MeV. CONCLUSIONS: Our proposal to name MOSFET rotations solves the problem of defining sensor orientations. Angular response could explain lower than expected results when the tip of the detector is lifted due to inadvertent movements. MOSFETs energy response is independent of several energies and differs by a maximum of 3.4% when dependent. This can limit dosimetry errors and makes it possible to calibrate the detectors only once for each group of energies, which saves time and optimizes lifespan of MOSFETs.
RESUMEN
The year 2012 marks the 50th anniversary of the death of Sir Ronald A. Fisher, one of the two Fathers of Statistics and a Founder of the International Biometric Society (the "Society"). To celebrate the extraordinary genius of Fisher and the far-sighted vision of Fisher and Chester Bliss in organizing and promoting the formation of the Society, this article looks at the origins and growth of the Society, some of the key players and events, and especially the roles played by Fisher himself as the First President. A fresh look at Fisher, the man rather than the scientific genius is also presented.
Asunto(s)
Biometría/historia , Sociedades Científicas/historia , Historia del Siglo XX , Humanos , Internacionalidad , Sociedades Científicas/organización & administraciónRESUMEN
While there has been extensive research developing gene-environment interaction (GEI) methods in case-control studies, little attention has been given to sparse and efficient modeling of GEI in longitudinal studies. In a two-way table for GEI with rows and columns as categorical variables, a conventional saturated interaction model involves estimation of a specific parameter for each cell, with constraints ensuring identifiability. The estimates are unbiased but are potentially inefficient because the number of parameters to be estimated can grow quickly with increasing categories of row/column factors. On the other hand, Tukey's one-degree-of-freedom model for non-additivity treats the interaction term as a scaled product of row and column main effects. Because of the parsimonious form of interaction, the interaction estimate leads to enhanced efficiency, and the corresponding test could lead to increased power. Unfortunately, Tukey's model gives biased estimates and low power if the model is misspecified. When screening multiple GEIs where each genetic and environmental marker may exhibit a distinct interaction pattern, a robust estimator for interaction is important for GEI detection. We propose a shrinkage estimator for interaction effects that combines estimates from both Tukey's and saturated interaction models and use the corresponding Wald test for testing interaction in a longitudinal setting. The proposed estimator is robust to misspecification of interaction structure. We illustrate the proposed methods using two longitudinal studies-the Normative Aging Study and the Multi-ethnic Study of Atherosclerosis.
Asunto(s)
Aterosclerosis/etiología , Aterosclerosis/genética , Exposición a Riesgos Ambientales/efectos adversos , Interacción Gen-Ambiente , Plomo/efectos adversos , Anciano , Anciano de 80 o más Años , Envejecimiento/fisiología , Aterosclerosis/etnología , Huesos/efectos de los fármacos , Huesos/metabolismo , Simulación por Computador , Exposición a Riesgos Ambientales/estadística & datos numéricos , Etnicidad/genética , Etnicidad/estadística & datos numéricos , Femenino , Humanos , Hierro/metabolismo , Plomo/metabolismo , Análisis de los Mínimos Cuadrados , Funciones de Verosimilitud , Estudios Longitudinales , Masculino , Persona de Mediana Edad , Modelos Genéticos , Estados Unidos/epidemiología , United States Department of Veterans AffairsRESUMEN
Systematics of the recently proposed Geastrum sect. Schmidelia are addressed through statistical analyses of quantitative morphological variables and phylogenetic reconstructions based on a multilocus approach. Emphasis is given to the taxonomic placement of G. schmidelii var. parvisporum. This variety is found to be not phylogenetically close to G. schmidelii var. schmidelii, the type species of G. sect. Schmidelia, and it therefore is excluded from this section, taxonomically raised to species rank (as G. parvisporum) and included as a member of G. sect. Hariotia. A second species in G. sect. Schmidelia is recognized and formally described as G. senoretiae. It is characterized by small basidiomata, non-hygrometric exoperidium, subsessile endoperidium and finely plicate, indistinctly delimited peristome, so far known only from Spain. Photographs and drawings are included, along with a comparison of morphologically close taxa. The presence of sclerified basidia in the mature gleba, previously not reported in the genus, is commented on.
Asunto(s)
Basidiomycota/clasificación , Secuencia de Bases , Basidiomycota/citología , Basidiomycota/genética , Basidiomycota/aislamiento & purificación , ADN de Hongos/química , ADN de Hongos/genética , ADN Ribosómico/genética , ADN Espaciador Ribosómico/química , ADN Espaciador Ribosómico/genética , Cuerpos Fructíferos de los Hongos , Proteínas Fúngicas/genética , ATPasas de Translocación de Protón Mitocondriales/genética , Datos de Secuencia Molecular , Filogenia , ARN Polimerasa I/genética , Análisis de Secuencia de ADN , Esporas FúngicasRESUMEN
Attitude measurement is a basic technique for monitoring vehicle motion states and safety. The spin motion of a vehicle couples the attitude angles with each other, which has an impact on the navigation and control of the vehicle. Global navigation satellite system (GNSS) signals-based roll angle measurement methods are important for vehicle attitude measurement. Most of existing studies use continuous signal power, but the case of loop lock loss leading to discontinuous power reception has not been considered. A robust estimation method for the roll angle based on the Tukey weight function is proposed to improve the measurement accuracy in cases of discontinuous reception. The characteristics of the GNSS signals, the geometric relationship between the signal power and roll angle of the vehicle are discussed. By installing a GNSS receiver with a single patched antenna on a rotating platform with a controllable rolling speed, the proposed method was verified by experiments. The robust estimation errors of different weight functions are analyzed. According to the characteristics of the gross measurement errors, a robust estimation method of multisatellite power observations is proposed to obtain a high-precision and stable estimation of the vehicle roll angle. The results show that the proposed algorithm can improve the accuracy of roll angle estimation even with gross measurement errors. As a result of the experiments, the estimation errors of the algorithm are 6.57° at a confidence level of 68 % and 15.49°at the confidence level of 95 %. In contrast, they are 11.38° and 37.31° for the traditional LS method. Moreover, the estimation accuracy of the algorithm is not significantly correlated with the vehicle rotational speed. Therefore, the vehicle roll angle can be estimated with high accuracy under a variety of rotational speeds.
RESUMEN
PURPOSE: The Scheffé's method obtains the difference between pair comparisons with that of the interval scale and can judge the superiority or inferiority of the sample to be compared with no restriction in the observation image by the statistical significant difference. However, the Scheffé's method cannot be judged as a single image quality indicator. Therefore, I examined a method that can evaluate the association of average degree of preference of Scheffé's method and the physical quantities that make up the image. METHODS: This study focuses on the fact that the average degree of preference of the Scheffé's method is quantitative data on the interval scale and that multiple regression analysis is possible. In the multiple regression analysis, the average degree of preference by imaging simulated pulmonary adenocarcinoma with different exposure doses was used as the objective variable and the exposure doses and noise (standard deviation [SD]) were used as the explanatory variables. The Scheffé's method used the Nakaya's modified method. RESULTS: In the multiple regression analysis, SD was P=0.027. By substituting the threshold value of the intersection of the exposure doses and SD into the multiple regression equation (predictive model), the average degree of preference (Y) was calculated. Y(Scheffé;Gy,SD) was -0.147, which was about 1/2 of exposure doses (-0.150). CONCLUSION: The multiple regression analysis of Scheffé's (average degree of preference) and physical quantity factors (exposure doses and noise) has made it possible to design images that can reduce exposure doses while maintaining adequate image quality.
Asunto(s)
Rayos X , Análisis por ApareamientoRESUMEN
Viable methods for the identification of item misfit or Differential Item Functioning (DIF) are central to scale construction and sound measurement. Many approaches rely on the derivation of a limiting distribution under the assumption that a certain model fits the data perfectly. Typical DIF assumptions such as the monotonicity and population independence of item functions are present even in classical test theory but are more explicitly stated when using item response theory or other latent variable models for the assessment of item fit. The work presented here provides a robust approach for DIF detection that does not assume perfect model data fit, but rather uses Tukey's concept of contaminated distributions. The approach uses robust outlier detection to flag items for which adequate model data fit cannot be established.
RESUMEN
UV spectroscopy is considered the simplest, the most money and time investor technique in analytical research. Besides its lowered solvent and energy consumption leading to greener outcomes, its practicality is wide and suitable for a wide range of applications. Multicomponent mixtures are always representing themselves as a problematic challenge for any analytical technique fortunately UV spectroscopic methods found many ways to tackle these mixtures. Fourier self-deconvolution (FSD) was recently applied in UV spectroscopy as an effective tool for the resolution of binary mixtures unfortunately like any other method may fail to completely resolve severely overlapping mixtures. In this paper, we epitomize the newly developed deconvoluted amplitude factor (DAF) spectrophotometric approach which couples the concepts of both the FSD and the amplitude factor methods for the resolution of tadalafil (TAD) in its binary mixtures with dapoxetine hydrochloride (DAP) or tamsulosin hydrochloride (TAM). The embraced approach was assessed regarding its greenness utilizing different assessing protocols to give evident proof for its sustainability. The innovative approach showed an enhancement in the resolution of binary mixtures and showed high sensitivity as noticed from limits of detection and quantitation which were (0.374, 1.136 µg/mL), (0.269, 0.817 µg/mL), and (0.518, 1.569 µg/mL) for TAD, DAP, and TAM, respectively. The method was validated as per ICH guidelines recommendations and also was statistically compared with recently reported methods which revealed no statistically significant difference. A very handy and reader-friendly data presentation approach was followed for the ease of statistical data interpretation and evaluation.