Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 10 de 10
Filtrar
1.
BMC Bioinformatics ; 19(1): 371, 2018 Oct 11.
Artigo em Inglês | MEDLINE | ID: mdl-30309317

RESUMO

BACKGROUND: With the exponential growth in available biomedical data, there is a need for data integration methods that can extract information about relationships between the data sets. However, these data sets might have very different characteristics. For interpretable results, data-specific variation needs to be quantified. For this task, Two-way Orthogonal Partial Least Squares (O2PLS) has been proposed. To facilitate application and development of the methodology, free and open-source software is required. However, this is not the case with O2PLS. RESULTS: We introduce OmicsPLS, an open-source implementation of the O2PLS method in R. It can handle both low- and high-dimensional datasets efficiently. Generic methods for inspecting and visualizing results are implemented. Both a standard and faster alternative cross-validation methods are available to determine the number of components. A simulation study shows good performance of OmicsPLS compared to alternatives, in terms of accuracy and CPU runtime. We demonstrate OmicsPLS by integrating genetic and glycomic data. CONCLUSIONS: We propose the OmicsPLS R package: a free and open-source implementation of O2PLS for statistical data integration. OmicsPLS is available at https://cran.r-project.org/package=OmicsPLS and can be installed in R via install.packages("OmicsPLS").


Assuntos
Genômica/métodos , Metabolômica/métodos , Humanos , Análise dos Mínimos Quadrados , Software
2.
BMC Bioinformatics ; 17 Suppl 2: 11, 2016 Jan 20.
Artigo em Inglês | MEDLINE | ID: mdl-26822911

RESUMO

BACKGROUND: Rapid computational and technological developments made large amounts of omics data available in different biological levels. It is becoming clear that simultaneous data analysis methods are needed for better interpretation and understanding of the underlying systems biology. Different methods have been proposed for this task, among them Partial Least Squares (PLS) related methods. To also deal with orthogonal variation, systematic variation in the data unrelated to one another, we consider the Two-way Orthogonal PLS (O2PLS): an integrative data analysis method which is capable of modeling systematic variation, while providing more parsimonious models aiding interpretation. RESULTS: A simulation study to assess the performance of O2PLS showed positive results in both low and higher dimensions. More noise (50 % of the data) only affected the systematic part estimates. A data analysis was conducted using data on metabolomics and transcriptomics from a large Finnish cohort (DILGOM). A previous sequential study, using the same data, showed significant correlations between the Lipo-Leukocyte (LL) module and lipoprotein metabolites. The O2PLS results were in agreement with these findings, identifying almost the same set of co-varying variables. Moreover, our integrative approach identified other associative genes and metabolites, while taking into account systematic variation in the data. Including orthogonal components enhanced overall fit, but the orthogonal variation was difficult to interpret. CONCLUSIONS: Simulations showed that the O2PLS estimates were close to the true parameters in both low and higher dimensions. In the presence of more noise (50 %), the orthogonal part estimates could not distinguish well between joint and unique variation. The joint estimates were not systematically affected. Simultaneous analysis with O2PLS on metabolome and transcriptome data showed that the LL module, together with VLDL and HDL metabolites, were important for the metabolomic and transcriptomic relation. This is in agreement with an earlier study. In addition more gene expression and metabolites are identified being important for the joint covariation.


Assuntos
Genômica/métodos , Metabolômica/métodos , Estatística como Assunto/métodos , Transcriptoma , Adulto , Idoso , Dieta , Feminino , Humanos , Análise dos Mínimos Quadrados , Masculino , Pessoa de Meia-Idade , Obesidade/genética , Obesidade/metabolismo , Biologia de Sistemas/métodos
3.
J Appl Stat ; 50(13): 2836-2856, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-37720244

RESUMO

Random forest is a popular prediction approach for handling high dimensional covariates. However, it often becomes infeasible to interpret the obtained high dimensional and non-parametric model. Aiming for an interpretable predictive model, we develop a forward variable selection method using the continuous ranked probability score (CRPS) as the loss function. eOur stepwise procedure selects at each step a variable that minimizes the CRPS risk and a stopping criterion for selection is designed based on an estimation of the CRPS risk difference of two consecutive steps. We provide mathematical motivation for our method by proving that in a population sense, the method attains the optimal set. In a simulation study, we compare the performance of our method with an existing variable selection method, for different sample sizes and correlation strength of covariates. Our method is observed to have a much lower false positive rate. We also demonstrate an application of our method to statistical post-processing of daily maximum temperature forecasts in the Netherlands. Our method selects about 10% covariates while retaining the same predictive power.

4.
J Appl Stat ; 49(9): 2208-2227, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-35898619

RESUMO

Investigating the main determinants of the mechanical performance of metals is not a simple task. Already known physically inspired qualitative relations between 2D microstructure characteristics and 3D mechanical properties can act as the starting point of the investigation. Isotonic regression allows to take into account ordering relations and leads to more efficient and accurate results when the underlying assumptions actually hold. The main goal in this paper is to test order relations in a model inspired by a materials science application. The statistical estimation procedure is described considering three different scenarios according to the knowledge of the variances: known variance ratio, completely unknown variances, and variances under order restrictions. New likelihood ratio tests are developed in the last two cases. Both parametric and non-parametric bootstrap approaches are developed for finding the distribution of the test statistics under the null hypothesis. Finally an application on the relation between geometrically necessary dislocations and number of observed microstructure precipitations is shown.

5.
Materials (Basel) ; 15(3)2022 Jan 24.
Artigo em Inglês | MEDLINE | ID: mdl-35160838

RESUMO

This study proposes a new approach to determine phenomenological or physical relations between microstructure features and the mechanical behavior of metals bridging advanced statistics and materials science in a study of the effect of hard precipitates on the hardening of metal alloys. Synthetic microstructures were created using multi-level Voronoi diagrams in order to control microstructure variability and then were used as samples for virtual tensile tests in a full-field crystal plasticity solver. A data-driven model based on Functional Principal Component Analysis (FPCA) was confronted with the classical Voce law for the description of uniaxial tensile curves of synthetic AISI 420 steel microstructures consisting of a ferritic matrix and increasing volume fractions of M23C6 carbides. The parameters of the two models were interpreted in terms of carbide volume fractions and texture using linear mixed-effects models.

6.
Risk Anal ; 31(4): 523-32, 2011 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-21175713

RESUMO

The need to identify toxicologically equivalent doses across different species is a major issue in toxicology and risk assessment. In this article, we investigate interspecies scaling based on the allometric equation applied to the single, oral LD (50) data previously analyzed by Rhomberg and Wolff. We focus on the statistical approach, namely, regression analysis of the mentioned data. In contrast to Rhomberg and Wolff's analysis of species pairs, we perform an overall analysis based on the whole data set. From our study it follows that if one assumes one single scaling rule for all species and substances in the data set, then ß = 1 is the most natural choice among a set of candidates known in the literature. In fact, we obtain quite narrow confidence intervals for this parameter. However, the estimate of the variance in the model is relatively high, resulting in rather wide prediction intervals.


Assuntos
Dose Letal Mediana , Animais , Humanos , Especificidade da Espécie
7.
Forensic Sci Int ; 297: 342-349, 2019 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-30903935

RESUMO

In the evaluation of measurements on characteristics of forensic trace evidence, Aitken and Lucy (2004) model the data as a two-level model using assumptions of normality where likelihood ratios are used as a measure for the strength of evidence. A two-level model assumes two sources of variation: the variation within measurements in a group (first level) and the variation between different groups (second level). Estimates of the variation within groups, the variation between groups and the overall mean are required in this approach. This paper discusses three estimators for the overall mean. In forensic science, two of these estimators are known as the weighted and unweighted mean. For an optimal choice between these estimators, the within- and between-group covariance matrices are required. In this paper a generalization to the latter two mean estimators is suggested, which is referred to as the generalized weighted mean. The weights of this estimator can be chosen such that they minimize the variance of the generalized weighted mean. These optimal weights lead to a "toy estimator", because they depend on the unknown within- and between-group covariance matrices. Using these optimal weights with estimates for the within- and between-group covariance matrices leads to the third estimator, the optimal "plug-in" generalized weighted mean estimator. The three estimators and the toy estimator are compared through a simulation study. Under conditions generally encountered in practice, we show that the unweighted mean can be preferred over the weighted mean. Moreover, in these situations the unweighted mean and the optimal generalized weighted mean behave similarly. An artificial choice of parameters is used to provide an example where the optimal generalized weighted mean outperforms both the weighted and unweighted mean. Finally, the three mean estimators are applied to real XTC data to illustrate the impact of the choice of overall mean estimator.

8.
Scand Stat Theory Appl ; 35(3): 385-399, 2008 Sep 01.
Artigo em Inglês | MEDLINE | ID: mdl-19763283

RESUMO

In this paper, we study an algorithm (which we call the support reduction algorithm) that can be used to compute non-parametric M-estimators in mixture models. The algorithm is compared with natural competitors in the context of convex regression and the 'Aspect problem' in quantum physics.

9.
Neuroimage ; 19(3): 1170-9, 2003 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-12880842

RESUMO

Iterative reconstructions are increasingly used for clinical PET studies owing to the better noise properties compared with filtered backprojection (FBP). The purpose of the present study was to compare ordered subsets expectation maximization (OSEM) iterative reconstruction with FBP as input for statistical parametric mapping (SPM) analysis of PET activation studies. Two phantom studies were performed simulating both motor and cognitive tasks and acquiring data with both high and low statistics. In contrast to clinical studies, where no a priori information is known, phantom studies allow for an accurate and detailed comparison between different reconstruction techniques. The significance of "activations" during "tasks" was determined using SPM99 software. Using region of interest analysis of SPM results, it was found that the maximum and average t values within each hot spot of the phantom were higher for OSEM than for FBP. In addition, OSEM4 x 16 (4 iterations, 16 subsets) produced fewer false-positive voxels than FBP, OSEM1 x 16 and OSEM2 x 16. In conclusion, for PET activation studies use of OSEM4 x 16 seems to give the best tradeoff between signal detection and noise reduction.


Assuntos
Mapeamento Encefálico/métodos , Encéfalo/fisiologia , Processamento de Imagem Assistida por Computador/métodos , Oxigênio/sangue , Algoritmos , Artefatos , Cognição/fisiologia , Humanos , Processamento de Imagem Assistida por Computador/estatística & dados numéricos , Modelos Anatômicos , Movimento/fisiologia , Radioisótopos de Oxigênio , Software , Tomografia Computadorizada de Emissão
10.
Neuroimage ; 20(2): 898-908, 2003 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-14568460

RESUMO

The outcome of Statistical Parametric Mapping (SPM) analyses of PET activation studies depends among others, on the quality of reconstructed data. In general, filtered back-projection (FBP) is used for reconstruction in PET activation studies. There is, however, increasing interest in iterative reconstruction algorithms such as ordered subset expectation maximization (OSEM) algorithms. The aim of the present study was to investigate the effects of reconstruction techniques and attenuation correction (AC) on the detection of activation foci following statistical analysis with SPM. First, a replicate study was performed to assess the effects of the reconstruction method on pixel variance. Second, a phantom study was performed to evaluate the influence of both locations of an activated area and applied reconstruction method on SPM outcome. A volumetric method was used to compute the number of false positive voxels for all reconstructions. In addition, average t values within activation foci and for false positive voxels were calculated. For the assessment of the effects of reconstruction on clinical data, a group of 11 patients was studied. For all reconstructions SPM maps were created and compared. Both the clinical and the phantom data showed that use of iterative reconstruction methods reduced false positive results, while showing similar SPM results within activated areas as FBP. Reconstruction of data without attenuation correction reduced noise for FBP only, but did not affect the quality of SPM results for OSEM. It is concluded that OSEM is a good alternative for FBP reconstructions providing SPM results with less noise.


Assuntos
Mapeamento Encefálico/métodos , Encéfalo/diagnóstico por imagem , Encéfalo/fisiologia , Processamento de Imagem Assistida por Computador/métodos , Tomografia Computadorizada de Emissão/métodos , Algoritmos , Reações Falso-Positivas , Humanos , Processamento de Imagem Assistida por Computador/estatística & dados numéricos , Modelos Anatômicos , Transtorno Obsessivo-Compulsivo/diagnóstico por imagem , Transtorno Obsessivo-Compulsivo/fisiopatologia , Reprodutibilidade dos Testes , Tomografia Computadorizada de Emissão/instrumentação , Tomografia Computadorizada de Emissão/estatística & dados numéricos
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA