Pesquisa | Secretaria de Estado da Saúde

1.

Accurate and fast small p-value estimation for permutation tests in high-throughput genomic data analysis with the cross-entropy method.

Shi, Yang; Shi, Weiping; Wang, Mengqiao; Lee, Ji-Hyun; Kang, Huining; Jiang, Hui.

Stat Appl Genet Mol Biol ; 22(1)2023 01 01.

Artigo em Inglês | MEDLINE | ID: mdl-37622330

RESUMO

Permutation tests are widely used for statistical hypothesis testing when the sampling distribution of the test statistic under the null hypothesis is analytically intractable or unreliable due to finite sample sizes. One critical challenge in the application of permutation tests in genomic studies is that an enormous number of permutations are often needed to obtain reliable estimates of very small p-values, leading to intensive computational effort. To address this issue, we develop algorithms for the accurate and efficient estimation of small p-values in permutation tests for paired and independent two-group genomic data, and our approaches leverage a novel framework for parameterizing the permutation sample spaces of those two types of data respectively using the Bernoulli and conditional Bernoulli distributions, combined with the cross-entropy method. The performance of our proposed algorithms is demonstrated through the application to two simulated datasets and two real-world gene expression datasets generated by microarray and RNA-Seq technologies and comparisons to existing methods such as crude permutations and SAMC, and the results show that our approaches can achieve orders of magnitude of computational efficiency gains in estimating small p-values. Our approaches offer promising solutions for the improvement of computational efficiencies of existing permutation test procedures and the development of new testing methods using permutations in genomic data analysis.

Assuntos

Genômica , Projetos de Pesquisa , Entropia , Algoritmos , Análise de Dados

2.

Uncertainty quantification in mechanistic epidemic models via cross-entropy approximate Bayesian computation.

Cunha, Americo; Barton, David A W; Ritto, Thiago G.

Nonlinear Dyn ; 111(10): 9649-9679, 2023.

Artigo em Inglês | MEDLINE | ID: mdl-37025428

RESUMO

This paper proposes a data-driven approximate Bayesian computation framework for parameter estimation and uncertainty quantification of epidemic models, which incorporates two novelties: (i) the identification of the initial conditions by using plausible dynamic states that are compatible with observational data; (ii) learning of an informative prior distribution for the model parameters via the cross-entropy method. The new methodology's effectiveness is illustrated with the aid of actual data from the COVID-19 epidemic in Rio de Janeiro city in Brazil, employing an ordinary differential equation-based model with a generalized SEIR mechanistic structure that includes time-dependent transmission rate, asymptomatics, and hospitalizations. A minimization problem with two cost terms (number of hospitalizations and deaths) is formulated, and twelve parameters are identified. The calibrated model provides a consistent description of the available data, able to extrapolate forecasts over a few weeks, making the proposed methodology very appealing for real-time epidemic modeling.

3.

Reconstruction of Grains in Polycrystalline Materials From Incomplete Data Using Laguerre Tessellations.

Petrich, Lukas; Stanek, Jakub; Wang, Mingyan; Westhoff, Daniel; Heller, Ludek; Sittner, Petr; Krill, Carl E; Benes, Viktor; Schmidt, Volker.

Microsc Microanal ; 25(3): 743-752, 2019 Jun.

Artigo em Inglês | MEDLINE | ID: mdl-31038096

RESUMO

Far-field three-dimensional X-ray diffraction microscopy allows for quick measurement of the centers of mass and volumes of a large number of grains in a polycrystalline material, along with their crystal lattice orientations and internal stresses. However, the grain boundaries-and, therefore, individual grain shapes-are not observed directly. The present paper aims to overcome this shortcoming by reconstructing grain shapes based only on the incomplete morphological data described above. To this end, cross-entropy (CE) optimization is employed to find a Laguerre tessellation that minimizes the discrepancy between its centers of mass and cell sizes and those of the measured grain data. The proposed algorithm is highly parallel and is thus capable of handling many grains (>8,000). The validity and stability of the CE approach are verified on simulated and experimental datasets.

4.

A Novel Hybrid Meta-Heuristic Algorithm Based on the Cross-Entropy Method and Firefly Algorithm for Global Optimization.

Li, Guocheng; Liu, Pei; Le, Chengyi; Zhou, Benda.

Entropy (Basel) ; 21(5)2019 May 14.

Artigo em Inglês | MEDLINE | ID: mdl-33267208

RESUMO

Global optimization, especially on a large scale, is challenging to solve due to its nonlinearity and multimodality. In this paper, in order to enhance the global searching ability of the firefly algorithm (FA) inspired by bionics, a novel hybrid meta-heuristic algorithm is proposed by embedding the cross-entropy (CE) method into the firefly algorithm. With adaptive smoothing and co-evolution, the proposed method fully absorbs the ergodicity, adaptability and robustness of the cross-entropy method. The new hybrid algorithm achieves an effective balance between exploration and exploitation to avoid falling into a local optimum, enhance its global searching ability, and improve its convergence rate. The results of numeral experiments show that the new hybrid algorithm possesses more powerful global search capacity, higher optimization precision, and stronger robustness.

5.

Fully Adaptive Particle Filtering Algorithm for Damage Diagnosis and Prognosis.

Rabiei, Elaheh; Droguett, Enrique Lopez; Modarres, Mohammad.

Entropy (Basel) ; 20(2)2018 Jan 31.

Artigo em Inglês | MEDLINE | ID: mdl-33265191

RESUMO

A fully adaptive particle filtering algorithm is proposed in this paper which is capable of updating both state process models and measurement models separately and simultaneously. The approach is a significant step toward more realistic online monitoring or tracking damage. The majority of the existing methods for Bayes filtering are based on predefined and fixed state process and measurement models. Simultaneous estimation of both state and model parameters has gained attention in recent literature. Some works have been done on updating the state process model. However, not many studies exist regarding an update of the measurement model. In most of the real-world applications, the correlation between measurements and the hidden state of damage is not defined in advance and, therefore, presuming an offline fixed measurement model is not promising. The proposed approach is based on optimizing relative entropy or Kullback-Leibler divergence through a particle filtering algorithm. The proposed algorithm is successfully applied to a case study of online fatigue damage estimation in composite materials.

6.

Multiset multicover methods for discriminative marker selection.

Hasanaj, Euxhen; Alavi, Amir; Gupta, Anupam; Póczos, Barnabás; Bar-Joseph, Ziv.

Cell Rep Methods ; 2(11): 100332, 2022 11 21.

Artigo em Inglês | MEDLINE | ID: mdl-36452867

RESUMO

Markers are increasingly being used for several high-throughput data analysis and experimental design tasks. Examples include the use of markers for assigning cell types in scRNA-seq studies, for deconvolving bulk gene expression data, and for selecting marker proteins in single-cell spatial proteomics studies. Most marker selection methods focus on differential expression (DE) analysis. Although such methods work well for data with a few non-overlapping marker sets, they are not appropriate for large atlas-size datasets where several cell types and tissues are considered. To address this, we define the phenotype cover (PC) problem for marker selection and present algorithms that can improve the discriminative power of marker sets. Analysis of these sets on several marker-selection tasks suggests that these methods can lead to solutions that accurately distinguish different phenotypes in the data.

Assuntos

Perfilação da Expressão Gênica , Análise de Célula Única , Perfilação da Expressão Gênica/métodos , Análise de Célula Única/métodos , Análise por Conglomerados , Algoritmos , Fenótipo

7.

Development and testing of a genetic marker-based pedigree reconstruction system 'PR-genie' incorporating size-class data.

Cope, Robert C; Lanyon, Janet M; Seddon, Jennifer M; Pollett, Philip K.

Mol Ecol Resour ; 14(4): 857-70, 2014 Jul.

Artigo em Inglês | MEDLINE | ID: mdl-24373173

RESUMO

For wildlife populations, it is often difficult to determine biological parameters that indicate breeding patterns and population mixing, but knowledge of these parameters is essential for effective management. A pedigree encodes the relationship between individuals and can provide insight into the dynamics of a population over its recent history. Here, we present a method for the reconstruction of pedigrees for wild populations of animals that live long enough to breed multiple times over their lifetime and that have complex or unknown generational structures. Reconstruction was based on microsatellite genotype data along with ancillary biological information: sex and observed body size class as an indicator of relative age of individuals within the population. Using body size-class data to infer relative age has not been considered previously in wildlife genealogy and provides a marked improvement in accuracy of pedigree reconstruction. Body size-class data are particularly useful for wild populations because it is much easier to collect noninvasively than absolute age data. This new pedigree reconstruction system, PR-genie, performs reconstruction using maximum likelihood with optimization driven by the cross-entropy method. We demonstrated pedigree reconstruction performance on simulated populations (comparing reconstructed pedigrees to known true pedigrees) over a wide range of population parameters and under assortative and intergenerational mating schema. Reconstruction accuracy increased with the presence of size-class data and as the amount and quality of genetic data increased. We provide recommendations as to the amount and quality of data necessary to provide insight into detailed familial relationships in a wildlife population using this pedigree reconstruction technique.

Assuntos

Marcadores Genéticos , Linhagem , Software , Animais , Animais Selvagens , Tamanho Corporal

8.

Adaptive importance sampling for network growth models.

Guetz, Adam N; Holmes, Susan P.

Ann Oper Res ; 189(1): 187-203, 2011 Sep.

Artigo em Inglês | MEDLINE | ID: mdl-27182098

RESUMO

Network Growth Models such as Preferential Attachment and Duplication/Divergence are popular generative models with which to study complex networks in biology, sociology, and computer science. However, analyzing them within the framework of model selection and statistical inference is often complicated and computationally difficult, particularly when comparing models that are not directly related or nested. In practice, ad hoc methods are often used with uncertain results. If possible, the use of standard likelihood-based statistical model selection techniques is desirable. With this in mind, we develop an Adaptive Importance Sampling algorithm for estimating likelihoods of Network Growth Models. We introduce the use of the classic Plackett-Luce model of rankings as a family of importance distributions. Updates to importance distributions are performed iteratively via the Cross-Entropy Method with an additional correction for degeneracy/over-fitting inspired by the Minimum Description Length principle. This correction can be applied to other estimation problems using the Cross-Entropy method for integration/approximate counting, and it provides an interpretation of Adaptive Importance Sampling as iterative model selection. Empirical results for the Preferential Attachment model are given, along with a comparison to an alternative established technique, Annealed Importance Sampling.

RESUMO

Assuntos

RESUMO

RESUMO

RESUMO

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

Detalhe da pesquisa