Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 8 de 8
Filtrar
1.
Proc Natl Acad Sci U S A ; 117(48): 30055-30062, 2020 12 01.
Artigo em Inglês | MEDLINE | ID: mdl-32471948

RESUMO

Many domains of science have developed complex simulations to describe phenomena of interest. While these simulations provide high-fidelity models, they are poorly suited for inference and lead to challenging inverse problems. We review the rapidly developing field of simulation-based inference and identify the forces giving additional momentum to the field. Finally, we describe how the frontier is expanding so that a broad audience can appreciate the profound influence these developments may have on science.

2.
Proc Natl Acad Sci U S A ; 117(10): 5242-5249, 2020 03 10.
Artigo em Inglês | MEDLINE | ID: mdl-32079725

RESUMO

Simulators often provide the best description of real-world phenomena. However, the probability density that they implicitly define is often intractable, leading to challenging inverse problems for inference. Recently, a number of techniques have been introduced in which a surrogate for the intractable density is learned, including normalizing flows and density ratio estimators. We show that additional information that characterizes the latent process can often be extracted from simulators and used to augment the training data for these surrogate models. We introduce several loss functions that leverage these augmented data and demonstrate that these techniques can improve sample efficiency and quality of inference.

3.
Brain ; 141(11): 3179-3192, 2018 11 01.
Artigo em Inglês | MEDLINE | ID: mdl-30285102

RESUMO

Determining the state of consciousness in patients with disorders of consciousness is a challenging practical and theoretical problem. Recent findings suggest that multiple markers of brain activity extracted from the EEG may index the state of consciousness in the human brain. Furthermore, machine learning has been found to optimize their capacity to discriminate different states of consciousness in clinical practice. However, it is unknown how dependable these EEG markers are in the face of signal variability because of different EEG configurations, EEG protocols and subpopulations from different centres encountered in practice. In this study we analysed 327 recordings of patients with disorders of consciousness (148 unresponsive wakefulness syndrome and 179 minimally conscious state) and 66 healthy controls obtained in two independent research centres (Paris Pitié-Salpêtrière and Liège). We first show that a non-parametric classifier based on ensembles of decision trees provides robust out-of-sample performance on unseen data with a predictive area under the curve (AUC) of ~0.77 that was only marginally affected when using alternative EEG configurations (different numbers and positions of sensors, numbers of epochs, average AUC = 0.750 ± 0.014). In a second step, we observed that classifiers based on multiple as well as single EEG features generalize to recordings obtained from different patient cohorts, EEG protocols and different centres. However, the multivariate model always performed best with a predictive AUC of 0.73 for generalization from Paris 1 to Paris 2 datasets, and an AUC of 0.78 from Paris to Liège datasets. Using simulations, we subsequently demonstrate that multivariate pattern classification has a decisive performance advantage over univariate classification as the stability of EEG features decreases, as different EEG configurations are used for feature-extraction or as noise is added. Moreover, we show that the generalization performance from Paris to Liège remains stable even if up to 20% of the diagnostic labels are randomly flipped. Finally, consistent with recent literature, analysis of the learned decision rules of our classifier suggested that markers related to dynamic fluctuations in theta and alpha frequency bands carried independent information and were most influential. Our findings demonstrate that EEG markers of consciousness can be reliably, economically and automatically identified with machine learning in various clinical and acquisition contexts.


Assuntos
Transtornos da Consciência/diagnóstico , Estado de Consciência/classificação , Eletroencefalografia , Adulto , Estado de Consciência/fisiologia , Transtornos da Consciência/classificação , Entropia , Feminino , Humanos , Teoria da Informação , Masculino , Pessoa de Meia-Idade , Vigília , Adulto Jovem
4.
Phys Rev Lett ; 121(11): 111801, 2018 Sep 14.
Artigo em Inglês | MEDLINE | ID: mdl-30265125

RESUMO

We present powerful new analysis techniques to constrain effective field theories at the LHC. By leveraging the structure of particle physics processes, we extract extra information from Monte Carlo simulations, which can be used to train neural network models that estimate the likelihood ratio. These methods scale well to processes with many observables and theory parameters, do not require any approximations of the parton shower or detector response, and can be evaluated in microseconds. We show that they allow us to put significantly stronger bounds on dimension-six operators than existing methods, demonstrating their potential to improve the precision of the LHC legacy constraints.

5.
Bioinformatics ; 32(9): 1395-401, 2016 05 01.
Artigo em Inglês | MEDLINE | ID: mdl-26755625

RESUMO

MOTIVATION: Collaborative analysis of massive imaging datasets is essential to enable scientific discoveries. RESULTS: We developed Cytomine to foster active and distributed collaboration of multidisciplinary teams for large-scale image-based studies. It uses web development methodologies and machine learning in order to readily organize, explore, share and analyze (semantically and quantitatively) multi-gigapixel imaging data over the internet. We illustrate how it has been used in several biomedical applications. AVAILABILITY AND IMPLEMENTATION: Cytomine (http://www.cytomine.be/) is freely available under an open-source license from http://github.com/cytomine/ A documentation wiki (http://doc.cytomine.be) and a demo server (http://demo.cytomine.be) are also available. CONTACT: info@cytomine.be SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Interpretação de Imagem Assistida por Computador , Estatística como Assunto , Internet , Software
6.
Front Artif Intell ; 6: 1128153, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-37091301

RESUMO

The genetic code is textbook scientific knowledge that was soundly established without resorting to Artificial Intelligence (AI). The goal of our study was to check whether a neural network could re-discover, on its own, the mapping links between codons and amino acids and build the complete deciphering dictionary upon presentation of transcripts proteins data training pairs. We compared different Deep Learning neural network architectures and estimated quantitatively the size of the required human transcriptomic training set to achieve the best possible accuracy in the codon-to-amino-acid mapping. We also investigated the effect of a codon embedding layer assessing the semantic similarity between codons on the rate of increase of the training accuracy. We further investigated the benefit of quantifying and using the unbalanced representations of amino acids within real human proteins for a faster deciphering of rare amino acids codons. Deep neural networks require huge amount of data to train them. Deciphering the genetic code by a neural network is no exception. A test accuracy of 100% and the unequivocal deciphering of rare codons such as the tryptophan codon or the stop codons require a training dataset of the order of 4-22 millions cumulated pairs of codons with their associated amino acids presented to the neural network over around 7-40 training epochs, depending on the architecture and settings. We confirm that the wide generic capacities and modularity of deep neural networks allow them to be customized easily to learn the deciphering task of the genetic code efficiently.

7.
Arch Public Health ; 80(1): 71, 2022 Mar 04.
Artigo em Inglês | MEDLINE | ID: mdl-35241162

RESUMO

BACKGROUND: The role played by large-scale repetitive SARS-CoV-2 screening programs within university populations interacting continuously with an urban environment, is unknown. Our objective was to develop a model capable of predicting the dispersion of viral contamination among university populations dividing their time between social and academic environments. METHODS: Data was collected through real, large-scale testing developed at the University of Liège, Belgium, during the period Sept. 28th-Oct. 29th 2020. The screening, offered to students and staff (n = 30,000), began 2 weeks after the re-opening of the campus but had to be halted after 5 weeks due to an imposed general lockdown. The data was then used to feed a two-population model (University + surrounding environment) implementing a generalized susceptible-exposed-infected-removed compartmental modeling framework. RESULTS: The considered two-population model was sufficiently versatile to capture the known dynamics of the pandemic. The reproduction number was estimated to be significantly larger on campus than in the urban population, with a net difference of 0.5 in the most severe conditions. The low adhesion rate for screening (22.6% on average) and the large reproduction number meant the pandemic could not be contained. However, the weekly screening could have prevented 1393 cases (i.e. 4.6% of the university population; 95% CI: 4.4-4.8%) compared to a modeled situation without testing. CONCLUSION: In a real life setting in a University campus, periodic screening could contribute to limiting the SARS-CoV-2 pandemic cycle but is highly dependent on its environment.

8.
PLoS One ; 9(4): e93379, 2014.
Artigo em Inglês | MEDLINE | ID: mdl-24695491

RESUMO

The primary goal of genome-wide association studies (GWAS) is to discover variants that could lead, in isolation or in combination, to a particular trait or disease. Standard approaches to GWAS, however, are usually based on univariate hypothesis tests and therefore can account neither for correlations due to linkage disequilibrium nor for combinations of several markers. To discover and leverage such potential multivariate interactions, we propose in this work an extension of the Random Forest algorithm tailored for structured GWAS data. In terms of risk prediction, we show empirically on several GWAS datasets that the proposed T-Trees method significantly outperforms both the original Random Forest algorithm and standard linear models, thereby suggesting the actual existence of multivariate non-linear effects due to the combinations of several SNPs. We also demonstrate that variable importances as derived from our method can help identify relevant loci. Finally, we highlight the strong impact that quality control procedures may have, both in terms of predictive power and loci identification. Variable importance results and T-Trees source code are all available at www.montefiore.ulg.ac.be/~botta/ttrees/ and github.com/0asa/TTree-source respectively.


Assuntos
Polimorfismo de Nucleotídeo Único/genética , Algoritmos , Loci Gênicos/genética , Estudo de Associação Genômica Ampla/métodos , Humanos , Modelos Lineares , Desequilíbrio de Ligação/genética , Modelos Genéticos , Risco , Software
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA