Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 14 de 14
Filtrar
1.
J Phys Chem A ; 127(27): 5745-5759, 2023 Jul 13.
Artigo em Inglês | MEDLINE | ID: mdl-37381078

RESUMO

Markov State Models (MSM) and related techniques have gained significant traction as a tool for analyzing and guiding molecular dynamics (MD) simulations due to their ability to extract structural, thermodynamic, and kinetic information on proteins using computationally feasible MD simulations. The MSM analysis often relies on spectral decomposition of empirically generated transition matrices. This work discusses an alternative approach for extracting the thermodynamic and kinetic information from the so-called rate/generator matrix rather than the transition matrix. Although the rate matrix itself is built from the empirical transition matrix, it provides an alternative approach for estimating both thermodynamic and kinetic quantities, particularly in diffusive processes. A fundamental issue with this approach is known as the embeddability problem. The key contribution of this work is the introduction of a novel method to address the embeddability problem as well as the collection and utilization of existing algorithms previously used in the literature. The algorithms are tested on data from a one-dimensional toy model to show the workings of these methods and discuss the robustness of each method in dependence of lag time and trajectory length.

2.
Microsc Microanal ; : 1-14, 2022 Mar 28.
Artigo em Inglês | MEDLINE | ID: mdl-35343415

RESUMO

Spatially resolved in situ transmission electron microscopy (TEM), equipped with direct electron detection systems, is a suitable technique to record information about the atom-scale dynamics with millisecond temporal resolution from materials. However, characterizing dynamics or fluxional behavior requires processing short time exposure images which usually have severely degraded signal-to-noise ratios. The poor signal-to-noise associated with high temporal resolution makes it challenging to determine the position and intensity of atomic columns in materials undergoing structural dynamics. To address this challenge, we propose a noise-robust, processing approach based on blob detection, which has been previously established for identifying objects in images in the community of computer vision. In particular, a blob detection algorithm has been tailored to deal with noisy TEM image series from nanoparticle systems. In the presence of high noise content, our blob detection approach is demonstrated to outperform the results of other algorithms, enabling the determination of atomic column position and its intensity with a higher degree of precision.

3.
Artigo em Inglês | MEDLINE | ID: mdl-35992040

RESUMO

Independent component analysis (ICA) is an unsupervised learning method popular in functional magnetic resonance imaging (fMRI). Group ICA has been used to search for biomarkers in neurological disorders including autism spectrum disorder and dementia. However, current methods use a principal component analysis (PCA) step that may remove low-variance features. Linear non-Gaussian component analysis (LNGCA) enables simultaneous dimension reduction and feature estimation including low-variance features in single-subject fMRI. A group LNGCA model is proposed to extract group components shared by more than one subject. Unlike group ICA methods, this novel approach also estimates individual (subject-specific) components orthogonal to the group components. To determine the total number of components in each subject, a parametric resampling test is proposed that samples spatially correlated Gaussian noise to match the spatial dependence observed in data. In simulations, estimated group components achieve higher accuracy compared to group ICA. The method is applied to a resting-state fMRI study on autism spectrum disorder in 342 children (252 typically developing, 90 with autism), where the group signals include resting-state networks. The discovered group components appear to exhibit different levels of temporal engagement in autism versus typically developing children, as revealed using group LNGCA. This novel approach to matrix decomposition is a promising direction for feature detection in neuroimaging.

4.
Neuroimage ; 142: 280-292, 2016 Nov 15.
Artigo em Inglês | MEDLINE | ID: mdl-27211473

RESUMO

Estimating spatiotemporal models for multi-subject fMRI is computationally challenging. We propose a mixed model for localization studies with spatial random effects and time-series errors. We develop method-of-moment estimators that leverage population and spatial information and are scalable to massive datasets. In simulations, subject-specific estimates of activation are considerably more accurate than the standard voxel-wise general linear model. Our mixed model also allows for valid population inference. We apply our model to cortical data from motor and theory of mind tasks from the Human Connectome Project (HCP). The proposed method results in subject-specific predictions that appear smoother and less noisy than those from the popular single-subject univariate approach. In particular, the regions of motor cortex associated with a left-hand finger-tapping task appear to be more clearly delineated. Subject-specific maps of activation from task fMRI are increasingly used in pre-surgical planning for tumor removal and in locating targets for transcranial magnetic stimulation. Our findings suggest that using spatial and population information is a promising avenue for improving clinical neuroimaging.


Assuntos
Conectoma/métodos , Interpretação Estatística de Dados , Imageamento por Ressonância Magnética/métodos , Modelos Estatísticos , Humanos , Atividade Motora/fisiologia , Córtex Motor/diagnóstico por imagem , Córtex Motor/fisiologia , Teoria da Mente/fisiologia
5.
Biometrics ; 70(1): 224-36, 2014 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-24350655

RESUMO

We examine differences between independent component analyses (ICAs) arising from different assumptions, measures of dependence, and starting points of the algorithms. ICA is a popular method with diverse applications including artifact removal in electrophysiology data, feature extraction in microarray data, and identifying brain networks in functional magnetic resonance imaging (fMRI). ICA can be viewed as a generalization of principal component analysis (PCA) that takes into account higher-order cross-correlations. Whereas the PCA solution is unique, there are many ICA methods-whose solutions may differ. Infomax, FastICA, and JADE are commonly applied to fMRI studies, with FastICA being arguably the most popular. Hastie and Tibshirani (2003) demonstrated that ProDenICA outperformed FastICA in simulations with two components. We introduce the application of ProDenICA to simulations with more components and to fMRI data. ProDenICA was more accurate in simulations, and we identified differences between biologically meaningful ICs from ProDenICA versus other methods in the fMRI analysis. ICA methods require nonconvex optimization, yet current practices do not recognize the importance of, nor adequately address sensitivity to, initial values. We found that local optima led to dramatically different estimates in both simulations and group ICA of fMRI, and we provide evidence that the global optimum from ProDenICA is the best estimate. We applied a modification of the Hungarian (Kuhn-Munkres) algorithm to match ICs from multiple estimates, thereby gaining novel insights into how brain networks vary in their sensitivity to initial values and ICA method.


Assuntos
Algoritmos , Mapeamento Encefálico/métodos , Imageamento por Ressonância Magnética/métodos , Modelos Estatísticos , Análise de Componente Principal/métodos , Adolescente , Transtorno do Deficit de Atenção com Hiperatividade/diagnóstico , Transtorno do Deficit de Atenção com Hiperatividade/fisiopatologia , Criança , Pré-Escolar , Simulação por Computador , Humanos
6.
J Appl Stat ; 51(6): 1210-1226, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-38628445

RESUMO

We examine the use of time series data, derived from Electric Cell-substrate Impedance Sensing (ECIS), to differentiate between standard mammalian cell cultures and those infected with a mycoplasma organism. With the goal of easy visualization and interpretation, we perform low-dimensional feature-based classification, extracting application-relevant features from the ECIS time courses. We can achieve very high classification accuracy using only two features, which depend on the cell line under examination. Initial results also show the existence of experimental variation between plates and suggest types of features that may prove more robust to such variation. Our paper is the first to perform a broad examination of ECIS time course features in the context of detecting contamination; to combine different types of features to achieve classification accuracy while preserving interpretability; and to describe and suggest possibilities for ameliorating plate-to-plate variation.

7.
Surgery ; 175(1): 121-127, 2024 01.
Artigo em Inglês | MEDLINE | ID: mdl-37925261

RESUMO

BACKGROUND: Machine learning has been increasingly used to develop algorithms that can improve medical diagnostics and prognostication and has shown promise in improving the classification of thyroid ultrasound images. This proof-of-concept study aims to develop a multimodal machine-learning model to classify follicular carcinoma from adenoma. METHODS: This is a retrospective study of patients with follicular adenoma or carcinoma at a single institution between 2010 and 2022. Demographics, imaging, and perioperative variables were collected. The region of interest was annotated on ultrasound and used to perform radiomics analysis. Imaging features and clinical variables were then used to create a random forest classifier to predict malignancy. Leave-one-out cross-validation was conducted to evaluate classifier performance using the area under the receiver operating characteristic curve. RESULTS: Patients with follicular adenomas (n = 7) and carcinomas (n = 11) with complete imaging and perioperative data were included. A total of 910 features were extracted from each image. The t-distributed stochastic neighbor embedding method reduced the dimension to 2 primary represented components. The random forest classifier achieved an area under the receiver operating characteristic curve of 0.76 (clinical only), 0.29 (image only), and 0.79 (multimodal data). CONCLUSION: Our multimodal machine learning model demonstrates promising results in classifying follicular carcinoma from adenoma. This approach can potentially be applied in future studies to generate models for preoperative differentiation of follicular thyroid neoplasms.


Assuntos
Adenocarcinoma Folicular , Adenoma , Neoplasias da Glândula Tireoide , Humanos , Inteligência Artificial , Estudos Retrospectivos , Neoplasias da Glândula Tireoide/diagnóstico por imagem , Neoplasias da Glândula Tireoide/patologia , Adenocarcinoma Folicular/diagnóstico por imagem , Adenoma/diagnóstico por imagem
8.
J Am Stat Assoc ; 118(541): 571-582, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-37346226

RESUMO

The Vector AutoRegressive Moving Average (VARMA) model is fundamental to the theory of multivariate time series; however, identifiability issues have led practitioners to abandon it in favor of the simpler but more restrictive Vector AutoRegressive (VAR) model. We narrow this gap with a new optimization-based approach to VARMA identification built upon the principle of parsimony. Among all equivalent data-generating models, we use convex optimization to seek the parameterization that is simplest in a certain sense. A user-specified strongly convex penalty is used to measure model simplicity, and that same penalty is then used to define an estimator that can be efficiently computed. We establish consistency of our estimators in a double-asymptotic regime. Our non-asymptotic error bound analysis accommodates both model specification and parameter estimation steps, a feature that is crucial for studying large-scale VARMA algorithms. Our analysis also provides new results on penalized estimation of infinite-order VAR, and elastic net regression under a singular covariance structure of regressors, which may be of independent interest. We illustrate the advantage of our method over VAR alternatives on three real data examples.

9.
Data Min Knowl Discov ; 35(5): 1882-1905, 2021.
Artigo em Inglês | MEDLINE | ID: mdl-34177356

RESUMO

The ability to accurately and consistently discover anomalies in time series is important in many applications. Fields such as finance (fraud detection), information security (intrusion detection), healthcare, and others all benefit from anomaly detection. Intuitively, anomalies in time series are time points or sequences of time points that deviate from normal behavior characterized by periodic oscillations and long-term trends. For example, the typical activity on e-commerce websites exhibits weekly periodicity and grows steadily before holidays. Similarly, domestic usage of electricity exhibits daily and weekly oscillations combined with long-term season-dependent trends. How can we accurately detect anomalies in such domains while simultaneously learning a model for normal behavior? We propose a robust offline unsupervised framework for anomaly detection in seasonal multivariate time series, called AURORA. A key innovation in our framework is a general background behavior model that unifies periodicity and long-term trends. To this end, we leverage a Ramanujan periodic dictionary and a spline-based dictionary to capture both seasonal and trend patterns. We conduct experiments on both synthetic and real-world datasets and demonstrate the effectiveness of our method. AURORA has significant advantages over existing models for anomaly detection, including high accuracy (AUC of up to 0.98), interpretability of recovered normal behavior ( 100 % accuracy in period detection), and the ability to detect both point and contextual anomalies. In addition, AURORA is orders of magnitude faster than baselines.

10.
PLoS One ; 16(9): e0255519, 2021.
Artigo em Inglês | MEDLINE | ID: mdl-34495951

RESUMO

Advances in remote sensing and machine learning enable increasingly accurate, inexpensive, and timely estimation of poverty and malnutrition indicators to guide development and humanitarian agencies' programming. However, state of the art models often rely on proprietary data and/or deep or transfer learning methods whose underlying mechanics may be challenging to interpret. We demonstrate how interpretable random forest models can produce estimates of a set of (potentially correlated) malnutrition and poverty prevalence measures using free, open access, regularly updated, georeferenced data. We demonstrate two use cases: contemporaneous prediction, which might be used for poverty mapping, geographic targeting, or monitoring and evaluation tasks, and a sequential nowcasting task that can inform early warning systems. Applied to data from 11 low and lower-middle income countries, we find predictive accuracy broadly comparable for both tasks to prior studies that use proprietary data and/or deep or transfer learning methods.


Assuntos
Aprendizado de Máquina , Desnutrição/epidemiologia , Pobreza/estatística & dados numéricos , Problemas Sociais/estatística & dados numéricos , Países em Desenvolvimento/economia , Países em Desenvolvimento/estatística & dados numéricos , Humanos , Desnutrição/economia , Análise Multivariada , Prevalência
11.
Environ Syst Decis ; 41(4): 594-615, 2021.
Artigo em Inglês | MEDLINE | ID: mdl-34306961

RESUMO

The electric power grid is a critical societal resource connecting multiple infrastructural domains such as agriculture, transportation, and manufacturing. The electrical grid as an infrastructure is shaped by human activity and public policy in terms of demand and supply requirements. Further, the grid is subject to changes and stresses due to diverse factors including solar weather, climate, hydrology, and ecology. The emerging interconnected and complex network dependencies make such interactions increasingly dynamic, posing novel risks, and presenting new challenges to manage the coupled human-natural system. This paper provides a survey of models and methods that seek to explore the significant interconnected impact of the electric power grid and interdependent domains. We also provide relevant critical risk indicators (CRIs) across diverse domains that may be used to assess risks to electric grid reliability, including climate, ecology, hydrology, finance, space weather, and agriculture. We discuss the convergence of indicators from individual domains to explore possible systemic risk, i.e., holistic risk arising from cross-domain interconnections. Further, we propose a compositional approach to risk assessment that incorporates diverse domain expertise and information, data science, and computer science to identify domain-specific CRIs and their union in systemic risk indicators. Our study provides an important first step towards data-driven analysis and predictive modeling of risks in interconnected human-natural systems.

12.
Stat (Int Stat Inst) ; 9(1): e302, 2020 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-32837718

RESUMO

Social distancing measures have been imposed across the United States in order to stem the spread of COVID-19. We quantify the reduction in the doubling rate, by state, that is associated with this intervention. Using the earlier of K-12 school closures and restaurant closures, by state, to define the start of the intervention, and considering daily confirmed cases through April 23, 2020, we find that social distancing is associated with a statistically-significant (p < 0.01) reduction in the doubling rate for all states except for Nebraska, North Dakota, and South Dakota, when controlling for false discovery, with the doubling rate averaged across the states falling from 0.302 (0.285, 0.320) days-1 to 0.010 (-0.007, 0.028) days-1. However, we do not find that social distancing has made the spread subcritical. Instead, social distancing has merely stabilized the spread of the disease. We provide an illustration of our findings for each state, including estimates of the effective reproduction number, R, both with and without social distancing. We also discuss the policy implications of our findings.

13.
Int J Biostat ; 16(1)2019 12 05.
Artigo em Inglês | MEDLINE | ID: mdl-31811802

RESUMO

We present new methods for cell line classification using multivariate time series bioimpedance data obtained from electric cell-substrate impedance sensing (ECIS) technology. The ECIS technology, which monitors the attachment and spreading of mammalian cells in real time through the collection of electrical impedance data, has historically been used to study one cell line at a time. However, we show that if applied to data from multiple cell lines, ECIS can be used to classify unknown or potentially mislabeled cells, factors which have previously been associated with the reproducibility crisis in the biological literature. We assess a range of approaches to this new problem, testing different classification methods and deriving a dictionary of 29 features to characterize ECIS data. Most notably, our analysis enriches the current field by making use of simultaneous multi-frequency ECIS data, where previous studies have focused on only one frequency; using classification methods to distinguish multiple cell lines, rather than simple statistical tests that compare only two cell lines; and assessing a range of features derived from ECIS data based on their classification performance. In classification tests on fifteen mammalian cell lines, we obtain very high out-of-sample predictive accuracy. These preliminary findings provide a baseline for future large-scale studies in this field.


Assuntos
Biofísica/métodos , Linhagem Celular/classificação , Técnicas Citológicas/métodos , Aprendizado de Máquina Supervisionado , Animais , Impedância Elétrica , Humanos
14.
Clin Epidemiol ; 10: 1801-1816, 2018.
Artigo em Inglês | MEDLINE | ID: mdl-30584374

RESUMO

PURPOSE: A reliable definition of exposure and knowledge about long-term medication patterns is important for drug safety studies during pregnancy. Few studies have investigated these measures for thyroid hormone replacement therapy (THRT). The purpose of this study was to 1) calculate the agreement between self-report and dispensed prescriptions of THRT and 2) classify women with similar adherence patterns to THRT into disjoint longitudinal trajectories. METHODS: Our analysis used data from the Norwegian Mother and Child Cohort Study (MoBa), a prospective population-based cohort study. MoBa was linked to prescription records from the Norwegian Prescription Database (NorPD). We estimated Cohen's kappa coefficients (k) and approximate 95% CIs for agreement between self-report and prescription records for the 6-month period prior to pregnancy and for each pregnancy trimester. Using group-based trajectory models (GBTMs), we estimated adherence trajectories among women who self-reported and had a THRT prescription. RESULTS: There were 56,148 women in MoBa, who had both a record in NorPD and available prescription history up to 1 year prior to pregnancy. Of these, 1,171 (2.1%) self-reported and received a prescription for THRT. Agreement was "perfect" in the 6-month period prior to pregnancy (k=0.86; CI 0.85-0.88), in the first (k=0.83; CI 0.82-0.85) and in the second trimesters (k=0.89; CI 0.87-0.90), while this was moderate (k=0.57; CI 0.54-0.59) in the third trimester. Among the subset of the 1,171 women, we identified four disjoint GBTM adherence groups: Constant-High (50.2%), Constant-Medium (32.9%), Increasing-Medium (11.0%), and Decreasing-Low (5.8%). CONCLUSION: Agreement between self-report and prescription records was high for THRT in the early pregnancy period. Based on our GBTM results, about one in two women with hypothyroidism had adequate adherence to prescribed THRT throughout pregnancy. Given the potential consequences, evidence of low adherence in 5.8% of pregnant women with hypothyroidism is of concern.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA