Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 10 de 10
Filtrar
1.
J Chem Inf Model ; 62(3): 423-432, 2022 02 14.
Artigo em Inglês | MEDLINE | ID: mdl-35029112

RESUMO

PoreMatMod.jl is a free, open-source, user-friendly, and documented Julia package for modifying crystal structure models of porous materials such as metal-organic frameworks (MOFs). PoreMatMod.jl functions as a find-and-replace algorithm on crystal structures by leveraging (i) Ullmann's algorithm to search for subgraphs of the crystal structure graph that are isomorphic to the graph of a query fragment and (ii) the orthogonal Procrustes algorithm to align a replacement fragment with a targeted substructure of the crystal structure for installation. The prominent application of PoreMatMod.jl is to generate libraries of hypothetical structures for virtual screenings. For example, one can install functional groups on the linkers of a parent MOF, mimicking postsynthetic modification. Other applications of PoreMatMod.jl to modify crystal structure models include introducing defects with precision and correcting artifacts of X-ray structure determination (adding missing hydrogen atoms, resolving disorder, and removing guest molecules). The find-and-replace operations implemented by PoreMatMod.jl can be applied broadly to diverse atomistic systems for various in silico structural modification tasks.


Assuntos
Algoritmos , Estruturas Metalorgânicas , Estruturas Metalorgânicas/química , Porosidade
2.
J Chem Phys ; 157(3): 034102, 2022 Jul 21.
Artigo em Inglês | MEDLINE | ID: mdl-35868929

RESUMO

Pesticides benefit agriculture by increasing crop yield, quality, and security. However, pesticides may inadvertently harm bees, which are valuable as pollinators. Thus, candidate pesticides in development pipelines must be assessed for toxicity to bees. Leveraging a dataset of 382 molecules with toxicity labels from honey bee exposure experiments, we train a support vector machine (SVM) to predict the toxicity of pesticides to honey bees. We compare two representations of the pesticide molecules: (i) a random walk feature vector listing counts of length-L walks on the molecular graph with each vertex- and edge-label sequence and (ii) the Molecular ACCess System (MACCS) structural key fingerprint (FP), a bit vector indicating the presence/absence of a list of pre-defined subgraph patterns in the molecular graph. We explicitly construct the MACCS FPs but rely on the fixed-length-L random walk graph kernel (RWGK) in place of the dot product for the random walk representation. The L-RWGK-SVM achieves an accuracy, precision, recall, and F1 score (mean over 2000 runs) of 0.81, 0.68, 0.71, and 0.69, respectively, on the test data set-with L = 4 being the mode optimal walk length. The MACCS-FP-SVM performs on par/marginally better than the L-RWGK-SVM, lends more interpretability, but varies more in performance. We interpret the MACCS-FP-SVM by illuminating which subgraph patterns in the molecules tend to strongly push them toward the toxic/non-toxic side of the separating hyperplane.


Assuntos
Praguicidas , Animais , Abelhas , Praguicidas/análise , Praguicidas/toxicidade , Máquina de Vetores de Suporte
3.
Am J Epidemiol ; 189(1): 55-67, 2020 01 31.
Artigo em Inglês | MEDLINE | ID: mdl-31595960

RESUMO

Heterogeneous exposure associations (HEAs) can be defined as differences in the association of an exposure with an outcome among subgroups that differ by a set of characteristics. In this article, we intend to foster discussion of HEAs in the epidemiologic literature and present a variant of the random forest algorithm that can be used to identify HEAs. We demonstrate the use of this algorithm in the setting of the association between systolic blood pressure and death in older adults. The training set included pooled data from the baseline examination of the Cardiovascular Health Study (1989-1993), the Health, Aging, and Body Composition Study (1997-1998), and the Sacramento Area Latino Study on Aging (1998-1999). The test set included data from the National Health and Nutrition Examination Survey (1999-2002). The hazard ratios ranged from 1.25 (95% confidence interval: 1.13, 1.37) per 10-mm Hg increase in systolic blood pressure among men aged ≤67 years with diastolic blood pressure greater than 80 mm Hg to 1.00 (95% confidence interval: 0.96, 1.03) among women with creatinine concentration ≤0.7 mg/dL and a history of hypertension. HEAs have the potential to improve our understanding of disease mechanisms in diverse populations and guide the design of randomized controlled trials to control exposures in heterogeneous populations.


Assuntos
Pressão Sanguínea , Interpretação Estatística de Dados , Métodos Epidemiológicos , Hipertensão/mortalidade , Estudos Observacionais como Assunto/estatística & dados numéricos , Idoso , Algoritmos , Determinação da Pressão Arterial , Estudos de Coortes , Feminino , Humanos , Hipertensão/etiologia , Masculino , Inquéritos Nutricionais , Modelos de Riscos Proporcionais
4.
J Acoust Soc Am ; 131(6): 4640-50, 2012 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-22712937

RESUMO

Although field-collected recordings typically contain multiple simultaneously vocalizing birds of different species, acoustic species classification in this setting has received little study so far. This work formulates the problem of classifying the set of species present in an audio recording using the multi-instance multi-label (MIML) framework for machine learning, and proposes a MIML bag generator for audio, i.e., an algorithm which transforms an input audio signal into a bag-of-instances representation suitable for use with MIML classifiers. The proposed representation uses a 2D time-frequency segmentation of the audio signal, which can separate bird sounds that overlap in time. Experiments using audio data containing 13 species collected with unattended omnidirectional microphones in the H. J. Andrews Experimental Forest demonstrate that the proposed methods achieve high accuracy (96.1% true positives/negatives). Automated detection of bird species occurrence using MIML has many potential applications, particularly in long-term monitoring of remote sites, species distribution modeling, and conservation planning.


Assuntos
Acústica , Aves/classificação , Vocalização Animal/classificação , Algoritmos , Animais , Aves/fisiologia , Ruído/prevenção & controle , Mascaramento Perceptivo/fisiologia , Reprodutibilidade dos Testes , Espectrografia do Som , Gravação em Fita , Vocalização Animal/fisiologia
5.
mSystems ; 7(1): e0105821, 2022 02 22.
Artigo em Inglês | MEDLINE | ID: mdl-35040699

RESUMO

A growing body of research has established that the microbiome can mediate the dynamics and functional capacities of diverse biological systems. Yet, we understand little about what governs the response of these microbial communities to host or environmental changes. Most efforts to model microbiomes focus on defining the relationships between the microbiome, host, and environmental features within a specified study system and therefore fail to capture those that may be evident across multiple systems. In parallel with these developments in microbiome research, computer scientists have developed a variety of machine learning tools that can identify subtle, but informative, patterns from complex data. Here, we recommend using deep transfer learning to resolve microbiome patterns that transcend study systems. By leveraging diverse public data sets in an unsupervised way, such models can learn contextual relationships between features and build on those patterns to perform subsequent tasks (e.g., classification) within specific biological contexts.


Assuntos
Microbiota , Microbiota/fisiologia , Aprendizado de Máquina
6.
J Phys Condens Matter ; 33(46)2021 Sep 07.
Artigo em Inglês | MEDLINE | ID: mdl-34404041

RESUMO

Metal-organic frameworks (MOFs) are nanoporous materials with good prospects as recognition elements for gas sensors owing to their adsorptive sensitivity and selectivity. A gravimetric, MOF-based sensor functions by measuring the mass of gas adsorbed in a MOF. Changes in the gas composition are expected to produce detectable changes in the mass of gas adsorbed in the MOF. In practical settings, multiple components of the gas adsorb into the MOF and contribute to the sensor response. As a result, there are typically many distinct gas compositions that produce the same single-sensor response. The response vector of a gas sensor array places multiple constraints on the gas composition. Still, if the number of degrees of freedom in the gas composition is greater than the number of MOFs in the sensor array, the map from gas compositions to response vectors will be non-injective (many-to-one). Here, we outline a mathematical method to determine undetectable changes in gas composition to which non-injective gas sensor arrays are unresponsive. This is important for understanding their limitations and vulnerabilities. We focus on gravimetric, MOF-based gas sensor arrays. Our method relies on a mixed-gas adsorption model in the MOFs comprising the sensor array, which gives the mass of gas adsorbed in each MOF as a function of the gas composition. The singular value decomposition of the Jacobian matrix of the adsorption model uncovers (i) the unresponsive directions and (ii) the responsive directions, ranked by sensitivity, in gas composition space. We illustrate the identification of unresponsive subspaces and ranked responsive directions for gas sensor arrays based on Co-MOF-74 and HKUST-1 aimed at quantitative sensing of CH4/N2/CO2/C2H6mixtures relevant to natural gas sensing.

7.
Artigo em Inglês | MEDLINE | ID: mdl-35403175

RESUMO

Infants' free-play behavior is highly variable. However, in developmental science, traditional analysis tools for modeling and understanding variable behavior are limited. Here, we used Hidden Markov Models (HMMs) to capture behavioral states that govern infants' toy selection during 20 minutes of free play in a new environment. We demonstrate that applying HMMs to infant data can identify hidden behavioral states and thereby reveal the underlying structure of infant toy selection and how toy selection changes in real time during spontaneous free play. More broadly, we propose that hidden-state models provide a fruitful avenue for understanding individual differences in spontaneous infant behavior.

8.
IEEE Trans Med Imaging ; 39(10): 3125-3136, 2020 10.
Artigo em Inglês | MEDLINE | ID: mdl-32305904

RESUMO

Histopathological image analysis is a challenging task due to a diverse histology feature set as well as due to the presence of large non-informative regions in whole slide images. In this paper, we propose a multiple-instance learning (MIL) method for image-level classification as well as for annotating relevant regions in the image. In MIL, a common assumption is that negative bags contain only negative instances while positive bags contain one or more positive instances. This asymmetric assumption may be inappropriate for some application scenarios where negative bags also contain representative negative instances. We introduce a novel symmetric MIL framework associating each instance in a bag with an attribute which can be either negative, positive, or irrelevant. We extend the notion of relevance by introducing control over the number of relevant instances. We develop a probabilistic graphical model that incorporates the aforementioned paradigm and a corresponding computationally efficient inference for learning the model parameters and obtaining an instance level attribute-learning classifier. The effectiveness of the proposed method is evaluated on available histopathology datasets with promising results.


Assuntos
Processamento de Imagem Assistida por Computador , Modelos Estatísticos
9.
IEEE Trans Pattern Anal Mach Intell ; 39(12): 2381-2394, 2017 12.
Artigo em Inglês | MEDLINE | ID: mdl-28103189

RESUMO

Labeling data for classification requires significant human effort. To reduce labeling cost, instead of labeling every instance, a group of instances (bag) is labeled by a single bag label. Computer algorithms are then used to infer the label for each instance in a bag, a process referred to as instance annotation. This task is challenging due to the ambiguity regarding the instance labels. We propose a discriminative probabilistic model for the instance annotation problem and introduce an expectation maximization framework for inference, based on the maximum likelihood approach. For many probabilistic approaches, brute-force computation of the instance label posterior probability given its bag label is exponential in the number of instances in the bag. Our contribution is a dynamic programming method for computing the posterior that is linear in the number of instances. We evaluate our method using both benchmark and real world data sets, in the domain of bird song, image annotation, and activity recognition. In many cases, the proposed framework outperforms, sometimes significantly, the current state-of-the-art MIML learning methods, both in instance label prediction and bag label prediction.

10.
Biotechnol Prog ; 25(4): 1009-17, 2009.
Artigo em Inglês | MEDLINE | ID: mdl-19610124

RESUMO

The nitrogen (N) concentration and pH of culture media were optimized for increased fermentative hydrogen (H(2)) production from the cyanobacterium, Synechocystis sp. PCC 6803. The optimization was conducted using two procedures, response surface methodology (RSM), which is commonly used, and a memory-based machine learning algorithm, Q2, which has not been used previously in biotechnology applications. Both RSM and Q2 were successful in predicting optimum conditions that yielded higher H(2) than the media reported by Burrows et al., Int J Hydrogen Energy. 2008;33:6092-6099 optimized for N, S, and C (called EHB-1 media hereafter), which itself yielded almost 150 times more H(2) than Synechocystis sp. PCC 6803 grown on sulfur-free BG-11 media. RSM predicted an optimum N concentration of 0.63 mM and pH of 7.77, which yielded 1.70 times more H(2) than EHB-1 media when normalized to chlorophyll concentration (0.68 +/- 0.43 micromol H(2) mg Chl(-1) h(-1)) and 1.35 times more when normalized to optical density (1.62 +/- 0.09 nmol H(2) OD(730) (-1) h(-1)). Q2 predicted an optimum of 0.36 mM N and pH of 7.88, which yielded 1.94 and 1.27 times more H(2) than EHB-1 media when normalized to chlorophyll concentration (0.77 +/- 0.44 micromol H(2) mg Chl(-1) h(-1)) and optical density (1.53 +/- 0.07 nmol H(2) OD(730) (-1) h(-1)), respectively. Both optimization methods have unique benefits and drawbacks that are identified and discussed in this study.


Assuntos
Hidrogênio/metabolismo , Nitrogênio/metabolismo , Synechocystis/química , Synechocystis/metabolismo , Biologia de Sistemas/métodos , Fermentação , Concentração de Íons de Hidrogênio , Modelos Estatísticos
SELEÇÃO DE REFERÊNCIAS
Detalhe da pesquisa