Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 7 de 7
Filter
1.
J Comput Chem ; 45(15): 1289-1302, 2024 Jun 05.
Article in English | MEDLINE | ID: mdl-38357973

ABSTRACT

Reinforcement learning (RL) methods have helped to define the state of the art in the field of modern artificial intelligence, mostly after the breakthrough involving AlphaGo and the discovery of novel algorithms. In this work, we present a RL method, based on Q-learning, for the structural determination of adsorbate@substrate models in silico, where the minimization of the energy landscape resulting from adsorbate interactions with a substrate is made by actions on states (translations and rotations) chosen from an agent's policy. The proposed RL method is implemented in an early version of the reinforcement learning software for materials design and discovery (RLMaterial), developed in Python3.x. RLMaterial interfaces with deMon2k, DFTB+, ORCA, and Quantum Espresso codes to compute the adsorbate@substrate energies. The RL method was applied for the structural determination of (i) the amino acid glycine and (ii) 2-amino-acetaldehyde, both interacting with a boron nitride (BN) monolayer, (iii) host-guest interactions between phenylboronic acid and ß-cyclodextrin and (iv) ammonia on naphthalene. Density functional tight binding calculations were used to build the complex search surfaces with a reasonably low computational cost for systems (i)-(iii) and DFT for system (iv). Artificial neural network and gradient boosting regression techniques were employed to approximate the Q-matrix or Q-table for better decision making (policy) on next actions. Finally, we have developed a transfer-learning protocol within the RL framework that allows learning from one chemical system and transferring the experience to another, as well as from different DFT or DFTB levels.

2.
J Occup Rehabil ; 30(3): 303-307, 2020 09.
Article in English | MEDLINE | ID: mdl-32623556

ABSTRACT

Rapid development in computer technology has led to sophisticated methods of analyzing large datasets with the aim of improving human decision making. Artificial Intelligence and Machine Learning (ML) approaches hold tremendous potential for solving complex real-world problems such as those faced by stakeholders attempting to prevent work disability. These techniques are especially appealing in work disability contexts that collect large amounts of data such as workers' compensation settings, insurance companies, large corporations, and health care organizations, among others. However, the approaches require thorough evaluation to determine if they add value to traditional statistical approaches. In this special series of articles, we examine the role and value of ML in the field of work disability prevention and occupational rehabilitation.


Subject(s)
Artificial Intelligence , Disabled Persons , Machine Learning , Workers' Compensation , Humans
3.
J Occup Rehabil ; 30(3): 318-330, 2020 09.
Article in English | MEDLINE | ID: mdl-31267266

ABSTRACT

Purpose The Work Assessment Triage Tool (WATT) is a clinical decision support tool developed using machine learning to help select interventions for patients with musculoskeletal disorders. The WATT categorizes patients based on individual characteristics according to likelihood of successful return to work following rehabilitation. A previous validation showed acceptable classification accuracy, but we re-examined accuracy using a new dataset drawn from the same system 2 years later. Methods A population-based cohort design was used, with data extracted from a Canadian compensation database on workers considered for rehabilitation between January 2013 and December 2016. Data were obtained on demographic, clinical, and occupational characteristics, type of rehabilitation undertaken, and return to work outcomes. Analysis included classification accuracy statistics of WATT recommendations. Results The sample included 28,919 workers (mean age 43.9 years, median duration 56 days), of whom 23,124 experienced a positive outcome within 30 days following return to work assessment. Sensitivity of the WATT for selecting successful programs was 0.13 while specificity was 0.87. Overall accuracy was 0.60 while human recommendations were higher at 0.72. Conclusions Overall accuracy of the WATT for selecting successful rehabilitation programs declined in a more recent cohort and proved less accurate than human clinical recommendations. Algorithm revision and further validation is needed.


Subject(s)
Musculoskeletal Diseases , Triage , Workers' Compensation , Adult , Canada , Cohort Studies , Humans
4.
Environ Int ; 131: 104972, 2019 10.
Article in English | MEDLINE | ID: mdl-31299602

ABSTRACT

BACKGROUND: Adverse birth outcomes (ABO) such as prematurity and small for gestational age confer a high risk of mortality and morbidity. ABO have been linked to air pollution; however, relationships with mixtures of industrial emissions are poorly understood. The exploration of relationships between ABO and mixtures is complex when hundreds of chemicals are analyzed simultaneously, requiring the use of novel approaches. OBJECTIVE: We aimed to generate robust hypotheses spatially linking mixtures and the occurrence of ABO using a spatial data mining algorithm and subsequent geographical and statistical analysis. The spatial data mining approach aimed to reduce data dimensionality and efficiently identify spatial associations between multiple chemicals and ABO. METHODS: We discovered co-location patterns of mixtures and ABO in Alberta, Canada (2006-2012). An ad-hoc spatial data mining algorithm allowed the extraction of primary co-location patterns of 136 chemicals released into the air by 6279 industrial facilities (National Pollutant Release Inventory), wind-patterns from 182 stations, and 333,247 singleton live births at the maternal postal code at delivery (Alberta Perinatal Health Program), from which we identified cases of preterm birth, small for gestational age, and low birth weight at term. We selected secondary patterns using a lift ratio metric from ABO and non-ABO impacted by the same mixture. The relevance of the secondary patterns was estimated using logistic models (adjusted by socioeconomic status and ABO-related maternal factors) and a geographic-based assignment of maternal exposure to the mixtures as calculated by kernel density. RESULTS: From 136 chemicals and three ABO, spatial data mining identified 1700 primary patterns from which five secondary patterns of three-chemical mixtures, including particulate matter, methyl-ethyl-ketone, xylene, carbon monoxide, 2-butoxyethanol, and n-butyl alcohol, were subsequently analyzed. The significance of the associations (odds ratio > 1) between the five mixtures and ABO provided statistical support for a new set of hypotheses. CONCLUSION: This study demonstrated that, in complex research settings, spatial data mining followed by pattern selection and geographic and statistical analyses can catalyze future research on associations between air pollutant mixtures and adverse birth outcomes.


Subject(s)
Air Pollutants/toxicity , Air Pollution/adverse effects , Maternal Exposure , Particulate Matter/toxicity , Pregnancy Outcome , Air Pollutants/analysis , Air Pollution/analysis , Alberta , Carbon Monoxide/analysis , Female , Humans , Industry , Infant, Low Birth Weight , Infant, Newborn , Logistic Models , Male , Odds Ratio , Particulate Matter/analysis , Pregnancy , Premature Birth/epidemiology
5.
Data Brief ; 25: 104104, 2019 Aug.
Article in English | MEDLINE | ID: mdl-31334309

ABSTRACT

Power converters are essential for the use of renewable energy resources. For example, a photovoltaic system produces DC energy that is transformed into AC by the voltage source inverter (VSI). This power is used by a motor drive that operates at different speeds, generating variable loads. Two parameters, namely, resistance and inductance are essential to correctly adjust the model predictive control (MPC) in a VSI. In this paper, we describe the data from a VSI that incorporates an MPC. We generate four datasets consisting of 399 cases or instances (rows) each one. Two data set comprises the simulations varying the inductance (continuous and discrete versions) and the other two varying the resistance (continuous and discrete versions). The motivation behind this data is to support the design and development of nonintrusive models to predict the resistance and inductance of a VSI under different conditions.

6.
BMC Med Inform Decis Mak ; 19(1): 112, 2019 06 17.
Article in English | MEDLINE | ID: mdl-31208407

ABSTRACT

BACKGROUND: Data mining tools have been increasingly used in health research, with the promise of accelerating discoveries. Lift is a standard association metric in the data mining community. However, health researchers struggle with the interpretation of lift. As a result, dissemination of data mining results can be met with hesitation. The relative risk and odds ratio are standard association measures in the health domain, due to their straightforward interpretation and comparability across populations. We aimed to investigate the lift-relative risk and the lift-odds ratio relationships, and provide tools to convert lift to the relative risk and odds ratio. METHODS: We derived equations linking lift-relative risk and lift-odds ratio. We discussed how lift, relative risk, and odds ratio behave numerically with varying association strengths and exposure prevalence levels. The lift-relative risk relationship was further illustrated using a high-dimensional dataset which examines the association of exposure to airborne pollutants and adverse birth outcomes. We conducted spatial association rule mining using the Kingfisher algorithm, which identified association rules using its built-in lift metric. We directly estimated relative risks and odds ratios from 2 by 2 tables for each identified rule. These values were compared to the corresponding lift values, and relative risks and odds ratios were computed using the derived equations. RESULTS: As the exposure-outcome association strengthens, the odds ratio and relative risk move away from 1 faster numerically than lift, i.e. |log (odds ratio)| ≥ |log (relative risk)| ≥ |log (lift)|. In addition, lift is bounded by the smaller of the inverse probability of outcome or exposure, i.e. lift≤ min (1/P(O), 1/P(E)). Unlike the relative risk and odds ratio, lift depends on the exposure prevalence for fixed outcomes. For example, when an exposure A and a less prevalent exposure B have the same relative risk for an outcome, exposure A has a lower lift than B. CONCLUSIONS: Lift, relative risk, and odds ratio are positively correlated and share the same null value. However, lift depends on the exposure prevalence, and thus is not straightforward to interpret or to use to compare association strength. Tools are provided to obtain the relative risk and odds ratio from lift.


Subject(s)
Data Mining , Epidemiologic Studies , Odds Ratio , Risk , Alberta/epidemiology , Female , Humans , Infant, Low Birth Weight , Infant, Newborn , Male , Maternal Exposure/statistics & numerical data
7.
BMC Public Health ; 17(1): 907, 2017 11 28.
Article in English | MEDLINE | ID: mdl-29179711

ABSTRACT

BACKGROUND: Data measuring airborne pollutants, public health and environmental factors are increasingly being stored and merged. These big datasets offer great potential, but also challenge traditional epidemiological methods. This has motivated the exploration of alternative methods to make predictions, find patterns and extract information. To this end, data mining and machine learning algorithms are increasingly being applied to air pollution epidemiology. METHODS: We conducted a systematic literature review on the application of data mining and machine learning methods in air pollution epidemiology. We carried out our search process in PubMed, the MEDLINE database and Google Scholar. Research articles applying data mining and machine learning methods to air pollution epidemiology were queried and reviewed. RESULTS: Our search queries resulted in 400 research articles. Our fine-grained analysis employed our inclusion/exclusion criteria to reduce the results to 47 articles, which we separate into three primary areas of interest: 1) source apportionment; 2) forecasting/prediction of air pollution/quality or exposure; and 3) generating hypotheses. Early applications had a preference for artificial neural networks. In more recent work, decision trees, support vector machines, k-means clustering and the APRIORI algorithm have been widely applied. Our survey shows that the majority of the research has been conducted in Europe, China and the USA, and that data mining is becoming an increasingly common tool in environmental health. For potential new directions, we have identified that deep learning and geo-spacial pattern mining are two burgeoning areas of data mining that have good potential for future applications in air pollution epidemiology. CONCLUSIONS: We carried out a systematic review identifying the current trends, challenges and new directions to explore in the application of data mining methods to air pollution epidemiology. This work shows that data mining is increasingly being applied in air pollution epidemiology. The potential to support air pollution epidemiology continues to grow with advancements in data mining related to temporal and geo-spacial mining, and deep learning. This is further supported by new sensors and storage mediums that enable larger, better quality data. This suggests that many more fruitful applications can be expected in the future.


Subject(s)
Air Pollution , Data Mining/statistics & numerical data , Epidemiologic Studies , Machine Learning/statistics & numerical data , Data Mining/methods , Humans
SELECTION OF CITATIONS
SEARCH DETAIL
...