Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 14 de 14
Filter
1.
Glob Chang Biol ; 30(3): e17221, 2024 Mar.
Article in English | MEDLINE | ID: mdl-38450880

ABSTRACT

Communities interspersed throughout the Canadian wildland are threatened by fires that have become bigger and more frequent in some parts of the country in recent decades. Identifying the fireshed (source area) and pathways from which wildland fire may ignite and spread from the landscape to a community is crucial for risk-reduction strategy and planning. We used outputs from a fire simulation model, including fire polygons and rate of spread, to map firesheds, fire pathways and corridors and spread distances for 1980 communities in the forested areas of Canada. We found fireshed sizes are larger in the north, where the mean distances between ecumene and fireshed perimeters were greater than 10 km. The Rayleigh Z test indicated that simulated fires around a large proportion of communities show significant directional trends, and these trends are stronger in the Boreal Plains and Shields than in the Rocky Mountain area. The average distance from which fire, when spreading at the maximum simulated rate, could reach the community perimeter was approximately 5, 12 and 18 km in 1, 2 and 3 days, respectively. The average daily spread distances increased latitudinally, from south to north. Spread distances were the shortest in the Pacific Maritime, Atlantic Maritime and Boreal Plains Ecozones, implying lower rates of spread compared to the rest of the country. The fire corridors generated from random ignitions and from ignitions predicted from local fire history differ, indicating that factors other than fuel (e.g. fire weather, ignition pattern) play a significant role in determining the direction that fires burn into a community.


Subject(s)
Disasters , Wildfires , Canada , Computer Simulation , Forests
2.
Sci Total Environ ; 869: 161831, 2023 Apr 15.
Article in English | MEDLINE | ID: mdl-36708831

ABSTRACT

A spread day is defined as a day in which fires grow a substantial amount of area; such days usually occur during high or extreme fire weather conditions. The identification and prediction of a spread day based on fire weather conditions could help both our understanding of fire regimes as well as forecasting and managing fires operationally. This study explores the relationships between fire weather and spread days in the forested areas of Canada by spatially and temporally matching a daily fire growth database to a daily gridded fire weather database that spans from 2001 to 2019. By examining the correlations between spread day fire weather conditions and location, conifer coverage (%), and elevation, we found that a spread day happens under less severe fire weather conditions as latitude increases for the entire study area and as conifer coverage increases within non-mountainous study areas. In the western mountain areas, however, with increasing conifer coverage more severe fire weather conditions are required for a spread day to occur. Using two modeling approaches, we were able to identify spread day indicators (generalized additive model) and to predict the occurrence of spread days (semi-binomial regression model) by Canadian Ecozones both annually and seasonally. Overall, Fine Fuel Moisture Code (FFMC), Initial Spread Index (ISI), and Fire Weather Index (FWI) performed the best in all models built for spread day identification and prediction but varied depending on the conditions mentioned above. FFMC was the most consistent across all spatial and temporal scales.

3.
J Cheminform ; 14(1): 50, 2022 Jul 28.
Article in English | MEDLINE | ID: mdl-35902962

ABSTRACT

In virtual screening for drug discovery, hit enrichment curves are widely used to assess the performance of ranking algorithms with regard to their ability to identify early enrichment. Unfortunately, researchers almost never consider the uncertainty associated with estimating such curves before declaring differences between performance of competing algorithms. Uncertainty is often large because the testing fractions of interest to researchers are small. Appropriate inference is complicated by two sources of correlation that are often overlooked: correlation across different testing fractions within a single algorithm, and correlation between competing algorithms. Additionally, researchers are often interested in making comparisons along the entire curve, not only at a few testing fractions. We develop inferential procedures to address both the needs of those interested in a few testing fractions, as well as those interested in the entire curve. For the former, four hypothesis testing and (pointwise) confidence intervals are investigated, and a newly developed EmProc approach is found to be most effective. For inference along entire curves, EmProc-based confidence bands are recommended for simultaneous coverage and minimal width. While we focus on the hit enrichment curve, this work is also appropriate for lift curves that are used throughout the machine learning community. Our inferential procedures trivially extend to enrichment factors, as well.

4.
J Cheminform ; 10(1): 57, 2018 Nov 28.
Article in English | MEDLINE | ID: mdl-30488298

ABSTRACT

The goal of chemmodlab is to streamline the fitting and assessment pipeline for many machine learning models in R, making it easy for researchers to compare the utility of these models. While focused on implementing methods for model fitting and assessment that have been accepted by experts in the cheminformatics field, all of the methods in chemmodlab have broad utility for the machine learning community. chemmodlab contains several assessment utilities, including a plotting function that constructs accumulation curves and a function that computes many performance measures. The most novel feature of chemmodlab is the ease with which statistically significant performance differences for many machine learning models is presented by means of the multiple comparisons similarity plot. Differences are assessed using repeated k-fold cross validation, where blocking increases precision and multiplicity adjustments are applied. chemmodlab is freely available on CRAN at https://cran.r-project.org/web/packages/chemmodlab/index.html .

5.
Molecules ; 23(12)2018 Nov 24.
Article in English | MEDLINE | ID: mdl-30477249

ABSTRACT

Permeation of chemical solutes through skin can create major health issues. Using the membrane-coated fiber (MCF) as a solid phase membrane extraction (SPME) approach to simulate skin permeation, we obtained partition coefficients for 37 solutes under 90 treatment combinations that could broadly represent formulations that could be associated with occupational skin exposure. These formulations were designed to mimic fluids in the metalworking process, and they are defined in this manuscript using: one of mineral oil, polyethylene glycol-200, soluble oil, synthetic oil, or semi-synthetic oil; at a concentration of 0.05 or 0.5 or 5 percent; with solute concentration of 0.01, 0.05, 0.1, 0.5, 1, or 5 ppm. A single linear free-energy relationship (LFER) model was shown to be inadequate, but extensions that account for experimental conditions provide important improvements in estimating solute partitioning from selected formulations into the MCF. The benefit of the Expanded Nested-Solute-Concentration LFER model over the Expanded Crossed-Factors LFER model is only revealed through a careful leave-one-solute-out cross-validation that properly addresses the existence of replicates to avoid an overly optimistic view of predictive power. Finally, the partition theory that accompanies the MCF approach is thoroughly tested and found to not be supported under complex experimental settings that mimic occupational exposure in the metalworking industry.


Subject(s)
Metals/metabolism , Models, Theoretical , Skin Physiological Phenomena , Algorithms , Permeability , Reproducibility of Results
6.
Technometrics ; 55(2): 161-173, 2013.
Article in English | MEDLINE | ID: mdl-23878407

ABSTRACT

A new classification method called the Optimal Bit String Tree is proposed to identify quantitative structure-activity relationships (QSARs). The method introduces the concept of a chromosome to describe the presence/absence context of a combination of descriptors. A descriptor set and its optimal chromosome form the splitting variable. A new stochastic searching scheme that contains a weighted sampling scheme, simulated annealing, and a trimming procedure optimizes the choice of splitting variable. Simulation studies and an application to screening monoamine oxidase (MAO) inhibitors show that OBSTree is advantageous in accurately and effectively identifying QSAR rules and finding different classes of active compounds. Details of the algorithm, SAS code, and simulated and real datasets are available online as supplementary materials.

7.
In Silico Biol ; 11(1-2): 61-81, 2011.
Article in English | MEDLINE | ID: mdl-22475752

ABSTRACT

ChemModLab, written by the ECCR @ NCSU consortium under NIH support, is a toolbox for fitting and assessing quantitative structure-activity relationships (QSARs). Its elements are: a cheminformatic front end used to supply molecular descriptors for use in modeling; a set of methods for fitting models; and methods for validating the resulting model. Compounds may be input as structures from which standard descriptors will be calculated using the freely available cheminformatic front end PowerMV; PowerMV also supports compound visualization. In addition, the user can directly input their own choices of descriptors, so the capability for comparing descriptors is effectively unlimited. The statistical methodologies comprise a comprehensive collection of approaches whose validity and utility have been accepted by experts in the fields. As far as possible, these tools are implemented in open-source software linked into the flexible R platform, giving the user the capability of applying many different QSAR modeling methods in a seamless way. As promising new QSAR methodologies emerge from the statistical and data-mining communities, they will be incorporated in the laboratory. The web site also incorporates links to public-domain data sets that can be used as test cases for proposed new modeling methods. The capabilities of ChemModLab are illustrated using a variety of biological responses, with different modeling methodologies being applied to each. These show clear differences in quality of the fitted QSAR model, and in computational requirements. The laboratory is web-based, and use is free. Researchers with new assay data, a new descriptor set, or a new modeling method may readily build QSAR models and benchmark their results against other findings. Users may also examine the diversity of the molecules identified by a QSAR model. Moreover, users have the choice of placing their data sets in a public area to facilitate communication with other researchers; or can keep them hidden to preserve confidentiality.


Subject(s)
Informatics/methods , Internet , Quantitative Structure-Activity Relationship , Data Mining , Models, Molecular , Neural Networks, Computer , Software , Support Vector Machine
8.
J Chem Inf Model ; 49(8): 1857-65, 2009 Aug.
Article in English | MEDLINE | ID: mdl-19807194

ABSTRACT

Ensemble methods have become popular for QSAR modeling, but most studies have assumed balanced data, consisting of approximately equal numbers of active and inactive compounds. Cheminformatics data are often far from being balanced. We extend the application of ensemble methods to include cases of imbalance of class membership and to more adequately assess model output. Based on the extension, we propose an ensemble method called MBEnsemble that automatically determines the appropriate tuning parameters to provide reliable predictions and maximize the F-measure. Results from multiple data sets demonstrate that the proposed ensemble technique works well on imbalanced data.


Subject(s)
Quantitative Structure-Activity Relationship , Computer Simulation , Models, Chemical
9.
Arthritis Res Ther ; 11(5): 252, 2009.
Article in English | MEDLINE | ID: mdl-19863777

ABSTRACT

The majority of autoimmune diseases predominate in females. In searching for an explanation for this female excess, most attention has focused on hormonal changes--both exogenous changes (for example, oral contraceptive pill) and fluctuations in endogenous hormone levels particularly related to menstruation and pregnancy history. Other reasons include genetic differences, both direct (influence of genes on sex chromosomes) and indirect (such as microchimerism), as well as gender differences in lifestyle factors. These will all be reviewed, focusing on the major autoimmune connective tissue disorders: rheumatoid arthritis, systemic lupus erythematosus and scleroderma.


Subject(s)
Autoimmune Diseases/genetics , Autoimmune Diseases/metabolism , Genetic Predisposition to Disease , Rheumatic Diseases/genetics , Rheumatic Diseases/metabolism , Female , Gonadal Steroid Hormones/metabolism , Humans , Pregnancy/immunology , Risk Factors , Sex Factors
10.
Arthritis Res Ther ; 11(3): 223, 2009.
Article in English | MEDLINE | ID: mdl-19490599

ABSTRACT

This article will review how epidemiological studies have advanced our knowledge of both genetic and environmental risk factors for rheumatic diseases over the past decade. The major rheumatic diseases, including rheumatoid arthritis, juvenile idiopathic arthritis, psoriatic arthritis, ankylosing spondylitis, systemic lupus erythematosus, scleroderma, osteoarthritis, gout, and fibromyalgia, and chronic widespread pain, will be covered. Advances discussed will include how a number of large prospective studies have improved our knowledge of risk factors, including diet, obesity, hormones, and smoking. The change from small-scale association studies to genome-wide association studies using gene chips to reveal new genetic risk factors will also be reviewed.


Subject(s)
Rheumatic Diseases/epidemiology , Rheumatic Diseases/etiology , Animals , Environment , Genetic Predisposition to Disease/epidemiology , Genetic Predisposition to Disease/etiology , Genetic Predisposition to Disease/genetics , Humans , Rheumatic Diseases/genetics , Risk Factors
11.
Environmetrics ; 20(5): 575-594, 2008 Sep 26.
Article in English | MEDLINE | ID: mdl-19936263

ABSTRACT

We suggest a parametric modeling approach for nonstationary spatial processes driven by point sources. Baseline near-stationarity, which may be reasonable in the absence of a point source, is modeled using a conditional autoregressive (CAR) Markov random field. Variability due to the point source is captured by our proposed autoregressive point source (ARPS) model. Inference proceeds according to the Bayesian hierarchical paradigm, and is implemented using Markov chain Monte Carlo (MCMC) methods. The parametric approach allows a formal test of effectiveness of the point source. Application is made to a real dataset on electric potential measurements in a field containing a metal pole and the finding is that our approach captures the pole's impact on small-scale variability of the electric potential process.

12.
Bioinformatics ; 23(10): 1225-34, 2007 May 15.
Article in English | MEDLINE | ID: mdl-17379692

ABSTRACT

MOTIVATION: New biological systems technologies give scientists the ability to measure thousands of bio-molecules including genes, proteins, lipids and metabolites. We use domain knowledge, e.g. the Gene Ontology, to guide analysis of such data. By focusing on domain-aggregated results at, say the molecular function level, increased interpretability is available to biological scientists beyond what is possible if results are presented at the gene level. RESULTS: We use a 'top-down' approach to perform domain aggregation by first combining gene expressions before testing for differentially expressed patterns. This is in contrast to the more standard 'bottom-up' approach, where genes are first tested individually then aggregated by domain knowledge. The benefits are greater sensitivity for detecting signals. Our method, domain-enhanced analysis (DEA) is assessed and compared to other methods using simulation studies and analysis of two publicly available leukemia data sets. AVAILABILITY: Our DEA method uses functions available in R (http://www.r-project.org/) and SAS (http://www.sas.com/). The two experimental data sets used in our analysis are available in R as Bioconductor packages, 'ALL' and 'golubEsets' (http://www.bioconductor.org/). SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Subject(s)
Burkitt Lymphoma/genetics , Computational Biology , Leukemia-Lymphoma, Adult T-Cell/genetics , Oligonucleotide Array Sequence Analysis , Software , Computer Simulation , Gene Expression Profiling , Humans , Sensitivity and Specificity
13.
Curr Opin Rheumatol ; 18(2): 141-6, 2006 Mar.
Article in English | MEDLINE | ID: mdl-16462519

ABSTRACT

PURPOSE OF REVIEW: This review aims to summarize articles published between October 2004 and November 2005 that have investigated the genetic epidemiology of rheumatoid arthritis. RECENT FINDINGS: The consistent replication of an association between the R620W single nucleotide polymorphism in PTPN22 and rheumatoid arthritis clearly establishes this polymorphism as an important risk factor for rheumatoid arthritis. SUMMARY: Genetic investigations of rheumatoid arthritis have predominantly been single nucleotide polymorphism-based candidate gene association studies searching for markers of susceptibility, severity or treatment response. Studies of the human leukocyte antigen region have refined and added to our understanding of the complex associations to polymorphisms with this locus. PTPN22 has emerged strongly as a genuine rheumatoid arthritis susceptibility gene with replications of the association to the R620W single nucleotide polymorphism. Many investigations have been conducted on the genetics of treatment response -- some 'generic' and others specific in terms of identifying genetic influences to the mode of action and metabolism of particular agents.


Subject(s)
Arthritis, Rheumatoid/epidemiology , Arthritis, Rheumatoid/genetics , Genetic Predisposition to Disease/genetics , Polymorphism, Single Nucleotide/genetics , Protein Tyrosine Phosphatases/genetics , Antirheumatic Agents/pharmacology , Arthritis, Rheumatoid/drug therapy , Drug Resistance/genetics , HLA Antigens/genetics , Humans , Mutation/genetics , Protein Tyrosine Phosphatase, Non-Receptor Type 22
14.
J Chem Inf Comput Sci ; 42(5): 1221-9, 2002.
Article in English | MEDLINE | ID: mdl-12377012

ABSTRACT

Drug discovery is dependent on finding a very small number of biologically active or potent compounds among millions of compounds stored in chemical collections. Quantitative structure-activity relationships suggest that potency of a compound is highly related to that compound's chemical makeup or structure. To improve the efficiency of cell-based analysis methods for high throughput screening, where information of a compound's structure is used to predict potency, we consider a number of potentially influential factors in the cell-based approach. A fractional factorial design is implemented to evaluate the effects of these factors, and lift chart results show that the design scheme is able to find conditions that enhance hit rates.


Subject(s)
Drug Design , Drug Evaluation, Preclinical/statistics & numerical data , Anti-HIV Agents/chemistry , Anti-HIV Agents/pharmacology , Cells , Computer Simulation , Databases, Factual , Humans , Microbial Sensitivity Tests/statistics & numerical data , Quantitative Structure-Activity Relationship
SELECTION OF CITATIONS
SEARCH DETAIL