Your browser doesn't support javascript.
loading
Towards a better understanding of deep convolutional neural network processes for recognizing organic chemicals of environmental concern.
Sun, Xiangfei; Zhang, Xianming; Wang, Luyao; Li, Yuanxin; Muir, Derek C G; Zeng, Eddy Y.
Afiliação
  • Sun X; Guangdong Key Laboratory of Environmental Pollution and Health, School of Environment, Jinan University, Guangzhou 511443, China.
  • Zhang X; Department of Chemistry and Biochemistry, Concordia University, Montreal, Quebec H4B 1R6, Canada.
  • Wang L; Guangdong Key Laboratory of Environmental Pollution and Health, School of Environment, Jinan University, Guangzhou 511443, China.
  • Li Y; Guangdong Key Laboratory of Environmental Pollution and Health, School of Environment, Jinan University, Guangzhou 511443, China.
  • Muir DCG; Guangdong Key Laboratory of Environmental Pollution and Health, School of Environment, Jinan University, Guangzhou 511443, China; Environment and Climate Change Canada, Aquatic Contaminants Research Division, 867 Lakeshore Road, Burlington, Ontario L7S 1A1, Canada.
  • Zeng EY; Guangdong Key Laboratory of Environmental Pollution and Health, School of Environment, Jinan University, Guangzhou 511443, China. Electronic address: eddyzeng@jnu.edu.cn.
J Hazard Mater ; 421: 126746, 2022 01 05.
Article em En | MEDLINE | ID: mdl-34388923
ABSTRACT
Deep convolutional neural network (DCNN) has proved to be a promising tool for identifying organic chemicals of environmental concern. However, the uncertainty associated with DCNN predictions remains to be quantified. The training process contains many random configurations, including dataset segmentation, input sequences, and initial weight, etc. Moreover, the DCNN working mechanism is non-linear and opaque. To increase confidence to use this novel approach, persistent, bioaccumulative, and toxic substances (PBTs) were utilized as representative chemicals of environmental concern to estimate the prediction uncertainty under five distinguished datasets and ten different molecular descriptor (MD) arrangements with 111,852 chemicals and 2424 available MDs. An internal correlation coefficient test indicated that the prediction confidence reached 0.98 when a mean of 50 DCNNs' predictions was used instead of a sing DCNN prediction. A threshold for PBT categorization was determined by considering costs between false-negative and false-positive predictions. As revealed by the guided backpropagation-class activation mapping (GBP-CAM) saliency images, only 12% of all selected MDs were activated by DCNN and influenced decision-making process. However, the activated MDs not only varied among chemical classes but also shifted with different DCNNs. Principal component analysis indicated that 2424 MDs could transform into 370 orthogonal variables. Both results suggest that redundancy exists among selected MDs. Yet, DCNN was found to adapt to redundant data by focusing on the most important information for better prediction performance.
Assuntos
Palavras-chave

Texto completo: 1 Base de dados: MEDLINE Assunto principal: Compostos Orgânicos / Redes Neurais de Computação Idioma: En Ano de publicação: 2022 Tipo de documento: Article

Texto completo: 1 Base de dados: MEDLINE Assunto principal: Compostos Orgânicos / Redes Neurais de Computação Idioma: En Ano de publicação: 2022 Tipo de documento: Article