自主神经功能紊乱化学品的机器学习筛查模型
Machine Learning Screening Model for Chemicals Inducing Autonomic Dysfunction
-
摘要: 化学品可以引起继发性自主神经功能紊乱(autonomic dysfunction, AD),对人体健康造成危害。通过动物实验和临床测试手段筛查AD化学品,过程复杂、耗时长且成本高,有必要发展高通量的筛查方法。目前,化学品诱发AD的机制复杂,尚缺乏筛查AD化学品的机器学习模型。本研究基于文献和数据库挖掘,构建了涵盖4种AD临床不良症状(直立性低血压、失禁、尿失禁、肛门失禁)的数据集,包括466种阳性数据,427种阴性数据。基于该数据集,计算ToxPrint毒性指纹,采用5种机器学习算法(决策树、支持向量机、k近邻、随机森林、梯度提升决策树)构建了AD化学品的筛查模型。随机森林模型的分类性能最优,训练集准确率达0.738,验证集准确率达0.737,若考虑模型应用域,当相似性阈值为0.75时,验证集准确率提高至0.752。此外,本研究耦合SHAP(SHapley Additive exPlanations)方法和子结构片段频率分析方法,揭示了诱发AD的16种警示子结构,包括9种键、3种链、3种环和1种基团结构。基于所发展的机器学习筛查模型,拓展了对AD机制的认识和理解,为神经毒性化学品的筛查和评价提供参考。Abstract: Chemical-induced secondary autonomic dysfunction (AD) has become a concern for human health due to its adverse effects on autonomic nervous system. Conventional methods for screening AD-induced chemicals through in vivo tests are time-consuming and expensive. Machine learning (ML) methods are efficient and reliable to develop models for screening AD-induced chemicals. Hence, based on literature and database mining, this study constructed a data set with a volume of 466 positive and 427 negative data samples, covering four clinical adverse symptoms related to AD, including orthostatic hypotension, incontinence, urinary incontinence, and anal incontinence. Recursive feature elimination with cross-validation method was applied for the selection of one hundred and twenty ToxPrints for ML modelling. Five ML algorithms, including decision tree, support vector machine, k-nearest neighbor, random forest, and gradient boosting decision tree were used to build the model for screening AD chemicals. The results indicated that random forest model showed the best classification performance, with a training set accuracy of 0.738 and a validation set accuracy of 0.737. The random forest model proposed was also assessed through Y-scrambling, demonstrating that the outcome obtained is not given by chance. If a chemical has a similarity score higher than 0.75, its expected prediction accuracy can be increased to 0.752, which indicates that the chemical can be classified more accurately. In addition, 16 structural alerts responsible for AD were identified by coupling the SHAP (SHapley Additive exPlanations) method and substructure frequency analysis. These structural alerts include 9 types of bonds[CN_amine_sec-NH_generic, CN_amine_aliphatic_generic, X[any_! C]_halide_inorganic, C(=O)N_carboxamide_generic, X[any]_halide, CN_amine_ter-N_aliphatic, CN_amine_alicyclic_generic,CN_amine_sec-NH_alkyl, C(=O)N_carboxamide_(NR2)], 3 types of chains[alkaneLinear_propyl_C3, aromaticAlkane_Ph-C1_cyclic, alkaneLinear_ethyl_C2(H_gt_1)], 3 types of rings[hetero_[5]_Z_1-Z, hetero_[5_6]_Z_generic, hetero_[5]_N_pyrrole_generic] and 1 type of group (aminoAcid_aminoAcid_generic).Frequency values of the above substructures were all greater than 1, which indicated that these structural fragments were much more frequently in positive chemicals than in negative chemicals. Frequency analysis outcomes further confirmed that the presence of the 16 chemical fragments would alert to induce AD. The developed ML model of this study could be a beneficial tool for effective screening of AD chemicals. Furthermore, structural alerts provided in this study could provide a valuable reference for the screening and evaluation of neurotoxic chemicals.
-
Key words:
- autonomic dysfunction /
- machine learning /
- screening model /
- structural alerts /
- neurotoxicity
-
-
Cardinali D P. Clinical Implications of the Enlarged Autonomic Nervous System[M]//Autonomic Nervous System. Cham:Springer International Publishing, 2017:287-312 庄志雄. 靶器官毒理学[M]. 北京:化学工业出版社, 2006:163-171 赵超英, 姜允申. 神经系统毒理学[M]. 北京:北京大学医学出版社, 2009:91-136 Jain K K. Drug-induced Disorders of the Autonomic Nervous System[M]//Drug-induced Neurological Disorders. Cham:Springer, 2021:469-479 Herring N, Kalla M, Paterson D J. The autonomic nervous system and cardiac arrhythmias:Current concepts and emerging therapies[J]. Nature Reviews Cardiology, 2019, 16(12):707-726 Ehmke H. The mechanotransduction of blood pressure[J]. Science, 2018, 362(6413):398-399 Shewale S V, Anstadt M P, Horenziak M, et al. Sarin causes autonomic imbalance and cardiomyopathy:An important issue for military and civilian health[J]. Journal of Cardiovascular Pharmacology, 2012, 60(1):76-87 Nguyen L S, Cooper L T, Kerneis M, et al. Systematic analysis of drug-associated myocarditis reported in the World Health Organization pharmacovigilance database[J]. Nature Communications, 2022, 13(1):25 Leung J Y T, Barr A M, Procyshyn R M, et al. Cardiovascular side-effects of antipsychotic drugs:The role of the autonomic nervous system[J]. Pharmacology & Therapeutics, 2012, 135(2):113-122 周宗灿. 毒理学教程[M]. 3版. 北京:北京大学医学出版社, 2006:486-502 Cheshire W P, Freeman R, Gibbons C H, et al. Electrodiagnostic assessment of the autonomic nervous system:A consensus statement endorsed by the American Autonomic Society, American Academy of Neurology, and the International Federation of Clinical Neurophysiology[J]. Clinical Neurophysiology, 2021, 132(2):666-682 Freeman R, Wieling W, Axelrod F B, et al. Consensus statement on the definition of orthostatic hypotension, neurally mediated syncope and the postural tachycardia syndrome[J]. Clinical Autonomic Research, 2011, 21(2):69-72 邓东阳, 于红霞, 张效伟, 等. 基于毒性效应的非目标化学品鉴别技术进展[J]. 生态毒理学报, 2015, 10(2):13-25 Deng D Y, Yu H X, Zhang X W, et al. Development and application of nontargeted analysis in effect directed analysis[J]. Asian Journal of Ecotoxicology, 2015, 10(2):13-25(in Chinese)
Wang Z Y, Walker G W, Muir D C G, et al. Toward a global understanding of chemical pollution:A first comprehensive analysis of national and regional chemical inventories[J]. Environmental Science & Technology, 2020, 54(5):2575-2584 Johnson A C, Jin X W, Nakada N, et al. Learning from the past and considering the future of chemicals in the environment[J]. Science, 2020, 367(6476):384-387 Kasahara Y, Yoshida C, Nakanishi K, et al. Alterations in the autonomic nerve activities of prenatal autism model mice treated with valproic acid at different developmental stages[J]. Scientific Reports, 2020, 10:17722 Pognan F, Beilmann M, Boonen H C M, et al. The evolving role of investigative toxicology in the pharmaceutical industry[J]. Nature Reviews Drug Discovery, 2023, 22(4):317-335 Zhao X, Sun Y H, Zhang R Q, et al. Machine learning modeling and insights into the structural characteristics of drug-induced neurotoxicity[J]. Journal of Chemical Information and Modeling, 2022, 62(23):6035-6045 Wang Z Y, Zhao P P, Zhang X X, et al. In silico prediction of chemical respiratory toxicity via machine learning[J]. Computational Toxicology, 2021, 18:100155 Tang W H, Chen J W, Hong H X. Development of classification models for predicting inhibition of mitochondrial fusion and fission using machine learning methods[J]. Chemosphere, 2021, 273:128567 Crofton K M, Bassan A, Behl M, et al. Current status and future directions for a neurotoxicity hazard assessment framework that integrates in silico approaches[J]. Computational Toxicology, 2022, 22:100223 Jeong J, Choi J. Artificial intelligence-based toxicity prediction of environmental chemicals:Future directions for chemical management applications[J]. Environmental Science & Technology, 2022, 56(12):7532-7543 滕跃发, 王晓晴, 李斐, 等. 大数据挖掘和机器学习在毒理学中的应用[J]. 生态毒理学报, 2022, 17(1):93-101 Teng Y F, Wang X Q, Li F, et al. Application of data mining and machine learning in toxicology[J]. Asian Journal of Ecotoxicology, 2022, 17(1):93-101(in Chinese)
张家晨, 张良, 庄树林. 分子起始事件在计算毒理学中的研究展望[J]. 环境化学, 2021, 40(9):2629-2632 Zhang J C, Zhang L, Zhuang S L. Perspective of molecular initiating events in computational toxicology[J]. Environmental Chemistry, 2021, 40(9):2629-2632(in Chinese)
Garland E M, Robertson D. Autonomic Failure[M]//Encyclopedia of Neuroscience. Amsterdam:Elsevier, 2009:825-832 Zhang H, Mao J, Qi H Z, et al. Developing novel computational prediction models for assessing chemical-induced neurotoxicity using naïve Bayes classifier technique[J]. Food and Chemical Toxicology, 2020, 143:111513 Yang C, Tarkhov A, Marusczyk J, et al. New publicly available chemical query language, CSRML, to support chemotype representations for application to data mining and modeling[J]. Journal of Chemical Information and Modeling, 2015, 55(3):510-528 Nguyen T N, Nakanowatari S, Nhat Tran T P, et al. Learning catalyst design based on bias-free data set for oxidative coupling of methane[J]. ACS Catalysis, 2021, 11(3):1797-1809 Specht T, Münnemann K, Hasse H, et al. Automated methods for identification and quantification of structural groups from nuclear magnetic resonance spectra using support vector classification[J]. Journal of Chemical Information and Modeling, 2021, 61(1):143-155 Sarkar N, Gupta R, Keserwani P K, et al. Air Quality Index prediction using an effective hybrid deep learning model[J]. Environmental Pollution, 2022, 315:120404 Cheng W X, Ng C A. Using machine learning to classify bioactivity for 3486 per- and polyfluoroalkyl substances (PFASs) from the OECD list[J]. Environmental Science & Technology, 2019, 53(23):13970-13980 Zulfiqar H, Yuan S S, Huang Q L, et al. Identification of cyclin protein using gradient boost decision tree algorithm[J]. Computational and Structural Biotechnology Journal, 2021, 19:4123-4131 王园宁, 刘会会, 杨先海. 构建有机化合物斑马鱼雌激素干扰效应的二元分类模型[J]. 生态毒理学报, 2019, 14(4):163-169 Wang Y N, Liu H H, Yang X H. Development of binary classification models for predicting estrogenic activity of organic compounds on zebrafish[J]. Asian Journal of Ecotoxicology, 2019, 14(4):163-169(in Chinese)
Huang Y, Li X H, Xu S J, et al. Quantitative structure-activity relationship models for predicting inflammatory potential of metal oxide nanoparticles[J]. Environmental Health Perspectives, 2020, 128(6):67010 Huang Y, Li X H, Cao J Y, et al. Use of dissociation degree in lysosomes to predict metal oxide nanoparticle toxicity in immune cells:Machine learning boosts nano-safety assessment[J]. Environment International, 2022, 164:107258 陈景文, 全燮. 环境化学[M]. 大连:大连理工大学出版社, 2009:260-289 Wang Z Y, Chen J W, Hong H X. Applicability domains enhance application of PPARγ agonist classifiers trained by drug-like compounds to environmental chemicals[J]. Chemical Research in Toxicology, 2020, 33(6):1382-1388 Yang H B, Lou C F, Li W H, et al. Computational approaches to identify structural alerts and their applications in environmental toxicology and drug discovery[J]. Chemical Research in Toxicology, 2020, 33(6):1312-1322 Lundberg S M, Lee S I. A unified approach to interpreting model predictions[J]. Advances in Neural Information Processing Systems, 2017, 30:4765-4774 孙露, 陈英杰, 吴曾睿, 等. 有机化合物生物富集因子的计算机预测研究[J]. 生态毒理学报, 2015, 10(2):173-182 Sun L, Chen Y J, Wu Z R, et al. In silico prediction of chemical bioconcentration factor[J]. Asian Journal of Ecotoxicology, 2015, 10(2):173-182(in Chinese)
Hochrein J, Klein M S, Zacharias H U, et al. Performance evaluation of algorithms for the classification of metabolic 1H NMR fingerprints[J]. Journal of Proteome Research, 2012, 11(12):6242-6251 Liu W P, Zhang L R, Bao L J, et al. Accurate classification and prediction of acute myocardial infarction through an ARMD procedure[J]. Journal of Proteome Research, 2023, 22(3):758-767 Jain S, Norinder U, Escher S E, et al. Combining in vivo data with in silico predictions for modeling hepatic steatosis by using stratified bagging and conformal prediction[J]. Chemical Research in Toxicology, 2021, 34(2):656-668 Writer J H, Antweiler R C, Ferrer I, et al. In-stream attenuation of neuro-active pharmaceuticals and their metabolites[J]. Environmental Science & Technology, 2013, 47(17):9781-9790 Mukherjee R K, Kumar V, Roy K. Ecotoxicological QSTR and QSTTR modeling for the prediction of acute oral toxicity of pesticides against multiple avian species[J]. Environmental Science & Technology, 2022, 56(1):335-348 Liu J, Patlewicz G, Williams A J, et al. Predicting organ toxicity using in vitro bioactivity data and chemical structure[J]. Chemical Research in Toxicology, 2017, 30(11):2046-2059 Xu T, Ngan D K, Ye L, et al. Predictive models for human organ toxicity based on in vitro bioactivity data and chemical structure[J]. Chemical Research in Toxicology, 2020, 33(3):731-741 -

计量
- 文章访问数: 1706
- HTML全文浏览数: 1706
- PDF下载数: 143
- 施引文献: 0