Machine-learning based prediction models for assessing skin irritation and corrosion potential of liquid chemicals using physicochemical properties by XGBoost

Yeonsoo Kang;Myeong Gyu Kim;Kyung‑Min Lim;

doi:10.1007/s43188-022-00168-8

Toxicological Research

제39권2호
/
Pages.295-305
/
2023
/
1976-8257(pISSN)
/
2234-2753(eISSN)

한국독성학회 (Korean Society of Toxicology & Korea Environmental Mutagen Society)

DOI QR Code

Machine-learning based prediction models for assessing skin irritation and corrosion potential of liquid chemicals using physicochemical properties by XGBoost

Yeonsoo Kang (College of Pharmacy, Ewha Womans University) ;
Myeong Gyu Kim (College of Pharmacy, Ewha Womans University) ;
Kyung‑Min Lim (College of Pharmacy, Ewha Womans University)

투고 : 2022.10.31
심사 : 2022.12.23
발행 : 2023.04.15

https://doi.org/10.1007/s43188-022-00168-8 인용

⟨ 이전 논문 다음 논문 ⟩

초록

Skin irritation test is an essential part of the safety assessment of chemicals. Recently, computational models to predict the skin irritation draw attention as alternatives to animal testing. We developed prediction models on skin irritation/corrosion of liquid chemicals using machine learning algorithms, with 34 physicochemical descriptors calculated from the structure. The training and test dataset of 545 liquid chemicals with reliable in vivo skin hazard classifcations based on UN Globally Harmonized System [category 1 (corrosive, Cat 1), 2 (irritant, Cat 2), 3 (mild irritant, Cat 3), and no category (nonirritant, NC)] were collected from public databases. After the curation of input data through removal and correlation analysis, every model was constructed to predict skin hazard classifcation for liquid chemicals with 22 physicochemical descriptors. Seven machine learning algorithms [Logistic regression, Naïve Bayes, k-nearest neighbor, Support vector machine, Random Forest, Extreme gradient boosting (XGB), and Neural net] were applied to ternary and binary classifcation of skin hazard. XGB model demonstrated the highest accuracy (0.73-0.81), sensitivity (0.71-0.92), and positive predictive value (0.65-0.81). The contribution of physicochemical descriptors to the classifcation was analyzed using Shapley Additive exPlanations plot to provide an insight into the skin irritation of chemicals.

키워드

과제정보

This study was Cosmetic safety evaluation project carried out by the Korea Cosmetic Industry Institute (KCII) funded by the Ministry of Health and Welfare and the Korea Environment Industry and Technology Institute (KEITI) funded by Korea Ministry of Environment (MOE) (2021002970001, 1485017976).

참고문헌

Gallegos Saliner A, Tsakovska I, Pavan M, Patlewicz G, Worth AP, Research QiE (2007) Evaluation of SARs for the prediction of skin irritation/corrosion potential-structural inclusion rules in the BfR decision support system. SAR 18:331-342. https://doi.org/10.1080/10629360701304014
OECD (2015) Test guideline no. 404: acute dermal irritation/corrosion. OECD guidelines for the testing of chemicals. Organisation for Economic Cooperation and Development, Paris. https://doi.org/10.1787/9789264070622
OECD (2021) Test no. 439: in vitro skin irritation: reconstructed human epidermis test method. Organisation for Economic Cooperation and Development, Paris. https://doi.org/10.1787/20745788
OECD (2019) Test no. 431: In vitro skin corrosion: reconstructed human epidermis (RHE) test method. Organisation for Economic Cooperation and Development, Paris. https://doi.org/10.1787/20745788
Desprez B, Barroso J, Griesinger C, Kandarova H, Alepee N, Fuchs HW (2015) Two novel prediction models improve predictions of skin corrosive sub-categories by test methods of OECD test guideline no. 431. Toxicol In Vitro 29:2055-2080. https://doi.org/10.1016/j.tiv.2015.08.015
Ball N, Cronin MT, Shen J, Blackburn K, Booth ED, Bouhifd M, Donley E, Egnash L, Hastings C, Juberg DR (2016) T4 report: toward good read-across practice (GRAP) guidance. Altex 33:149. https://doi.org/10.14573/altex.1601251
Patlewicz G, Ball N, Booth ED, Hulzebos E, Zvinavashe E, Hennes C (2013) Use of category approaches, read-across and (Q) SAR: general considerations. Regul Pharmacol Toxicol 67:1-12. https://doi.org/10.1016/j.yrtph.2013.06.002
Saliner AG, Worth AP (2007) Testing strategies for the prediction of skin and eye irritation and corrosion for regulatory purposes: Publications Ofce of the European Union. https://doi.org/10.2788/64337
Benfenati E, Chaudhry Q, Gini G, Dorne JL (2019) Integrating in silico models and read-across methods for predicting toxicity of chemicals: a step-wise strategy. Environ Int 131:105060. https://doi.org/10.1016/j.envint.2019.105060
Raies AB, Bajic VB (2016) In silico toxicology: computational methods for the prediction of chemical toxicity. Wiley Interdiscip Rev Comput Mol Sci 6:147-172. https://doi.org/10.1002/wcms.1240
Organisation for Economic Co-operation and Development (2014) Guidance document on the validation of (quantitative) structure-activity relationship [(Q) SAR] models. Organisation for Economic Co-operation and Development. https://doi.org/10.1787/9789264085442-en
Verheyen GR, Braeken E, Van Deun K, Van Miert S (2017) Evaluation of existing (Q) SAR models for skin and eye irritation and corrosion to use for REACH registration. Toxicol Lett 265:47-52. https://doi.org/10.1016/j.toxlet.2016.11.007
Marchant CA, Briggs KA, Long A (2008) In silico tools for sharing data and knowledge on toxicity and metabolism: Derek for windows, meteor, and vitic. Toxicol Mech Methods 18:177-187. https://doi.org/10.1080/15376510701857320
Hulzebos E, Walker JD, Gerner I, Schlegel K (2005) Use of structural alerts to develop rules for identifying chemical substances with skin irritation or skin corrosion potential. QSAR Combina Sci 24:332-342. https://doi.org/10.1002/qsar.200430905
Han J, Lee G-Y, Bae G, Kang M-J, Lim K-M (2021) Chemskin reference chemical database for the development of an in vitro skin irritation test. Toxics 9:314. https://doi.org/10.3390/toxics9110314
Schober P, Vetter TR, Analgesia (2020) Linear regression in medical research. Anesthesia 132:108. https://doi.org/10.1213/ANE.0000000000005206
Vetter TR, Schober P, Analgesia (2018) Regression: the apple does not fall far from the tree. Anesthesia 127:277-283. https://doi.org/10.1213/ane.0000000000003424
Berger JO (2013) Statistical decision theory and Bayesian analysis. Springer Science & Business Media, Berlin. https://doi.org/10.1007/978-1-4757-4286-2_4
Altman NS (1992) An introduction to kernel and nearest-neighbor nonparametric regression. Am Stat 46:175-185. https://doi.org/10.1080/00031305.1992.10475879
Huang S, Cai N, Pacheco PP, Narrandes S, Wang Y, Xu W (2018) Applications of support vector machine (SVM) learning in cancer genomics. Cancer Genom Proteom 15:41-51. https://doi.org/10.21873/cgp.20063
Noble WS (2006) What is a support vector machine? Nat Biotechnol 24:1565-1567. https://doi.org/10.1038/nbt1206-1565
Pellegrino E, Jacques C, Beaufils N, Nanni I, Carlioz A, Metellus P, Ouafik Lh (2021) Machine learning random forest for predicting oncosomatic variant NGS analysis. Sci Rep 11:1-14. https://doi.org/10.1038/s41598-021-01253-y
Noh B, Youm C, Goh E, Lee M, Park H, Jeon H, Kim OY (2021) XGBoost based machine learning approach to predict the risk of fall in older adults using gait outcomes. Sci Rep 11:1-9. https://doi.org/10.1038/s41598-021-91797-w
Chen T, Guestrin C (2016) Xgboost: a scalable tree boosting system. In: Proceedings of the 22nd acm sigkdd international conference on knowledge discovery and data mining, pp 785-794. https://doi.org/10.1145/2939672.2939785
Kriegeskorte N, Golan T (2019) Neural network models and deep learning. Curr Biol 29:R231-R236. https://doi.org/10.1016/j.cub.2019.02.034
Silva AC, Borba JV, Alves VM, Hall SU, Furnham N, Kleinstreuer N, Muratov E, Tropsha A, Andrade CH (2021) Novel computational models offer alternatives to animal testing for assessing eye irritation and corrosion potential of chemicals. Artif Intell Life Sci 1:100028. https://doi.org/10.1016/j.ailsci.2021.100028
Ying G-S, Maguire MG, Glynn RJ, Rosner B (2020) Calculating sensitivity, specificity, and predictive values for correlated eye data. Investig Ophthalmol Vis Sci 61:29-29. https://doi.org/10.1167/iovs.61.11.29
Akbar NA, Sunyoto A, Arief MR, and Caesarendra W (2020) Improvement of decision tree classifer accuracy for healthcare insurance fraud prediction by using extreme gradient boosting algorithm. In: 2020 international conference on informatics, multimedia, cyber and information system (ICIMCIS). IEEE, pp 110-114. https://doi.org/10.1109/ICIMCIS51567.2020.9354286
Wang F, Ross CL (2018) Machine learning travel mode choices: comparing the performance of an extreme gradient boosting model with a multinomial logit model. Transp Res Rec 2672:35-45. https://doi.org/10.1177/0361198118773556
Bae SY, Lee J, Jeong J, Lim C, Choi J (2021) Efective data-balancing methods for class-imbalanced genotoxicity datasets using machine learning algorithms and molecular fngerprints. Comput Toxicol 20:100178. https://doi.org/10.1016/j.comtox.2021.100178
Shi Z, Chu Y, Zhang Y, Wang Y, Wei D-Q (2020) Prediction of blood-brain barrier permeability of compounds by fusing resampling strategies and extreme gradient boosting. IEEE Access 9:9557-9566. https://doi.org/10.1109/ACCESS.2020.3047852
Feng H, Zhang L, Li S, Liu L, Yang T, Yang P, Zhao J, Arkin IT, Liu H (2021) Predicting the reproductive toxicity of chemicals using ensemble learning methods and molecular fingerprints. Toxicol Lett 340:4-14. https://doi.org/10.1016/j.toxlet.2021.01.002
Macfarlane M, Jones P, Goebel C, Dufour E, Rowland J, Araki D, Costabel-Farkas M, Hewitt NJ, Hibatallah J, Kirst AJRT (2009) A tiered approach to the use of alternatives to animal testing for the safety assessment of cosmetics: skin irritation. Regul Toxicol Pharmacol 54:188-196. https://doi.org/10.1016/j.yrtph.2009.04.003
Gallegos Saliner A, Tsakovska I, Pavan M, Patlewicz G, Worth A (2007) Evaluation of SARs for the prediction of skin irritation/corrosion potential-structural inclusion rules in the BfR decision support system. SAR QSAR Environ Res 18:331-342. https://doi.org/10.1080/10629360701304014
Mombelli E (2008) An evaluation of the predictive ability of the QSAR software packages, DEREK, HAZARDEXPERT and TOPKAT, to describe chemically-induced skin irritation. Altern Lab Anim 36:15-24. https://doi.org/10.1177/026119290803600104
Tsakovska I, Saliner AG, Netzeva T, Pavan M, Worth A (2007) Evaluation of SARs for the prediction of eye irritation/corrosion potential-structural inclusion rules in the BfR decision support system. SAR QSAR Environ Res 18:221-235. https://doi.org/10.1080/10629360701304063
Musa AY, Jalgham RT, Mohamad AB (2012) Molecular dynamic and quantum chemical calculations for phthalazine derivatives as corrosion inhibitors of mild steel in 1 M HCl. Corros Sci 56:176-183. https://doi.org/10.1016/j.corsci.2011.12.005
Usha T, Tripathi P, Pande V, Middha SK (2013) Molecular docking and quantum mechanical studies on pelargonidin-3-glucoside as renoprotective ACE inhibitor. Int Sch Res Not 2013:428378. https://doi.org/10.1155/2013/428378
Eddy NO, Essien NB (2017) Computational chemistry study of toxicity of some m-tolyl acetate derivatives insecticides and molecular design of structurally related products. In Silico Pharmacol 5:1-17. https://doi.org/10.1007/s40203-017-0036-y
Ferguson J (1939) The use of chemical potentials as indices of toxicity. Proc R Soc Lond Ser B Biol Sci 127:387-404. https://doi.org/10.1098/rspb.1939.0030
Kehrer JP (2000) The Haber-Weiss reaction and mechanisms of toxicity. Toxicology 149:43-50. https://doi.org/10.1016/S0300-483X(00)00231-6
Lyakurwa F, Yang X, Li X, Qiao X, Chen J (2014) Development and validation of theoretical linear solvation energy relationship models for toxicity prediction to fathead minnow (Pimephales promelas). Chemosphere 96:188-194. https://doi.org/10.1016/j.chemosphere.2013.10.039
Bakire S, Yang X, Ma G, Wei X, Yu H, Chen J, Lin H (2018) Developing predictive models for toxicity of organic chemicals to green algae based on mode of action. Chemosphere 190:463-470. https://doi.org/10.1016/j.chemosphere.2017.10.028
Ameh PO, Eddy NO (2016) Theoretical and experimental studies on the corrosion inhibition potentials of 3-nitrobenzoic acid for mild steel in 0.1 M H₂SO₄. Cogent Chem 2:1253904. https://doi.org/10.1080/23312009.2016.1253904

Toxicological Research

Machine-learning based prediction models for assessing skin irritation and corrosion potential of liquid chemicals using physicochemical properties by XGBoost

초록

키워드

과제정보

참고문헌

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

자세히 찾기

이미지 검색 (β)