The construction industry is one of the most dangerous industries in many countries. To improve the situation, senior managers overseeing portfolios of construction projects need to understand the safety risk levels of their projects so that interventions can be implemented proactively. Safety leading indicators is one way to flag sites that are of higher risk. However, there is a lack of validated leading indicators that can reliably classify sites according to their safety risk levels. On the other hand, despite the success of machine learning (ML) approaches in other domains, it is not widely utilized in the construction industry, especially in the development of safety leading indicators. This paper presents a ML approach to developing leading indicators that classify sites in accordance to their safety risk in construction projects. This study was guided by the industry recognized CrossIndustry Standard Process for Data Mining (CRISPDM) framework and the key types of data used include safety inspection records, accident cases and project-related data. These data were obtained from a large contractor in Singapore and the data were accumulated from year 2010 to 2016. Out of thirty-three input variables (also known as features or independent variables), 13 input variables were selected using a combination of Boruta feature selection technique and decision tree. Of the 13 selected input variables, six of them are project-related (project type, project ownership, contract sum, percent completed, magnitude of delay and project manpower) and seven of them are items in the contractor’s safety inspection checklists (crane/lifting operations, scaffold, mechanical-elevated working platform, falling hazards/openings, environmental management, good practices and weighted safety inspection score). Five popular ML algorithms were then used to train models for prediction of accident occurrence and severity. During validation, random forest (RF) provided the best prediction performance with an accuracy of 0.78 and has achieved a substantial strength of agreement with Weighted-Kappa Statistics of 0.70. Comparing with similar studies, this result is promising. The prediction (i.e. the output variable) provided by the RF model can be used as a safety leading indicator of the risk level of a site. It is recommended that the predictive RF model be deployed in construction organizations, especially large public and private developers, contractors and industry associations, to provide monthly forecast of project safety performance so that pre-emptive inspections and interventions can be implemented in a more targeted manner.
مقاله حاضر از سری مقالات منتشر شده در مجله بین المللی الزویر می باشد که توسط Clive Q.X. Poha, Chalani Udhyami Ubeynarayanab, Yang Miang Gohc نوشته شده است.