Abstract
Emerging machine learning capabilities can be leveraged to make transportation infrastructure safer and reduce fatalities by informing decisions about which countermeasures to apply at crash-prone locations. At this time, project prioritization typically involves assessing effectiveness, cost-benefit ratios, and available funding. Crash Modification Factors (CMFs) play an essential role in project assessment by predicting the effectiveness of safety countermeasures. Their applicability has limitations, however. Some of these may be overcome with innovative approaches such as knowledge-mining. The US Department of Transportation’s (DOT) CMF Clearinghouse provides practitioners with a list of reliable CMFs developed from individual studies. However, available CMFs do not cover all potential scenarios-of-interest to State DOTs because unique projects may feature novel infrastructure types or countermeasures. Experimental or observational studies are the dominant tools for estimating CMFs. However, these approaches may require years of effort to collect adequate crash data. To address these challenges, the research team developed a machine learning framework that mines CMF Clearinghouse data to uncover previously unidentified relationships. This provides a cost-effective and time-efficient solution to assessing CMFs not covered by the CMF Clearinghouse. The study proposed framework fully explores existing CMF data. The research team extensively trained and tested the proposed approach on CMF Clearinghouse data with experiments and showed that the framework can predict CMFs with reasonable accuracy. The framework flexibly incorporates heterogeneous data from the CMF Clearinghouse, captures the semantic contexts of countermeasures, and maintains data compatibility.