Predicting Road Accidents and Analyzing their Patterns Using Supervised Machine Learning

Oyoo, James Oduor

JKUAT Repository Home
→
Theses and Dissertations
→
College of Heaith Sciences JKUAT (COHES)
→
View Item

dc.contributor.author	Oyoo, James Oduor
dc.date.accessioned	2025-12-03T13:35:36Z
dc.date.available	2025-12-03T13:35:36Z
dc.date.issued	2025-12-03
dc.identifier.citation	OyooJO2025	en_US
dc.identifier.uri	http://localhost/xmlui/handle/123456789/6876
dc.description	Master of Science in Computer Systems	en_US
dc.description.abstract	Road traffic collisions are some of the most serious issues that the world is facing. This results in many fatalities, injuries, and financial losses, with low-middle-income countries (LMICs) bearing a disproportionate amount of the cases. Previous studies have examined this scenario by utilizing various methods and strategies on various sections and crossings. Conventional methods such as logit and probit models have been extensively employed to predict road accidents. Nevertheless, these techniques have flaws, such as the requirement of a predetermined mathematical form and the presence of missing values and outliers in the dataset, which negatively impact the outcomes of the prediction model. Conversely to statistical techniques, machine learning (ML) techniques can manage the outliers and missing values in the dataset. Designing accurate predictive models for road accidents is an important task for the transportation network, and this has enabled researchers to become innovative by developing prediction models (PM) and researching factors that contribute to these accidents. This thesis, therefore, aims to develop and evaluate a prediction model using an ensemble ML technique that incorporates supervised ML algorithms such as AdaBoost, K-Nearest Neighbors (K-NN), Decision Trees (DT), and Naive Bayes (NB) to predict road accidents and their patterns. The driving simulator was used as a research instrument to collect data. The data collected was then normalized and cleaned for analysis using the scikit-learn Python library. The synthetic minority oversampling technique (SMOTE) was employed to address the data imbalance prior to training the model. The particle swarm optimization (PSO) algorithm was used to identify the most important features in our dataset. The primary performance indicators, such as testing accuracy, precision, recall, and F1 score, were used to assess the models and compare their outcomes. The findings of this study indicate that the two-layer ensemble model outperforms the four base classiﬁcation models based on four performance indicators, with 88% testing accuracy, 86% precision, 83% recall, and 84% F1 score. The proposed two-layer ensemble model can be utilized in the future for both theoretical and practical applications, such as road safety management to improve the existing conditions of the road network and inform the formulation of traffic safety policies based on evidence. Ultimately, the results showed that ML-based models outperformed statistical models. Keywords: machine learning, data imbalance, road safety, driving simulation, SMOTE	en_US
dc.description.sponsorship	Dr. Kennedy Ogada Odhiambo, PhD JKUAT, Kenya Dr. Jael Sanyanda Wekesa, PhD JKUAT, Kenya	en_US
dc.language.iso	en	en_US
dc.publisher	JKUAT-COPAS	en_US
dc.subject	Predicting Road Accidents	en_US
dc.subject	Supervised Machine Learning	en_US
dc.subject	Analyzing their Patterns	en_US
dc.subject	SMOTE	en_US
dc.title	Predicting Road Accidents and Analyzing their Patterns Using Supervised Machine Learning	en_US
dc.type	Thesis	en_US

Files in this item

Name: James Oyoo Msc ...

Size: 1.492Mb

Format: PDF

Description: MSc Thesis

View/Open

This item appears in the following Collection(s)

College of Heaith Sciences JKUAT (COHES) [880]
Medical Laboratory; Agriculture & environmental Biotecthology; Biochemistry; Molecular Medicine, Applied Epidemiology; Medicinal PhytochemistryPublic Health;

Predicting Road Accidents and Analyzing their Patterns Using Supervised Machine Learning

Files in this item

This item appears in the following Collection(s)

Search DSpace

Browse

All of DSpace

This Collection

My Account