Medal Mavericks: SHAP-Driven Hybrid Random Forest Model for Olympic Forecasting

Authors

  • Junhong Li
  • Enrui Hu

DOI:

https://doi.org/10.56028/aetr.14.1.1113.2025

Keywords:

Random Forest; SHAP Model; Gradient Boosting Tree; Feature Engineering.

Abstract

This study builds an Olympic medal prediction model based on historical data and machine learning. The results show that the United States and China will dominate the gold and medal tables. The number of MEDALS won by sports powerhouses such as Russia and the United Kingdom tends to stabilize, while that of countries like Brazil and Australia may decline. The model predicts that about 10 developing countries, including Saint Kitts and Nevis (athletics), Bhutan (archery), and South Sudan (long-distance running), will win their first Olympic gold MEDALS at the 2028 Olympics. Through SHAP analysis, it was found that the number of events was significantly positively correlated with the number of MEDALS. The host country has advantages in key development projects. The impact of skill-based and physical-based events on medal acquisition in different countries varies significantly. The research also found that outstanding coaches can dramatically increase the total number of gold MEDALS. This conclusion can provide a theoretical basis for the International Olympic Committee's resource allocation and various countries' strategic decision-making.

Downloads

Published

2025-07-21