# FIFA-World-Cup-Qatar-2022-Prediction: Project Overview

Using international matches played since the 1990s, the countries' qualifiers from recent matches, and the potential of each side, this project aims to forecast the outcomes of the QATAR 2022 World Cup.

Resources

Python Version: 3.10
Packages: Pandas, NumPy, Sklearn, Tensorflow, and Seaborn.
Data:
- international_matches.csv - This dataset provides a complete overview of all international soccer matches played since the 90s. On top of that, the strength of each team is provided by incorporating actual FIFA rankings as well as player strengths based on the EA Sport FIFA video game.
- players_22.csv - The datasets provided include the player data for FIFA 22 Career Mode.

Data preparation and dataset creation

Both datasets international_matches.csv and players_22.csv were prepared for analysis and the creation of the training dataset of the Machine learning model. The data preparation involved removing the iformation of teams that will not participate in the cup and fixing the na's values.
From dataset international_matches.csv, create training dataset training.csv and inference dataset last_team_scores.csv. training.csv contains the names of the teams facing each other, the FIFA ranking of each team, and the rating of both teams' defense, midfield, and offense. On the other hand, the inference dataset contains the qualification of each team on its last FIFA date.

Exploratory Data Analysis

From datasets international_matches.csv and players_22.csv, the notebooks QATAR22_EDA+Data_Preparation.ipynb and Getting_Squads_Stats.ipynb answer the questions listed below. These questions allow us to get an idea of the favorites to win the cup according to statistics.

National soccer teams with the best offense

National soccer teams with the best defense

National soccer team with the best midfield

Teams with the highest winning percentage

The best players in Qatar 2022

The most promising teams

Is there any advantage to be the local team?

This question is fundamental. The pie chart below highlights that home teams win more than 50% of the home games. This is due to different reasons, e.g., players are more comfortable in familiar surroundings, additional support from a home crowd, home comforts/lack of travel, configuration of the playing area to suit the home team. When the Colombian National Team visits the Maracana stadium to play against Brazil, they tend to lose the match or draw. However, they tend to tie or win when Colombia is local in the Metropolitano stadium. For this reason, to predict the result of the matches from a Machine Learning model, one must define the home team and the visiting team.

Modeling and Tuning

The Modeling+Tuning.ipynb notebook aims to train the Machine learning model that will predict the outcome of the World Cup matches. This notebook chooses one ML model to predict the group stage matches and another for the knockout stage. the difference is that the result of the group stage matches can be a loss, a draw, or a win. On the other hand, in the direct elimination stage, there is only defeat or victory. The best model for each stage is chosen among the algorithms:

Random Forest
Ada Boost Classifier
XGB Boost
Neural Networks

The XGB Boost model presents the best performance in both stages. Therefore it is tuned, validated, and exported as a pipeline to perform easy inferences.

Confusion matrix of the group stage model tuned and validated

Confusion matrix of the knockout stage model tuned and validated

Predictions

Finally, notebook Predictions.ipynb employs the inference datasets and the trained models to predict the World Cup matches and thus find the winner of the World Cup. It is essential to mention that to choose who is the home team in each World Cup match, use dataset squad_stats.csv, which provides the potential of each team; therefore, the team with more significant potential will be the home team.

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
.ipynb_checkpoints		.ipynb_checkpoints
data		data
images		images
Getting_Squads_Stats.ipynb		Getting_Squads_Stats.ipynb
Modeling+Tuning.ipynb		Modeling+Tuning.ipynb
Predictions.ipynb		Predictions.ipynb
QATAR22_EDA+Data_Preparation.ipynb		QATAR22_EDA+Data_Preparation.ipynb
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

# FIFA-World-Cup-Qatar-2022-Prediction: Project Overview

Resources

Data preparation and dataset creation

Exploratory Data Analysis

Modeling and Tuning

Predictions

About

Releases

Packages

Languages

ogbeiedward/FIFA-World-Cup-Qatar-2022-Prediction

Folders and files

Latest commit

History

Repository files navigation

# FIFA-World-Cup-Qatar-2022-Prediction: Project Overview

Resources

Data preparation and dataset creation

Exploratory Data Analysis

Modeling and Tuning

Predictions

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages