The Titanic dataset is a well-known dataset in the field of data science and machine learning. It contains information about passengers aboard the RMS Titanic, which sank on its maiden voyage in 1912 after colliding with an iceberg. The dataset includes the following features:
- Survived: Whether the passenger survived (0 = No, 1 = Yes)
- Pclass: Ticket class (1 = 1st, 2 = 2nd, 3 = 3rd)
- Name: Passenger's name
- Sex: Passenger's gender
- Age: Passenger's age in years
- SibSp: Number of siblings/spouses aboard the Titanic
- Parch: Number of parents/children aboard the Titanic
- Ticket: Ticket number
- Fare: Passenger fare
- Cabin: Cabin number
- Embarked: Port of embarkation (C = Cherbourg, Q = Queenstown, S = Southampton)
The primary goal of analyzing this dataset is often to predict whether a passenger survived the sinking of the Titanic based on various features such as their class, gender, age, and other factors.
In this analysis, several machine learning algorithms were employed to predict survival outcomes based on the Titanic dataset. The algorithms used include:
- Support Vector Classifier (SVC)
- K Neighbors Classifier
- Decision Tree Classifier
- Stochastic Gradient Descent Classifier (SGDClassifier)
- Logistic Regression
- Gaussian Naive Bayes (GaussianNB)
- Gradient Boosting Classifier
- Random Forest Classifier
These algorithms were applied to explore patterns in the data and build predictive models to determine the likelihood of survival for passengers.