Tools : Jupyter Notebook
Programming Languange : Python
Libraries : NumPy, Pandas
Visualization : Matplotlib, Seaborn
Dataset : Hotel Bookings
A mini project created by an expert tutor of Rakamin Academy. In this project, as a Data Scientist from hotel company had responsibility analyze customer behavior of bookings hotel, cancellation bookings, and interprete the analysis using Python visualization.
Investigated hotel business performance with ability to reach the market based on months, stays duration, and lead time towards cancellation bookings.
Created a data-driven visualization become insights for hotel business.
- First of all, I did a descriptive statistics. This dataset contains information of hotel bookings by customer. The dataset has 119390 rows and 29 columns with various data type float64(4), int64(16), object(9).
- Checked missing values. After overviewing the dataset, I checked the values of each columns. There are missing values in
company
(94.30%),agent
(13.68%),city
(0.40%) andchildren
(0.003%) columns. I impute all the missing values with '0' for numericals and 'Unknown' for categoricals. - Handled odd values.
meal
column had the unique values Breakfast, Full Board, Dinner, No Meal, Undefined. Undefined values indicate same definition to No Meal. So, I replaced the the Undefined values to No Meal. - Handle unnecessary values. I found that some guest are 0 based on the order so in the future I need to analyze the guest > 0 only. I make a copy of dataset onto new variable.
- Monthly Hotel Booking Analysis Based on Hotel Type
Observe and analyze growth based on monthly hotlel bookings on hotel type. Both of hotel type tend to be increase in the holiday season. However, amount of City Hotel booking looks decrease on August to September.
Figure 3: Distribution of Monthly Hotel Booking Based on Hotel Type - Impact Analysis of Stay Duration on Hotel Bookings Cancellation Rates
Analyze correlation between stay duration towards cancellation hotel rates. City hotel was the highest cancel bookings rate than the Resort Hotel, almost 100%. However, Resort Hotel had the percentage less 50%.
Figure 4: Distribution of Stays Duration on Hotel Bookings Cancellation Rates - Impact Analysis of Lead Time on Hotel Bookings Cancellation Rates
Analyze correlation between lead time towards cancellation rates. Increase of lead time City Hotel will affect to cancellation bookings. Meanwhile, Resort Hotel had fluctuative distribution.
Figure 5: Distribution of Lead Time on Hotel Bookings Cancellation Rates