This repository contains the documentation, code, and resources for cleaning, transforming, and analyzing airline delay data using Power Query and Power BI. The dataset includes 30 columns with detailed flight information, and the goal is to clean the data, handle missing values, transform data types, and prepare it for analysis and visualization. The final output is an interactive Power BI dashboard that provides actionable insights into flight delays, cancellations, and operational efficiency.
The dataset contains the following key columns:
- FlightID: Unique identifier for each flight.
- FlightNum: Flight number.
- TailNum: Aircraft tail number.
- CarrierCode: Carrier code.
- Year, Month, DayofMonth: Date components.
- CRSDepTime, DepTime: Scheduled and actual departure times.
- CRSArrTime, ArrTime: Scheduled and actual arrival times.
- CRSElapsedTime, ActualElapsedTime: Scheduled and actual flight duration.
- DepDelay, ArrDelay: Departure and arrival delays.
- CarrierDelay, WeatherDelay, NASDelay, SecurityDelay, LateAircraftDelay: Delay causes.
- Origin, Dest: Origin and destination airport codes.
- Distance: Flight distance in miles.
- TaxiIn, TaxiOut: Taxi-in and taxi-out durations.
- Cancelled, Diverted: Binary indicators for cancellations and diversions.
- CancellationCode: Reason for cancellation.
- Affected Columns:
CarrierDelay
,WeatherDelay
,NASDelay
,SecurityDelay
,LateAircraftDelay
- Replace null values with
0
to ensure consistency. - Power Query Step:
ReplaceNullValues = Table.ReplaceValue(Source, null, 0, Replacer.ReplaceValue, {"CarrierDelay", "WeatherDelay", "NASDelay", "SecurityDelay", "LateAircraftDelay"})
- Affected Columns:
Cancelled
,Diverted
- Convert numeric values (0,1) to Boolean (
True/False
) for better readability. - Power Query Step:
ChangeType = Table.TransformColumnTypes(ReplaceNullValues, {{"Cancelled", type logical}, {"Diverted", type logical}})
- Affected Columns:
Year
,Month
,DayofMonth
- Merge these into a
Date
column inyyyy/MM/dd
format. - Generate
DateKey
inYYYYMMDD
format for linking with a date dimension. - Power Query Step:
AddDateColumn = Table.AddColumn(ChangeType, "Date", each Date.FromText(Text.From([Year]) & "/" & Text.From([Month]) & "/" & Text.From([DayofMonth])), type date) AddDateKey = Table.AddColumn(AddDateColumn, "DateKey", each Date.ToText([Date], "YYYYMMDD"), type text)
- Affected Columns:
DepTime
,CRSDepTime
,ArrTime
,CRSArrTime
- Convert numeric values to proper
hh:mm
time format. - Power Query Step:
ConvertTime = Table.TransformColumns(AddDateKey, { {"DepTime", each Time.FromText(Text.PadStart(Text.From(_), 4, "0")), type time}, {"CRSDepTime", each Time.FromText(Text.PadStart(Text.From(_), 4, "0")), type time}, {"ArrTime", each Time.FromText(Text.PadStart(Text.From(_), 4, "0")), type time}, {"CRSArrTime", each Time.FromText(Text.PadStart(Text.From(_), 4, "0")), type time} })
- Affected Columns:
Year
,Month
,DayofMonth
,DayOfWeek
- These columns become redundant after creating
Date
andDateKey
. - Power Query Step:
RemoveColumns = Table.RemoveColumns(ConvertTime, {"Year", "Month", "DayofMonth", "DayOfWeek"})
- Ensure unique entries for each airport based on
IATA
andICAO
codes.
The data model is designed to:
- Analyze causes and patterns of flight delays.
- Understand reasons for flight cancellations.
- Identify trends and correlations between carriers, airports, and dates.
- Provide actionable insights for improving operational efficiency.
- Contains detailed flight data, including delays, cancellations, and durations.
- Key columns:
FlightID
,DateKey
,CarrierCode
,Dest
,DepDelay
,ArrDelay
,Cancelled
,Diverted
, etc.
- Dim Airport: Contains airport details (
IATA
,ICAO
,Airport Name
,Country Code
,Latitude
,Longitude
,Region
). - Dim Carrier: Contains carrier details (
Carrier Name
,IATA Carrier Code
,ICAO Carrier Code
). - Dim Cancellation: Contains cancellation reasons (
Cancellation Code
,Cancellation Reason
). - Dim Date: Contains date details (
Date
,DateKey
,Day
,Day Name
,Month
,Month Name
,Year
).
- Fact Flights ↔ Dim Date:
DateKey
- Fact Flights ↔ Dim Airport:
Dest = IATA
- Fact Flights ↔ Dim Carrier:
Carrier Code
- Fact Flights ↔ Dim Cancellation:
CancellationCode
The Power BI dashboard provides an interactive and visually appealing way to explore flight delay data. It consists of multiple pages, each focusing on a specific aspect of the data.
-
- Project title and navigation buttons to access the dashboard, project description, and Q&A section.
-
- Key metrics: Total flights, offline flights, early flights, and late flights.
- Top 10 carriers and airports by flights.
- Flight trends by month and quarter.
- Regional flight comparison.
-
- Total delay hours by region (displayed on a map).
- Top 5 and bottom 5 regions by delays.
- Delay ratio by carrier and region.
-
Drill-Through Pages:
-
- Delay trends by month, day of the week, and hour.
- Breakdown of delay types (Carrier, NAS, Security, Weather, Late Aircraft).
-
Cancellations & Diversions Analysis Page:
- Key KPIs: Canceled flights, cancellation rate, diverted flights, diversion ratio.
- Trend analysis of cancellations and diversions over time.
- Cancellation reasons and trends by carrier.
-
slicer:
The slicer allows users to filter flight data based on:
- Airport – Select a specific airport or view all.
- Carrier – Filter by airline carrier.
- Date – Choose a date range.
- Hour – Select specific hours of the day.
To use it, simply adjust the dropdowns and sliders, then apply the filters to update the dashboard.
- Interactive Visuals: Users can filter and slice data to focus on specific categories.
- Slicer Filters: Easily refine data by selecting specific airports, carriers, dates, and hours.
- Drill-Through Capabilities: Analyze data at granular levels (e.g., specific regions or airports).
- Geographical Mapping: Visualize regional impacts of flight delays.
- Time-Based Trend Analysis: Identify delay patterns over time.
- Power Query: For data cleaning and transformation.
- Power BI Desktop: For data modeling and visualization.
- DAX (Data Analysis Expressions): For calculations and measures.
- Open the Power BI file (.pbix).
- Navigate through different pages using the sidebar menu.
- Use filters and slicers to explore specific insights.
- Click on regions, carriers, or airports for drill-through analysis.
- Integration with real-time flight data APIs for up-to-date insights.
- Predictive analytics for forecasting delays and cancellations.
- Additional user customization options for personalized dashboards.
This README provides a comprehensive guide to the Airline-Delay project. For further details, explore the repository or reach out to the maintainers. Happy analyzing! 🚀