Introduction to Exploratory Data Analysis in Python

: By Satyam Gadekar

What is Exploratory data analysis?

In statistics, exploratory data analysis (EDA) is an approach to analyzing data sets to summarize their main characteristics, often with visual methods. A statistical model can be used or not, but primarily EDA is for seeing what the data can tell us beyond the formal modeling or hypothesis testing task.

How to perform Exploratory Data Analysis?

This is one such question that everyone is keen on knowing the answer. Well, the answer is it depends on the data set that you are working. There is no one method or common methods in order to perform EDA, whereas in this Project you can understand some common methods and plots that would be used in the EDA process.

Advantages of EDA

It gives us valuable insights about the data
It helps us for feature selection (i.e using PCA)
Visualization is an effective way of detecting outliers

Dataset Introduction

We will perform exploratory data analysis on the House Price Prediction dataset.

EDA in Python

There are many libraries available in python pandas, NumPy, matplotlib, seaborn etc. with the help of those we can do the analysis of the data and bring out helpful insights. I will be using Jupyter Notebook along with these libraries. Some of the key steps in EDA are :

Identifying the features
Basics of EDA
Handling Missing values
Detecting Outliers
Handling Outliers
Histogram
Correlation Heatmap
Scatterplot
Boxplot
Feature Engineering

Endnotes

Hence the above are some of the steps involved in Exploratory data analysis, these are some general steps that you must follow in order to perform EDA.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

07Satyam_ML_EDA.md

07Satyam_ML_EDA.md

Introduction to Exploratory Data Analysis in Python

What is Exploratory data analysis?

How to perform Exploratory Data Analysis?

Dataset Introduction

EDA in Python

Endnotes

Files

07Satyam_ML_EDA.md

Latest commit

History

07Satyam_ML_EDA.md

File metadata and controls

Introduction to Exploratory Data Analysis in Python

What is Exploratory data analysis?

How to perform Exploratory Data Analysis?

Dataset Introduction

EDA in Python

Endnotes