Skip to content

This project explores the data of medical insurance claims. Descriptive Analysis, Exploratory data analysis, Univariate, Bivariate and multivariate analysis is performed to explore the data and how different features are correlated to each other. Finally, hypothesis testing is performed by employing t-test, Chi-squared test and One-way ANOVA.

Notifications You must be signed in to change notification settings

kedarghule/Statistical-Analysis-of-Insurance-Claims

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Statistical-Analysis-of-Insurance-Claims

Problem Statement

Leveraging customer information is of paramount importance for most businesses. In the case of an insurance company, attributes of customers like the ones in the given dataset below can be crucial in making business decisions. The problem statement aims at finding out different relationships and correlations between the different attributes in the dataset and gather information about the dataset.

Dataset

Link to Dataset: https://www.kaggle.com/datasets/mirichoi0218/insurance

Dataset Information:

  • age: age of primary beneficiary

  • sex: insurance contractor gender, female, male

  • bmi: Body mass index, providing an understanding of body, weights that are relatively high or low relative to height, objective index of body weight (kg / m ^ 2) using the ratio of height to weight, ideally 18.5 to 24.9

  • children: Number of children covered by health insurance / Number of dependents

  • smoker: Smoking

  • region: the beneficiary's residential area in the US, northeast, southeast, southwest, northwest.

  • charges: Individual medical costs billed by health insurance

Our dataset has the following types of variables:

  • Categorical varibles: sex, smoker, region, children
  • Quantitative variables: age, bmi, charges. Here children is a discrete variable where as age, bmi, and charges are continous variables.

Descriptive Statistics

image

image

Exploratory Data Analysis

Univariate Analysis

Box Plot

image

Histogram and Rug Plot

image image image image

Bar Plot

image

Bivariate and Multivariate Analysis

Correlation among Features

image

Pair Plot

image image

Sex vs All Numerical Features

image image

Smokers vs All Numerical Features

image image

Region vs All Numerical Features

image image

Comparing Categorical Features

image image image image image

Comparing Numerical Features and Categorical Features

image image image image image

BMI by Age Group Comparison

image

Age Group by Charges Comparison

image

More EDA

image image image image image

Statistical Analysis

image image image image image image image image

About

This project explores the data of medical insurance claims. Descriptive Analysis, Exploratory data analysis, Univariate, Bivariate and multivariate analysis is performed to explore the data and how different features are correlated to each other. Finally, hypothesis testing is performed by employing t-test, Chi-squared test and One-way ANOVA.

Topics

Resources

Stars

Watchers

Forks

Packages

No packages published