Skip to content

Latest commit

 

History

History
36 lines (30 loc) · 1.82 KB

README.md

File metadata and controls

36 lines (30 loc) · 1.82 KB

Data Science Tools


  • Analyzing Numerical & Categorical data
  • Count plots of Categorical variables
  • Histogram plots of Numerical variables
  • Box plots of numerical variables based on Categorical variable
  • Histogram of numerical variables based on Categorical variable
  • Transformation of Numerical variables
  • Normalization or Standardization
  • We can impute the missing values of numerical variables by its median. For categorical variables, missing values are imputed by mode.
  • We can also try several group be method to decide how to impute.
  • Handling of outliers
  • dummy variable creation for categorical variables (escaping multicollinearity)

  • EDA is extensively applied and some Scikit-Learn based model such as Logistic Regression, Support Vector Machine, K-Nearest Neighbour, Random Forest is used.

License

MIT