Skip to content

Latest commit

 

History

History
11 lines (7 loc) · 1.16 KB

README.md

File metadata and controls

11 lines (7 loc) · 1.16 KB

Predicting Home Prices Using Linear Regression

This repository contains a notebook and datset used to build a pipeline of functions to run scikit-learn's linear regression model to predict a home's sale price.

The pipeline contains three main functions to quickly iterate on different models.

  • transform_features function is for feature engineering
  • select_features function is used to select features
  • train_and_test function trains and tests the model using linear regression and returns the RMSE error metric

I am going to work with housing data for the city of Ames, Iowa, in the United States from 2006 to 2010. The data set contains 2930 observations and a 80 explanatory variables (23 nominal, 23 ordinal, 14 discrete, and 20 continuous) involved in assessing home values. For information on why the data was collected go here. More information about the different columns in the data can be found here.