Skip to content

A comparison of runtimes to fit OLS regression models using different Python libraries (Scikit-learn, statsmodels, Numpy matrix multiplication)

Notifications You must be signed in to change notification settings

raytighe/linear_regression_speeds

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Fastest Way to Fit OLS Regression Models in Python

In this analysis I compare the runtimes of different methods to fit an ordinary least squares regression model. I fit a multivariate normal random sample using Scikit-Learn’s Linear Model module, statsmodels’ sm module, and with simple matrix multiplication. The result was three clear runtime distributions with simple matrix multiplication having the fastest mean runtime, followed by the Scikit-Learn then statsmodels methods. The results suggest that the most computationally efficient method for fitting ordinary least squares regression models with 0 intercept is using Numpy’s vectorized matrix multiplication. However, in practice, the average runtimes differ by hundredths of seconds so the relevant efficiency gains may be negligible.

alt text

About

A comparison of runtimes to fit OLS regression models using different Python libraries (Scikit-learn, statsmodels, Numpy matrix multiplication)

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published