The goal of this Analysis is to use historical data to perform analytics verification and validation of automotive features and design for future testing. We will provide visualization of statistical test and interpretation of the results to management.
We are conducting multi-linear regression analysis on MechaCar data. Below is a summary of the linear regression model.
Which variables/coefficients provided a non-random amount of variance to the mpg values in the dataset?
The Vehicle weight, spoiler_angle and AWD provided the most amount of variance to the mpg values.
Is the slope of the linear model considered to be zero? Why or why not?
The slope is not considered to be Zero because, slope is a function of R and Standard deviation. Neither R nor Stdev is zero. Also, the P-value indicate a significant relationship between the dependent and independent variables which is an indication of non-zero slope
Does this linear model predict mpg of MechaCar prototypes effectively? Why or why not?
The model predicted mpg of MechaCar prototypes effectively because we have an R-Square of 71%, which means 71% of the data fits the model.
In this analysis, we created a statistical summary on the Suspension_Coil dataset
When we look at the total_summary data_frame of the entire production lot, the variance of the coils is 62.29 PSI, which is consistent and within the 100 PSI variance specification. The Lot_summary shared more details, Lot 1 and Lot 2 are well within the 100 PSI variance specification with variances of 0.98 and 7.47 respectively. But Lot 3 does not meet the variance specification, showing a larger variance of 170.29. However, the overall lot variance is still within the threshold.
One-sample t-test was conducted to identify any statistical significance difference in the means of different sample data and the population dataset. When we compare the results, it appears Lot 1 and Lot 2 have much bigger p-values of 1 and 0.607, respectively. These p-values is an indication that there is no statistical difference between the mean and sample dataset. Lot 3 has a p-value of 0.041 meaning the mean is statistically the same and there is no sufficient evidence to reject the null hypothesis.
Write a short description of a statistical study that can quantify how the MechaCar performs against the competition. In your study design, think critically about what metrics would be of interest to a consumer: for a few examples, cost, city or highway fuel efficiency, horse power, maintenance cost, or safety rating.
• We will test information on fuel efficiency of MechaCar and compare with competitors to determine miles per gallon, or mpg
• The null hypothesis is that miles per gallon, or mpg will be consistent across all car manufacturers. The alternative hypothesis s that mpg will not be consistent with at least one manufacturer.
• We will use ANOVA analysis for fuel efficiency (mpg) for the same type of cars. ANOVA is preferred over t-test and can be useful in testing the means of the continuous numerical variable of fuel efficiency (mpg) across various manufacturer
• We will collect measured mpg data for both MechaCar and competitors. We will increase the sample size to reduce error and improve the confidence interval.