Skip to content

Essentially conducted on a multidimensional dataset, that includes data regarding different cereal brands, and output of questions that were asked in a survey. Here, I've use PCA, which is not completely different from factor analysis, it get the jobs done. The statistics involved is basically finding different principle components and using it …

Notifications You must be signed in to change notification settings

amitesh0109/factor-analysis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 

Repository files navigation

Factor-analysis on cereal dataset

In this project we conduct a factor analysis on the cereal dataset, Factor analysis is essentially used to assess the structure of our data by evaluating correlation between variables. It basically summarizes data into a few dimensions by condensing a large number of variables into a smaller set of hidden factors that we do not directly measure or observe, but which maybe easier to interpret. Using this analysis we can model each orginal original variable as a linear function of these basic factors. For this analysis to be successful groups of variables should be highly correlated, with small correlations among variables from different groups. Principal component analysis and factor analysis are very similar methods used for reduction of multivariate data, the difference between them is that factor analysis assumes the existence of a few common factors driving the variation in the data while principal component analysis does not make such an assumption. There are lots of other methods that are variations on the theme of dimension reduction, like heatmaps, t-SNE plots etc.

A pca plot converts the correlation among all of the cells into 2D graph. Values that are highly correlated cluster together. Essentially conducted on a multidimensional dataset, that includes data regarding different cereal brands, and output of questions that were asked in a survey. Here, I've use PCA, which is not completely different from factor analysis, it get the jobs done. The statistics involved is basically finding different principle components and using it to deduce different factors that effects a particular cereal brand.

About

Essentially conducted on a multidimensional dataset, that includes data regarding different cereal brands, and output of questions that were asked in a survey. Here, I've use PCA, which is not completely different from factor analysis, it get the jobs done. The statistics involved is basically finding different principle components and using it …

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published