A system to automate the process of tagging questions on stack overflow to classify information in an efficient manner and improve user experience. ->Multi-label classification system that automatically tags users’ questions. ->Linear SVC, logistic regression and SGD models using one vs rest classification method. ->Aim to predict as many tags as possible with high F1 score and low hamming loss.
Prediction Models ->Logistic Regression - a binary classification prediction model which processes the probability of discrete outcome given an input. ->Linear SVC (Support Vector Classifier)- Its objective is to fit to the data you provide, returning a "best fit" hyperplane that divides, or categorizes, your data. ->Stochastic Gradient Descent (SGD)- a type of gradient descent in which samples are selected randomly instead of using the whole data for each iteration.