These are the code to do the classification.
-
The cluster_kmeans is to cluster by kmenas, the result is not good though.
-
The classification is to extract vector features from the description. We use Tf-Idf to get the features. Then we use different way to classify them. This is the main.
-
Keyword_search_classifying is to create our training dataset. We use keyword search to classify the data and check them to make sure they are correct.