Solution which placed 39th on LeaderBoard.
https://www.hackerearth.com/machine-learning-india-hacks-2016/machine-learning/will-bill-solve-it/
Both training and testing dataset consist of 3 files :-
With Attributes of a User:
user_id - the user id
skills - all his skills separated by the delimiter '|'
solved_count - number of problems solved by the user
attempts - total number of incorrect submissions done by the user
user_type : type of user (S - Student, W - Working, NA - No Information Available)
Attribute related to a Problem :
problem_id - the id of the problem
level - difficulty of the problem (Very-Easy, Easy, Easy-Medium, Medium, Medium-Hard, Hard)
accuracy - the accuracy score for the problem
solved_count - number of people who have solved it
error_count - number of people who have solved it incorrectly
rating - star (quality) rating of the problem on scale of 0-5
tag1 - tag of the problem representing the type e.g. Data Structures
tag2 - tag of the problem
tag3 - tag of the problem
tag4 - tag of the problem
tag5 - tag of the problem
Problem User interaction and final results for each attempt a user made to a solve a particular problem.
user_id - the id of the user who made a submission
problem_id - the id of the problem that was attempted
solved_status - indicates whether the submission was correct (SO : Solved or Correct solution, AT : Attempted or Incorrect solution )
result - result of the code execution (PAC: Partially Accepted, AC : Accepted, TLE : Time limit exceeded, CE : Compilation Error, RE : Runtime Error, WA : Wrong Answer)
language_used - the lang used by user to code the solution
execution_time - the execution time of the solution
Calculated a Custom Feature from the available user's tags. Used a Hard Voting Classifier of AdaBoostClassifier and RandomForestClassifer.
python run.py