In the data
folder of this repo there is a CSV file called reactions.csv
. It contains real data corresponding to how users on Otta have reacted to (saved or skipped) jobs on the platform.
The reaction data consists of four columns:
user_id
- the integer ID of the user who liked or disliked the jobjob_id
- the integer ID of the job the user interacted withdirection
- whether the user liked (true
) or disliked (false
) the jobtime
- the timestamp corresponding to when they reacted to the job
Task: The similarity score between two users is the number of jobs which they both like. Find the two users with the highest similarity.
Answer: Users 5193
and 1791
have the highest similarity score of 181
.
In the data
folder there is an additional CSV file called jobs.csv
. It contains unique integer IDs for over 12,000 jobs, along with integer IDs for the job's associated company.
Task: The similarity score between two companies is the number of users who like at least one job at both companies. Using both the reactions.csv
and jobs.csv
data, find the two companies with the highest similarity score.
Answer: Company 92
and 46
have the highest similarity score of 104
.