This repo contains all the data and code for our CS 5112 [Algorithms and Data Structures for Applications Final Project] (Fall 2021). This project explores clustering and nearest neighbors approaches for matching users assuming that users prefer partners similar to themselves. The goal was to compare different algorithms to group users and then provide many-to-one matches within these groups based on users’ sex and orientation.
Libraries needed to run the code: pandas, numpy, matplotlib, sklearn, kmodes, umap-learn, gower
The raw data file is larger than 100 MB and therefore can't be pushed to github. It can be found here: https://drive.google.com/file/d/1GcXNikA94s_APjyrUpNu8HJYNucVbHsY/view?usp=sharing