Skip to content

taykiathong/IR_Project

Repository files navigation

50.045 Information Retrieval Project

JobMatch

Our project looks to focus on the theme of queries regarding job searches. Our team understands the pain with regards to finding jobs as almost-graduates ourselves, and many a time we will head online to look for relevant reviews for the job and/or the company that we are applying for. These reviews will be able to better help our understanding of the position and the company we are applying for, and also to get a general sense of the attitude towards that current role based on current or past staff.

Job_reviews.csv → original csv file

Cleaned_data.csv → csv file after performing data cleaning and preprocessing

Train_cleaned_data.csv → split to include only 80% of the cleaned_data.csv after shuffling

Test_cleaned_data.csv → includes the remaining 20% of cleaned_data.csv

Cleandata.ipynb → initial steps to clean up the original csv file and preprocess it

CosineSimilarity.ipynb→ ipynb file to run cosine similarity and cosine similarity with pos tagging, relevance feedback , Average Precision; includes other methods explored such as AND, OR, Jaccard Similarity and Boolean Retrieval.

Bm25_og.ipynb → original bm25 model code

Bm25_postag.ipynb → bm25 model code with added weights for specific terms such as location, place, position

bm25_ap→ ipynb file for bm25 for relevance feedback and Average Precision

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published