Preface

This repository is a continuation by members of Codaisseur classes #28 and #27 of the "Jobs Board" real world project that was started by members of Codaisseur class #26. The original repo can be found here https://github.com/hastinc/Jobs-Board-Server.

Note: The copy-jobs/:id endpoint can be used when you have to create a new database remotely (for e.g. in Heroku). This basically copies all the jobs from the Huntr API into the database in chunks of 1000 records or less instead of copying all at once (which is what the /copy-jobs endpoint above does). This was a way to deal with Heroku's timeout issues. The usage for this would be to use with id = 1,2,3 ..until the response is 'No more jobs data available'.

http POST :4000/copy-jobs/1
http POST :4000/copy-jobs/2
http POST :4000/copy-jobs/3
http POST :4000/copy-jobs/4

Connect to your database with:

Mac: Postico
Linux: DBeaver

Connect to API database:

Go to the Official Codaisseur Graduate Github --> Projects --> Jobs Board --> Credentials. Here you will find the credentials needed to access the API database.

If everything went well, you are now able to see a populated companies, jobs, members, events and duplicates table in your database.

Access

Go to the Official Codaisseur Graduate Github --> Projects --> Jobs Board --> Credentials. Here you will find the most recent token. If this token is not valid anymore, ask your product owner for admin access to the Codaisseur’s Huntr account and then create a new token. If you have no admin access to the Codaisseur Huntr ask your product owner for a valid token.

To implement the token:

install module "dotenv"
create a .env file in the root directory (/Jobs-Board-Server)
the .env file should copy the .env.default values with the valid token
insert the token manually into the .env file

Your files should look like this:

./.env

API_TOKEN=<token>

./.env.default

API_TOKEN=

API

MODELS:

Companies -> employers inputted in Huntr by Codaisseur Graduates
Jobs -> jobs inputted in Huntr by Codaisseur Graduates (not open vacancies, each graduate creates a job when he or she applies for a position => 1 real vacancy can have multiple jobs (couple of Codaisseur Graduates applied to the same position))
Members -> Codaisseur Graduates
Events -> Actions performed by Codaisseur Graduates
Entries -> (not implemented in routes yet) timeline of Jobs in relation to Members

ENDPOINTS:

<base url> is either http://localhost:4000 for local development or https://frozen-meadow-51398.herokuapp.com for the deployed backend.

Fetches all the companies/jobs/members/events from the Huntr API and stores them in the database:

POST <base url>/copy-companies
POST <base url>/copy-members
POST <base url>/copy-jobs

Fetches jobs in batches of 1000 from the Huntr API and stores them in the database (used for pushing data to heroku database, to avoid the timeout). Usage: id=1 => jobs 0-999 , id=2 => jobs 1000-1999 etc: -POST <base url>/copy-jobs/:id

WARNING ONLY POST COPY-EVENTS WHEN RUNNING YOUR LOCAL DATABASE so http :4000/ NOT the heroku deployment

POST <base url>/copy-events

Fetches 12 companies from the database. Query parameters are page, sortBy and search:

GET <base url>/companies

Fetches a company with a specified id from the database:

GET <base url>/companies/:id

Fetches all companies from the Huntr API without pagination:

GET <base url>/allcompanies

Fetches jobs with the Indeed scraper. Query parameters are query (i.e. description) and city:

GET <base url>/jobs

Webhook endpoint. Receives post requests from the Huntr API every time a new “event” has occurred. See Huntr for more information:

POST <base url>/events

Fetches all events from the Huntr API:

GET <base url>/events

Fetches all active members from the Huntr API:

GET <base url>/members/active

Huntr

Token:

To create a valid token (if you have admin access to Huntr):

Admin —> developers —> Access Tokens —> Add Token

Webhook:

Current endpoint: https://frozen-meadow-51398.herokuapp.com/events

Please note that if you wish to add a new endpoint or edit the name of the URL of the deployed API, it might take some time (ie. 24 hours) before Huntr will recognise it as a valid endpoint.

To create a new webhook endpoint: Admin —> developers —> Webhooks —> Add Endpoint

Also note that a webhook is always a POST endpoint and always send back a HTTP status code of 200 as a response.

Events:

The Huntr API sends 2 types of events through to the webhook endpoint. These are identified by the “eventType” field: “JOB_ADDED” or “JOB_MOVED”. There are more event types however through testing we have noticed that Huntr only sends the 2 above mentioned even types.

Testing:

How to test incoming events:

Admin —> Boards —> Create Boards

Invite yourself or your colleague to the board and set the “advisor” to yourself. Test by inputting: “adding jobs”, “moving jobs” and setting dates. Expected result: Event entities created in the API database matching your input.

Notes:

The values of the different fields to do with “date” are not accurate coming from Huntr.

Please see the Huntr API documentation here for more information.

De-duplication algorithm

Companies This module de-duplicates the companies you get by calling the Huntr API at the /employers endpoint.

The de-duplication algorithm first takes out the companies where no-one from Codaisseur applied.
Then we iterate over the list of companies and compare it to all the companies in the list for each iteration to find duplicates.
Before we actually compare the names we use regular expressions to transform the company names to lowercase and filter out all characters following “-“ or “|”. It also ensures we only leave characters matching letters from the Latin alphabet or numbers.
Then we use the node package string-similarity to give us a matching score.
If it’s bigger than a set threshold we add the contents of the duplicate company to the company we’re comparing with. Then we remove the duplicate company.
Once we’re done iterating over the whole list of companies we add them to our database table called “companies’.
We’re also keeping track of the thrown away duplicates and store which company in the “companies” table they’re related to. This might be nice if you want to use other endpoints of the Huntr API and need the information of the thrown away duplicates.

Jobs This module de-duplicates the jobs you get by calling the Huntr API at the /jobs endpoint.

The de-duplication algorithm first takes out the jobs where no-one from Codaisseur applied.
Then we iterate over the list of jobs and compare it to all the jobs in the list for each iteration to find duplicates.
It will remove all duplicated jobs id's and return only the no duplicated ones.

Name		Name	Last commit message	Last commit date
Latest commit History 183 Commits
Huntr		Huntr
.env		.env
.env.default		.env.default
.gitignore		.gitignore
README.md		README.md
db.js		db.js
index.js		index.js
package-lock.json		package-lock.json
package.json		package.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Preface

Table of contents

Jobs Board Server

Technologies used

Setup

Access

API

Huntr

De-duplication algorithm

About

Releases

Packages

Contributors 10

Languages

Official-Codaisseur-Graduate/Jobs-Board-Server

Folders and files

Latest commit

History

Repository files navigation

Preface

Table of contents

Jobs Board Server

Technologies used

Setup

Access

API

Huntr

De-duplication algorithm

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 10

Languages

Packages