Problem statement: Scrap the data from Techolution Careers website and store the data according to the date of posting(Most old first) as a DataFrame in CSV.

Website URL: https://techolution.app.param.ai/jobs/

To solve this problem the steps were:

Opened the website URL
2)While inspecting the website I found out that there were three request sent by the website one of them was giving query on job type description and location
3)We copied this link it was https://techolution.app.param.ai/api/career/get_job/?query=&locations=&category=&job_types= we also got the content type it was json
4)To extract information from this we used requests python package
5)We loaded the information in json format using json.loads

import requests  
import json, sys, csv
import pandas as pd  

r = requests.get('https://techolution.app.param.ai/api/career/get_job/?query=&locations=&category=&job_types=')

j = json.loads(r.content.decode('UTF-8'))

We saved the json file as data_file.json

with open("data_file.json", "w") as write_file:
    json.dump(j, write_file)

for row in j:
    print(row)

fil_locations
data
fil_job_types
fil_category
query_str
total_jobs

In JSON file we found six keys, as we can see above, we observed the data and found that the data key will be sufficient to give all information related to jobs.
The structure of data in json file is:
data
categories
jobs
Inside the jobs we can find the informations related to the job we store this in a list arr and also the location is in array so we convert it into string

arr=[]
locations = []
for i in j['data']:
    for obj in j['data'][i]['jobs']:
        locations = obj['locations']
        obj['locations'] = locations[0]
        arr.append(obj)

We found the columns and saved in obj list and created a dataframe using arr list and columns were in obj list

obj=[]
for col in j['data'][i]['jobs'][0]:
        obj.append(col)
df=pd.DataFrame(arr, columns=obj)

We need to sort the list using the posting date so first we convert the date created_at using pandas to_datetime and then sort the data frame on this column

df['created_at'] =  pd.to_datetime(df['created_at'])

df=df.sort_values('created_at')

df.head()

.dataframe thead th {
    text-align: left;
}

.dataframe tbody tr th {
    vertical-align: top;
}

</style>

	id	title	req_id	slug	created_at	locations	description	job_type	min_exp	max_exp	added_by	added_by_email	category	business_unit_name	organization_name
20	a933f8e2-dd5d-4a82-ab10-1209634d31c7	Engineering Lead	1861	engineering-lead	2019-02-08T13:11:49.966886Z	mauritius	<p><strong style="color: rgb(0, 0, 0); backgro...	Full-time	84	216	Rekha Allam	rekha.allam@techolution.com	Information Technology	Cloud Automation - Mauritius	Techolution Mauritius
19	4e17f47b-7916-411a-b0cd-90d5eeb6346f	DevOps Architect	1873	devops-architect	2019-02-11T12:00:25.061831Z	Hyderabad	<p><span style="color: rgb(0, 0, 0); backgroun...	Full-time	60	180	Nikhil Shekhar	nick.shekhar@techolution.com	Information Technology	Cloud Automation - India	Techolution Pvt Ltd
26	4e641217-901a-4670-886e-dd2946bf5476	Machine Learning Engineer	1898	machine-learning-engineer	2019-02-14T16:13:38.000894Z	Hyderabad	<p><strong style="color: rgb(51, 51, 51);">Tit...	Full-time	36	60	Madhav Kommineni	madhav@techolution.com	Facial recognition	FaceOpen	Techolution LLC
18	d4847f54-7a0a-44dd-b3a6-b93c7fb3cb7d	Sr SDET	1903	sr-sdet	2019-02-14T16:38:50.411436Z	New York	<p>Techolution is a premier cloud, user interf...	Full-time	36	120	Satish Kumar	satish.kumar@techolution.com	Information Technology	UI/UX Modernization - US	Techolution LLC
17	c4daf0d7-f86f-4d86-b62e-ab9117ba2800	OSS DevOps Engineer	1905	oss-devops-engineer	2019-02-14T16:55:20.844881Z	Hyderabad	<p><strong>Title : OSS DevOps Engineer</s...	Full-time	72	144	Pavan Kumar	pavan.thirunahari@techolution.com	Information Technology	Cloud Automation - India	Techolution Pvt Ltd

df.to_csv("jobfile.csv",encoding='utf-8', index=False)

Saving the file as jobfile.csv. This is the required file

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!