Skip to content

LIRNEasia/socioeconomic-index

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Introduction

A socioeconomic index, also known as a deprivation index, is a single numerical figure that gauges the socioeconomic status of a predefined area. It encompasses multiple socioeconomic characteristics as well as their relative significance. It would allow for direct comparisons of socioeconomic status between regions and would be tremendously useful in identifying patterns and correlations between socioeconomic status and other attributes. This is our attempt at creating a socioeconomic index for Sri Lanka.

Dataset

The dataset we used was the 2011 national census datasets. This repository contains a cleaned version of these datasets. Below is a thorough description of the datasets and their respective name in the code.

Dataset Category Variable Name in Code
Household Cooking Fuel Firewood
Kerosene
Gas
Electricity
Sawdust / Paddy husk
Other
coo_firewood
coo_kerosene
coo_gas
coo_electricity
coo_sawdust_paddyhusk
coo_other
Household Floor Material Cement
Tile / Granite / Terrazzo
Mud
Wood
Sand
Concrete
Other
flo_cement
flo_tile_granite_terrazzo
flo_mud
flo_wood
flo_sand
flo_concrete
flo_other
Household Housing Permanent
Semi-permanent
Improvised
Unclassified
hou_permanent
hou_semipermanent
hou_improvised
hou_unclassified
Household Lighting National Grid
Hydro Power
Kerosene
Solar Power
Biogas
Other
lig_nationalgrid
lig_hydro
lig_kerosene
lig_solar
lig_biogas
lig_other
Household Roof Material Tile
Asbestos
Concrete
Zinc / Aluminium sheet
Metal sheet
Cadjan / Palmyrah / Straw
Other
roo_tile
roo_asbestos
roo_concrete
roo_zinc_aluminium
roo_metal
roo_cadjan_palmyrah_straw
roo_other
Household Structure Single - 1 story
Single - 2 story
Single - 3+ story
Attached house / Annex
Flat
Condominium
Twin house
Row / Line room
Hut / Shanty
str_single_1
str_single_2
str_single_3
str_attachedhouse_annex
str_flat
str_condominium
str_twinhouse
str_room
str_hut_shanty
Household Tenure Owned
Rent / Lease - government owned
Rent / Lease - private owned
Rent free
Encroached
Other
ten_owned
ten_rent_gov
ten_rent_pvt
ten_rent_free
ten_encroached
ten_other
Household Toilet Facilities Water Seal - connected to sewer
Water Seal - connected to septic tank
Pour flush
Direct pit
Other
No toilet
toi_waterseal_sewer
toi_waterseal_tank
toi_pourflush
toi_directpit
toi_other
toi_none
Household Wall Material Brick
Cement block / Stone
Cabook
Soil brick
Mud
Cadjan / Palmyrah
Plank / Metal sheet
Other
wal_brick
wal_cementblock_stone
wal_cabook
wal_soilbrick
wal_mud
wal_cadjan_palmyrah
wal_plank_metal
wal_other
Household Waste Disposal Collected by government
Burned
Buried
Composted
Disposed into environment
Other
was_collect_gov
was_burn
was_bury
was_compost
was_dispose_env
was_other
Household Water Source Protected well - within premises
Protected well - outside premises
Unprotected well
Tap - within unit
Tap - outside unit but within premises
Tap - outside premises
Rural water projects
Tube well
Bowser
River / Tank / Stream
Rain water
Bottled water
Other
wat_well_prot_in
wat_well_prot_out
wat_well_unprot
wat_tap_unit_in
wat_tap_prem_in
wat_tap_prem_out
wat_rural
wat_tubewell
wat_bowser
wat_river_tank_stream
wat_rain
wat_bottled
wat_other
Population Age 0 - 4
5 - 9
10 - 14
15 - 19
20 - 24
25 - 29
30 - 34
35 - 39
40 - 44
45 - 49
50 - 54
55 - 59
60 - 64
65 - 69
70 - 74
75 - 79
80 - 84
85 - 89
90 - 94
95 & above
age_y0_4
age_y5_9
age_y10_14
age_y15_19
age_y20_24
age_y25_29
age_y30_34
age_y35_39
age_y40_44
age_y45_49
age_y50_54
age_y55_59
age_y60_64
age_y65_69
age_y70_74
age_y75_79
age_y80_84
age_y85_89
age_y90_94
age_y95_above
Population Education Primary
Secondary
O Level
A Level
Degree & Above
No Schooling
edu_primary
edu_secondary
edu_olevel
edu_alevel
edu_degree
edu_none
Population Employment Employed
Unemployed
Economically Inactive
emp_employed
emp_unemployed
emp_inactive
Population Gender Male
Female
gen_male
gen_female

Note: The 2011 national census datasets are only available as a summary of counts at the Grama Niladhari Division (GND) level. The original categorical variables surveyed at the household level have been converted to binary variables and aggregated for each GND. This obscures certain correlations between variables, therefore our results are suboptimal.

Methodology

We employed principal component analysis (PCA) and extracted the first principal component to use as the socioeconomic index. We strongly recommend reading Vyas and Kumaranayake (2006) for a thorough justification of this method as well as an exploration of alternatives.

This whitepaper contains a thorough description of our process. In short, we observed the following procedure:

  1. Curate the dataset to remove variables that are either redundant or non-indicative of socioeconomic status.
  2. Normalize the dataset with respect to each category within each GND.
  3. Standardize each variable.
  4. Run PCA on the standardized dataset.
  5. Extract the weights given by the first principal component.
  6. Multiply the standardized dataset by these weights.
  7. Sum the above scores for each GND to get the socioeconomic index.

Results

We separated the GNDs into seven quantiles and plotted their socioeconomic index as a choropleth map. These are our results using the household and population datasets.

Combined Dataset

Socioeconomic Index using the combined dataset

Household Dataset

Socioeconomic Index using the household dataset

Population Dataset

Socioeconomic Index using the population dataset

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages