Skip to content

Generating GTFS

Shekhar Krishnan edited this page Jun 30, 2015 · 2 revisions

Converting BEST Data to GTFS

Steps for making trips.txt file for GTFS from the route_atlas table:

First the route_atlas table was divided into two tables one which contained the timings column of first and last bus from source to destination names as uptrip and other contained the timings column of first and last bus from destination to source named as rt3.

Queries for creating those two tables.

create table uptrip as select route_no,froms,tos,sfirst,slast,h1,h2,h3,h4,h5,id,sch from atlasf order by id; create table downtrip as select route_no,froms,tos,dfirst,dlast,h1,h2,h3,h4,h5,id,sch from atlasf order by id;

Then from those 2 tables all those rows were deleted which does not have timings detail.

delete from uptrip where sfirst isnull and slast isnull; delete from downtrip where dfirst isnull and dlast isnull; select route_no,froms,tos,sfirst,slast,h1,h2,h3,h4,h5,id,sch from atlasf order by id;

According to GTFS format trips that span multiple dates will have stop times greater than 24:00:00. For example, if a trip begins at 10:30:00 p.m. and ends at 2:15:00 a.m. on the following day, the stop times would be 22:30:00 and 26:15:00. For that these queries have to be applied.

update uptrip set slast=slast+24.0 where sfirst >slast; update rt3 set slast=slast+24.0 where sfirst >slast;

Then there are some routes for which timings of first bus is not given but last bus is given. Remove that error using this query.

update uptrip set sfirst=slast where sfirst isnull; update uptrip set slast=null where sfirst=slast; update rt3 set sfirst=slast where sfirst isnull; update rt3 set slast=null where sfirst=slast;

For making trips.txt file for GTFS add 5 columns in both uptrip and rt3 table which should contain number of trips in a particular time slot. The day is divided into 5 time slots before 7 A.M., 7 to 11 A.M., 11 to 5 P.M, 5 to 8 P.M. and after 8 P.M.

Make a function named minut which calculates the difference between two timings and divide that difference by the headway of that particular slot and return number of trips in that slot of timing. Then use the following queries to update these five columns in both uptrip and rt3 according to different time slots. For each time slot we make 4 cases as follows:

  1. CASE 1: WHEN START_TIME LIES BETWEEN THE GIVEN INTERVAL AND END_TIME ALSO LIES BETWEEN SAME INTERVAL:
  2. CASE 2: WHEN START_TIME IS LESS THAN THE START INTERVAL AND END_TIME LIES BETWEEN THE SAME INTERVAL AND THUS SETS START_TIME TO THE START OF INTERVAL:
  3. **CASE 3 **: WHEN END_TIME IS MORE THAN THE END OF INTERVAL AND START_TIME LIES BETWEEN THE SAME INTERVAL AND THUS SETS END_TIME TO THE END OF INTERVAL:
  4. CASE 4: WHEN START_TIME IS LESS THAN THE START INTERVAL AND END_TIME IS GREATER THAN THE END INTERVAL AND THUS SETS START_TIME TO THE START OF INTERVAL AND END_TIME TO THE END OF INTERVAL:

CREATE OR REPLACE FUNCTION minut(input1 float,input2 float,input3 numeric) RETURNS int IMMUTABLE STRICT COST 500 LANGUAGE plpgsql AS $$ DECLARE diff1 int ; diff2 int ; BEGIN diff1=floor(input2)*60+(input2-floor(input2))*100; diff2=floor(input1)*60+(input1-floor(input1))*100; RETURN floor((diff1-diff2)/input3); END; $$;


update uptrip set t1=minut(sfirst,7.00,h1) where sfirst<7.00 and slast>=7.00 and h1 is not null;

update uptrip set t1=minut(sfirst,slast,h1) where sfirst<7.00 and slast<=7.00 and h1 is not null and t1 isnull;

update uptrip set t2= minut(sfirst,7.00,h1) where sfirst<=10.59 and sfirst>=7.00 and slast>=11.00 and h2 is not null;

For buses which run between 7 to 11 A.M.

update uptrip set t2=minut(sfirst,slast,h2) where sfirst<11.00 and sfirst>=7.00 and slast<11.00 and slast>=7.00 and h2 is not null and t2 isnull;

update uptrip set t2=minut(7.00,slast,h2) where sfirst<=6.59 and slast<=10.59 and slast>=7.00 and h2 is not null and t2 isnull;

update uptrip set t2=minut(7.00,11.00,h2) where sfirst<=6.59 and slast>=11.00 and h2 is not null and t2 isnull;

For buses which run between 11 to 5 P.M

update uptrip set t3=minut(sfirst,17.00,h3) where sfirst<=16.59 and sfirst>=11.00 and slast>=17.00 and h3 is not null;

update uptrip set t3=minut(sfirst,slast,h3) where sfirst<17.00 and sfirst>=11.00 and slast<17.00 and slast>=11.00 and h3 is not null and t3 isnull;

update uptrip set t3=minut(11.00,slast,h3) where sfirst<=10.59 and slast<=16.59 and slast>=11.00 and h3 is not null and t3 isnull;

update uptrip set t3=minut(11.00,17.00,h3) where sfirst<=10.59 and slast>=17.00 and h3 is not null and t3 isnull;

For buses which run between 5 to 8 P.M

update uptrip set t4=minut(sfirst,20.00,h4) where sfirst<=19.59 and sfirst>=17.00 and slast>=20.00 and h4 is not null;

update uptrip set t4=minut(sfirst,slast,h4) where sfirst<20.00 and sfirst>=17.00 and slast<20.00 and slast>=17.00 and h4 is not null and t4 isnull;

update uptrip set t4=minut(17.00,slast,h4) where sfirst<=16.59 and slast<=19.59 and slast>=17.00 and h4 is not null and t4 isnull;

update uptrip set t4=minut(17.00,20.00,h4) where sfirst<=16.59 and slast>=20.00 and h4 is not null and t4 isnull;

For buses which run after 8 P.M update uptrip set t5=minut(sfirst,slast, h5) where sfirst>=20.00 and h5 is not null and t5 isnull;

update uptrip set t5=minut(20.00,slast, h5) where sfirst<=19.59 and slast>=20.00 and h5 is not null and t5 isnull;

After updating these 5 columns add a column say sum in both tables uptrip and rt3. That column will contain the sum of 5 columns t1,t2,t3,t4 and t5.

Then create a table say trips in your database which would contain 3 columns route_id,schedule and trip_id. Then use this function to fill the trips table.Apply the function on both the tables uptrip and rt3.

Add a column in both uptrip and rt3 table named as temp which would contain the value equal to concatenation of route_id,schedule and id. ID column uniquely identifies each row in route_atlas.

Trip_id is formed by concatenating temp column in atlas with h1,h2,h3,h4,h5 for the five different time slots respectively,trip number in that particular time slot and 0 for a uptrip and 1 for a down trip.

create or replace function generate_trip() returns void as $$ declare curs1 cursor for select * from uptrip order by id; rec uptrip%rowtype; begin open curs1; loop fetch curs1 into rec; exit when not found; for i in 1..rec.t1 loop insert into trips values(rec.route_id,rec.sch,rec.temp||'_'||'h1_'||i,1); end loop; for i in 1..rec.t2 loop insert into trips values(rec.route_id,rec.sch,rec.temp||'_'||'h2_'||i,1); end loop; for i in 1..rec.t3 loop insert into trips values(rec.route_id,rec.sch,rec.temp||'_'||'h3_'||i,1); end loop; for i in 1..rec.t4 loop insert into trips values(rec.route_id,rec.sch,rec.temp||'_'||'h4_'||i,1); end loop; for i in 1..rec.t5 loop insert into trips values(rec.route_id,rec.sch,rec.temp||'_'||'h5_'||i,1); end loop; end loop; close curs1; end; $$ language plpgsql;

Generation of Stop_Times.txt

This takes input the column number of route id,start time,end time,starting serial number i.e srfirst,ending serial number i.e. srlast,id,running time for all time intervals,headway for all time intervals,stop codes,serial number from route details. stop_times.cpp

This code takes as input, three files namely uptrip,atlas and route details. Uptrip contains all the trips from route atlas which are ‘from’ (from stop) and ‘to’ (to stop) of route_atlas. It also contains two columns, srfirst and srlast. These two columns store the stop serial numbers of the from and to stops respectively i.e we matched the from stop of route atlas with the from stop in route details, both having the same route number and stored the corresponding stop serial number in srfirst and similarly dod it for srlast.

Clone this wiki locally