Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add rosters.txt #45

Open
BTollison opened this issue Nov 30, 2023 · 7 comments
Open

add rosters.txt #45

BTollison opened this issue Nov 30, 2023 · 7 comments

Comments

@BTollison
Copy link
Collaborator

BTollison commented Nov 30, 2023

In an effort to help distribute information between platforms, we wish to add rosters.txt. The intended use is that many operations cluster several duties (runs) within a roster to assign to a driver. The duties are usually clustered in a way that represents the amount of work a single driver will work in a week period. Some companies also cluster this work in several weeks for 1 drivers.

ods_roster_id
Persistent ID as proposed in issue #44 meant to help keep track of changes to a roster over time. Rosters can have long lifespans while the duties (runs) can change within them over time.

  • Required

roster_id

  • Required, primary key

service_id

  • Required, defined service period that the duties and roster_id is valid for.

monday_run

  • Required, value should be a run_id that operates on a monday during this service period.
  • If blank, no run_id is assigned.

tuesday_run

  • Required, value should be a run_id that operates on a tuesday during this service period.
  • If blank, no run_id is assigned.

wednesday_run

  • Required, value should be a run_id that operates on a wednesday during this service period.
  • If blank, no run_id is assigned.

thursday_run

  • Required, value should be a run_id that operates on a thursdsay during this service period.
  • If blank, no run_id is assigned.

friday_run

  • Required, value should be a run_id that operates on a friday during this service period.
  • If blank, no run_id is assigned.

saturday_run

  • Required, value should be a run_id that operates on a saturday during this service period.
  • If blank, no run_id is assigned.

sunday_run

  • Required, value should be a run_id that operates on a saturday during this service period.
  • If blank, no run_id is assigned.

week_sequence

  • if blank, then the roster represents only 1 week.
  • This value represents what what week this row of the roster should represent. If a roster_id has 2 weeks worth of work, then there should be a value 1 in the first week and a value 2 in the second week.
@skyqrose
Copy link
Contributor

skyqrose commented Dec 1, 2023

I can see how a grouping of many days' work is useful to represent even when there's no specific person assigned to it yet, as part of the scheduling process. So far, ODS has been mostly focused on data useful during operation, as opposed to during planning+scheduling. Would this be useful during operation (like if you need to check if the usual driver is working today), or is this intended to expand the scope of ODS into scheduling?

Would it be a specific week? I don't see a date field. There's a service_id field, but service_ids can be a many days or only 1, consecutive or not. What does the sunday_run field mean if applied to a service_id that only occurs on weekdays? How would you represent how holiday work is grouped with other work?

If it is meant to represent specific dates (as opposed to a generic week), it kind of turns into #28, with 1/7th the rows but 7x the columns. What if you leaned into that, and had a table with 3 columns, operational_date, run_id, roster_id?

@BTollison
Copy link
Collaborator Author

This is meant to take the scheduling data (Hastus, Optibus, etc) and send it to the operational side. Although I can see how on the scheduling side you could also assign drivers to every roster_id. In my previous work in the USA it was common to assign a driver to a roster during a bidding process. So, I am open to including this information somehow in a rosters.txt.

Since my intention here was to only model what comes from scheduling and not what occurs on the operational side (yet) I had not intended this to be tied to a specific week so much as tied to a specific service period (service_id). You raise a good point though, if a roster_id and service_id are overlapping you'll get duplicate roster_id's:

<style type="text/css"></style>

Calemdar.txt                  
service_id monday tuesday wednesday thursday friday saturday sunday start_date end_date
weekday 1 1 1 1 1 0 0 20240107 20240127
saturday 0 0 0 0 0 1 0 20240107 20240127
sunday 0 0 0 0 0 0 1 20240107 20240127

<style type="text/css"></style>

roster.txt                    
ods_roster_id roster_id service_id monday_run tuesday_run wednesday_run thursday_run friday_run saturday_run sunday_run week_sequence
1 1 weekday 4 5 6     3 2  
2 1 saturday 4 5 6     3 2  
3 1 sunday 4 5 6     3 2  

So this is not what I would consider ideal, but in some ways it makes sense because the duties are always grouped by the work per service day type, so unless we found a way to group service_id, at the moment I can't think of a less bad way to do this.

On your question about holiday work, maybe the workaround here can be similar to what happens with calendar_dates.txt where we simply have the roster_id + the duty that applies that day? This sounds like maybe what you are suggesting with issue #28?

There are places where a person may be doing a longer than 1 week roster, so that's why I have the field of week_sequence.

@skyqrose
Copy link
Contributor

skyqrose commented Dec 4, 2023

Thanks for the tables, that helps show the idea.

This is what I meant by the 3 columns in #28. There wouldn't be a separate exceptions file like calendar and calendar_dates, just a single file with all the dates. It's more duplicated rows cuz you need one for every date on the roster, but more straightforward, easier to have exceptions like holidays, and dodges some of the confusion about whether a service is active on a day of the week. I think it can represent everything that the day-of-the-week system can (as long as you want to say someone works on these 3 specific Mondays, and don't need to say someone typically works on Mondays in general.)

roster.txt    
date roster_id run_id
20240107 1 2
20240108 1 4
20240109 1 5
20240110 1 6
20240113 1 3
20240114 1 2
20240115 1 4
20240116 1 5
20240117 1 6
20240120 1 3
20240121 1 2
20240122 1 2 (MLK day)
20240123 1 5
20240124 1 6
20240127 1 3
20240107 2 ...

@BTollison
Copy link
Collaborator Author

Okay, I see what you're getting at now. My only concern with this is how many repeated rows we'd have.

I guess then for this to work, you need a way to distinguish which run_id is valid since there could be multiple run_id's with the same name, right?

@skyqrose
Copy link
Contributor

skyqrose commented Dec 5, 2023

Oh, yeah, my agency has issues with non-unique run ids, too, that's described in #12.
Potential solutions for here:

  • Use unique run_ids
  • Add a 4th column for the service_id
  • Assume run_ids are unique among the service_ids effective on the given date. To get the service_id, look up what service_ids are operating on that date, and find one with the given run. (Edit: I'm pretty sure this isn't a safe assumption, cuz there could be run_ids that are the same on bus and subway running on the same day.)

I'm gonna go make sure #28 doesn't run into this problem, too, thanks for noticing that.

@westontrillium
Copy link

westontrillium commented Dec 12, 2023

I would strongly advise against allowing duplicate primary keys. Even if their existence is commonplace, it goes against some fundamental relational database best practices.

I do agree the 3-column option makes it easier to be more explicit about dates of a given roster and requires less calculation/interpretation, though you'd probably want a second table to normalize rosters.

@BTollison
Copy link
Collaborator Author

BTollison commented Aug 26, 2024

So I'm currently attempting to revise rosters.txt proposal, but it occurs to me that in the calendar.txt style, we run into the problem of:

How do we handle a roster with service_id's? I end up in a situation where I need 3 rows to describe 1 week of rosters in a "simple" scenario where we have a weekday, saturday, and sunday service_id. We need a way to model this nicely to feed rosters in systems, the calendar_dates.txt style only works in an operational context I think.

The best idea so far that I have is a file just to group service_id's that are relevant...

grouped_calendar.txt

service_id group_service_id
weekday 9999
saturday 9999
sunday 9999

rosters.txt

group_service_id roster_id week_sequence monday_run tuesday_run wednesday_run thursday_run friday_run saturday_run sunday_run
9999 1 1 10 20 30 40 50    
9999 1 2 80 90 100 110     70
9999 1 3 120 130 140     60 150

roster_dates.txt

grouped_service_id date roster_id run_id exception_type
9999 20240830 1 555 1

For roster_dates.txt exception_type would have the proposed values:

exception_type Definition
NULL no rosters.txt, and is therefore the default
1 replacement
2 removed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants