Skip to content
This repository has been archived by the owner on May 1, 2023. It is now read-only.

Proposing a DB schema #9

Open
dj2 opened this issue Jul 28, 2021 · 4 comments
Open

Proposing a DB schema #9

dj2 opened this issue Jul 28, 2021 · 4 comments

Comments

@dj2
Copy link
Contributor

dj2 commented Jul 28, 2021

Wrote out some potential tables for the DB schema for the new service.

Organisations

Top level organisations. The organization has users and owns all of the scans run by that organization.

- id: UUID [primary key]
- name: String [unique]
- created_at: DateTime
- updated_at: DateTime

Users

User accounts for logging in. The session stores the access_token to make it easier to revoke and replace if need.

 - id: UUID [primary key]
 - organization_id: UUID [foreign key -> organisations]
 - name: String
 - email: String
 - password_hash: String
 - access_token: UUID
 - created_at: DateTime
 - updated_at: DateTime

ScanTypes

The different types of scans that can be run by the system. E.g. 'Accessibility', 'Security', etc.

 - id: UUID [primary key]
 - name: string

Templates

A template for a scan to be run. The token is provided through the API to specify which template to execute. Templates are owned by an organization and have a given name. Think, digitial.canada.ca as the template for run against the primary website.

 - id: UUID [primary key]
 - organization_id: UUID [foreign key -> organisations]
 - token: UUID [indexed]
 - name: String
 - created_at: DateTime
 - updated_at: DateTime

TemplateScans

Each template object can have multiple scans run against it. So, you may setup a scan for accessibilty and for security. When th template is executed all of the associated template scans are run. The scan_type_id specifies the type of scan to execute and the data stores any data needed to run the same. For example, the accessiblity scan my store the two top level urls for the website along with the spider flag and any other needed configuration for that scan.

 - id: UUID [primary key]
 - template_id: UUID [foreign key -> templates]
 - scan_type_id: UUID [foreign key -> scan_types]
 - data: jsonb <configuration data for the scan, URLs, spider, etc>

TemplateScanTriggers

The triggers are things that happen with a given TemplateScan encounters a violation. This maybe a callback to log a bug, or the response to the github action. (More info, or a data field maybe needed here to configure the callback?)

 - id: UUID [primary key]
 - template_scan_id: UUID [foreign key -> template_scans]
 - callback: text

Scans

The scans are the results table. So, a given organization executed a given template which had a given scan on a given date/time. This is the top level record of the scan that happened. Each TemplateScan would create a Scans record as they are run.

 - id: UUID [primary key]
 - organization_id: UUID [foreign key -> organisations]
 - template_id: UUID [foreign key -> Templates]
 - scan_type_id: UUID [foreign key -> scan_types]
 - created_at: DateTime
 - updated_at: DateTime

A11yReports

Specific scan type data for a scan. So, given a scan we can look up the table to retrieve the scan data from. This holds the basic information for an accessibility scan.

 - id: UUID [primary key]
 - scan_id: UUID [foreign key -> scan]
 - product: string
 - revision: string
 - url: string
 - ci: boolean [not null default false]
 - summary: jsonb
 - created_at: DateTime

A11yViolations

Stores individual violations from a given accessiblity report.

 - id: UUID [primary key]
 - report_id: UUID [foreign key -> A11yReports on delete cascade]
 - violation: string
 - impact: string
 - target: text
 - html: text
 - data: jsonb
 - tags: jsonb
 - message: text
 - url: string
 - created_at: DateTime

SecurityReports

Similar to A11yReports but for security reports.

 - id: UUID [primary key]
 - scan_id: UUID [foreign key -> scans]

SecurityViolations

Similar to A11yViolations but for security violations.

 - id: UUID [primary key]
 - report_id: UUID [foreign key -> SecurityReports on delete cascade]
@dj2
Copy link
Contributor Author

dj2 commented Jul 28, 2021

I'm not a huge fan of the TemplateScans::data field. Could this be broken out more? Are there consistent things we'll care about for a scan? Like the set of URLs to run over, do we spider, other?

Should we have the following extracted out of the data?

TemplateScanURLs

 - template_scan_id: UUID [foreign key -> TemplateScans on delete cascade]
 - url: string

@lchski
Copy link

lchski commented Jul 28, 2021

Wonder whether we actually need separate tables for A11yReports / SecurityReports, A11yViolations / SecurityViolations – seems that could just be a ScanType column on Reports and Violations tables, assuming all the other fields are the same.

@dj2
Copy link
Contributor Author

dj2 commented Jul 28, 2021

If we store the data as a JSON blob, we could, but that also makes it harder to work with through SQL. Although, maybe we don't need that flexibility?

@dj2
Copy link
Contributor Author

dj2 commented Jul 28, 2021

How do we deal with pages that require authentication? Do we store something in the TemplateScans::data? How do we secure the username/password used for authentication?

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

2 participants