Skip to content

Conversation

@Bill-Haigh
Copy link
Contributor

Ticket number

PRSD-1610

Goal of change

SPIKE/ADR into if it is possible to generate enough data using plausible custom events in order to generate sankey diagrams.

Description of main change(s)

ADR created with summary of the considered options and the overall suggestion for how to proceed. Prototype exists on branch PRSD-1610/Front-end-referrer-header a screenshot of the outcome of the prototype can be seen below.

image

Copy link
Collaborator

@Travis-Softwire Travis-Softwire left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few thoughts, but overall LGTM. Worth taking this PR out of draft and getting Rowan to review as well though

@Bill-Haigh Bill-Haigh marked this pull request as ready for review October 23, 2025 12:52
@Bill-Haigh Bill-Haigh requested a review from rowanhill October 23, 2025 12:53
Copy link
Collaborator

@rowanhill rowanhill left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Sorry, I write all this and then failed to click the button for like a week)

I've left a few suggestions for changes, mostly to make the ADR a little terser and focused on the decision of how to create and send events to Plausible.

That said, let's hold off merging this for now, until we've discussed the results of our spike with the wider team. E.g. if there's not much appetite for this data, or if we decide to ditch Plausible, we wouldn't need this ADR.


## Status

{Draft ~~| Proposed | Accepted | Rejected | Superseded~~}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please can you update this - probably best to just change it to Accepted.


{Draft ~~| Proposed | Accepted | Rejected | Superseded~~}

Date of decision: {yyyy-MM-dd}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And a reminder to update this before you merge.

Comment on lines 24 to 30
It is certainly possible to get enough data from plausible to generate Sankey diagrams. This has been tested on the property registration journey, and this quickly became a complicated diagram.
As such it will need to be considered how best to present the data in a useful way. It is likely that the data collection and manipulation will be automated but the diagrams themselves may need to be manually adjusted for readability.

The chosen option is Front-end events with the referrer header, because it is the least intrusive to users while still providing useful data for generating Sankey diagrams. As part of the research into the options this quickly became the simplest and easiest of the options. It was clear that sending just the referrer header and current url as an event would create sufficient data to generate useful Sankey diagrams. The Front end was chosen as the delivery method as it was the simplest to implement with minimal impact on the existing webapp.
There is an existing prototype on branch `PRSD-1610/Front-end-referrer-header`, this has allowed me to implement this method and generate some mock data using plausible. I have then used the data to create a Sankey diagram using draxlr.com which has demonstrated that this method will work.

A tool will need to be created to generate the Sankey diagrams in a more useful format, this has not been implemented as part of the ADR but my suggestion based on the research I have done is to use D3.js with d3-sankey. This will allow for sufficient customisation to create useful diagrams for analysis.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a bit verbose, and includes info beyond just the decision outcome. We want these ADRs to be nice, short descriptions of decisions we've made and why.

I think this could just be:

Front-end events using the referrer header, because this provides sufficient data (as demonstrated by a spike) and is the simplest technical change.

Comment on lines 15 to 18
* Front-end custom events.
* Back-end custom events.
* Using the referrer header.
* Using Session ID.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's arrange these in two groups:

Event generation:

  • Front-end custom events
  • Back-end custom events

Event data:

  • Referrer header
  • HTTP session ID

We can then echo those groups in the headers below.

Comment on lines 20 to 21


Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Extremely minor, but this extra whitespace isn't needed.

* Good, because it does not rely on the browser to send the data, so it is less likely to be blocked by ad blockers or other privacy tools.
* Good, because it does not rely on JavaScript being enabled in the browser.
* Bad, because it requires additional implementation work to integrate with the journey framework.
* Bad, because it could potentially miss data due to server-side issues.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a bit confused by this one: what server-side issues would mean that we serve the page but fail to send the Plausible event (that don't have similar equivalents in the front-end case)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Honestly re-reading it I'm not entirely sure what I was getting at. - I have removed it.

#### Using the referrer header

Send a custom event to Plausible on each page load with the following:
1. `"props": {"flow": "page-a", "page-b"}`
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be { "flow": ["page-a", "page-b"] }? (And similarly below)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should actually be "Flow": {props: {"page-a", "page-b"}} So have updated it.

Comment on lines 72 to 73
1. `"props": {"flow": "page-a", "page-b", "session-id"}`
2. Where `page-a` is the referrer, `page-b` is the current page, and `session-id` is a unique identifier for the user's session.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't the point of this one that we don't send page-a? And just infer the path from timestamps and the session-id?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On reflection getting a users full path would be easier using page-a, page-b with the session id rather than using the timestamps.

@@ -0,0 +1,84 @@
# ADR-0034: plausible custom events
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excruciatingly minor, but can we capitalise Plausible, please?


## More Information

- In relation to the generation of Sankey diagrams there will need to be a list of `nodes` which will represent the endpoints in the journey. d3 can then be used to determine the order of the nodes. This is important for readability as with the complexity of the journeys it quickly becomes unreadable without. In addition to this it is worth noting that you can align nodes to occupy the same horizontal space e.g. where several endpoints are part of the same step (such as licence type in the property registration journey)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd leave this out, IMO. We're not really sure what we're doing with Sankey diagram generation, if anything, and even if we were it's not the decision being documented by this ADR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants