Allow automated and customisable result schema creation using RMM package #43

azimov · 2023-03-04T21:32:54Z

Currently, the management of results files is somewhat limiting in this package because there are custom, post analysis scripts that run and handle the creation of schemas and upload of any results files.

To resolve this issue I propose the utilisation of the RMM package to automatically handle the results upload in most cases, and in cases for modules where some schema specific creation parameters and data manipulation are required prior to upload allow callbacks inside the module execution context to do this.

For schema creation

Create a targets workflow for creating schemas for each module. This should be ran only once.

In the default case:

Find resultsDataModelSpecification.csv file (either exposed in the module via overridable function getDataModelSpecifications) and export it to work folder
Call ResultModelManager::generateSqlSchema to create schema sql and execute (setting table prefix and database schema from module settings)

In the case where customizability is required:

Pass jobContext to a function that can optionally be added to the Main.R file of the module createDataModelSchema this should execute schema creation (e.g. PLP and CohortDiagnostics already have functions to do this)

For results upload

A similar pattern is required. However, in this case the results uploding should be considered a targets::tar_target instance that depends on the base module.

In the default case:

Find resultsDataModelSpecification.csv file (either exposed in the module via overridable function getDataModelSpecifications) and export it to work folder
(outside of the renv context) Upload results files according to this model spec using ResultModelManager::uploadResults

In the custom case:

Pass the jobContext to a function that can optionally be added to the Main.R file of the module uploadResultsCallback this should execute schema creation (e.g. PLP and CohortDiagnostics already have functions to do this and implementation for @jreps and I would be trivial)

See incoming PR for an implementation.

The text was updated successfully, but these errors were encountered:

anthonysena · 2023-09-18T15:03:48Z

A couple of notes after trying out this functionality:

The schema creation does not use the "work" folder approach which may be useful to store the target script generated, logs, etc.
The "database_meta_table" is created by Strategus itself and not by the modules; this is currently a gap since the schema creation does not include this table.
I didn't see a mechanism to detect if tables exist? Perhaps this is exposed in the RMM package and we just need to expose it via Strategus?

azimov · 2023-09-19T22:24:15Z

Thanks @anthonysena, to your points:

The schema creation does not use the "work" folder approach which may be useful to store the target script generated, logs, etc.

This is probably a relatively easy change and reasonable as it's likely that we will need to debug so I can have a look at do it.

The "database_meta_table" is created by Strategus itself and not by the modules; this is currently a gap since the schema creation does not include this table.

Strategus could use RMM to create the result schema or it may be desirable to include this in RMM itself as its likely going to be used by all analytics results sets

I didn't see a mechanism to detect if tables exist? Perhaps this is exposed in the RMM package and we just need to expose it via Strategus?

Do you mean if they have been created or just to check if the schema conforms to the spec? This hasn't been implemented in RMM but I'm not sure what the intended purpose is here?

anthonysena · 2023-10-02T17:43:10Z

Do you mean if they have been created or just to check if the schema conforms to the spec? This hasn't been implemented in RMM but I'm not sure what the intended purpose is here?

Sorry that my question was unclear - I think the upload mechanism always assumes it should create the target tables in the results schema which might drop existing results if such a schema & results tables already exist. I believe both options are supported by RMM already - its most likely that I just need to allow the user to specify this in the executionSettings object.

anthonysena · 2024-07-15T13:39:58Z

Linking this specific issue to the draft PR #145 since I think this need is addressed in that feature branch. I think that branch covers the desires of this issue but if not we can keep this open and address it specifically.

anthonysena · 2024-08-08T13:55:22Z

Closing this out since v1.0 of Strategus will allow each module to expose the mechanisms to create their results tables & upload results. In this case, either the module will use the ResultModelManager or have its own implementation of these functions.

chrisknoll · 2024-08-08T15:10:02Z

Might want to keep this open: my concern is that the things like OHDSIShinyModules depends on a certain database results schema that may be beyond the concern of the individual statistics packages. Instead, I would propose a layer between the modules and the UI like:

ShinyModule  -------- >     Results Model    <------   Strategus Module

In this way, what the modules need to map their underlying package to is defined in Results Model and the data that the UI reports on is also defined in the same place (the Results Model). This allows us to coordinate changes between both sides by first defining it in the middle and then pushing the implementation of reporting and generation to the sides. It also means that the underlying packages can be free to develop new features independent of module versions, but once it's time to report the data in the standard UI app, it becomes part of the results model and the strategus module maps the new data inot the results model, and the shiny module implements the report on it.

Other advantages to this type of separation: ShinyModule bug fixes can be developed independently of results model or strategus module releases. Hopefully this level of separation makes sense.

azimov mentioned this issue Mar 4, 2023

Results upload #44

Closed

anthonysena added this to the v0.1.0 milestone Mar 6, 2023

anthonysena modified the milestones: v0.1.0, v0.2.0 Oct 2, 2023

anthonysena modified the milestones: v0.1.1, v0.2.0 Dec 4, 2023

anthonysena mentioned this issue Jul 1, 2024

Document results data model for modules #143

Closed

anthonysena linked a pull request Jul 15, 2024 that will close this issue

Refactor Strategus module approach #145

Merged

anthonysena added the study results label Jul 16, 2024

anthonysena closed this as completed Aug 8, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Allow automated and customisable result schema creation using RMM package #43

Allow automated and customisable result schema creation using RMM package #43

azimov commented Mar 4, 2023 •

edited

Loading

anthonysena commented Sep 18, 2023

azimov commented Sep 19, 2023

anthonysena commented Oct 2, 2023

anthonysena commented Jul 15, 2024

anthonysena commented Aug 8, 2024

chrisknoll commented Aug 8, 2024

Allow automated and customisable result schema creation using RMM package #43

Allow automated and customisable result schema creation using RMM package #43

Comments

azimov commented Mar 4, 2023 • edited Loading

For schema creation

For results upload

anthonysena commented Sep 18, 2023

azimov commented Sep 19, 2023

anthonysena commented Oct 2, 2023

anthonysena commented Jul 15, 2024

anthonysena commented Aug 8, 2024

chrisknoll commented Aug 8, 2024

azimov commented Mar 4, 2023 •

edited

Loading