-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Restructure the analysis specification and remove sharedResources
#144
Comments
Thanks, @anthonysena for putting this issue out there for discussion. My feeling on the 'shared resources' element is that it's a bit of a redundant moniker on those elements of the Strategus JSON, where you could just have cohort definitions, negative control specs etc, just declared at the top-level of the strategus where all modules know that they exist. One complication/confusion we have today is that some people think about putting the design specification inside the module node of Strategus, while other things were found in 'shared'. My thought about structuring this would be:
So under this structure, it's pretty clear what the elements of the Strategus json. In this way, all the design elements have a 'home'. The 'Modules' one is where we currently specify which modules are in play for the given analsyis (that's where you find the version and names of the modules), however, I think we're moving away from the 'versioned modules', correct? Meaning: when bulding a Strategus spec, you will be using the Strategus API based on a given version of the library (managed in either a Docker container or renv.lock file) which will produce the JSON payload. then when you exeucute, you just use the proper runtime context for this specification (again, either run within a docker or initilized from a renv.lock). With the runtime context established, the process of executing the strategus analysis would be:
I also think we discussed separating the 'publish results' to a separate function, eithr as a separate function of Strategus, or depend on a completely external process for handing strategus output. |
I think we agree that the notion of Perhaps the full analysis specification should be made available to each module so that it may be interrogated? If another module (i.e. CohortDiagnostics) needs info about the cohorts, it can interrogate the specification for the CohortGenerator module?
For reference, here is how the design specification is nested inside of the module node as you described (in case anyone else reading this is wondering): "moduleSpecifications": [
{
"module": "CohortGeneratorModule",
"version": "0.1.0",
"remoteRepo": "github.com",
"remoteUsername": "ohdsi",
"settings": {
"incremental": true,
"generateStats": true
},
"attr_class": ["CohortGeneratorModuleSpecifications", "ModuleSpecifications"]
} Moving these to a top level of the design document would make this clearer to your point. I then presume you'd have only 1 specification for a module in a given analysis specification in this case? I think this is fair since a module can have its settings handle multiple analyses.
That's right - the version of the "module" will be dictated by the version of Strategus that is declared in the renv.lock file and the corresponding module's package version.
I think this is out of scope for this topic (analysis specification changes) but yes, I'd expect that results management is different from the analysis specification and execution. This difference will result in different functions of Strategus for these areas of concern. |
Adding to this issue based on #51 (comment), we'd like to integrate the renv.lock file contents into the analysis specification to ensure that the R packages used to create the specifications are the same as the ones used to execute the analysis. I think we need to elaborate on details in #148 before we decide on adding this to the analysis specification but wanted to make sure the idea was put forward here to track it. |
sharedResources
from analysis specificationsharedResources
The v0.x structure of a Strategus analysis specification is of the form
The original thinking here is that we might need to refer to different "shared resources" which has mainly been the cohort definitions (with subsets) and negative control outcomes. Under the hood, Strategus will take the
sharedResources
section of the analysis specification and the selected module specifications and pass that to the module for execution.@chrisknoll and others have proposed removing this section since it is a bit of a "catch all" that lacks structure. I'd propose that we simply store the elements of the cohort definitions, subsets and negative control outcomes with the
CohortGeneratorModule
specifications. We can then pass the CohortGeneratorModule specifications & the specific module's specifications into the job context for the module so that, if needed, these elements can be accessed.Full analysis specification example: https://github.com/ohdsi-studies/FluoroquinoloneAorticAneurysm/blob/master/inst/analysisSpecification.json
The text was updated successfully, but these errors were encountered: