Centralizing modules & renv lock file #29

anthonysena · 2023-01-18T15:10:18Z

From discussion with @schuemie, we'd like to update Strategus to have it be the central location for individual analytic modules while also allowing for the flexibility of using external modules as is currently supported. The goal here is to align Strategus's modules & renv.lock file with the planned regular release(s) of HADES. To do this, we discussed some ideas which I'll put forward in this issue.

Strategus would maintain a central lock file that is used by all of the modules. When a study is executed via Strategus, the renv.lock file would be included with the analysis specification and execution settings JSON to provide a complete record of the study. Specifically, the renv.lock file would detail all of the R dependencies (including Strategus), the analysis specification would detail the design choices for the study and the execution settings would detail the specific environment settings for executing the study.
Modules would reside inside of the Strategus package, in the R code folder. We could standardize the names of the module .R files to "Module.R". Additionally, it may be useful to use R6 classes for the modules to provide an interface for the modules and an inheritance structure we can use to centralize specific utility functions that modules may need (i.e. how to get the cohort definitions from the analysis specification).

The analysis specification itself would need an update, presumably we'd remove the version, remoteRepo and remoteUsername attributes in the module specification (shown below). In the case when these attributes are missing, Strategus would source the module from its local set of modules otherwise it would instantiate the module as is done today.

Strategus/inst/testdata/analysisSpecification.json

Lines 226 to 228 in c701540

    
           "version": "0.0.14", 
        
           "remoteRepo": "github.com", 
        
           "remoteUsername": "ohdsi",

Tagging @ablack3 for awareness.

The text was updated successfully, but these errors were encountered:

ablack3 · 2023-01-24T09:39:01Z

Thanks for sharing this plan @anthonysena.
Overall it makes sense.

while also allowing for the flexibility of using external modules

The one requirement I’d like to add (or highlight if it already exists) is that Strategus can be extended with new modules without ever requiring a new OHDSI repo or pull request on an existing OHDSI repo. This would give me and others the ability to experiment with extensions without needing permission/access rights. I think this is the idea behind the open-closed principle.

My other comment is that I might steer away from using R6 for modules and instead suggest defining generic functions that modules must implement (essentially the module interface). These generic functions could be defined in the Strategus package using S3 or S4 (or even the new R7) and then implemented in other packages that extend Strategus functionality. I have not used R6 much but what I think R6 is good for is maintaining and mutating state in an application as well as providing an alternative to R’s normal “copy on modify” behavior. I think the functional OOP approach implemented in S3, S4, and R7 would work well in this case and put the focus on the generic functions a module (which is likely to be somewhat synonymous with an analytic R package) needs to implement to work with Strategus. That’s my two cents though and you should go with what makes the most sense to you and the other developers working on Strategus.

anthonysena · 2023-01-24T15:23:33Z

The one requirement I’d like to add (or highlight if it already exists) is that Strategus can be extended with new modules without ever requiring a new OHDSI repo or pull request on an existing OHDSI repo. This would give me and others the ability to experiment with extensions without needing permission/access rights. I think this is the idea behind the open-closed principle.

Thanks @ablack3 for noting this. I'd like to point out this is currently possible given the design of Strategus. To illustrate this point a bit more, I'll use an example. In Strategus, I'm putting together a sample analysisSpecification.json document that we can use to test Strategus and run various modules. Here is an excerpt from that document:

Strategus/inst/testdata/analysisSpecification.json

Lines 225 to 228 in 01e63f7

    
           "module": "CohortGeneratorModule", 
        
           "version": "0.0.14", 
        
           "remoteRepo": "github.com", 
        
           "remoteUsername": "ohdsi",

As shown above, each module currently has a reference to the location of the module itself. So, if I wanted to use a module outside of OHDSI, I could reference it and make use of it in the same specification. To showcase how that is currently done, here is an excerpt from the vignette:

Strategus/vignettes/CreatingAnalysisSpecification.Rmd

Lines 76 to 97 in 01e63f7

    
           ## CohortGenerator Module Settings 
        
           The following code downloads the settings functions from the `CohortGeneratorModule` which then activates some additional functions we can use for creating the analysis specification. In the analysis specification, we will add the cohort definitions and negative control outcomes to the `sharedResources` section since these elements may be used by any of the HADES modules. To do this, we will use the `createCohortSharedResourceSpecifications` and `createNegativeControlOutcomeCohortSharedResourceSpecifications` functions respectively. In addition, we will use the  `cohortGeneratorModuleSpecifications` function to specify the cohort generation settings. 
        
           ```{r eval=FALSE} 
        
           source("https://raw.githubusercontent.com/OHDSI/CohortGeneratorModule/v0.0.14/SettingsFunctions.R") 
        
           # Create the cohort definition shared resource element for the analysis specification 
        
           cohortDefinitionSharedResource <- createCohortSharedResourceSpecifications( 
        
             cohortDefinitionSet = cohortDefinitionSet 
        
           ) 
        
           # Create the negative control outcome shared resource element for the analysis specification 
        
           ncoSharedResource <- createNegativeControlOutcomeCohortSharedResourceSpecifications( 
        
             negativeControlOutcomeCohortSet = ncoCohortSet, 
        
             occurrenceType = 'all', 
        
             detectOnDescendants = TRUE) 
        
           # Create the module specification 
        
           cohortGeneratorModuleSpecifications <- createCohortGeneratorModuleSpecifications( 
        
             incremental = TRUE, 
        
             generateStats = TRUE)

The block of code referenced above is used to create the CohortGeneratorModule settings that go into the analysisSpecification.json. As shown here, the elements of the analysisSpecification.json are taken from the function cohortGeneratorModuleSpecifications. Digging into this even deeper, if we take a look at cohortGeneratorModuleSpecifications you will see that the metadata is part of that function. Also worth noting that this function is auto-generated using code in ModuleMaintenance.R

My other comment is that I might steer away from using R6 for modules and instead suggest defining generic functions that modules must implement (essentially the module interface). These generic functions could be defined in the Strategus package using S3 or S4 (or even the new R7) and then implemented in other packages that extend Strategus functionality. I have not used R6 much but what I think R6 is good for is maintaining and mutating state in an application as well as providing an alternative to R’s normal “copy on modify” behavior. I think the functional OOP approach implemented in S3, S4, and R7 would work well in this case and put the focus on the generic functions a module (which is likely to be somewhat synonymous with an analytic R package) needs to implement to work with Strategus. That’s my two cents though and you should go with what makes the most sense to you and the other developers working on Strategus.

This would be worthwhile to discuss - I'm currently thinking that it may be desirable to create a StrategusModules package that contains the individual modules and supporting functions/classes to keep the main Strategus repo focused on executing the analyses. Happy to hear your thoughts on S3/S4/R6/R7 as I haven't done a lot with these OOP bits in R. Tagging @azimov @chrisknoll @levalle @mdlavallee92 as they may have some experience to share on this topic.

azimov · 2023-03-10T00:36:28Z

Modules would reside inside of the Strategus package, in the R code folder. We could standardize the names of the module .R files to "Module.R". Additionally, it may be useful to use R6 classes for the modules to provide an interface for the modules and an inheritance structure we can use to centralize specific utility functions that modules may need (i.e. how to get the cohort definitions from the analysis specification).

I don't think that a centralized module location would be particularly useful or a good idea.
On a user's system it should be possible to install multiple versions of the same module, instantiate them and use them in whatever executions they need to.

For a given study it may also be desirable to call a module that uses only proprietary code that will never see the light of day outside the organisation.

I think a much more desirable solution would be to allow users to install whatever modules they want into a local system path.
This could contain many different modules of different versions in both a global location that the user can backup or delete at will.
The module's in these folders would be able to list their source location (e.g. a github repository) but requiring a github doesn't aid reproducibility in my view (it's just a convenient way of sharing code).

This will also simplify development because module developers would be able to build and test in local environments. It's already quite painful to develop modules because they require a github release. Linking them all to a single github repository would make that even harder to push and test small changes to.

A centralized repository will also become extremely difficult to maintain over time. We'll basically be implementing something like CRAN and I don't really like the idea of any one organisation maintaining this because I fear it will hamper adoption amongst the community.
Study packages had problems but they also have the really appealing principle that anyone could just develop them and share them.

If user's can't do that then why wouldn't they just maintain a single renv.lock file and say "use this" with a script.R file they maintain?

The one requirement I’d like to add (or highlight if it already exists) is that Strategus can be extended with new modules without ever requiring a new OHDSI repo or pull request on an existing OHDSI repo. This would give me and others the ability to experiment with extensions without needing permission/access rights. I think this is the idea behind the open-closed principle.

I agree with this. Modules shouldn't need to be OHDSI owned (or even run on an OMOP CDM, for example). I think this package should limit its' scope to allowing users to orchestrate complex workflows in modules across different renv environments.

I think a simplification would be to model how npm/yarn work in the javascript world. You want to download the packages into a local bundle (or a global space on disk) and then use them in a study.

anthonysena · 2024-07-31T15:16:31Z

Keeping this open based on discussions with @egillax since we'd like to migrate the remaining modules into Strategus. Specifically:

anthonysena · 2024-09-16T13:31:37Z

Closing and tracking via #164.

anthonysena mentioned this issue Mar 6, 2023

Simplify module installation #45

Closed

schuemie mentioned this issue Mar 17, 2023

Executing settings functions inside renvs #51

Closed

mdlavallee92 mentioned this issue Sep 6, 2023

R6 v S3/S4 in HADES OHDSI/Capr#82

Closed

anthonysena added sos-challenge Issues raised during the Save Our Sisyphus challenge and removed sos-challenge Issues raised during the Save Our Sisyphus challenge labels Oct 2, 2023

schuemie added this to HADES Hack-a-thon Oct 21, 2023

schuemie moved this to In Progress in HADES Hack-a-thon Oct 21, 2023

anthonysena added this to the v0.2.0 milestone Dec 4, 2023

anthonysena mentioned this issue Jun 11, 2024

Eliminate keyring, renv and targets package requirements #135

Closed

This was referenced Jul 1, 2024

Adopt ResultsModelManager for results management OHDSI/CohortIncidence#58

Open

Refactor Strategus module approach #145

Merged

anthonysena linked a pull request Jul 15, 2024 that will close this issue

Refactor Strategus module approach #145

Merged

anthonysena mentioned this issue Jul 15, 2024

How to design and distribute a Strategus network study #148

Open

anthonysena added execution study design process labels Jul 16, 2024

anthonysena linked a pull request Sep 3, 2024 that will close this issue

Add PatientLevelPredictionValidationModule & ModelTransferModule #161

Draft

anthonysena mentioned this issue Sep 16, 2024

Migrate PatientLevelPredictionValidation & ModelTransfer Modules into Strategus #164

Open

anthonysena closed this as completed Sep 16, 2024

anthonysena removed a link to a pull request Sep 16, 2024

Add PatientLevelPredictionValidationModule & ModelTransferModule #161

Draft

schuemie removed this from HADES Hack-a-thon Sep 20, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Centralizing modules & renv lock file #29

Centralizing modules & renv lock file #29

anthonysena commented Jan 18, 2023

ablack3 commented Jan 24, 2023 •

edited

Loading

anthonysena commented Jan 24, 2023 •

edited

Loading

azimov commented Mar 10, 2023

anthonysena commented Jul 31, 2024

anthonysena commented Sep 16, 2024

Centralizing modules & renv lock file #29

Centralizing modules & renv lock file #29

Comments

anthonysena commented Jan 18, 2023

ablack3 commented Jan 24, 2023 • edited Loading

anthonysena commented Jan 24, 2023 • edited Loading

azimov commented Mar 10, 2023

anthonysena commented Jul 31, 2024

anthonysena commented Sep 16, 2024

ablack3 commented Jan 24, 2023 •

edited

Loading

anthonysena commented Jan 24, 2023 •

edited

Loading