[DRAFT] Execution Settings R6 Class #196

mdlavallee92 · 2024-10-24T16:34:55Z

Proposal: add an r6 class for an execution settings that handles the routing of the connectionDetails and the databaseSchemas. Class would also have a method to connect, disconnect and identify the dbms.

azimov · 2024-10-24T17:14:47Z

R/executionSettings.R

+    },
+
+    # disconnect to database
+    disconnect = function() {


perhaps a "withConnection" withr style context can be implemented as a method see here:

OHDSI/DatabaseConnector#269

azimov · 2024-10-24T17:53:16Z

R/executionSettings.R

+    },
+
+    #TODO make this more rigorous
+    # add warning if no connection available


include a DatabaseConnector::dbIsValid call?

makes sense. I left this open as I need to think of all checks on the database connection.

azimov · 2024-10-24T17:54:20Z

R/executionSettings.R

+                          connection = NULL,
+                          cdmDatabaseSchema = NULL,
+                          workDatabaseSchema = NULL,
+                          tempEmulationSchema = NULL,


tempEmulationSchema is normally an option

could we have it as an argument? snowflake requires it for everything selfishly hope to keep it here 😃

yes, keep it as an argument but the current convention across other packages is to use

tempEmulationSchema = getOption('sqlRenderTempEmulationSchema')

So I'm proposing that this behvaiour is consistent.

azimov · 2024-10-24T17:55:20Z

R/executionSettings.R

+#' An R6 class to define an ExecutionSettings object
+#'
+#' @export
+ExecutionSettings <- R6::R6Class(


@anthonysena I think this is a useful abstraction that can be used in almost all our packages and CohortGenerator seems an appropriate place to put it.

Some initial points that I think we need:

Changing the generate cohorts api so that it creates this object OR uses it if specified - this way we can maintain compatibility for existing scripts. Alternatively we should provide dual APIs (old and new) by creating a new function executeCohorts.

We will need to write a vignette

we should update all functions in this package to use this

We should look at the stratuegus execution context list to see if there is additional stuff we should replicate there

Perhaps we also move or replicate the cdm source meta data function from strategus here?

We should add a helper function to create an instance

Agreed @azimov and thanks for this contribution @mdlavallee92!

Changing the generate cohorts api so that it creates this object OR uses it if specified - this way we can maintain compatibility for existing scripts. Alternatively we should provide dual APIs (old and new) by creating a new function executeCohorts.

Let's aim to add this as an additional way of encapsulating the settings as you suggested and then fully adopt it when we have the opportunity to break compatibility with v1.x.

mdlavallee92 · 2024-10-25T20:12:13Z

@azimov, @anthonysena and @chrisknoll this is the file I use for the R6Class functions. I added to the PR for execution settings for now but it would be better if this lived in some "tools" package we maintain elsewhere.

tagging @alondhe and @katy-sadowski for awareness

chrisknoll · 2024-10-27T19:17:27Z

@azimov, @anthonysena and @chrisknoll this is the file I use for the R6Class functions. I added to the PR for execution settings for now but it would be better if this lived in some "tools" package we maintain elsewhere.

These would make good utility functions. I'm not sure I'd use them in my case, as there's cases where multiple checks are made (see here or here). But if it makes sense to use them, all for it.

chrisknoll · 2024-10-27T19:24:04Z

R/executionSettings.R

+ExecutionSettings <- R6::R6Class(
+  classname = "ExecutionSettings",
+  public = list(
+    initialize = function(connectionDetails = NULL,


Just FYI, the from of initialize here is different than what we discussed about R6 classes in the monthly meeting:

Initialize takes list-of-list (ie: fromJSON()) or a string which is JSON format and the object is initialized from that.

The objective of those R6 classes (in CohortIncidence) was to just enshroud object state into a formalized construct. So, it wasn't meant to have behavior about connecting, and the entire object model should be able to be serialized from/to json and I'm not sure connection details is that.

So based on the above I was imagining the execution settings to contain user-input data about schemas, runtime options, etc, and then to execute a study you have the analysis, execution settings and connection details provided separately (especially because sometimes you pass along an actual open connection (for whatever reason) or a connection details and the code decides which to use).

I think the context of this class is useful for a different reason that serialization or de-serialization. This would allow us to standardize across packages how we encapsulate inputs. At the moment these are fairly inconsistent.

I would say we explicitly never want to serialize connectionDetails as this adds the potential to leak security details. So the construction should be:

createExecutionSettings <- function(connection = NULL, connectionDetails = NULL, cdmDatabaseSchema, cohortDatabaseSchema = ... etc) { data <- list( <insertAllCommonSettings> ) ExecutionSettings$new(connection = connection, connectionDetails = connectionDetails, data = data) }

I think the context of this class is useful for a different reason that serialization or de-serialization.

That's a fair point, but this PR seems to set the basis for one example of implementing the other R6 classes. I was looking at the R6 classes as a means to define the analytic input API (at least at a data structure level, not a functional level). So there will be some that are pure data oriented (to define structure and handle serialization) and maybe there will be some that have behavior (for example, CI has a query builder R6 class I believe).

Maybe we need to see an example of both in this PR so we see where lines are drawn between behavior classes and structure classes.

azimov · 2024-10-29T15:35:01Z

@azimov, @anthonysena and @chrisknoll this is the file I use for the R6Class functions. I added to the PR for execution settings for now but it would be better if this lived in some "tools" package we maintain elsewhere.

These would make good utility functions. I'm not sure I'd use them in my case, as there's cases where multiple checks are made (see here or here). But if it makes sense to use them, all for it.

I think we should seek to expand this list, there will always be exceptions but there are basic things that require a double check. For example, if a value is a big int this is not readily supported by checkmate. The best solution I could think of was to check if the value is numeric and value modulo 1 == 0. Agreeing on this and standardizing would be incredibly useful.

It could also be possible to use ... notation when passed only to a single checkmate validator, with some reasonable defaults to override behaviour. E.g:

.setCharacter <- function(private, key, value, min.chars = 1, null.ok = FALSE, ...) {
  checkmate::assert_character(x = value, min.chars =min.chars, null.ok = null.ok, ...)
  private[[key]] <- value
  invisible(private)
}

anthonysena and others added 2 commits October 2, 2024 12:01

Release v0.11.2

fc28683

add execution settings

7365ac0

azimov reviewed Oct 24, 2024

View reviewed changes

mdlavallee92 and others added 3 commits October 24, 2024 14:03

Merge branch 'OHDSI:main' into mles

8be0478

add createExecutionSettings

3c923f0

add R6 class checkmate fn

df94fc2

chrisknoll reviewed Oct 27, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[DRAFT] Execution Settings R6 Class #196

[DRAFT] Execution Settings R6 Class #196

mdlavallee92 commented Oct 24, 2024

azimov Oct 24, 2024

azimov Oct 24, 2024

mdlavallee92 Oct 25, 2024

azimov Oct 24, 2024

mdlavallee92 Oct 25, 2024

azimov Oct 29, 2024

azimov Oct 24, 2024

anthonysena Oct 25, 2024

mdlavallee92 commented Oct 25, 2024

chrisknoll commented Oct 27, 2024 •

edited

Loading

chrisknoll Oct 27, 2024 •

edited

Loading

azimov Oct 29, 2024

chrisknoll Oct 29, 2024

azimov commented Oct 29, 2024

[DRAFT] Execution Settings R6 Class #196

Are you sure you want to change the base?

[DRAFT] Execution Settings R6 Class #196

Conversation

mdlavallee92 commented Oct 24, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mdlavallee92 commented Oct 25, 2024

chrisknoll commented Oct 27, 2024 • edited Loading

chrisknoll Oct 27, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

azimov commented Oct 29, 2024

chrisknoll commented Oct 27, 2024 •

edited

Loading

chrisknoll Oct 27, 2024 •

edited

Loading