Skip to content

Conversation

gogonzo
Copy link
Contributor

@gogonzo gogonzo commented Sep 11, 2025

Closes insightsengineering/NEST-roadmap#36

WIP description. Comments blocked

Introduction

Current design bases on an idea that a module can consume its arguments referring to any variable in any dataset. Consider following example, where:

  • a module uses x, y and facet arguments to create an interactive inputs,
  • user can select a variable from any dataset for x, y, facet
  • visualization will be build on a merged dataset containing these three variables
Pseudo example
# pseudocode
tm_example <- function(x, y, facet) {
  ui = function(id, x, y, facet) ...., # creates placeholders for inputs
  server = function(id, x, y, facet) {
    moduleServer(id, function(input, output, session) {
      output$plot <- renderPlot({
        merged_dataset |>
          ggplot(
            aes(
              x = <selected x var>,
              y = <selected y var>
            )          
          ) + geom_point() + facet_wrap(vars(<selected facet var>))
      })
    })
  }
}

To provide choices and default selection for x, y and facet we propose following api:

# pseudocode
tm_example(
  x = picks(
    datasets(<choices>, <selected>),
    variables(<choices>, <selected>)
  ),
  y = picks(
    datasets(<choices>, <selected>),
    variables(<choices>, <selected>)
  ),
  facet = picks(
    datasets(<choices>, <selected>),
    variables(<choices>, <selected>)
  )
)

Where each function creates an object which holds the information consumed by the framework. choices and selected can be either:

  • explicit character denoting the name of the objects
  • Natural number denoting index of possible
  • tidyselect selection_helpers (?tidyselect::language)

Example settings

Strict variables picks

picks below will create an input in the module where single variable can be selected from c("Sepal.Length", "Sepal.Width"). multiple = FALSE disallow user to select more than one choice.

picks(
  datasets(choices = "iris", selected = "iris"),
  variables(choices = c("Sepal.Length", "Sepal.Width"), selected = "Sepal.Length", multiple = FALSE)
)

Dynamic variables choices

Following picks will create an input in the module where user will be able to select any variable from iris (any = everything()) and by default 1-st will be selected. Be careful, setting explicit selected when choices throws a warning as it is not certain for example that "Species" %in% everything().

picks(
  datasets(choices = "iris", selected = "iris"),
  variables(choices = tidyselect::everything(), selected = 1, multiple = FALSE)
)

Dynamic variables from multiple datasets

Consider a situation when one wants to select a variable from either iris or mtcars. Instead of forcing app-developer to enumerate all possible choices for iris and mtcars. Following picks will create two related inputs for datasets and for variables. Input for variables will automatically update when dataset selection changes.

picks(
  datasets(choices = c("iris", "mtcars"), selected = "iris"),
  variables(choices = tidyselect::everything(), selected = 1, multiple = FALSE)
)

Dynamic everything

In extreme scenario also lists of datasets could be unknown. Or to avoid writing too much text, one can specify following picks.

picks(
  datasets(choices = tidyselect::where(is.data.frame), selected = 1),
  variables(choices = tidyselect::everything(), selected = 1, multiple = FALSE)
)

Implementation in teal_module

teal_module will accept x, y and facet and hand-over them to both ui and server.

code
tm_example <- function(x, y, facet) {
  module(
    ui = ui_example,
    server = srv_example,
    ui_args = list(x = x, y = y, facet = facet),
    server_args = list(x = x, y = y, facet = facet)
  )
}

On the ui part it is necessary to call module_input_ui for each picks object.

code
ui_example <- function(id, x, y, facet) {
  ns <- NS(id)
  div(
    module_input_ui(id = ns("x"), spec = x),
    module_input_ui(id = ns("y"), spec = y),
    module_input_ui(id = ns("facet"), spec = facet),
    plotOutput(ns("plot"))
  )
}

In the server, picks are utilized in module_input_srv which can be called per each pick, or for all at once (as in the example below). module_input_srv is used only to resolve dynamic choices/selected and to handle interaction between inputs. selectors contain a list of selected datasets/variables for each pick. In this example selectors structure looks like this:

x: (reactiveVal)
  datasets: 
    choices: ...
    selected: ...
  variables: 
    choices: ...
    selected: ...
y: ...
facet: ...

module_input_srv doesn't do anything else that controlling a selection for a number of reasons:

code
srv_example <- function(id, data, x, y, facet) {
  moduleServer(id, function(input, output, session) {
    selectors <- module_input_srv(data = data, spec = list(x = x, y = y, facet = facet))

    merged_q <- reactive({
      req(data(), map_merged(selectors))
      qenv_merge_selectors(x = data(), selectors = selectors)
    })

    plot_q <- reactive({
      within(merged_q(),
        {
          merged %>%
            ggplot(aes(x = x, y = y)) +
            geom_point() +
            facet_wrap(vars(facet))
        },
        x = str2lang(map_merged(selectors)$x$variables),
        y = str2lang(map_merged(selectors)$y$variables),
        facet = str2lang(map_merged(selectors)$facet$variables)
      )
    })

    output$plot <- renderPlot({
      req(plot_q())
      rev(get_outputs(plot_q()))[[1]]
    })
  })
}

App example

code
devtools::load_all("teal.data")
devtools::load_all("teal.transform")
devtools::load_all("teal")
devtools::load_all("teal.modules.general")
library(dplyr)

data <- within(teal.data::teal_data(), {
  customers <- tibble::tribble(
    ~id, ~name, ~age, ~country,
    1, "Alice Johnson", 30, "USA",
    2, "Bob Smith", 25, "Canada",
    3, "Charlie Brown", 35, "UK",
    4, "David Wilson", 28, "Australia",
    5, "Emma Davis", 32, "USA",
    6, "Frank Miller", 27, "Canada",
    7, "Grace Taylor", 29, "UK",
    8, "Henry Clark", 33, "Australia",
    9, "Isabella Martinez", 26, "USA",
    10, "Jack Thompson", 31, "Canada"
  )

  orders <- tibble::tribble(
    ~id, ~customer_id, ~order_date, ~total_amount,
    101, 1, as.Date("2024-01-15"), 250.00,
    102, 1, as.Date("2024-02-01"), 150.00,
    103, 2, as.Date("2024-02-10"), 125.00,
    104, 3, as.Date("2024-02-15"), 200.00,
    105, 4, as.Date("2024-02-20"), 175.00,
    106, 5, as.Date("2024-03-01"), 300.00,
    107, 6, as.Date("2024-03-05"), 50.00,
    108, 7, as.Date("2024-03-10"), 225.00,
    109, 8, as.Date("2024-03-12"), 100.00,
    110, 9, as.Date("2024-03-15"), 275.00,
    111, 10, as.Date("2024-03-18"), 125.00,
    112, 2, as.Date("2024-03-20"), 150.00
  )

  order_items <- tibble::tribble(
    ~id, ~order_id, ~product_id, ~quantity, ~unit_price, ~total_price,
    201, 101, 401, 2, 100.00, 200.00,
    202, 101, 402, 1, 50.00, 50.00,
    203, 102, 402, 3, 50.00, 150.00,
    204, 103, 402, 1, 50.00, 50.00,
    205, 103, 403, 1, 75.00, 75.00,
    206, 104, 401, 2, 100.00, 200.00,
    207, 105, 403, 2, 75.00, 150.00,
    208, 105, 402, 1, 50.00, 50.00,
    209, 106, 401, 3, 100.00, 300.00,
    210, 107, 402, 1, 50.00, 50.00,
    211, 108, 401, 1, 100.00, 100.00,
    212, 108, 403, 2, 75.00, 150.00,
    213, 109, 402, 2, 50.00, 100.00,
    214, 110, 401, 2, 100.00, 200.00,
    215, 110, 403, 1, 75.00, 75.00,
    216, 111, 402, 2, 50.00, 100.00,
    217, 111, 401, 1, 100.00, 100.00,
    218, 112, 403, 2, 75.00, 150.00
  )

  order_files <- tibble::tribble(
    ~id, ~order_id, ~file_name, ~file_type,
    301, 101, "invoice_101.pdf", "invoice",
    302, 102, "receipt_102.pdf", "receipt",
    303, 103, "invoice_103.pdf", "invoice",
    304, 104, "receipt_104.pdf", "receipt",
    305, 105, "invoice_105.pdf", "invoice",
    306, 106, "receipt_106.pdf", "receipt",
    307, 107, "invoice_107.pdf", "invoice",
    308, 108, "receipt_108.pdf", "receipt",
    309, 109, "invoice_109.pdf", "invoice",
    310, 110, "receipt_110.pdf", "receipt",
    311, 111, "invoice_111.pdf", "invoice",
    312, 112, "receipt_112.pdf", "receipt"
  )

  products <- tibble::tribble(
    ~id, ~name, ~price, ~category, ~stock_quantity,
    401, "Laptop Pro", 100.00, "Electronics", 15,
    402, "Wireless Mouse", 50.00, "Electronics", 50,
    403, "Office Chair", 75.00, "Furniture", 8
  )

  product_components <- tibble::tribble(
    ~id, ~product_id, ~component_name, ~component_type, ~quantity_required, ~cost,
    501, 401, "CPU", "Processor", 1, 25.00,
    502, 401, "RAM", "Memory", 2, 15.00,
    503, 401, "SSD", "Storage", 1, 20.00,
    504, 401, "Screen", "Display", 1, 30.00,
    505, 402, "Optical Sensor", "Sensor", 1, 8.00,
    506, 402, "Wireless Module", "Connectivity", 1, 12.00,
    507, 402, "Battery", "Power", 1, 5.00,
    508, 403, "Steel Frame", "Structure", 1, 35.00,
    509, 403, "Cushion", "Comfort", 1, 20.00,
    510, 403, "Wheels", "Mobility", 5, 3.00
  )

  iris <- iris
  mtcars <- mtcars
  iris$id <- seq_len(nrow(iris))
  mtcars$id <- seq_len(nrow(mtcars))
  ADSL <- rADSL
  ADTTE <- rADTTE
  ADRS <- rADRS
  ADAE <- rADAE
  ADLB <- rADLB
  # ADSL <- tmc_ex_adsl
  # ADQS <- tmc_ex_adqs %>%
  #   filter(ABLFL != "Y" & ABLFL2 != "Y") %>%
  #   filter(AVISIT %in% c("WEEK 1 DAY 8", "WEEK 2 DAY 15", "WEEK 3 DAY 22")) %>%
  #   mutate(
  #     AVISIT = as.factor(AVISIT),
  #     AVISITN = rank(AVISITN) %>%
  #       as.factor() %>%
  #       as.numeric() %>%
  #       as.factor()
  #   )
  ADTR <- rADTR
})

join_keys(data) <- c(
  teal.data::default_cdisc_join_keys[c("ADSL", "ADTTE", "ADRS", "ADAE", "ADQS", "ADTR", "ADLB")],
  teal.data::join_keys(
    join_key("iris", keys = "id"),
    join_key("mtcars", keys = "id"),
    teal.data::join_key("customers", keys = "id"),
    teal.data::join_key("orders", keys = c("id")),
    teal.data::join_key("products", keys = c("id")),
    teal.data::join_key("product_components", keys = c("id")),
    # foreign keys
    teal.data::join_key("customers", "orders", keys = c(id = "customer_id")),
    teal.data::join_key("products", "order_items", keys = c(id = "product_id")),
    teal.data::join_key("products", "product_components", keys = c(id = "product_id")),
    teal.data::join_key("orders", "order_items", keys = c(id = "order_id")),
    # add missing keys
    teal.data::join_key("ADTR", "ADTR", keys = c("STUDYID", "USUBJID", "PARAMCD", "AVISIT")),
    teal.data::join_key("ADSL", "ADTR", keys = c("STUDYID", "USUBJID"))
  )
)

app <- init(
  data = data,
  modules = modules(
      tm_merge(
      label = "adam",
      inputs = list(
        a = picks(
          datasets("ADTTE"),
          variables(
            # choices = where(~ is.factor(.x) | is.character(.x)) & !any_of(c("USUBJID", "STUDYID", "SITEID", "SUBJID")),
            multiple = TRUE
          )
        ),
        b = picks(
          datasets(choices = tidyselect::where(is.data.frame), selected = "ADSL"),
          variables(is_categorical(min.len = 2, max.len = 20) & !is_key(), selected = 1, multiple = TRUE)
        ),
        c = picks(
          datasets(tidyselect::everything(), "ADTTE"),
          variables(choices = c(AGE:ARM, PARAMCD), selected = AGE, multiple = TRUE)
        ),
        d = picks(
          datasets(choices = "ADRS", selected = "ADRS"),
          variables(choices = "PARAM", selected = "PARAM"),
          values(selected = tidyselect::everything(), multiple = TRUE)
        ),
        e = picks(
          datasets(selected = "ADSL"),
          variables(
            choices = variable_choices("whatever", subset = function(data) {
              idx <- vapply(data, is.factor, logical(1))
              names(data)[idx]
            })
          )
        )
      )
    ),
    tm_merge(
      label = "non adam",
      inputs = list(
        a = picks(
          datasets(
            choices = tidyselect::where(is.data.frame) & !tidyselect::starts_with("AD"),
            selected = "orders"
          ),
          variables(
            selected = "order_date",
            multiple = TRUE
          )
        ),
        b = picks(
          datasets(selected = "products"),
          variables(selected = "price", multiple = TRUE)
        ),
        c = picks(
          datasets(selected = "order_items"),
          variables(multiple = TRUE)
        )
      )
    ),
    tm_p_spiderplot(
      time_var = picks(
        datasets("ADTR"),
        variables(c("ADY", "AVISITN"))
      ),
      value_var = picks(
        datasets(c(ADSL, ADTR)),
        variables(choices = where(is.numeric), selected = any_of(c("BMRKR1", "AVAL")))
      ),
      subject_var = picks(
        datasets("ADSL"),
        variables("USUBJID"),
        values(
          selected = tidyselect::everything(),
          multiple = TRUE
        )
      ),
      color_var = picks(
        datasets("ADSL"),
        variables(c("SEX", "RACE", "COUNTRY", "ARM")),
        values(selected = tidyselect::everything(), multiple = FALSE)
      ),
      decorators = list(
        teal_transform_module(
          server = function(id, data) {
            moduleServer(id, function(input, output, session) {
              reactive({
                within(
                  data(), plot <- plot %>% plotly::layout(title = title),
                  title = sprintf("Spiderplot of %s", toString(unique(data()$ADTR$PARAMCD)))
                )
              })
            })
          }
        )
      ),
      transformators = list(
        teal_transform_filter(
          label = "Select Endpoint",
          picks(
            datasets("ADTR"),
            variables("PARAMCD"),
            values(choices = everything(), multiple = FALSE)
          )
        )
      )
    )
  )
)

shinyApp(app$ui, app$server, enableBookmarking = "server")

@insightsengineering insightsengineering locked as off-topic and limited conversation to collaborators Sep 11, 2025
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Design data extract and data merge
2 participants