[help] Why does dynamic branching change the rules for when a target becomes outdated? #1369
-
Help
DescriptionIn a chain of three targets targets::tar_script({
# first run
make_dfr <- function() tibble::tibble(x = 1:2, a = 1:2)
list(
targets::tar_target(dfr, make_dfr()),
targets::tar_target(dfr_x, dplyr::select(dfr, x)),
targets::tar_target(result, max(dfr_x$x))
)
})
targets::tar_make()
#> > dispatched target dfr
#> o completed target dfr [0.001 seconds]
#> > dispatched target dfr_x
#> o completed target dfr_x [0.035 seconds]
#> > dispatched target result
#> o completed target result [0 seconds]
#> > ended pipeline [0.069 seconds] targets::tar_script({
# second run
make_dfr <- function() tibble::tibble(x = 1:2, a = 1:2 * 5)
list(
targets::tar_target(dfr, make_dfr()),
targets::tar_target(dfr_x, dplyr::select(dfr, x)),
targets::tar_target(result, max(dfr_x$x))
)
})
targets::tar_make()
#> > dispatched target dfr
#> o completed target dfr [0.002 seconds]
#> > dispatched target dfr_x
#> o completed target dfr_x [0.033 seconds]
#> v skipped target result
#> > ended pipeline [0.075 seconds] targets::tar_script({
# first run dynamic branching
make_dfr <- function() list(tibble::tibble(x = 1:2, a = 1:2))
list(
targets::tar_target(dfr, make_dfr(), iteration="list"),
targets::tar_target(dfr_x, dplyr::select(dfr, x), pattern=map(dfr), iteration="list"),
targets::tar_target(result, max(dfr_x$x), pattern=map(dfr_x))
)
})
targets::tar_make()
#> > dispatched target dfr
#> o completed target dfr [0.002 seconds]
#> > dispatched branch dfr_x_39b4e0c71f9c48fa
#> o completed branch dfr_x_39b4e0c71f9c48fa [0.033 seconds]
#> o completed pattern dfr_x
#> > dispatched branch result_89fb999fd0673ae5
#> o completed branch result_89fb999fd0673ae5 [0 seconds]
#> o completed pattern result
#> > ended pipeline [0.077 seconds] targets::tar_script({
# second run dynamic branching
make_dfr <- function() list(tibble::tibble(x = 1:2, a = 1:2 * 5))
list(
targets::tar_target(dfr, make_dfr(), iteration="list"),
targets::tar_target(dfr_x, dplyr::select(dfr, x), pattern=map(dfr), iteration="list"),
targets::tar_target(result, max(dfr_x$x), pattern=map(dfr_x))
)
})
targets::tar_make()
#> > dispatched target dfr
#> o completed target dfr [0.002 seconds]
#> > dispatched branch dfr_x_6e2886b62e71937a
#> o completed branch dfr_x_6e2886b62e71937a [0.033 seconds]
#> o completed pattern dfr_x
#> > dispatched branch result_d09dc4169add951f
#> o completed branch result_d09dc4169add951f [0 seconds]
#> o completed pattern result
#> > ended pipeline [0.075 seconds] Created on 2024-11-10 with reprex v2.1.0 |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 1 reply
-
It's a limitation of dynamic branching. The names of the dynamic branches are based on hashes of the upstream dependencies. Since all of Although hash-based naming caused targets to invalidate in this case, it is usually more stable than the obvious alternatives. For example, if the branches were named |
Beta Was this translation helpful? Give feedback.
It's a limitation of dynamic branching. The names of the dynamic branches are based on hashes of the upstream dependencies. Since all of
dfr
changed, each branch ofdfr_x
changed its name, and this change propagates all the way downstream.Although hash-based naming caused targets to invalidate in this case, it is usually more stable than the obvious alternatives. For example, if the branches were named
dfr_x_1
anddfr_x_2
, then each branch name would depend on the number and ordering of all the other branches, which would lead to even more wasted computation in the general case.