Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

collector4 #2

Open
wants to merge 3 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
76 changes: 76 additions & 0 deletions R/collector.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,76 @@
# initiate global tibble to store the results
globals <- new.env()
globals$archive <- tibble(
call = list(),
env = list()
)

# Add collect_and_rethrow() before the original body
# and set the original function as the "unmodified" attribute
set_collector <- function(fun, force = FALSE) {
new_fun <- fun
body(new_fun) <- call(
"{",
quote(collect_and_rethrow()),
body(fun)
)
# note: trace() calls it "original" so we picked a new name
attr(new_fun, "unmodified") <- fun
new_fun
}

# * copy caller env and all the chain up to a special env, but every binding is lazy
# * store the call and env in the global tibble
# * on.exit, go through the envs and remove every binding that is still lazy, it's
# not been used, by reference it will update the global tibble too
# * use the original function to eval the call, and do it in the new env, return
# this value from the original function call
collect_and_rethrow <- function() {
new_caller_env <- env_clone_lazy(parent.frame(2))
call = sys.call(-1)
archive(
call = call,
env = new_caller_env
)
rlang::eval_bare(bquote(on.exit(env_cleanup(.(new_caller_env)))), parent.frame())

call[[1]] <- attr(sys.function(-1), "unmodified")
rlang::return_from(parent.frame(), eval(call, new_caller_env))
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The return_from() is overkill, we can have eval(call, new_caller_env) and remove the original body in the curried function.

}



# note: env_clone doesn't do deep copy, i.e we're not cloning environments that
# are bound in `e`, or nested in lists, or found in attributes, or as function enclosures
env_clone_lazy <- function(env) {
# we stop the recursion when we find a special env, defined by having a name
if (!is.null(attr(env, "name")) || identical(env, emptyenv())) return(env)
parent_clone <- env_clone_lazy(env_parent(env))
clone <- rlang::env_clone(env, parent = parent_clone)
for (nm in names(clone)) {
#FIXME: assumes env contains no active or lazy bindings, this could be fixed
env_bind_lazy(clone, !!nm := !!env[[nm]])
}
clone
}

# drop lazy bindings, since these were not used
env_cleanup <- function(env) {
if (!is.null(attr(env, "name")) || identical(env, emptyenv())) return(env)
env_cleanup(env_parent(env))
lazy_lgl <- env_binding_are_lazy(env)
rm(list = names(lazy_lgl)[lazy_lgl], envir = env)
}

archive <- function(
call,
env
) {
globals$archive <- rbind(
globals$archive,
tibble(
call = list(call),
env = list(env)
)
)
}
1 change: 1 addition & 0 deletions R/summarise.R
Original file line number Diff line number Diff line change
Expand Up @@ -119,6 +119,7 @@
summarise <- function(.data, ..., .groups = NULL) {
UseMethod("summarise")
}

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oops

#' @rdname summarise
#' @export
summarize <- summarise
Expand Down
4 changes: 4 additions & 0 deletions R/zzz.r
Original file line number Diff line number Diff line change
@@ -1,4 +1,8 @@
.onLoad <- function(libname, pkgname) {
mutate <<- set_collector(mutate)
summarise <<- set_collector(summarise)
summarize <<- set_collector(summarize)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

once in a package we can do this in place from outside, and we can easily have a function to reverse the operation, very similar to untrace()

op <- options()
op.dplyr <- list(
dplyr.show_progress = TRUE
Expand Down