-
Help
DescriptionI have a targets pipepline in which one target queries a database, saves a file to disc and returns a file path. A downstream target reads in that path and does some further analysis. However i sometimes need to edit the is this the best way to have a file that both works in a targets pipeline but also allows for interactive development outside of the pipeline? explore_data <-
function() {
# load the analysis data
analysis_data <-
if (!exists("tar_runtime")) {
library(tidyverse)
library(arrow)
source("R/util.R")
analysis_data <- read_parquet(path/to/analysis_data.parquet)
} else {
# load the analysis data using target name from pipeline in _targets
parquet_file_path <- tar_read(analysis_data)
# Read the parquet file into a data frame
analysis_data <- read_parquet(parquet_file_path)
}
# rest of the script to explore the data
analysis_data %>% ...
}
|
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 6 replies
-
What I would typically do is have the explore data function take a So your function would look like: explore_data <- function(parquet_file_path) {
analysis_data <- read_parquet(parquet_file_path)
# rest of the script to explore the data
analysis_data %>% ...
} And your list of targets in list(
tar_target(parquet_file_path, "path/to/analysis_data.parquet", format = "file"),
tar_target(exploration, explore_data(parquet_file_path))
) |
Beta Was this translation helpful? Give feedback.
In the above simple example, if I wanted to work on editing the
explore_data()
function, I'd probably, in the console, run:After running
tar_load(parquet_file_path)
there's aparquet_file_path
object in my environment that prints/returns"path/to/analysis_data.parquet"
.Then I'd open up the file with the
explore_data()
function and because I named my file path target the same as the file argument to that function, I can just run the code inside the function one line at a time until I get to the part I want to edit.This pattern should work for anything. For example, if you had a function that …