Replies: 1 comment
-
I am not sure I completely follow the use case yet, but I can begin with some comments. Looks like you are getting your initial data from a database, and it would be ideal to automatically invalidate targets like Parallel computing in Integration with cloud storage is documented at https://books.ropensci.org/targets/data.html#cloud-storage. Writing to a database is tricky. If you write to the same database table as the data you start with, then your pipeline is circular, which is not a good fit for what |
Beta Was this translation helpful? Give feedback.
-
I'm a member of the Observational Health Data Science & Informatics community (OHDSI) (ohdsi.org). OHDSI is a community focused on generating medical evidence from patient-level data leveraging an agreed upon standard data model and a set of R based method libraries. An area where we have not yet found a solution is in the area of reproducible pipeline execution. I was recently tasked with evaluating the landscape of pipeline tooling available for R based analyses and as you might suspect I came across targets as part of that journey.
I have tried to script out the scenario the OHDSI community is often tasked with, specifically, generating evidence at scale, across a distributed community of person-centric data source holders. This tasks requires multiple steps and having a reliable, reproducible process is our goal. Here is a draft stub script with some embedded questions I thought might be helpful to get the discussion started.
Please let me know if there are any details that I left out that would aid in the discussion.
Thanks!
Beta Was this translation helpful? Give feedback.
All reactions