[help] repex: with garbage_collection = TRUE, If I load DESeq2 in target script, the pipeline goes from 200 workers, to struggle to lauch 50, with the rest of the code unaltered #1353
-
Help
DescriptionThis is bizarre This reproducible pipeline achieves launching 200 workers very quick # _targets.R
library(targets)
library(crew) # Load the crew package for resource management
library(crew.cluster)
# Define the crew controller using Slurm
controller <- crew_controller_slurm(
name = "tier_1",
slurm_memory_gigabytes_per_cpu = 8,
# script_lines = "#SBATCH --mem 15G", # Uncomment if you need custom SBATCH lines
slurm_cpus_per_task = 1,
workers = 200,
# seconds_interval = 0.1,
# tasks_max = 50, # Uncomment and adjust if needed
verbose = TRUE
)
# Set target options to use the crew controller
tar_option_set(
controller = controller,
memory = "transient", garbage_collection = TRUE,
storage = "worker", retrieval = "worker"
)
# Generate 1000 targets
list(
tar_target(
name = l,
command = 1:1000, iteration = "list"
),
tar_target(
name = a,
command = { Sys.sleep(20); return("bla")},
pattern = l, iteration = "list"
# Wait for 20 seconds
)
)
2-3 jobs launched at a time ▶ dispatched branch a_5e041bac7fdda32a
▶ dispatched branch a_b4ef72d8cc373f87
▶ dispatched branch a_fb70227d5195e1b6
Submitted batch job 18893351
Submitted batch job 18893352
▶ dispatched branch a_708ff7775c222f46
▶ dispatched branch a_998a1f850b34f8bf
Submitted batch job 18893353
Submitted batch job 18893354
▶ dispatched branch a_d5bfb22e2970bf05
▶ dispatched branch a_614c910a6407ae48
Submitted batch job 18893355
Submitted batch job 18893356
▶ dispatched branch a_218a7e33829a6caa
▶ dispatched branch a_e5436d8e690b357c
Submitted batch job 18893357
Submitted batch job 18893358
▶ dispatched branch a_fad08582a31aa945
▶ dispatched branch a_49d8e9e910e006b4
▶ dispatched branch a_c29db86ad8864854
Submitted batch job 18893359
Submitted batch job 18893360
▶ dispatched branch a_173f18e5db84bdf6
Submitted batch job 18893361
Submitted batch job 18893362
▶ dispatched branch a_46f22c0f6ccb974d
▶ dispatched branch a_570472b1af5d6446
Submitted batch job 18893363
Submitted batch job 18893364
▶ dispatched branch a_89e035454124b5cb
▶ dispatched branch a_381eb4cc42050173
Submitted batch job 18893365
Submitted batch job 18893366
▶ dispatched branch a_6e1ac475f385e0ac
▶ dispatched branch a_dcfb404a18f8ef63
Submitted batch job 18893367
Submitted batch job 18893368
▶ dispatched branch a_94a7fc5f5112c097
▶ dispatched branch a_3d3e35c7518030af The same pipeline with # _targets.R
library(DESeq2)
library(targets)
library(crew) # Load the crew package for resource management
library(crew.cluster)
# Define the crew controller using Slurm
controller <- crew_controller_slurm(
name = "tier_1",
slurm_memory_gigabytes_per_cpu = 8,
# script_lines = "#SBATCH --mem 15G", # Uncomment if you need custom SBATCH lines
slurm_cpus_per_task = 1,
workers = 200,
# seconds_interval = 0.1,
# tasks_max = 50, # Uncomment and adjust if needed
verbose = TRUE
)
# Set target options to use the crew controller
tar_option_set(
controller = controller,
memory = "transient", garbage_collection = TRUE,
storage = "worker", retrieval = "worker"
)
# Generate 1000 targets
list(
tar_target(
name = l,
command = 1:1000, iteration = "list"
),
tar_target(
name = a,
command = { Sys.sleep(20); return("bla")},
pattern = l, iteration = "list"
# Wait for 20 seconds
)
)
Launches, launches 30-50 workers max, and launches more slowly. One at a time. ▶ dispatched branch a_448277654bcddef7
Submitted batch job 18893726
▶ dispatched branch a_5fd4ae7fc058cbda
Submitted batch job 18893727
▶ dispatched branch a_08d5d4689f970b56
Submitted batch job 18893728
▶ dispatched branch a_3c17fccf298467fb
✔ skipped branch a_7944aa1e96b3e001
Submitted batch job 18893729
▶ dispatched branch a_fbc50ef3280274aa
Submitted batch job 18893730
▶ dispatched branch a_287ee8c9bb727c7b
Submitted batch job 18893731
▶ dispatched branch a_8d88d8d0e54f1aae
Submitted batch job 18893732
▶ dispatched branch a_12ebdeb201aedec7
Submitted batch job 18893733
▶ dispatched branch a_b2b238a68b3ae674
Submitted batch job 18893737
▶ dispatched branch a_84380e1f9da4a8a8
Submitted batch job 18893764
▶ dispatched branch a_26440d9480acc0ae
Submitted batch job 18893765
▶ dispatched branch a_506788df7a4160c1
Submitted batch job 18893766
▶ dispatched branch a_e52df70f047cdf09
Submitted batch job 18893767
▶ dispatched branch a_5fce7e42cc0c3aa2
Submitted batch job 18893768
▶ dispatched branch a_37e2f2be000352d2
Submitted batch job 18893769
▶ dispatched branch a_b2adccd98b257c18
Submitted batch job 18893771
▶ dispatched branch a_f2cdc78fa5018f3d
Submitted batch job 18893772
▶ dispatched branch a_93f67fd0197b0a51
Submitted batch job 18893773
▶ dispatched branch a_8e8195d32ab5ab98
Submitted batch job 18893774
▶ dispatched branch a_882db2f95243c0b4
Submitted batch job 18893775
▶ dispatched branch a_d5ba83b1bfd967a0
Submitted batch job 18893776
▶ dispatched branch a_df0f410e600a5f7c
Submitted batch job 18893777
▶ dispatched branch a_1dc931d7fad4b85e |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 1 reply
-
I think you and I noticed this issue at around the same time. The problem is that |
Beta Was this translation helpful? Give feedback.
I think you and I noticed this issue at around the same time. The problem is that
base::gc()
is computationally expensive, especially when run over and over again for large numbers of fast targets. In developmenttargets
, I recently made thegarbage_collection
option oftar_option_set()
more flexible so that e.g.tar_option_set(garbage_collection = 100)
will only rungc()
every hundredth target.