Skip to content

Commit

Permalink
LSF bug fixes.
Browse files Browse the repository at this point in the history
The LSB_PJL_TASK_GEOMETRY env var has the potential to create
very long bsub commands for jobs with large numbers of cores.
Rather than using /bin/env to pass env vars, a sh wrapper script
first sets the variables using export statements and then calls
the bsub command, passing the usual options.  This provides a
unique environment for all jobs, even when they are submitted
by concurrent threads.  The bsub command length is reduced by
not having long lists of environment variable settings in it.

NOTE: This commit has not been fully tested, but is being committed
to ease testing across multiple HPC systems (e.g. WCOSS and
Yellowstone).
  • Loading branch information
christopherwharrop-noaa committed Apr 28, 2015
1 parent 86e7ce9 commit f607041
Showing 1 changed file with 14 additions and 11 deletions.
25 changes: 14 additions & 11 deletions lib/workflowmgr/lsfbatchsystem.rb
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,7 @@ class LSFBatchSystem
require 'workflowmgr/utilities'
require 'fileutils'
require 'etc'
require 'tempfile'

@@qstat_refresh_rate=30
@@max_history=3600*1
Expand Down Expand Up @@ -104,12 +105,12 @@ def submit(task)
rocotodir=File.dirname(File.dirname(File.expand_path(File.dirname(__FILE__))))

# Build up the string of environment settings
envstr=""
envstr="#!/bin/sh\n"
task.envars.each { |name,env|
if env.nil?
envstr += "#{name}='' "
envstr += "export #{name}\n"
else
envstr += "#{name}='#{env}' "
envstr += "export #{name}=#{env}\n"
end
}

Expand Down Expand Up @@ -168,7 +169,7 @@ def submit(task)
end
cmd += " -R span[ptile=#{span}]"
cmd += " -n #{nval}"
envstr += "#{ROCOTO_TASK_GEO}='#{task_geometry}' "
envstr += "export #{ROCOTO_TASK_GEO}=#{task_geometry}\n"
end
when :nodes
# Get largest ppn*tpp to calculate ptile
Expand Down Expand Up @@ -211,7 +212,7 @@ def submit(task)
cmd += " -n #{nnodes*ptile}"

# Setenv the LSB_PJL_TASK_GEOMETRY to specify task layout
envstr += "#{ROCOTO_TASK_GEO}='#{task_geometry}' "
envstr += "export #{ROCOTO_TASK_GEO}=#{task_geometry}\n"
when :walltime
hhmm=WorkflowMgr.seconds_to_hhmm(WorkflowMgr.ddhhmmss_to_seconds(value))
cmd += " -W #{hhmm}"
Expand Down Expand Up @@ -253,14 +254,16 @@ def submit(task)
# Add the command to submit
cmd += " #{rocotodir}/sbin/lsfwrapper.sh #{task.attributes[:command]}"

# Prepend the environment settings
cmd = "/bin/env " + envstr + cmd
# Build a script to set env vars and then call bsub to submit the job
tf=Tempfile.new('bsub.wrapper')
tf.write(envstr + cmd)
tf.flush()

# Run the submit command
output=`#{cmd} 2>&1`.chomp
# Run the submit command script
output=`/bin/sh #{tf.path} 2>&1`.chomp

WorkflowMgr.log("Submitted #{task.attributes[:name]} using #{cmd} 2>&1 ==> #{output}")
WorkflowMgr.stderr("Submitted #{task.attributes[:name]} using #{cmd} 2>&1 ==> #{output}",4)
WorkflowMgr.log("Submitted #{task.attributes[:name]} using '/bin/sh #{tf.path} 2>&1' with input {{#{envstr + cmd}}}")
WorkflowMgr.stderr("Submitted #{task.attributes[:name]} using '/bin/sh #{tf.path} 2>&1' with input {{#{envstr + cmd}}}",4)

# Parse the output of the submit command
if output=~/Job <(\d+)> is submitted to (default )*queue/
Expand Down

0 comments on commit f607041

Please sign in to comment.