Skip to content

Job Description

Antonio S. Cofiño edited this page Jan 10, 2022 · 2 revisions

This section will explain how to create a template with which to configure DRM4G's jobs.

Syntax

<VARIABLE> = ["]<VALUE>["]

Template options

  • NAME: Name of the job (filename of the Job Template by default).
  • EXECUTABLE: The executable file. If the executable is a shell command, you '''MUST''' specify the absolute path. For instance, /usr/bin/python, /bin/bash, /usr/bin/perl, etc.
  • ARGUMENTS: Arguments to the above executable.
  • NP: The number of processors to be allocated to a job.
  • INPUT_FILES: A comma-separated pair of local/remote filenames. If the remote filename is missing, the local filename will be preserved in the execution host.
  • OUTPUT_FILES: A comma-separated pair of remote/local filenames. If the local filename is missing, the remote filename will be preserved in the client host.
  • STDIN_FILE: Standard input file.
  • STDOUT_FILE: Standard output file. By default is stdout.${JOB_ID}.
  • STDERR_FILE: Standard error file. By default is stderr.${JOB_ID}.
  • REQUIREMENTS: A boolean expression evaluated for each available host, if the evaluation returns true the host will be considered to submit the job.
  • ENVIRONMENT: User defined, comma-separated environment variables.

Examples

  • How to define and submit a template named template.job to execute an R script:
NAME = My_R_script
EXECUTABLE  = /usr/bin/R -f example.r
STDOUT_FILE = stdout_file.${JOB_ID}
STDERR_FILE = stderr_file.${JOB_ID}
INPUT_FILES = example.r
OUTPUT_FILES = output.data

In order to submit the job we have to use the drm4g job submit command:

drm4g job submit Template.job

If we want to submit an array job we have to use the --ntasks indicating the number of tasks to run:

drm4g job submit --ntasks 2 task.job

  • How to define a template to execute a Python script:
NAME = My_Python_script
EXECUTABLE  = example.py
STDOUT_FILE = stdout_file.${JOB_ID}
STDERR_FILE = stderr_file.${JOB_ID}
INPUT_FILES = example.py
NAME = My_Python_script
EXECUTABLE  = /usr/bin/python example.py
STDOUT_FILE = stdout_file.${JOB_ID}
STDERR_FILE = stderr_file.${JOB_ID}
INPUT_FILES = example.py

Requirement Expressions

Syntax

The syntax of the requirement expressions is defined as:

stmt::= expr
expr::= VARIABLE '=' INTEGER
         | VARIABLE '>' INTEGER
         | VARIABLE '<' INTEGER
         | VARIABLE '=' STRING
         | expr '&' expr
         | expr '|' expr
         | '!' expr
         | '(' expr ')'

Each expression is evaluated to 1 (True) or 0 (False). Only those hosts for which the requirement expression is evaluated to True will be considered to execute the job.

Logical operators are as expected ( less '<', greater '>', '&' AND, '|' OR, '!' NOT), '=' means equals with integers. When you use '=' operator with strings, it performs a shell wild-card pattern matching.

Variables

The REQUIREMENTS values are:

  • HOSTNAME: Execution host (e.g. mycomputer).
  • ARCH:Architecture of the execution host (e.g. i686, x86_64).
  • LRMS_TYPE: Type of local DRM system for execution (e.g. pbs, sge).
  • QUEUE_NAME: Name of the queue (e.g. default, short).
  • QUEUE_MAXTIME:Maximum wall time of jobs in the queue.
  • QUEUE_MAXCPUTIME:Maximum CPU time of jobs in the queue.
  • QUEUE_MAXCOUNT:Maximum count of jobs that can be submitted in one request to the queue.
  • QUEUE_MAXRUNNINGJOBS:Maximum number of running jobs in the queue.
  • QUEUE_MAXJOBSINQUEUE:Maximum number of queued jobs in the queue.

Examples

REQUIREMENTS = LRMS_TYPE = "pbs"                   # Only use pbs
REQUIREMENTS = HOSTNAME = "*.es"                   # Only hosts ending in ".es"
REQUIREMENTS = HOSTNAME = "mycomputer"             # Only use mycomputer
REQUIREMENTS = ARCH = "x86_64"                     # Only host x86_64 architecture 
REQUIREMENTS = ARCH = "x86_64" & HOSTNAME = "*.es" # Only hosts ending in ".es" and have x86_64 architecture

Job Environment Expressions

Job environment variables can be easily set with the ENVIRONMENT parameter of the Job Template. These environment variables are parsed, so you can use and substitute the DRM4G variables.

Note: The variables defined in the ENVIRONMENT are sourced in a bash shell. In this way you can take advantage of the bash substitution capabilities and built-in functions. For example:

ENVIRONMENT = VAR = "`expr ${JOB_ID} + 3`" # will set VAR to JOB_ID + 3 

In addition to those variables set in the ENVIRONMENT parameter, DRM4G set the following variables, that can be used by your applications:

GW_RESTARTED
GW_EXECUTABLE
GW_HOSTNAME
GW_ARCH
GW_CPU_MHZ
GW_MEM_MB
GW_RESTART_FILES
GW_CPULOAD_THRESHOLD
GW_ARGUMENTS
GW_TASK_ID
GW_CPU_MODEL
GW_ARRAY_ID
GW_TOTAL_TASKS
GW_JOB_ID
GW_OUTPUT_FILES
GW_INPUT_FILES
GW_OS_NAME
GW_USER
GW_DISK_MB
GW_OS_VERSION

Syntax

The syntax of the environment expressions is specified in a comma-separated, source/destination pair:

stmt::= VARIABLE = VALUE, VARIABLE = VALUE, ...

LRMS Variables

The following variables defined in the ENVIRONMENT are translated to LRMS Job specification:

  • PPN: Specify the number of processors per node requested for the job.
  • CPUTIME:Maximum amount of CPU time used by all processes in the job ('''HH:MM:SS''').
  • WALLTIME:Maximum amount of real time during which the job can be in the running state ('''HH:MM:SS''').
  • MEMORY:Maximum amount of physical memory used by the job ('''MB''').

Examples

ENVIRONMENT = WALLTIME = 00:01:00                # 60 seconds of max walltime
ENVIRONMENT = WALLTIME = 00:01:00, MEMORY = 2000 # 60 seconds of max walltime and 2 GB of RAM memory