Skip to content

Commit

Permalink
Add converted shell guidelines from Confluence
Browse files Browse the repository at this point in the history
  • Loading branch information
cxzk committed Jan 26, 2024
1 parent 21fd2ef commit 49b7a3a
Show file tree
Hide file tree
Showing 14 changed files with 374 additions and 10 deletions.
24 changes: 24 additions & 0 deletions shell/foreword.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
========
Foreword
========

We have a large collection of (mostly ksh93) shell scripts supporting
the IFS, which have grown somewhat organically to a high level of
complexity, with a variety of different styles and approaches. Although
there have been suggestions from multiple sources over the years of
rules that should be followed, these have generally been somewhat ad-hoc
and inconsistently applied, with various degrees of applicability to
modern systems.

Further, the choice of ksh93 as a "standard" shell is an increasingly
niche one, carrying risks that (a) it may not be readily available on
newer platforms (e.g. in the context of DestinE), and (b) it's likely to
be increasingly unfamiliar to new starters.

Following discussion with stakeholders, what is presented here aims to
be a coherent set of standards and guidance for shell scripts (as we
already have for example for Fortran) promoting a consistent, structured,
and modern approach that is applicable across research, development,
testing and operational environments. While they should be sufficient to
start making improvements, it is expected that they will be further
refined and extended over time.
25 changes: 25 additions & 0 deletions shell/guidelines/dependencies.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
Available commands and dependencies
-----------------------------------

Most of the time, our scripts currently run on a constrained set of
GNU/Linux based systems, where the availability of various GNU
extensions can be assumed on top of the standard commands specified by
POSIX (both additional commands and options to standard ones).

However, there are exceptions - for example for OpenIFS usage, or local
IFS development and testing, on other non-GNU or non-Linux platforms
(notably macOS), even if they are Unix-like and POSIX-compliant.
(Previous HPC systems before the Cray were also not necessarily
GNU/Linux-based.)

Scripts should therefore avoid using GNU-specific commands or extensions
when there is an alternative POSIX command or syntax that is equally
suitable. Where there is significant benefit however (in terms of
simplicity, clarity, performance, reliability etc.), GNU extensions may
be used, but the relevant GNU tools should be clearly documented as a
dependency of the script in case they need to be installed explicitly on
non-GNU-based platforms.

More broadly, *any* external command (or package of them) that's not
part of the POSIX standard should be documented as a dependency of the
script and/or the package of which it is part.
7 changes: 7 additions & 0 deletions shell/guidelines/ecflow/index.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
ecFlow task wrappers
--------------------
.. toctree::
:maxdepth: 1

structure
telemetry
62 changes: 62 additions & 0 deletions shell/guidelines/ecflow/structure.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,62 @@
Structure
~~~~~~~~~

These are not “pure” shell scripts, but are preprocessed first by
ecFlow. In general, they should be simple shell wrappers (ideally
generated by pyFlow, but that is beyond the scope of this document)
around a standalone script (or a short sequence of such scripts with
minimal logic), passing through any ecFlow-variable-derived options via
command line arguments. Where there are a large number of such
variables, they may be placed in a configuration file in the task's
temporary working directory that is then passed as a single command-line
argument.

Options that come not from ecFlow variables but directly from a
suite-level (or sub-suite component-level) configuration file (e.g.
current ``config.*.h`` or their successor) should be passed by one of
the following, depending how many need to be supplied:

#. individually as command-line arguments, if only a few are required;
#. via a temporary configuration file as with ecFlow-derived variables,
if there are too many to pass individually;
#. by directly passing the entire higher-level configuration file, if a
large portion of its content is required;
#. by passing a coherent sub-portion of the higher-level configuration
structure, if a hierarchical configuration system is introduced in
future.

In very limited cases, but only where it corresponds to a
well-established interface (e.g. specifying paths to certain tools),
options may be used to set environment variables instead; however as
these are effectively a form of global variable this should not be the
norm, and should *never* be done for options that may vary between
components of a suite. Exporting the entire content of ``config.*.h`` to
the environment via ``set -a / -o allexport`` is strongly deprecated.

In all cases, the interface to the called script should be well
documented, so that the script which actually does the work can be
tested outside of ecFlow. This called script must correctly report any
failure via a non-zero return code to the task wrapper.

In light of the above, *only* the header and footer of the task
wrappers, or the boilerplate code directly ``%include``\ d in them,
should include ecFlow substitutions (``%VAR%``, further ``%include``\ s
etc.) Called / sourced scripts should not include such syntax, as this
makes them impossible to use or test outside of ecFlow. *If* they are
deployed via a construct like this in the task wrapper:

::

cat >$TMPDIR/script <<\EOF
%includenopp <script>
EOF
chmod +x $TMPDIR/script

then this should *always* be done using ``%includenopp`` rather than
``%include`` to prevent such substitutions and the need to "double-up"
real ``%`` characters in the script. Similarly, this should be done with
``<<\`` rather than ``<<`` to prevent shell substitutions during
deployment (which should only happen at runtime). *However, future
alternative mechanisms for script deployment are possible but outside
the scope of this document, to be considered alongside the wider
evolution of workflow code for suite generation and deployment.*
39 changes: 39 additions & 0 deletions shell/guidelines/ecflow/telemetry.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
ecFlow telemetry and trapping
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

It is essential that ecFlow task wrappers do an ``ecflow_client --init``
at the start, ``--complete`` on successful completion, and implement
proper trapping to ensure an ``--abort`` on all failures. This should
generally be done by standard “boilerplate” included at the start and
end of the task wrapper, rather than in an ad-hoc way. Trapping should
be set up as early as possible, and special care taken to ensure that
any preceding setup/configuration code will not cause the script to exit
without calling ``ecflow_client --abort``.

It is not normally necessary for scripts *called from* or *sourced from*
the task wrapper to implement trapping themselves, unless required for
their own internal cleanup – if a called script fails or receives a
signal, the task wrapper will see this as a non-zero return code (which
it can handle or abort on explicitly or via “\ ``set -e``\ ” invoking
the wrapper's traps).

Sourced scripts, all functions in Bash, and POSIX ``()`` functions in
Korn shell, will inherit the trapping environment in which they are
called, and failures within them will behave similarly to those in the
body of the script. Changes to traps will propagate back out to the
calling environment, so must be carefully restored on all exit paths.
Such scripts and functions should therefore not change the traps unless
this is precisely their intended and documented purpose.

Called scripts, and ``function``-keyword functions in Korn shell, have
their own trapping environment, so may freely implement their own local
trap handlers without explicitly restoring them on exit, although this
is not necessary simply to maintain correct trapping in the task as a
whole, because the calling script will see and handle the non-zero
return code as outlined above. While trapping within a standalone called
script may be safe and useful, local traps should *not* be set within
functions unless a specific exception is agreed, because this is a
Korn-shell-specific behaviour – if the script is run in Bash, they will
behave the same as POSIX ``()`` functions and changes to the trap will
propagate back out to the caller, potentially breaking the task
wrapper's trapping.
18 changes: 18 additions & 0 deletions shell/guidelines/google.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
Google Shell Style Guide
------------------------

The `Google Shell Style
Guide <https://google.github.io/styleguide/shellguide.html>`__ provides
an existing well-thought-out set of detailed guidelines for shell
scripting (and there are very few others to be found) that we have
chosen to adopt as a foundation for our purposes. **It is recommended to
follow this guide as a baseline. For conciseness, the details are not
reproduced here but the guide should be read in conjunction with this
document.**

The `ChromiumOS
extensions <https://chromium.googlesource.com/chromiumos/docs/+/HEAD/styleguide/shell.md>`__
to the general Google guidance also offer some useful additions which
should be considered (in particular around shell arithmetic, defaulting
variables, ``printf`` vs ``echo`` and argument parsing with
``getopts``).
19 changes: 19 additions & 0 deletions shell/guidelines/index.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
=====================================
Detailed guidelines for shell scripts
=====================================

New scripts should follow these guidelines, except where dependence on
legacy scripts prevents this; legacy scripts will be migrated over a
period of time according to a managed programme.

.. toctree::
:maxdepth: 1
:numbered:

when
which
options
google
ecflow/index
tools
dependencies
48 changes: 48 additions & 0 deletions shell/guidelines/options.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
Preferred shell options
-----------------------

The following are proposed as a standard set of options to enable in all
scripts, except where there is a clear and documented need to do
otherwise:

- ``set -e`` / ``-o errexit``: exit immediately when a command fails.
This reduces the verbosity of error handling, similar to using a
limited form of exceptions rather than C-like code.
- ``set -u`` / ``-o nounset``: exit immediately when using an undefined
variable. This helps to catch the use of undefined or mistyped
variables, similar to using "``implicit none``" in Fortran.
- ``set -o pipefail``: consider a pipeline failed if *any* of its
commands fail, rather than ignoring errors in all but the *last*
command.
- ``shopt -s inherit_errexit``: this disables the Bash peculiarity
where "``set -e``" gets reset within ``$(...)`` command
substitutions.

These can be combined for brevity:

::

set -eu -o pipefail
shopt -s inherit_errexit

The following are worthy of consideration in cases where they are
appropriate:

- ``shopt -s extglob``: allow various extended globbing syntaxes.
- ``shopt -s failglob``: exit immediately if a glob matches no
filenames.
- ``shopt -s nullglob``: globs that match no files expand to an empty
string rather than themselves.

Finally, where verbose tracing of the script's execution is desired,
"``set -x`` / ``-o xtrace``" should be used.

More rarely, e.g. when tracking down hard-to-find parsing errors,
"``set -v`` / ``-o verbose``" may also be useful to output a trace as
the script is *read* rather than as it is executed, but this should not
normally be left on in production scripts.

"``set -a`` / ``set -o allexport``" may be used sparingly and locally,
if a whole list of variables are to be exported, but should *always* be
reset immediately afterwards to avoid polluting the environment with
unnecessary variables from the rest of the script.
20 changes: 20 additions & 0 deletions shell/guidelines/principles.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
==================
Guiding principles
==================

These indicate the high-level direction of what these guidelines are
intended to promote, with the detail of how to achieve this covered in
the subsequent sections.

#. Use the most suitable tool for the job, without excessive variety.
#. Consistency is more important than any particular stylistic choice.
#. Reusable modular components rather than duplication or repetition.
#. Robustness against run-time failures (both standalone and under
ecFlow).
#. Maximise use of automated check/lint/formatting tools.
#. Avoid either sticking with the past, or adopting the latest thing,
for the sake of it.
#. Make it *easier* to write and debug scripts that are needed, rather
than getting in the way of scientific or technical work.
#. Enable a gradual/piecemeal transition of existing scripts rather than
a "big bang" rewrite of everything at once.
33 changes: 33 additions & 0 deletions shell/guidelines/tools.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
Use of shellcheck and shfmt
---------------------------

The "`shellcheck <https://www.shellcheck.net/wiki/Home>`__" tool can
catch many undesirable practices in shell scripts, including those
discouraged in the Google guide. All scripts should pass its validation
with the all checks enabled and warnings *sparingly* disabled per line
or file where it is really necessary.

The "`shfmt <https://github.com/mvdan/sh>`__" tool can be used to
auto-format shell scripts (much like "black" does for Python) in a
manner consistent with the Google guide, and (unless there are strong
arguments for an exception) all scripts should follow this with an
appropriate set of options (``-i 2 -ci -s``). Editor configuration files
should also be available to support correct formatting as scripts are
written.

There will be a need for some wrapping around these tools, (a) to define
the prescribed set of options to use, (b) to support shell scripts
embedded in ``.ecf`` task wrappers, and (c) to provide appropriate
skeleton config.*.h files to ensure that all the config variables aren't
falsely shown as undefined. These wrappers will probably live either
directly in ifs-scripts, or in ifs-git-tools (like "git ifsnorms" for
Fortran).

Attention should be paid during pull request reviews to whether any new
exceptions to the rules enforced by these tools are justified.

At some point, when we have a "clean" set of scripts, we could consider
enforcing these checks via Git commit hooks, but that's a decision for
the future rather than now. In the meantime, we should aim to check for
regressions against them (i.e. new violations that weren't already
there) via the continuous integration system.
33 changes: 33 additions & 0 deletions shell/guidelines/when.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
When (not) to use shell
-----------------------

Shell scripting should be primarily considered for individual short
simple scripts, with more structured languages preferred for larger and
more complex code.

Shell is particularly suited to being the top-level wrapper or glue that
sets up the appropriate environment before executing a more complex
program, or a simple sequence of them. (For example, an ecFlow task
wrapper that extracts the relevant parameters from ecFlow variables and
passes them to a complex executable; or something like ``run_parallel``
which sets up the appropriate environment for srun.)

However, it is much less suitable for complex logic and data/text
processing. Tasks like the generation of one set of configuration files
from another (e.g. generating Fortran namelists based on a suite or
experiment configuration, possibly modulated by knowledge of current
state) can generally be expressed more clearly and robustly by calling
out to a more structured language such as Python. (Conversely, Python is
*not* generally a good choice as the wrapper which forks subprocesses,
even more so when parallelisation is required.)

It is therefore recommended that, while shell may be the appropriate
choice for the outermost layer and simple subtasks, this should delegate
to something else (typically Python, unless it’s handled by a compiled
language) for complex processing and logic.

Separate guidance will also be developed for Python, but in general
Python components need to have good interfaces, test coverage and
documentation to ensure that (as with compiled Fortran code) they are
unlikely to need on-the-fly debugging or internal modification in an
operational context.
30 changes: 30 additions & 0 deletions shell/guidelines/which.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
Which shell to use
------------------

While there are a variety of different shells around, each with its own
nice features, no other shell has achieved the ubiquity of Bash. It is
the default login shell on most modern Linux systems, leading to
widespread familiarity, and readily available on almost all other
Unix-like platforms.

Greater portability *can* be achieved by sticking to POSIX-standard
shell syntax, which should behave the same in any POSIX-compatible
shell, but this comes at the cost of an extremely restricted feature set
(lacking arrays, ``[[ ... ]]`` conditional syntax, function-local
variables etc.) Outside of specific very simple scripts that may require
portability to exotic or embedded systems which don't support Bash, this
is unlikely to be a good trade-off.

It might be possible to define a supported set of shells, and restrict
to a common subset of their extended functionality and syntax (possibly
with a function library to encapsulate differences). However, it's not
necessarily clear *which* other shells ought to be included here, and
may come at the price of significant complexity in both design and
ongoing testing. (Such an approach might have a *short-term* role while
new scripts co-exist with legacy ksh ones with common includes, however,
as discussed in the migration plan.)

**It is therefore recommended that Bash (version 4.4+) is the default
choice for all shell scripts**, except in rare cases where they must run
on esoteric systems where only a POSIX-compatible ``/bin/sh`` can be
assumed to be available.
16 changes: 16 additions & 0 deletions shell/index.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
==============================
IFS shell scripting guidelines
==============================

.. toctree::
:maxdepth: 2

foreword
guidelines/index

Indices and tables
==================

* :ref:`genindex`
* :ref:`modindex`
* :ref:`search`
10 changes: 0 additions & 10 deletions shell/rules/I1.rst

This file was deleted.

0 comments on commit 49b7a3a

Please sign in to comment.