Skip to content

Commit

Permalink
Merge pull request #16 from cxzk/cxzk_add_shell
Browse files Browse the repository at this point in the history
Add IFS shell scripting guidelines
  • Loading branch information
msleigh authored Feb 2, 2024
2 parents f385042 + 49b7a3a commit 0f7825d
Show file tree
Hide file tree
Showing 15 changed files with 550 additions and 10 deletions.
176 changes: 176 additions & 0 deletions shell/conf.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,176 @@
# -*- coding: utf-8 -*-
#
# Configuration file for the Sphinx documentation builder.
#
# This file does only contain a selection of the most common options. For a
# full list see the documentation:
# http://www.sphinx-doc.org/en/master/config

# -- Path setup --------------------------------------------------------------

# If extensions (or modules to document with autodoc) are in another directory,
# add these directories to sys.path here. If the directory is relative to the
# documentation root, use os.path.abspath to make it absolute, like shown here.
#
# import os
# import sys
# sys.path.insert(0, os.path.abspath('.'))


# -- Project information -----------------------------------------------------

project = 'IFS_shell_guidelines'
copyright = '2023, ECMWF'
author = 'ECMWF'

# The short X.Y version
version = ''
# The full version, including alpha/beta/rc tags
release = ''


# -- General configuration ---------------------------------------------------

# If your documentation needs a minimal Sphinx version, state it here.
#
# needs_sphinx = '1.0'

# Add any Sphinx extension module names here, as strings. They can be
# extensions coming with Sphinx (named 'sphinx.ext.*') or your custom
# ones.
extensions = [
]

# Add any paths that contain templates here, relative to this directory.
templates_path = ['_templates']

# The suffix(es) of source filenames.
# You can specify multiple suffix as a list of string:
#
source_suffix = ['.rst', '.md']
# source_suffix = '.rst'

# The master toctree document.
master_doc = 'index'

# The language for content autogenerated by Sphinx. Refer to documentation
# for a list of supported languages.
#
# This is also used if you do content translation via gettext catalogs.
# Usually you set "language" from the command line for these cases.
language = 'en'

# List of patterns, relative to source directory, that match files and
# directories to ignore when looking for source files.
# This pattern also affects html_static_path and html_extra_path.
exclude_patterns = ['_build', 'Thumbs.db', '.DS_Store']

# The name of the Pygments (syntax highlighting) style to use.
pygments_style = None


# -- Options for HTML output -------------------------------------------------

# The theme to use for HTML and HTML Help pages. See the documentation for
# a list of builtin themes.
#
# html_theme = 'alabaster'
html_theme = 'sphinx_rtd_theme'

# Theme options are theme-specific and customize the look and feel of a theme
# further. For a list of options available for each theme, see the
# documentation.
#
html_theme_options = {
'collapse_navigation': True,
}

# Add any paths that contain custom static files (such as style sheets) here,
# relative to this directory. They are copied after the builtin static files,
# so a file named "default.css" will overwrite the builtin "default.css".
html_static_path = ['_static']

# Custom sidebar templates, must be a dictionary that maps document names
# to template names.
#
# The default sidebars (for documents that don't match any pattern) are
# defined by theme itself. Builtin themes are using these templates by
# default: ``['localtoc.html', 'relations.html', 'sourcelink.html',
# 'searchbox.html']``.
#
# html_sidebars = {}


# -- Options for HTMLHelp output ---------------------------------------------

# Output file base name for HTML help builder.
htmlhelp_basename = 'IFS_shell_guidelinesdoc'


# -- Options for LaTeX output ------------------------------------------------

latex_elements = {
# The paper size ('letterpaper' or 'a4paper').
#
# 'papersize': 'letterpaper',

# The font size ('10pt', '11pt' or '12pt').
#
# 'pointsize': '10pt',

# Additional stuff for the LaTeX preamble.
#
# 'preamble': '',

# Latex figure (float) alignment
#
# 'figure_align': 'htbp',
}

# Grouping the document tree into LaTeX files. List of tuples
# (source start file, target name, title,
# author, documentclass [howto, manual, or own class]).
latex_documents = [
(master_doc, 'IFS_shell_guidelines.tex', 'IFS\\_shell\_guidelines Documentation',
'ECMWF', 'manual'),
]


# -- Options for manual page output ------------------------------------------

# One entry per manual page. List of tuples
# (source start file, name, description, authors, manual section).
man_pages = [
(master_doc, 'ifs_shell_guidelines', 'IFS_shell_guidelines Documentation',
[author], 1)
]


# -- Options for Texinfo output ----------------------------------------------

# Grouping the document tree into Texinfo files. List of tuples
# (source start file, target name, title, author,
# dir menu entry, description, category)
texinfo_documents = [
(master_doc, 'IFS_coding_guidelines', 'IFS_coding_guidelines Documentation',
author, 'IFS_coding_guidelines', 'One line description of project.',
'Miscellaneous'),
]


# -- Options for Epub output -------------------------------------------------

# Bibliographic Dublin Core info.
epub_title = project

# The unique identifier of the text. This can be a ISBN number
# or the project homepage.
#
# epub_identifier = ''

# A unique identification for the text.
#
# epub_uid = ''

# A list of files that should not be packed into the epub file.
epub_exclude_files = ['search.html']
24 changes: 24 additions & 0 deletions shell/foreword.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
========
Foreword
========

We have a large collection of (mostly ksh93) shell scripts supporting
the IFS, which have grown somewhat organically to a high level of
complexity, with a variety of different styles and approaches. Although
there have been suggestions from multiple sources over the years of
rules that should be followed, these have generally been somewhat ad-hoc
and inconsistently applied, with various degrees of applicability to
modern systems.

Further, the choice of ksh93 as a "standard" shell is an increasingly
niche one, carrying risks that (a) it may not be readily available on
newer platforms (e.g. in the context of DestinE), and (b) it's likely to
be increasingly unfamiliar to new starters.

Following discussion with stakeholders, what is presented here aims to
be a coherent set of standards and guidance for shell scripts (as we
already have for example for Fortran) promoting a consistent, structured,
and modern approach that is applicable across research, development,
testing and operational environments. While they should be sufficient to
start making improvements, it is expected that they will be further
refined and extended over time.
25 changes: 25 additions & 0 deletions shell/guidelines/dependencies.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
Available commands and dependencies
-----------------------------------

Most of the time, our scripts currently run on a constrained set of
GNU/Linux based systems, where the availability of various GNU
extensions can be assumed on top of the standard commands specified by
POSIX (both additional commands and options to standard ones).

However, there are exceptions - for example for OpenIFS usage, or local
IFS development and testing, on other non-GNU or non-Linux platforms
(notably macOS), even if they are Unix-like and POSIX-compliant.
(Previous HPC systems before the Cray were also not necessarily
GNU/Linux-based.)

Scripts should therefore avoid using GNU-specific commands or extensions
when there is an alternative POSIX command or syntax that is equally
suitable. Where there is significant benefit however (in terms of
simplicity, clarity, performance, reliability etc.), GNU extensions may
be used, but the relevant GNU tools should be clearly documented as a
dependency of the script in case they need to be installed explicitly on
non-GNU-based platforms.

More broadly, *any* external command (or package of them) that's not
part of the POSIX standard should be documented as a dependency of the
script and/or the package of which it is part.
7 changes: 7 additions & 0 deletions shell/guidelines/ecflow/index.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
ecFlow task wrappers
--------------------
.. toctree::
:maxdepth: 1

structure
telemetry
62 changes: 62 additions & 0 deletions shell/guidelines/ecflow/structure.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,62 @@
Structure
~~~~~~~~~

These are not “pure” shell scripts, but are preprocessed first by
ecFlow. In general, they should be simple shell wrappers (ideally
generated by pyFlow, but that is beyond the scope of this document)
around a standalone script (or a short sequence of such scripts with
minimal logic), passing through any ecFlow-variable-derived options via
command line arguments. Where there are a large number of such
variables, they may be placed in a configuration file in the task's
temporary working directory that is then passed as a single command-line
argument.

Options that come not from ecFlow variables but directly from a
suite-level (or sub-suite component-level) configuration file (e.g.
current ``config.*.h`` or their successor) should be passed by one of
the following, depending how many need to be supplied:

#. individually as command-line arguments, if only a few are required;
#. via a temporary configuration file as with ecFlow-derived variables,
if there are too many to pass individually;
#. by directly passing the entire higher-level configuration file, if a
large portion of its content is required;
#. by passing a coherent sub-portion of the higher-level configuration
structure, if a hierarchical configuration system is introduced in
future.

In very limited cases, but only where it corresponds to a
well-established interface (e.g. specifying paths to certain tools),
options may be used to set environment variables instead; however as
these are effectively a form of global variable this should not be the
norm, and should *never* be done for options that may vary between
components of a suite. Exporting the entire content of ``config.*.h`` to
the environment via ``set -a / -o allexport`` is strongly deprecated.

In all cases, the interface to the called script should be well
documented, so that the script which actually does the work can be
tested outside of ecFlow. This called script must correctly report any
failure via a non-zero return code to the task wrapper.

In light of the above, *only* the header and footer of the task
wrappers, or the boilerplate code directly ``%include``\ d in them,
should include ecFlow substitutions (``%VAR%``, further ``%include``\ s
etc.) Called / sourced scripts should not include such syntax, as this
makes them impossible to use or test outside of ecFlow. *If* they are
deployed via a construct like this in the task wrapper:

::

cat >$TMPDIR/script <<\EOF
%includenopp <script>
EOF
chmod +x $TMPDIR/script

then this should *always* be done using ``%includenopp`` rather than
``%include`` to prevent such substitutions and the need to "double-up"
real ``%`` characters in the script. Similarly, this should be done with
``<<\`` rather than ``<<`` to prevent shell substitutions during
deployment (which should only happen at runtime). *However, future
alternative mechanisms for script deployment are possible but outside
the scope of this document, to be considered alongside the wider
evolution of workflow code for suite generation and deployment.*
39 changes: 39 additions & 0 deletions shell/guidelines/ecflow/telemetry.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
ecFlow telemetry and trapping
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

It is essential that ecFlow task wrappers do an ``ecflow_client --init``
at the start, ``--complete`` on successful completion, and implement
proper trapping to ensure an ``--abort`` on all failures. This should
generally be done by standard “boilerplate” included at the start and
end of the task wrapper, rather than in an ad-hoc way. Trapping should
be set up as early as possible, and special care taken to ensure that
any preceding setup/configuration code will not cause the script to exit
without calling ``ecflow_client --abort``.

It is not normally necessary for scripts *called from* or *sourced from*
the task wrapper to implement trapping themselves, unless required for
their own internal cleanup – if a called script fails or receives a
signal, the task wrapper will see this as a non-zero return code (which
it can handle or abort on explicitly or via “\ ``set -e``\ ” invoking
the wrapper's traps).

Sourced scripts, all functions in Bash, and POSIX ``()`` functions in
Korn shell, will inherit the trapping environment in which they are
called, and failures within them will behave similarly to those in the
body of the script. Changes to traps will propagate back out to the
calling environment, so must be carefully restored on all exit paths.
Such scripts and functions should therefore not change the traps unless
this is precisely their intended and documented purpose.

Called scripts, and ``function``-keyword functions in Korn shell, have
their own trapping environment, so may freely implement their own local
trap handlers without explicitly restoring them on exit, although this
is not necessary simply to maintain correct trapping in the task as a
whole, because the calling script will see and handle the non-zero
return code as outlined above. While trapping within a standalone called
script may be safe and useful, local traps should *not* be set within
functions unless a specific exception is agreed, because this is a
Korn-shell-specific behaviour – if the script is run in Bash, they will
behave the same as POSIX ``()`` functions and changes to the trap will
propagate back out to the caller, potentially breaking the task
wrapper's trapping.
18 changes: 18 additions & 0 deletions shell/guidelines/google.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
Google Shell Style Guide
------------------------

The `Google Shell Style
Guide <https://google.github.io/styleguide/shellguide.html>`__ provides
an existing well-thought-out set of detailed guidelines for shell
scripting (and there are very few others to be found) that we have
chosen to adopt as a foundation for our purposes. **It is recommended to
follow this guide as a baseline. For conciseness, the details are not
reproduced here but the guide should be read in conjunction with this
document.**

The `ChromiumOS
extensions <https://chromium.googlesource.com/chromiumos/docs/+/HEAD/styleguide/shell.md>`__
to the general Google guidance also offer some useful additions which
should be considered (in particular around shell arithmetic, defaulting
variables, ``printf`` vs ``echo`` and argument parsing with
``getopts``).
19 changes: 19 additions & 0 deletions shell/guidelines/index.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
=====================================
Detailed guidelines for shell scripts
=====================================

New scripts should follow these guidelines, except where dependence on
legacy scripts prevents this; legacy scripts will be migrated over a
period of time according to a managed programme.

.. toctree::
:maxdepth: 1
:numbered:

when
which
options
google
ecflow/index
tools
dependencies
Loading

0 comments on commit 0f7825d

Please sign in to comment.