forked from oliver-sanders/cylc-tutorial
-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathcylc-overview-apx.tex
61 lines (52 loc) · 3.27 KB
/
cylc-overview-apx.tex
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
\section{Appendix: Cylc Overview}
\label{Appendix Cylc Overview}
\note{This Appendix contains extra explanatory material for Section~\ref{Cylc
Overview}.}
\terminology{A \underline{task} represents a \underline{job} (a script or
program) that runs on a computer.} We make this distinction because a {\em job}
exists only when it runs, but its representation in a cylc suite has a longer
lifetime. For instance, a ``waiting'' task represents a job that will run
sometime in the future because its inputs have not been satisfied yet.
\terminology{A \underline{cycle point} is a point on an integer or date-time
sequence.} Note that a date-time cycle point has no connection to real
time unless you attach a {\em clock trigger} to a task. Clock-triggers
say that in addition to any dependence on other tasks, a task cannot trigger
unless the wall-clock time is greater than or equal to its cycle point, or some
offset from that point. See Section~\ref{Clock Triggered Tasks}.
\terminology{A \underline{continuous workflow of cycling tasks} is a single
workflow composed of cycling tasks, not just a series of separate single-cycle
workflows, and it may extend indefinitely into the future.} This describes
cylc's classic {\em iterative cycling} mode in which the scheduler continually
extends the workflow to future cycle points as the suite runs, and each
instance of a repeated job is represented by the same logical task with a
different cycle point. {\em Parameterized cycling} is an alternative to this
for workflows that are {\em finite in extent and not too large}: each instance
of a repeated job is represented by a different logical task with a different
task name, and the entire workflow is mapped out at start-up rather than
extended as the suite runs. Parameterized cycling is generally not as flexible
or powerful as iterative cycling, and it has several significant disadvantages,
but it can sometimes be useful - see the cylc user guide for more information.
The Section~\ref{Cylc Overview} example workflow is repeated here:
\begin{center}
\includegraphics[width=0.4\columnwidth]{resources/tex/dep-two-cycles-linked}
\end{center}
Here's a job schedule for a single cycle of the workflow,
\begin{center}
\includegraphics[width=0.17\columnwidth]{resources/tex/timeline-zero.png}
\end{center}
Bar width is proportional to job run time, and the vertical axis has no
meaning. So {\em b} and {\em c} start running at the same time,
immediately after {\em a} finishes, and so on. The job schedule for repeatedly
cycling the same workflow is shown below, for cylc (top) and a traditional
fixed-cycle scheduler (bottom)
\begin{center}
\includegraphics[width=0.5\columnwidth]{resources/tex/timeline-two}
\end{center}
The different colours represent different cycle points. Cylc automatically
interleave cycles for faster scheduling throughput. In this case cylc is
running tasks from four different cycle points at once, most of the time. The
optimal result is shown (the white time-gaps between tasks are are required by
the dependency relationshionships). This assumes sufficient compute resource to
run every task as soon as its inputs are satisified. But if a task is delayed
(in a batch scheduler queue or otherwise) the system organically adapts, and
the rest of the workflow will carry on as dependencies allow.