-
Notifications
You must be signed in to change notification settings - Fork 1
/
introduction.tex
93 lines (89 loc) · 5.33 KB
/
introduction.tex
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
\label{sec:introduction}
Computational science and engineering communities develop
complex applications to solve scientific and engineering challenges,
but these communities have a mixed record of using software
engineering best practices
\cite{hannay2009,Nguyen-Hoan}. Many codes developed by
scientific communities adopt standard software practices when the size
and complexity of an application become too unwieldy to continue
without them \cite{cc2012}. The driving force behind adoption is
usually the realization that without using software engineering
practices, the development, verification, and maintenance of
applications can become intractable. As more codes cross the threshold
into increasing complexity, software engineering processes are being
adopted from practices derived outside the scientific and
engineering domain. Yet the state of the art for software engineering
practices in scientific codes often lags behind that in the commercial
software space \cite{basili2008understanding, hochstein2008asc,segal2008developing}.
There are many reasons: lack of incentives, support, and
funding; a reward system favoring scientific results over software
development; limited understanding of how software engineering
should be promoted to communities that have their own
specific needs and sociology \cite{carver2007software,Heroux2009}.
Some software engineering practices have been better accepted than others
among the developers of scientific codes. The ones that are
used often include repositories for code version control,
licensing process, regular testing, documentation, release and distribution
policies, and contribution policies \cite{ carver2012software, carver2007software,
cc2012, Dubey2014}. Less accepted practices include code review,
code deprecation, and adoption of specific practices from development
methodologies such as Agile \cite{agile}. Software best practices that
may be effective in commercial software development environments are
not always suited for scientific environments, partly because of
sociology and partly because of technical
challenges. Sociology manifests itself as suspicion of too rigid a
process or not seeing the point of adopting a practice. The
technical challenges arise from the nature of problems being addressed by
these codes. For example, multiphysics and multicomponent
codes that run on large high-performance computing
(HPC) platforms put a large premium on performance. In our
experience, good performance is most often achieved by sacrificing
some of the modularity in software architecture
(e.g. \cite{Dubey1999}). Similarly lateral interactions in physics get
in the way of encapsulations (see Sections \ref{sec:domain-challenges}
and \ref{sec:institutional-challenges} for more examples and details).
This chapter elaborates on the challenges and how they were
addressed in FLASH \cite{Dubey2009, Fryxell2000} and Amanzi
\cite{moulton2011}, two codes with very
different development timeframe, and therefore very different
development paths. FLASH, whose development began in the late 1990s,
is among the first generation of codes that
adopted a software process. This was in the era when the advantages of
software engineering were almost unknown in the scientific
world. Amanzi is from the ``enlightened'' era (by scientific software
standards) where a minimal set of software practices are adopted by
most code projects intending long term use. A study of software
engineering of these codes from different eras of scientific software
development highlight how these practices and the communities have
evolved.
FLASH was originally designed for computational
astrophysics. It has been almost continuously
under production and development since 2000, with three major
revisions. It has exploited an extensible framework to expand its
reach and is now a community code for over half a dozen scientific
communities. The adoption of software engineering practices has
grown with each version change and expansion of capabilities. The
adopted practices themselves have evolved to meet the needs of the
developers at different stages of development. Amanzi, on the other
hand, started in 2012 and has developed
from the ground up in C++ using relatively modern software engineering
practices. It still has one major target community but is also
designed with extensibility as an objective. Many other
similarities and some differences are described later in the chapter.
In particular, we address the issues related to software
architecture and modularization, design of a testing regime,
unique documentation needs and challenges, and the tension between intellectual property
management and open science.
The next few sections outline the challenges that
are either unique to, or are more dominant in scientific
software than elsewhere. Section \ref{sec:lifecycle} outlines the possible
lifecycle of a scientific code, followed by domain specific
technical challenges in Section \ref{sec:domain-challenges}. Section
\ref{sec:institutional-challenges} describes the
technical and sociological challenges posed by the institutions
where such codes are usually developed. Section
\ref{sec:case-studies} presents a case study of FLASH and Amanzi
developments. Sections \ref{sec:generalizations} and \ref{sec:future}
present general observations and additional considerations for
adapting the codes for the more challenging platforms expected in the
future.