eqtools_paper.tex

%\documentclass[12pt,floatfix,showpacs]{revtex4-1}
\documentclass{elsarticle}

\usepackage{amssymb,amsmath,amsfonts}
\usepackage{graphicx}
\usepackage{subfig}
\usepackage{epstopdf}
\usepackage{url}

\newcommand{\eg}{\emph{e.g., }}
\newcommand{\ie}{\emph{i.e., }}
\newcommand{\etal}{\emph{et al.}}
\newcommand{\diff}{\ensuremath\mathrm{d}}

% remove these for final publication 
\usepackage{color} 
\newcommand{\note}[1]{\textcolor{red}{#1}}
\newcommand{\gnote}[1]{\marginpar{\scriptsize\textcolor{red}{#1}}}

\graphicspath{{./graphics/}}

% Macro to consistently format the package name:
\usepackage{xspace}
\newcommand{\eqtools}{\texttt{eqtools}\xspace}
\newcommand{\profiletools}{\texttt{profiletools}\xspace}
\newcommand{\TRIPPy}{\texttt{TRIPPy}\xspace}
\newcommand{\gptools}{\texttt{gptools}\xspace}

\usepackage{siunitx}

%% This list environment is used for the references in the
%% Program Summary
%%
\newcounter{bla}
\newenvironment{refnummer}{%
\list{[\arabic{bla}]}%
{\usecounter{bla}%
 \setlength{\itemindent}{0pt}%
 \setlength{\topsep}{0pt}%
 \setlength{\itemsep}{0pt}%
 \setlength{\labelsep}{2pt}%
 \setlength{\listparindent}{0pt}%
 \settowidth{\labelwidth}{[9]}%
 \setlength{\leftmargin}{\labelwidth}%
 \addtolength{\leftmargin}{\labelsep}%
 \setlength{\rightmargin}{0pt}}}
 {\endlist}

\journal{Computer Physics Communications}

\begin{document}

%\begin{frontmatter}

\title{\eqtools: Modular, Extensible, Open-Source, Cross-Machine Python Tools for Working with Magnetic Equilibria}

\author[PSFC]{M. A. Chilenski\corref{cor1}}
\ead{markchil@mit.edu}

\author[PSFC]{I. C. Faust}
\ead{faustian@mit.edu}

\author[PSFC,cinch]{J. R. Walk}
\ead{jrwalk@mit.edu}

\cortext[cor1]{Corresponding author}

\address[PSFC]{Plasma Science and Fusion Center, Massachusetts Institute of Technology, 77 Massachusetts Avenue, Cambridge, MA, 02139, USA}
\address[cinch]{Cinch Financial, 24 School St, Boston, MA, 02108, USA}

\date{\today}

\begin{abstract}
As plasma physics research for fusion energy transitions to an increasing emphasis on cross-machine collaboration and numerical simulation, it becomes increasingly important that portable tools be developed to enable data from diverse sources to be analyzed in a consistent manner. 
This paper presents \eqtools, a modular, extensible, open-source toolkit implemented in the Python programming language for handling magnetic equilibria and associated data from tokamaks. 
\eqtools provides a single interface for working with magnetic equilibrium data, both for handling derived quantities and mapping between coordinate systems, extensible to function with data from different experiments, data formats, and magnetic reconstruction codes, replacing the diverse, non-portable solutions currently in use.  
Moreover, while the open-source Python programming language offers a number of advantages as a scripting language for research purposes, the lack of basic tokamak-specific functionality has impeded the adoption of the language for regular use.
Implementing equilibrium-mapping tools in Python removes a substantial barrier to new development in and porting legacy code into Python.
In this paper, we introduce the design of the \eqtools package and detail the workflow for usage and expansion to additional devices.  The implementation of a novel three-dimensional spline solution (in two spatial dimensions and in time) is also detailed.  
Finally, verification and benchmarking for accuracy and speed against existing tools are detailed. Wider deployment of these tools will enable efficient sharing of data and software between institutions and machines as well as self-consistent analysis of the shared data.
\end{abstract}

\begin{keyword}
plasma physics \sep tokamaks \sep magnetic equilibrium reconstruction \sep data analysis
\PACS 52.55.Fa \sep 07.05.Kf \sep 52.70.Ds
\end{keyword}

\maketitle

%\end{frontmatter}

% Computer program descriptions should contain the following
% PROGRAM SUMMARY.

{\bf PROGRAM SUMMARY}
  %Delete as appropriate.

\begin{small}
\noindent
{\em Manuscript Title:}                                       \eqtools: Modular, Extensible, Open-Source, Cross-Machine Python Tools for Working with Magnetic Equilibria\\
{\em Authors:}                                                M. A. Chilenski, I. C. Faust, J. R. Walk\\
{\em Program Title:}                                          \eqtools\\
{\em Journal Reference:}                                      \\
  %Leave blank, supplied by Elsevier.
{\em Catalogue identifier:}                                   \\
  %Leave blank, supplied by Elsevier.
{\em Licensing provisions:}                                   GNU General Public License Version 3\\
  %enter "none" if CPC non-profit use license is sufficient.
{\em Programming language:}                                   Python, C\\
{\em Computer:}                                               PCs\\
  %Computer(s) for which program has been designed.
{\em Operating system:}                                       Linux, Macintosh OS X, Microsoft Windows\\
  %Operating system(s) for which program has been designed.
{\em RAM:}                                              Several megabytes, depends on resolution of data\\
  %RAM in bytes required to execute program with typical data.
{\em Number of processors used:}                              1\\
  %If more than one processor.
{\em Keywords:} plasma physics, tokamaks, magnetic equilibrium reconstruction, data analysis  \\
  % Please give some freely chosen keywords that we can use in a
  % cumulative keyword index.
{\em Classification:}                 19.4                        \\
  %Classify using CPC Program Library Subject Index, see (
  % http://cpc.cs.qub.ac.uk/subjectIndex/SUBJECT_index.html)
  %e.g. 4.4 Feynman diagrams, 5 Computer Algebra.
{\em External routines/libraries:}             F2PY [1], matplotlib [2], MDSplus [3], NumPy [4], SciPy~[5]              \\
  % Fill in if necessary, otherwise leave out.
%{\em Subprograms used:}                                       \\
  %Fill in if necessary, otherwise leave out.
%{\em Catalogue identifier of previous version:}*              \\
  %Only required for a New Version summary, otherwise leave out.
%{\em Journal reference of previous version:}*                  \\
  %Only required for a New Version summary, otherwise leave out.
%{\em Does the new version supersede the previous version?:}*   \\
  %Only required for a New Version summary, otherwise leave out.
{\em Nature of problem:} Access to results from magnetic equilibrium reconstruction code, conversion between various coordinate systems tied to the magnetic equilibrium.\\
{\em Solution method:} Data are stored in an object-oriented data structure with human-readable getter methods. Coordinates are converted using bivariate or trivariate splines.\\
%{\em Reasons for the new version:}*\\
  %Only required for a New Version summary, otherwise leave out.
%   \\
%{\em Summary of revisions:}*\\
  %Only required for a New Version summary, otherwise leave out.
%   \\
%{\em Restrictions:}\\
  %Describe any restrictions on the complexity of the problem here.
%   \\
%{\em Unusual features:}\\
  %Describe any unusual features of the program/problem here.
%   \\
%{\em Additional comments:}\\
  %Provide any additional comments here.
%   \\
{\em Running time:} Coordinate transformations on a $66\times 66$ point spatial grid take between 1 and 5 milliseconds per time slice, depending on the transformation used and how many intermediate results have been stored.

\begin{thebibliography}{0}
\bibitem{1} P.~Peterson,
  International Journal of Computational Science \& Engineering 4~(4) (2009)
  296--305.
\bibitem{2} J. D. Hunter, Computing in Science and Engineering, 9~(3) (2007) 90--95.
\bibitem{3} J.~A. Stillerman, T.~W. Fredian, K.~A. Klare, G.~Manduchi, Review of Scientific Instruments 68~(1) (1997) 939--942.
\bibitem{4} S.~van der Walt, S. C. Colbert and G.~Varoquaux, Computing in Science \& Engineering 13~(2) (2011) 22--30.
\bibitem{5} E.~Jones, T.~Oliphant, P.~Peterson, et~al.,
  {{S}ci{P}y: Open source scientific tools for
  {P}ython} (2001--).
                               % This list is different from the bibliography at the end of 
                               % the Long Write-Up.
\end{thebibliography}
\end{small}

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

\section{Introduction}\label{sec:intro}

The basic computational tasks associated with magnetic equilibrium reconstructions -- namely, the handling of derived quantities (\eg calculated plasma current, safety-factor profile) and the mapping between real-space and flux coordinate systems for experimental data -- are universal among tokamak experiments.  
Despite this commonality, experiments typically utilize locally-developed solutions developed for the particulars of that experiment's data storage and usage.  
This ad-hoc development of base-level functionality inhibits the mobility of higher-level codes to other devices (as the code may require substantial modification to address the particulars of the new data storage design).  
Moreover, such development is often quite static, such that the implementation is difficult to extend to new data formats (\eg to handle both a primary MDSplus-based data storage system \cite{MDSplus} and the \emph{eqdsk} storage files produced directly by the EFIT reconstruction code \cite{Lao1985}), necessitating parallel workflows for functionally identical tasks depending on the data source.  
Best design practices call for the vagaries of data source and storage implementation to be placed in the back end, presenting a consistent, straightforward interface common between machines and code implementations to the research scientist.

The new \eqtools package provides a modular, extensible, cross-machine toolkit developed in the Python programming language for handling magnetic equilibrium data.  
The \eqtools package provides a consistent and straightforward interface to the researcher for both coordinate mapping routines (historically handled in separate standalone routines) and access to derived quantities (often handled with manual hooks into data storage).  
Moreover, \eqtools is constructed with a modular, object-oriented design, such that the package is easily extensible to handle data from different experiments and reconstruction codes, giving the researcher a single unified interface for data from any machine or code.  
The implementation of reconstructed-equilibrium handling in the Python language removes a substantial barrier to the adoption of Python as a day-to-day analysis language for tokamak research, which offers numerous advantages in ease of use, computational speed, user/developer base, and free and open-source implementations compared to current common working languages for fusion research.  The \eqtools package is open-source and licensed under the GNU General Public License \cite{GPL}, and distributed via PyPI \cite{eqtools_pypi,eqtools_git,eqtools_readthedocs}.

This paper details the design and implementation of \eqtools, particularly the paradigm for extension to new machines (section \ref{sec:design}), describes the basic algorithms used to convert between coordinate systems (section \ref{sec:algo}), describes the implementation of a novel trivariate spline method for improved interpolation in the time dimension (section \ref{sec:trispline}), presents runtime and accuracy benchmarks against the current IDL\footnote{IDL is a registered trademark of Exelis Visual Information Solutions.} implementation at Alcator C-Mod (section \ref{sec:benchmark}), and shows examples of the code in use (section \ref{sec:examples}).

\section{Package Design and Use}\label{sec:design}

The \eqtools package is designed to present a consistent, human-readable interface to the user for both data handling of derived quantities and mapping routines between coordinate systems, all contained within a single persistent object (herein referenced in code snippets as \verb|eq|).
%For example, consider how the poloidal flux on-axis would be accessed using the current system on Alcator C-Mod using the MDSobjects \cite{MDSobjects} Python interface:
%\begin{verbatim}
%import MDSplus
%t = MDSplus.Tree('analysis', shotnum)
%n = t.getNode('efit.results.a_eqdsk:simagx')
%psi0 = n.data()
%\end{verbatim}
%And compare to the approach to access the same data using \eqtools:
%\begin{verbatim}
%import eqtools
%eq = eqtools.CModEFITTree(shotnum)
%psi0 = eq.getFluxAxis()
%\end{verbatim}
%Notice that the \eqtools approach requires fewer lines and that the cryptic six letter EFIT tag name ``simagx'' is replaced with the more descriptive ``getFluxAxis.''
For example, accessing the calculated value for the poloidal flux on-axis, $\psi_0$, is simply \verb|eq.getFluxAxis()| -- this replaces arrangements that may involve cumbersome manual hooks into the data system and interaction with opaque internal EFIT variable names.
Similarly, mapping routines are handled internally in the \verb|eq| object -- mapping from $(R, Z)$ coordinates to normalized poloidal flux $\psi_{\text{n}}$, for example, is simply \verb|eq.rz2psinorm(R,Z,t,...)|.  
One notable consequence of this object-oriented approach is that intermediate spline calculations for the mapping routines are stored in the persistent object -- compared to typical standalone mapping routines (which must start from scratch for each function call) this allows substantial speed gains for subsequent mapping calculations with the same equilibrium object, as is illustrated in section~\ref{sec:benchmark}.

The \eqtools package is structured such that the user-facing methods (for data access and coordinate mapping) are consistent for different machines or reconstruction codes.
Moreover, adding support for new machines or codes requires minimal new code development.  
The package inheritance structure is depicted in figure \ref{fig:flowchart}.  
The base level of the inheritance structure (shown in blue in figure \ref{fig:flowchart}) provides the abstract class \verb|Equilibrium| -- this defines the prototypical structure for the data (ensuring that child classes use a consistent format) and defines the mapping routines, which are inherited and used by each child class.  
The methods therein are sufficiently general to be used by any grid-based equilibrium reconstruction.

\begin{figure}[p]
 \includegraphics[width=\textwidth]{graphics/flowchart_new.pdf}
 \caption{Package structure for the \eqtools package.  The base abstract class (blue) provides a prototypical structure for the derived-data handlers, as well as providing the complete set of coordinate-mapping routines.  Intermediate abstract classes (yellow) prescribe the handling for data storage systems and codes -- at present the relatively-ubiquitous EFIT reconstruction stored in MDSplus tree structures is provided.  User-facing classes (green) handle the details of machine-specific implementations.  Dashed lines denote classes that have not yet been implemented, but which can be introduced into the package in a straightforward manner due to its modular construction.  The user interface is consistent, as it is provided by the parent classes -- code migration between machines requires only changing which child class is called for the reconstruction.  The \texttt{EqdskReader} class, which directly handles \emph{eqdsk} text files from EFIT, inherits directly from the base class as it is sufficiently unique to not warrant an intermediate abstract class.  Outside the \texttt{Equilibrium} inheritance structure, the package also provides the \texttt{PFileReader} class to handle the ``p-file'' plasma profile data associated with EFIT, as well as the \texttt{filewriter} module to produce portable g-files from \texttt{Equilibrium} objects.}
 \label{fig:flowchart}
\end{figure}

An intermediate inheritance level (shown in yellow in figure \ref{fig:flowchart}) provides support for specific codes and data systems (while still maintaining cross-machine generality) -- for example, the \verb|EFITTree| class provides methods for EFIT reconstructions stored in MDSplus tree structures (as is used on C-Mod, NSTX, and DIII-D).  
This minimizes the need for repeated code in machine-specific versions of common code implementations.  
The user-facing inheritance level (shown in green in figure \ref{fig:flowchart}) finalizes the specific details of implementation for a given reconstruction code, data-storage methodology, and machine implementation (\eg \verb|CModEFITTree|, \verb|NSTXEFITTree|, \verb|D3DEFITTree|).  
These user-facing implementations typically require relatively little code (on the order of 100 lines in the case of \verb|CModEFITTree| and \verb|NSTXEFITTree|), and extension modules to the code can be quickly developed in most cases.

In addition to the already-developed modules inheriting \verb|EFITTree|, the code currently contains the \verb|EqdskReader| module, which directly interfaces with the \emph{eqdsk} text files (specifically, the ``g-file'' and ``a-file'' containers, which store equilibrium and scalar derived quantities, respectively) generated by EFIT.  
This allows a single unified interface for both the MDSplus-based and portable text file data storage common to US experiments. 
Due to the unique structure of the code necessary to read text file data (which is done directly in \verb|EqdskReader|), \verb|EqdskReader| directly inherits mapping routines and structure from \verb|Equilibrium|, without an intermediate stage.
%However, as new text-file-based storage methods are implemented in \eqtools sufficient commonality may be found to necessitate the creation of an intermediate abstraction level for generalized text file storage systems in subsequent versions of the \eqtools package.

In addition to the \verb|EqdskReader| module, which handles equilibrium and derived-quantity data from g- and a-file outputs from EFIT, the \eqtools package provides a \verb|PFileReader| object to handle plasma-profile data from the ``p-files'' associated with EFIT.  
While this does not require access to the reconstructed equilibrium (and thus is separate from the \verb|Equilibrium| inheritance structure) profile data is commonly paired with the reconstructed equilibrium -- as such, the \eqtools package provides a built-in methodology to handle the additional data.  
The \eqtools package also contains a package-level method, \verb|filewriter|, to produce g-files from classes in the \verb|Equilibrium| inheritance tree, allowing easy generation of portable datasets from the main (for example, MDSplus tree) data-storage method.

\section{Details of the coordinate mapping routines}\label{sec:algo}
%One of the strengths of \eqtools is its support for conversions between the majority of coordinate systems in common use:
\eqtools supports transformations between a wide variety of coordinates in common use including real-space $(R, Z)$ coordinates, mapped outboard midplane major radius $R_{\text{mid}}$, normalized minor radius $r/a$, unnormalized poloidal flux $\psi$, normalized poloidal flux $\psi_{\text{n}}$, normalized toroidal flux $\phi_{\text{n}}$, normalized flux surface volume $V_{\text{n}}$ and the square roots of these quantities.

The most important transformation maps a given point $(R, Z)$ at a given time $t$ to the poloidal flux $\psi$ at that location as computed with the magnetic reconstruction code.
The default implementation uses a nearest-neighbor interpolation in time: the code first retrieves the flux reconstruction at the time closest to $t$, then a bivariate interpolating spline \cite{dierckxBook,scipy} is used to map from $(R, Z)$ to $\psi$.
The bivariate spline coefficients from each time are stored in memory as they are computed in order to speed up subsequent calculations at that time slice.
A more advanced method using a tricubic interpolating spline to interpolate smoothly in space \emph{and} time is described in section~\ref{sec:trispline}.

Once the coordinate has been mapped to $\psi$, subsequent calculations are simpler.
Normalized poloidal flux is defined as
\begin{gather}
	\psi_{n} = \frac{\psi - \psi_{0}}{\psi_{a} - \psi_{0}},
\end{gather}
where $\psi_{0}$ is the poloidal flux at the magnetic axis and $\psi_{a}$ is the poloidal flux at the boundary.
To ensure self-consistency, the default behavior is to use nearest-neighbor interpolation to get the values of $\psi_{0}$ and $\psi_{a}$ at the desired time.
When a tricubic spline is used to map from $(R, Z)$ to $\psi$ a cubic spline is used to interpolate $\psi_{0}$ and $\psi_{a}$ in time.

Normalized toroidal flux is defined in terms of normalized poloidal flux as
\begin{equation}
	\begin{gathered}
		\phi(\psi) = \int_{\psi_{0}}^{\psi} q(\psi')\,\diff\psi'\label{eq:phipsi}\\
		\phi_{\text{n}} = \frac{\phi}{\phi_{a}},
	\end{gathered}
\end{equation}
where $q(\psi) = \diff\phi/\diff\psi$ is the safety factor profile (typically computed when EFIT is run) and $\phi_{a}=\int_{\psi_{0}}^{\psi_{a}}q(\psi')\,\diff\psi'$ is the toroidal flux at the last closed flux surface.
The integral in (\ref{eq:phipsi}) is numerically evaluated using the trapezoid rule.

The normalized flux surface volume is defined as
\begin{gather}
	V_{\text{n}}(\psi) = \frac{V(\psi)}{V_{a}},
\end{gather}
where $V(\psi)$ is the volume enclosed by the (closed) flux surface with flux $\psi$ and $V_{a}$ is the volume enclosed by the last closed flux surface.
In the case of Alcator C-Mod, the flux surface volume $V(\psi)$ is computed automatically when EFIT is run, but it would be straightforward to override the \verb|getFluxVol()| method of the \verb|Equilibrium| class to compute this from the $\psi(R, Z)$ grid when this quantity is not already available in the tree.

Mapping from $\psi_{\text{n}}$ to $R_{\text{mid}}$ is accomplished by forming a dense radial grid of $R$ points that go from the magnetic axis to the edge of the grid the flux is reconstructed on, finding the vertical location of the magnetic axis at the desired time(s) (called $Z_{0}$), then converting the resulting $(R, Z_{0})$ points to $\psi_{\text{n}}$ with the routines described above.
This one-to-one mapping between $R_{\text{mid}}$ and $\psi_{\text{n}}$ is then interpolated to give the desired conversion from $\psi_{\text{n}}$ to $R_{\text{mid}}$.

By default, $r/a$ is defined in terms of $R_{\text{mid}}$ as
\begin{gather}
	r/a = \frac{R_{\text{mid}} - R_{0}}{R_{a} - R_{0}},
\end{gather}
where $R_{0}$ is the major radius of the magnetic axis and $R_{a}$ is the outboard midplane major radius of the last closed flux surface.
Since other definitions of $r/a$ are preferred when the Shafranov shift is high, the specific definition of $r/a$ can be changed simply by overriding the methods \verb|_rmid2roa(...)| and \verb|_roa2rmid(...)| in the \verb|Equilibrium| class.
Variants on these basic routines are then used to map between other pairs of coordinates.

\section{Tri-Spline Implementation}\label{sec:trispline}

\begin{figure}[ht]
 \includegraphics[width=\textwidth]{graphics/tspline_merge.eps}
 \caption{Flux surface reconstruction in time of an Alcator C-Mod discharge using nearest-neighbor time interpolation (a) and a tricubic spline interpolation (b). Aliasing behavior is removed by the use of a spline in the time-dimension. }
 \label{fig:tspline}
\end{figure}

%Time-dependent interpolation poses a difficult problem with respect to the instantaneous nature of most equilibrium reconstructions. 
Typically, coordinates are determined via nearest-neighbor interpolation to the reconstruction timebase.
When the timebase of the equilibrium reconstruction is coarser than and/or offset from the desired timebase this can induce aliasing and discontinuities (as can be seen in figure~\ref{fig:tspline}). 
These limit the nearest-neighbor approach when considering fast timescale behaviors with comparatively slow reconstructions.  
When reconstruction achieves the Nyquist frequency, an additional interpolation in time can accurately remove aliasing errors induced by this timebase mismatch. 
The additional temporal dimension requires the use of higher-dimensional interpolators, for which a fast tricubic interpolation scheme has been developed. 

A tricubic interpolating spline \cite{Lekien2005} is used to evaluate the three-dimensional flux grid as a function of $R$, $Z$ and $t$. 
The additional time dimension and subsequent higher dimensionality increases the evaluation time of the mapping routines. 
The $(R, Z, t)$ points are assumed to form a tensor product grid which allows for the use of finite element matrix methods for extracting the derivatives used to compute the necessary spline coefficients. 
The higher dimensionality of the problem scales the necessary computational time due to the increased necessary information (64-fold increase in matrix computation). 
Other non-three-dimensional parameters also require further computation due to the inherent complication of increasing the number of dimensions. 
Persistence of the coefficients allows for similar computational times to previous equilibrium mappings when subject to calculations in the same voxel, but the overall computation is still slower when compared to lower-dimensional mapping.

The code for the tricubic spline interpolation was written in C and was integrated into Python using the F2PY package \cite{Peterson2009}.
C was used both to help with performance and to allow the trispline code to be used with any programming language which provides a C API.
Two separate methods are used to calculate and evaluate the spline, dependent on the nature of the grid.  
In cases which the grid is regular, the novel inclusion of a finite-difference matrix into the spline coefficient matrix reduces computational time (see Appendix \ref{sec:App}).  
In other cases, the derivatives in the necessary directions and orders are calculated before being used in the spline coefficient matrix multiplication.
%The C code is compartmentalized and can accept arbitrary three dimensional data, so other programming languages and codes with sufficient C APIs can access the optimized trispline code for general use.

\section{Verification and Benchmarking}\label{sec:benchmark}
\eqtools was implemented using the vectorized array operations provided by NumPy \cite{vanderWaltCSE2011}, adhering to the general rule of thumb that any operation requiring a large number of for-loops should be re-cast as a vectorized operation or implemented directly in C (as was done for the tricubic spline interpolation routine).
Most of the computational effort is expended in the Fortran spline routines provided by SciPy \cite{scipy,dierckxBook}, which means that the overhead contributed by the fact that Python is an interpreted language is minimal.
The performance differences between pure Python, NumPy and compiled languages have been explored elsewhere \cite{NASA2009,IBM2016}, and the slight performance decrease compared to what might be expected from a pure C or Fortran implementation is felt to be more than acceptable because it dramatically increases the user-friendliness and expandability of the code.

\eqtools has been verified against the existing, thoroughly-tested IDL routines presently in use for handling coordinate mapping at Alcator C-Mod.
Figure~\ref{fig:rz2psi_diff} shows the discrepancy between the IDL routines and \eqtools for the conversion of $(R, Z)$ to $\psi$.
The differences are small (of order \SI{e-7}{Wb/rad}, compared to a signal of order $\pm\SI{0.5}{Wb/rad}$), and are consistent with the fact that Python used double precision whereas IDL used single precision for the test.\footnote{The data are retrieved from MDSplus in single precision but must be cast to double precision for the Python spline routines to work properly. While IDL can work in double precision, it defaults to single precision when given single precision input data.}
\begin{figure}
	\centering
	\includegraphics[width=0.6\textwidth]{graphics/absolute_compare}
	\caption{Absolute difference between calculations of a mapping between the $(R, Z)$ grid and poloidal flux $\psi$ for \eqtools and the current IDL implementation of the mapping routine for an example Alcator C-Mod reconstructed equilibrium.  Relative differences of order \SI{e-7}{Wb/rad} are typical (the value of $\psi$ is in the range $\pm\SI{0.5}{Wb/rad}$), and are consistent with the different float precisions used.}
	\label{fig:rz2psi_diff}
\end{figure}
Timing tests were conducted by converting a $66\times66$ element $(R, Z)$ grid (double the density of the grid EFIT provides the flux on) into each of the coordinates supported.
This conversion was performed at 180 time slices (double the temporal resolution output by EFIT) and nearest-neighbor temporal interpolation was used.
The test was run both with all points being processed at once (denoted ``all'' in Table~\ref{tab:benchmarkingtests}) and with the routine being called inside a loop over all time points (denoted ``loop'' in Table~\ref{tab:benchmarkingtests}).
The conversion was performed twice with the same \verb|Equilibrium| object in \eqtools in order to assess the time savings from storing the spline coefficients.
The IDL routines (with the exception of the conversion to $R_{\text{mid}}$) support reuse of the spline coefficients from a single time slice, so a second run of the IDL code with the stored spline coefficients was performed when looping over time slices.
The test was repeated 100 times, and the mean execution time to convert the $(R, Z)$ grid for all time points is given in Table~\ref{tab:benchmarkingtests}.
\begin{table}
	\caption{Time to convert all 180 time slices in milliseconds. ``All'' refers to passing all points to the routine at once, whereas ``loop'' refers to calling the routine in a loop over time points. ``First'' refers to the first call to the routine when data are collected from the server and spline coefficients are computed, ``second'' refers to the second call once the data and spline coefficients have been cached.}
	\label{tab:benchmarkingtests}
	\begin{tabular}{@{\extracolsep{4pt}}lS[round-mode=places,round-precision=0,table-format=4.0]cS[round-mode=places,round-precision=0,table-format=4.0]S[round-mode=places,round-precision=0,table-format=4.0]S[round-mode=places,round-precision=0,table-format=4.0]S[round-mode=places,round-precision=0,table-format=3.0]S[round-mode=places,round-precision=0,table-format=4.0]S[round-mode=places,round-precision=0,table-format=3.0]}
		\hline
		Conversion & \multicolumn{4}{c}{IDL} & \multicolumn{4}{c}{\eqtools}\\
		\cline{2-5}\cline{6-9}
		 & \multicolumn{2}{c}{all} & \multicolumn{2}{c}{loop} & \multicolumn{2}{c}{all} & \multicolumn{2}{c}{loop}\\
		 \cline{2-3}\cline{4-5}\cline{6-7}\cline{8-9}
		 & {first} & {second} & {first} & {second} & {first} & {second} & {first} & {second}\\
		\hline
		$(R, Z)\to\psi$ & 148.913 & {---} & 3147.166 & 3051.697 & 133.6 & 116.3 & 178.7 & 160.8\\
		$(R, Z)\to\psi_{\text{n}}$ & 162.863 & {---} & 4023.811 & 3879.576 & 137.1 & 119.3 & 183.7 & 165.0\\
		$(R, Z)\to\phi_{\text{n}}$ & 472.517 & {---} & 5307.059 & 5196.169 & 254.4 & 164.5 & 302.9 & 214.7\\
		$(R, Z)\to V_{\text{n}}$ & 467.648 & {---} & 5325.815 & 5220.426 & 253.1 & 163.6 & 302.0 & 214.5\\
		$(R, Z)\to R_{\text{mid}}$ & 1399.561 & {---} & 5327.507 & {---} & 941.7 & 186.9 & 992.5 & 238.0\\ 
		\hline
	\end{tabular}
\end{table}

In all cases \eqtools is faster than the IDL routines.
Furthermore, the test results highlight the additional flexibility \eqtools provides users.
For the most efficient case where all of the points are passed at once, \eqtools is only slightly faster for the simple conversions of $(R, Z)$ to $\psi$ and $\psi_{\text{n}}$.
But, for the more complicated conversions to $\phi_{\text{n}}$, $V_{\text{n}}$ and $R_{\text{mid}}$, \eqtools is anywhere from 1.5 to 1.9 times faster than the IDL routines.
Furthermore, the caching of intermediate results in \eqtools accelerates the second call to a routine by as much as a factor of 5 (for the conversion to $R_{\text{mid}}$).
Where \eqtools stands in stark contrast to the IDL routines is the case where the conversions are evaluated in a loop.
While this case is not something which would be used directly as both codes support the evaluation of multiple time points at once, it is indicative of the performance to be expected should a user write a script which needs to evaluate each time point on a different grid, or otherwise needs the flexibility to split an operation up into steps applied to each time point.
For this case, \eqtools is anywhere from 5 to 20 times faster than the IDL routines.
This is believed to be a result of the large overhead associated with function calls in the IDL language.
These test results highlight the improved performance and extra programming flexibility provided by \eqtools.

\section{Example Applications}
\label{sec:examples}

\subsection{Tomographic Reconstruction}

Tomography consists of using sets of unique line-integrated measurements and mathematically reducing the data to profiles of chosen basis functions.
These basis functions are typically radial in nature, with angular dependence represented by a Fourier series.
When combined, the data are used to reconstruct the characteristics within the volume of interest.
\eqtools provides a simple and direct interface for converting $(R, Z)$ coordinates into equilibrium-based coordinates.
A new, Python-based tomography code known as the Toroidal Inversion Radiation Protocol in Python (\TRIPPy) \cite{FaustTRIPPy} was implemented using \eqtools.
The underlying use of \eqtools allows for consistent tomographic reconstruction of various line-integrated data for a number of tokamaks.

\begin{figure}
	\subfloat[equilibrium with chord view]{
            	\includegraphics[trim=0in 0in .5in .5in,clip,width=0.36\textwidth]{graphics/XTOMO_3_13_view_1120824019_2.pdf}
        }
        	\subfloat[weighting profile versus $\psi_n$]{
            	\includegraphics[trim=.35in .0in .6in .6in,clip,width=0.64\textwidth]{graphics/XTOMO_3_13_weighting_1120824019_2.pdf}
        }
	\caption{A viewing chord of the soft X-ray tomography system (XTOMO) is shown on an example plasma in blue (a), the view weight is maximized at the location closest to the center of the plasma (in $\psi_n$). The effects of angular harmonics are shown by sinusoidal component weights which can negative. This example uses 150 bins in $\psi_n$ with a $10 \mu$m step size.}
	\label{fig:TRIPPydemo}
\end{figure}


\TRIPPy solves for the weighting of a line of sight on a flux-based radial coordinate and a Fourier angular coordinate.
Each measured line-integral through the plasma is discretized to small finite length segments.
Each segment is mapped using \eqtools, and its length is fractionally added a predetermined grid in the basis coordinates.
Many lines of sight create a linear mapping between the line-integrated measurements and the unknown profiles.
This inverse problem can be solved using any number of numerical methods.

\TRIPPy and \eqtools have been implemented for the Alcator C-Mod and NSTX-U tokamaks and has been used to benchmark tomography systems on those machines \cite{Faust2014}. 
The grid size in flux space must be larger than the size of the length discretization in order to properly resolve features in the plasma.
Shown in figure~\ref{fig:TRIPPydemo} is an example line of sight using the soft X-ray tomography system on Alcator C-Mod \cite{Granetz1991}.
Tomographic reconstruction through \TRIPPy can give insight of the emission of radiation and the temperature of the plasma while also determining the character of magnetohydrodynamic modes.
The consistent and extensible design of both \TRIPPy and \eqtools allows for tomographic reconstruction to be implemented for multiple diagnostics on many tokamaks.

\subsection{Profile fitting}
Combining data from multiple diagnostics into a consistent, smooth profile is a fundamental task in experimental plasma physics.
\profiletools \cite{profiletools_git,profiletools_readthedocs} is a Python package which aims to simplify this task by presenting a uniform interface to load, time-average, combine and fit data from different diagnostics.
\eqtools is used internally by \profiletools both to map diagnostics to a consistent grid and to output the smooth fit in the desired coordinates.
In line with the \eqtools philosophy of shielding the user from obtuse calls to the underlying data system (be it MDSplus or some other system), for each diagnostic supported, \profiletools provides a function to fetch the data (and the accompanying error bars).
These quantities are returned in a \verb|BivariatePlasmaProfile| object, where ``bivariate'' refers to the fact that the data are, in general, a function of space and time.
The coordinates the measurement locations are stored in can be converted using the \verb|convert_abscissa(...)| method.
Multiple \verb|BivariatePlasmaProfile| instances from different diagnostics or even different shots can be combined as necessary using the \verb|add_profile| method.
The combined data set can then be used to predict a smoothed profile using Gaussian process regression \cite{gpr_nf} through integration with the \gptools Python package \cite{gptools_git,gptools_readthedocs}.
An example of this sensor fusion task is shown in Figure~\ref{fig:profiledemo}.
\begin{figure}
	\subfloat[equilibrium]{
            	\includegraphics[width=0.35\textwidth]{graphics/eq}
        }
        	\subfloat[data and smoothed profile]{
            	\includegraphics[width=0.65\textwidth]{graphics/prof}
        }
	\caption{Equilibrium with diagnostic locations (a) and individual measurements with smoothed electron temperature profile (b). The data include edge \cite{Hughes2003} and core \cite{Hughes2001} Thomson Scattering systems (ETS, CTS) which measure the local temperature along a vertical chord and two electron cyclotron emission systems (GPC, GPC2) which measure the temperature at the midplane \cite{Basse2007}. The symbols are the same between the two figures. The fit and its 1- and 3-sigma error envelopes were computed using Gaussian process regression with the maximum a posteriori estimate for the hyperparameters.}
	\label{fig:profiledemo}
\end{figure}
The use of \eqtools to provide consistent coordinate transformations and a consistent interface for combining data from multiple diagnostics simplifies the task of building automated data analysis workflows.

\section{Summary}\label{sec:summary}

The \eqtools package provides a modular, extensible framework for handling the basic tasks associated with magnetic equilibrium reconstructions for tokamaks -- namely, accessing derived quantities and mapping between real-space and flux coordinate systems for experimental data.
This package, developed in the open-source Python programming language and freely available from Github and PyPI \cite{eqtools_pypi,eqtools_git,eqtools_readthedocs}, presents a consistent, user-friendly interface independent of data source or storage method.
\eqtools replaces machine- and storage-specific methods which are often nonintuitive to use and which use a static code structure that can be difficult to extend for experimental data from multiple machines, storage methods, or reconstruction codes.
Moreover, the \eqtools package is designed with a modular, object-oriented construction, such that the package is easily extended to include new experiments, reconstruction codes, and data storage methods.
In the current distribution version, EFIT reconstructions \cite{Lao1985} in MDSplus-based data storage from NSTX, C-Mod, and DIII-D are implemented, along with a class to read the portable \emph{eqdsk} datafile.  

The package includes a complete set of mapping routines between real-space machine coordinates (\ie the $(R, Z)$ grid defined for the reconstruction), midplane-mapped real-space coordinates (major radius and normalized minor radius), and flux-space coordinate systems (normalized poloidal and toroidal flux and normalized flux surface volume), as well as their square roots.
In short, the coordinate systems customarily used in a broad variety of analysis applications are supported within a single unified user interface.  
In addition to the standard bivariate analysis mapping the $R$ and $Z$ coordinate with nearest-neighbor interpolation in time (as is presently used in the mapping routines at C-Mod), \eqtools includes a novel trivariate spline implementation to provide smooth interpolation along the time axis as well.  
This allows the user to optionally trade computational time for a substantially more accurate treatment of the time variation in cases where experimental data are sampled at times significantly different from the reconstruction timebase.

The mapping routines in \eqtools have been thoroughly benchmarked against the existing IDL implementations used at Alcator C-Mod.  
Mapping results from \eqtools are consistent with the previous IDL results to within numerical error (arising from double-precision calculations in Python compared to the default single precision used in IDL).
Initial runs of the \eqtools mapping routines are a factor of three to ten faster than the previous IDL implementation.
Moreover, as the persistent \verb|Equilibrium| object stores intermediate calculations in the mapping routines, subsequent calculations with the same shot show additional speed improvements compared to the IDL implementation.

The development of \eqtools clears a substantial hurdle to the adoption of Python as a standard data analysis language for tokamak research, the use of which offers numerous advantages in terms of development base and support, computational speed, ease of adoption, and cost over other languages in common use at present.
Additionally, \eqtools allows for substantially more straightforward implementation of cross-machine analysis tools, a significant benefit in light of increasing emphasis on modeling and cross-machine collaboration in fusion research.

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

\section*{Acknowledgements}
The authors would like to acknowledge their thanks for the work S.~Wolfe performed in writing the original IDL coordinate mapping routines for Alcator C-Mod.

This material is based upon work conducted using the Alcator C-Mod tokamak, a DOE Office of Science user facility.
This material is based upon work supported by the U.S.\ Department of Energy, Office of Science, Office of Fusion Energy Sciences under Award Number DE-FC02-99ER54512.
This material is based upon work supported in part by the U.S.\ Department of Energy Office of Science Graduate Research Fellowship Program (DOE SCGF), made possible in part by the American Recovery and Reinvestment Act of 2009, administered by ORISE-ORAU under contract number DE-AC05-06OR23100.

\appendix
\section{Generation of tricubic spline coefficients for a regular grid}\label{sec:App}

Significant time savings can be recovered for tricubic interpolation computation when the data grid ($f(x,y,z)$) is regular, which is often the case for sets of magnetic equilibria. 
Recovering the spline coefficients requires the calculation of a number of derivatives in each dimension, which for a regular grid can be achieved with finite difference matrix $\boldsymbol{B}$. 
The new \emph{a priori} inclusion of $\boldsymbol{B}$ in the spline calculation matrix $\boldsymbol{A_{inv}}$ (defined as $\boldsymbol{A_{inv}^*}$) removes the necessary derivative generation and reduces total computation by half.
This calculation requires a collection of the nearest $4\times4\times4$ grid of values $\boldsymbol{y}$ to the coordinates of interest $x,y,z$ to extract spline coefficients $\boldsymbol{x}$.
\begin{equation}
\boldsymbol{x} = \boldsymbol{A_{inv}} ( \boldsymbol{B} \boldsymbol{y}) = \boldsymbol{A_{inv}^*}\boldsymbol{y}
\end{equation}
Where
\begin{equation}
\boldsymbol{A_{inv}^*} = \boldsymbol{A_{inv}}\boldsymbol{B} 
\end{equation}
 $\boldsymbol{y}$ comprises the values of the grid necessary for calculating the derivatives at the corners of the voxel $f$, and must be made into a form which is appropriate for $A_{inv}$. $A_{inv}$ expects the values of $f$ at each voxel corners then $\frac{\partial f}{\partial x}, \frac{\partial f}{\partial y}, \frac{\partial f}{\partial z}, \frac{\partial^2 f}{\partial x \partial y}, \frac{\partial^2 f}{\partial y \partial z}, \frac{\partial^2 f}{\partial x \partial z}, \frac{\partial^3 f}{\partial x \partial y \partial z}$ in the same manner. $\boldsymbol{B}$ converts the points into these derivatives, through first order finite difference matrices for $\frac{\partial f}{\partial x}, \frac{\partial f}{\partial y}, \frac{\partial f}{\partial z}$, second order finite difference matrices for $\frac{\partial^2 f}{\partial x \partial y}, \frac{\partial^2 f}{\partial y \partial z}, \frac{\partial^2 f}{\partial x \partial z}$, and a third-order for $\frac{\partial^3 f}{\partial x \partial y \partial z}$. The form of $\boldsymbol{B}$ is dependent on the form of $\boldsymbol{y}$ for the formation of the finite differences.

\section*{References}
\bibliographystyle{elsarticle-num-no-title}
\bibliography{eqtools_paper}

\end{document}