StepWise
is a light wrapper for the parts of your Elixir code which need to be debuggable in production. It does this by separating the implementation of your app logic from the implementation of how things like logs, metrics, and traces are created.
That means that it:
- ...encourages the breaking down of such code into steps
- ...requires that each step returns a success or failure state (via standard
{:ok, _}
and{:error, _}
tuples) - ...provides telemetry events to separate/centralize code for concerns such as logging, metrics, and tracing.
Let's start with some code...
defmodule MyApp.NotifyCommenters do
# This outer `step` isn't neccessary, but it is a useful convention to be able to
# track the status of the whole run in addition to individual steps.
def run(post, method) do
StepWise.step({:ok, post}, &steps/1)
end
def steps(post, method) do
{:ok, post}
|> StepWise.step(&MyApp.Posts.get_comments/1)
|> StepWise.map_step(fn comment ->
# get_commenter/1 doesn't return `{:ok, _}`, so we need
# to do that here
{:ok, MyApp.Posts.get_commenter(comment)}
end)
|> StepWise.step(¬ify_users/2, method)
end
def notify_users(user, method) do
# ...
end
end
You might notice that the step/1
and map_step/1
functions take function values. These can be anonymous (i.e. as in map_step
above), though errors will be clearer when using function values coming from named functions (e.g. MyApp.Posts.get_comments
and notify_users
in the example).
The step
and map_step
functions rescue
/ catch
anything which bubbles up so that you don't have to. All exceptions/throws will be returned as {:error, _}
tuples so that they can be handled. exit
s, however, are not caught on purpose because, as this Elixir guide says: "exit signals are an important part of the fault tolerant system provided by the Erlang VM..."
{:error, _}
tuples will always be returned with Exception values (all {:error, _}
tuples without exceptions will be wrapped). This means that you can:
- ...call
Exception.message
to get a string - ...
raise
the exception value if you want to raise the error - ...hand the exception to error-collecting services like Sentry, Rollbar, etc...
- ...pattern match or act upon on the structure and attributes of the exception
You might also check out this tweet from Andrea Leopardi and the linked blog post regarding Exception
values in error tuples.
If you are familiar with Elixir's with
, you may be wondering about it's relation to StepWise
since with
also helps you handle a series of statements which could succeed or fail. See below for more discussion StepWise
vs with
.
As my colleague put it: "Logging definitely feels like one of those areas where it very quickly jumps from 'these sprinkled out log calls are giving us a lot of value' to 'we now have a mess in both code and log output'"
Central to StepWise
is it's telemetry events to allow actions such as logging, metrics, and tracing be separated as a different concern from your code. There are three telemetry events:
[:step_wise, :step, :start]
Executed when a step starts with the following metadata:
id
: A unique ID generated by:erlang.unique_integer()
step_func
: The function object given to thestep
/step_map
functionmodule
: The module where thestep_func
is defined (for convenience)func_name
: The name of thestep_func
(for convenience)input
: The value that was given to the stepcontext
: The extra argument which is passed down into functions of arity2
.system_time
: The system time when the step was started
[:step_wise, :step, :stop]
Executed when a step stop with all of the same metadata as the start
event, but also with:
result
: the value ({:ok, _}
or{:error, _}
tuple) that was returned from the step functionsuccess
: A boolean describing if the result was a success (for convenience, based onresult
)
There is also a duration
measurement value to give the total time taken by the step (in the native
time unit)
If you use phoenix
you'll get telemetry_metrics
and a MyAppWeb.Telemetry
module by default. In that case you can easily get metrics for time and total counts for all steps that you create:
summary([:step_wise, :step, :stop, :duration],
unit: {:native, :millisecond},
tags: [:hostname, :module, :func_name]
),
counter([:step_wise, :step, :stop, :duration],
unit: {:native, :millisecond},
tags: [:hostname, :module, :func_name]
),
Here is an example of how you might implement logging for your steps (call MyApp.StepWiseIntegration.install()
somewhere like your MyApp.Application.start/2
) (see this wiki page for a more thorough example):
defmodule MyApp.StepWiseIntegration do
def install do
:telemetry.attach_many(
__MODULE__,
[
# [:step_wise, :step, :start],
[:step_wise, :step, :stop],
],
&__MODULE__.handle/4,
[]
)
end
def handle(
[:step_wise, :step, :stop],
%{duration: duration},
%{module: module, func_name: func_name, input: input, result: result},
_config
) do
case {func_name, result} do
{_, {:error, exception}} ->
# Getting a string via `Exception.message/1` will mention the `module` and `func_name`
# in the string, but if we add them as metadata (depending on where our logs go) we can
# more reliably filter to the correct logs.
Logger.error(Exception.message(exception), input: input, module: module, func_name: func_name)
# Since `StepWise` wraps all errors, calling `Exception.message` will return
# information about the if the error was returned/raised and about which
# step it came from. In the code above, calling `Exception.message` on a returned
# exception might give us a string like:
# "There was an error *returned* in MyApp.NotifyCommenters.notify_users/1:\n\n\"Email server is not available\""
# Above we log any errors that occur, but here we only log successes
# in the `steps` function so that we don't have a lot of logs.
# This serves as a starting point for many, but the point of using `telemetry`
# to separate your steps from how they are monitored is that you can organize
# your steps and log however you'd like.
{:steps, {:ok, value}} ->
Logger.info(
"#{module}.#{func_name} *succeeded* in #{duration}",
input: input, result: result,
module: module, func_name: func_name
)
end
end
end
First, while StepWise
has some overlap with with
's ability to handle errors, it's attempting to solve a specific problem (improving debugging of production code). Let's discuss some of the differences:
The with
clause in Elixir is a way to specify a pattern-matched "happy path" for a series of expressions. The first expression which does not match it's corresponding pattern will be either:
- ...returned from the
with
(if noelse
is given) - ...given to a series of pattern matches (using
else
)
with
also doesn't rescue
or catch
for you (for better or worse, the usefulness of that depends on the situation).
step_wise
requires that each step be either an {:ok, _}
or {:error, _}
pattern.
- ...uses functions to give identification to steps when something goes wrong.
- ...
rescue
s from exceptions andcatch
es throws. - ...requires the use of
{:ok, _}
/{:error, _}
tuples. - ...emits
telemetry
events to allow for integration with various debugging tools.
StepWise
supports configuration to disable the wrapping of exceptions and throws:
config :step_wise, :wrap_step_function_errors, false
This is primarily useful just for your test
environment. Since StepWise
wraps errors, you would need to test for StepWise.Error
and StepWise.StepFunctionError
values, which exist for wrapping and formatting errors in production environments.
Alternatively you might choose to create helpers which allow you to test for errors without needing to worry about the details of StepWise.Error
and StepWise.StepFunctionError
.
The example above is a primary use-case of chaining together functions in a pipe-like way. One advantage of pipes in Elixir is that it is easy to follow a single value through a serious of operations. For example:
session
|> get_user()
|> fetch_posts(10, public: true)
While pipes allow for functions of any number of arguments, the step
and map_step
functions take function values with a fixed arity. So in order to allow contextual arguments (like the arguments to fetch_posts
), it is possible to give step
and map_step
a function of arity 2
and then pass in a value (which could be a tuple/map/list/etc... which holds multiple values). For example:
{:ok, session}
|> step(&get_user/1)
|> step(&fetch_posts/2, limit: 10, opts: [public: true])
This also turns into the single context
key in the telemetry
events.
Sometimes, however, you may need to pass more complex objects through your StepWise
pipeline, which might look like...
In some cases, however, you may want to use a more GenServer
-like style where you have a state object that is modified along the way:
def EmailPost do
import StepWise
def run(user_id, post_id) do
%{user_id: user_id, post_id: post_id}
|> step(&MyApp.Posts.get_comments/1)
|> step(&fetch_user_data/1)
|> step(&fetch_post_data/1)
|> step(&finalize/1)
end
def fetch_user_data(%{user_id: id} = state) do
{:ok, Map.put(state, :user, MyApp.Users.get(id))}
end
def fetch_post_data(%{post_id: id} = state) do
{:ok, Map.put(state, :post, MyApp.Posts.get(id))}
end
def finalize(%{user: user, post: post}) do
# ...
end
end
Note that import StepWise
is used here. The first example used the StepWise
module explicitly to demonstrate the recommendation from the Elixir guides to prefer alias
over import
. But in self-contained modules you may find the style of import
preferable.
If available in Hex, the package can be installed
by adding step_wise
to your list of dependencies in mix.exs
:
def deps do
[
{:step_wise, "~> 0.6.1"}
]
end
Documentation can be generated with ExDoc and published on HexDocs. Once published, the docs can be found at https://hexdocs.pm/step_wise.