index.xml

<?xml version="1.0" encoding="UTF-8"?><feed xmlns="http://www.w3.org/2005/Atom"><title>Pat Shaughnessy</title><id>http://patshaughnessy.net</id><updated>2022-02-19T18:22:06Z</updated><author><name>Pat Shaughnessy</name></author><entry><title>LLVM IR: The Esperanto of Computer Languages</title><link href="https://patshaughnessy.net/2022/2/19/llvm-ir-the-esperanto-of-computer-languages" rel="alternate"></link><id href="https://patshaughnessy.net/2022/2/19/llvm-ir-the-esperanto-of-computer-languages" rel="alternate"></id><published>2022-02-19T00:00:00Z</published><updated>2022-02-19T00:00:00Z</updated><category>Crystal</category><author><name>Pat Shaughnessy</name></author><summary type="html">&lt;div style=&quot;float: left; padding: 8px 30px 0px 0px; text-align: center; line-height:18px&quot;&gt;
  &lt;img src=&quot;https://patshaughnessy.net/assets/2022/2/19/esperanto.png&quot;&gt;&lt;br/&gt;
  &lt;i&gt; Esperanto grammar is logical and self&lt;br/&gt;
consistent, designed to be easy to learn. &lt;br/&gt;
  &lt;small&gt; &lt;a title=&quot;Renatoeo, CC BY-SA 4.0 &amp;lt;https://creativecommons.org/licenses/by-sa/4.0&amp;gt;, via Wikimedia Commons&quot; href=&quot;https:/</summary><content type="html">&lt;div style=&quot;float: left; padding: 8px 30px 0px 0px; text-align: center; line-height:18px&quot;&gt;
  &lt;img src=&quot;https://patshaughnessy.net/assets/2022/2/19/esperanto.png&quot;&gt;&lt;br/&gt;
  &lt;i&gt; Esperanto grammar is logical and self&lt;br/&gt;
consistent, designed to be easy to learn. &lt;br/&gt;
  &lt;small&gt; &lt;a title=&quot;Renatoeo, CC BY-SA 4.0 &amp;lt;https://creativecommons.org/licenses/by-sa/4.0&amp;gt;, via Wikimedia Commons&quot; href=&quot;https://commons.wikimedia.org/wiki/File:CARD_GRAM%C3%81TICA_ESPERANTO.png&quot;&gt;via Wikimedia Commons&lt;/a&gt;&lt;/small&gt; &lt;/i&gt;
&lt;/div&gt;
&lt;p&gt;I empathize for people who have to learn English as a foreign language. English
grammar is inconsistent, arbitrary and hard to master. English spelling is even
worse. I sometimes find myself apologizing for my language’s shortcomings. But
learning any foreign language as an adult is very difficult.&lt;/p&gt;
&lt;p&gt;&lt;a href=&quot;https://en.wikipedia.org/wiki/Esperanto&quot;&gt;Esperanto&lt;/a&gt;, an “artificial language,”
is different. Invented by Ludwik Zamenhof in 1873, Esperanto has a vocabulary
and grammar that are logical and consistent, designed to be easier to learn.
Zamenhof intended Esperanto to become the universal second language.&lt;/p&gt;
&lt;p&gt;Computers have to learn foreign languages too. Every time you compile and run
a program, your compiler translates your code into a foreign language: the
native machine language that runs on your target platform. Compilers should
have been called translators. And compilers struggle with the same things we
do: inconsistent grammar and vocabulary, and other peculiarities of the target
platform.&lt;/p&gt;
&lt;p&gt;Recently, however, more and more compilers translate your code to an
artificial machine language. They produce a simpler, more consistent, more
powerful machine language that doesn’t actually run on any machine. This
artificial machine language, LLVM IR, makes writing compilers simpler and
reading the code compilers produce simpler too.&lt;/p&gt;
&lt;p&gt;LLVM IR is becoming the universal second language for compilers.&lt;/p&gt;
&lt;h2&gt;One Line of LLVM IR&lt;/h2&gt;
&lt;p&gt;The &lt;a href=&quot;https://llvm.org&quot;&gt;Low Level Virtual Machine&lt;/a&gt; (LLVM) project had the novel
idea of inventing a virtual machine that was easy for compiler engineers to use
as a target platform. The LLVM team designed a special instruction set called
&lt;a href=&quot;https://llvm.org/docs/LangRef.html&quot;&gt;intermediate representation&lt;/a&gt; (IR). New,
modern languages such as Rust, Swift, Clang-based versions of C and many
others, first translate your code to LLVM IR. Then they use the LLVM framework
to convert the IR into actual machine language for any target platform LLVM
supports:&lt;/p&gt;
&lt;img style=&quot;width: 500px; margin-bottom: 20px&quot; src=&quot;https://patshaughnessy.net/assets/2022/2/19/platforms.svg&quot;&gt;
&lt;p&gt;LLVM is great for compilers. Compiler engineers don’t have to worry about the
detailed instruction set of each platform, and LLVM optimizes your code for
whatever platform you choose automatically. And LLVM is also great for people
like me who are interested in what machine language instructions look like and
how CPUs execute them. LLVM instructions are much easier to follow than real
machine instructions. Let’s take a look at one!&lt;/p&gt;
&lt;p&gt;Here’s a line of LLVM IR I generated from a simple
&lt;a href=&quot;https://crystal-lang.org&quot;&gt;Crystal&lt;/a&gt; program:&lt;/p&gt;
&lt;pre type=&quot;console&quot;&gt;%57 = call %&quot;Array(Int32)&quot;* @&quot;*Array(Int32)@Array(T)::unsafe_build&lt;Int32&gt;:Array(Int32)&quot;(i32 610, i32 2), !dbg !89&lt;/pre&gt;
&lt;p&gt;Wait a minute! This isn’t simple or easy to follow at all! What am I talking
about here? At first glance, this does look confusing. But as we’ll see, most
of the confusing syntax is related to Crystal, not LLVM. Studying this line of
code will reveal more about Crystal than it will about LLVM.&lt;/p&gt;
&lt;p&gt;The rest of this article will unpack and explain what this line of code means.
It looks complex, but is actually quite simple.&lt;/p&gt;
&lt;h2&gt;The Call Instruction&lt;/h2&gt;
&lt;p&gt;The instruction above is a function call in LLVM IR. To produce this code, I
wrote a small Crystal program and then translated it using this command:&lt;/p&gt;
&lt;pre type=&quot;console&quot;&gt;$ crystal build array_example.cr --emit llvm-ir&lt;/pre&gt;
&lt;p&gt;The &lt;code&gt;--emit&lt;/code&gt; option directed Crystal to generate a file called array_example.ll,
which contains the line above along with thousands of other lines. We’ll get to
the Crystal code in a minute. But for now, how do I get started understanding
what the LLVM code means?&lt;/p&gt;
&lt;p&gt;The &lt;a href=&quot;https://llvm.org/docs/LangRef.html&quot;&gt;LLVM Language Reference Manual&lt;/a&gt; has
documentation for &lt;code&gt;call&lt;/code&gt; and all of the other LLVM IR instructions. Here’s the
syntax for &lt;code&gt;call&lt;/code&gt;:&lt;/p&gt;
&lt;pre type=&quot;console&quot;&gt;&amp;lt;result&gt; = [tail | musttail | notail ] call [fast-math flags] [cconv] [ret attrs] [addrspace(&amp;lt;num&gt;)]
         &amp;lt;ty&gt;|&amp;lt;fnty&gt; &amp;lt;fnptrval&gt;(&amp;lt;function args&gt;) [fn attrs] [ operand bundles ]&lt;/pre&gt;
&lt;p&gt;My example &lt;code&gt;call&lt;/code&gt; instruction doesn’t use many of these options. Removing the
unused options, I can see the actual, basic syntax of &lt;code&gt;call&lt;/code&gt;:&lt;/p&gt;
&lt;pre type=&quot;console&quot;&gt;&amp;lt;result&gt; = call &amp;lt;ty&gt; &amp;lt;fnptrval&gt;(&amp;lt;function args&gt;)&lt;/pre&gt;
&lt;p&gt;In order from left to right, these values are:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;&lt;code&gt;&amp;lt;result&amp;gt;&lt;/code&gt; which register to save the result in&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;code&gt;&amp;lt;ty&amp;gt;&lt;/code&gt; the type of the return value&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;code&gt;&amp;lt;fnptrval&amp;gt;&lt;/code&gt; a pointer to the function to call&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;code&gt;&amp;lt;function args&amp;gt;&lt;/code&gt; the arguments to pass to that function&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;What does all of this mean, exactly? Let’s find out!&lt;/p&gt;
&lt;h2&gt;A CPU With Infinite Registers&lt;/h2&gt;
&lt;p&gt;Starting on the left and moving right, let’s step through the &lt;code&gt;call&lt;/code&gt; instruction:&lt;/p&gt;
&lt;img src=&quot;https://patshaughnessy.net/assets/2022/2/19/result.svg&quot;&gt;
&lt;p&gt;The token &lt;code&gt;%57&lt;/code&gt; to the left of the equals sign tells LLVM where to save the
return value of the function call that follows. This isn’t a normal variable;
&lt;code&gt;%57&lt;/code&gt; is an LLVM “register.”&lt;/p&gt;
&lt;p&gt;Registers are physical circuits located on microprocessor chips used to save
intermediate values. Saving a value in a CPU register is much faster than
saving a value in memory, since the register is located on the same chip as the
rest of the microprocessor. Saving a value in RAM memory, on the other hand,
requires transmitting that value from one chip to another and is much slower,
relatively speaking. Unfortunately, each CPU has a limited number of registers
available, and so compilers have to decide which values are used frequently
enough to warrant saving in nearby registers, and which other values can be
moved out to more distant memory.&lt;/p&gt;
&lt;p&gt;Unlike the limited number of registers available on a real CPU, the imaginary
LLVM microprocessor has an infinite number of them. Because of this, compilers
that target LLVM can simply save values to a register whenever they would like.
There’s no need to find an available register, or to move an existing value out
of a register first before using it for something else. Busy work that normal
machine language code can’t avoid.&lt;/p&gt;
&lt;p&gt;In this program, the Crystal compiler had already saved 56 other values in
“registers” and so for this line of LLVM IR, Crystal simply used the next
register, number 57.&lt;/p&gt;
&lt;h2&gt;LLVM Structure Types&lt;/h2&gt;
&lt;p&gt;Moving left to right, LLVM &lt;code&gt;call&lt;/code&gt; instructions next indicate the type of the
function call’s return value:&lt;/p&gt;
&lt;img src=&quot;https://patshaughnessy.net/assets/2022/2/19/type.svg&quot;&gt;
&lt;p&gt;This name of this type, &lt;code&gt;Array(Int32)&lt;/code&gt;, is generated by the Crystal compiler, not
by LLVM. That is, this is a type from my Crystal program. It could have been
anything, and indeed other compilers that target LLVM will generate completely
different type names.&lt;/p&gt;
&lt;p&gt;The example Crystal program I used to generate this LLVM code was:&lt;/p&gt;
&lt;pre style=&quot;background-color:#ffffff;&quot;&gt;
&lt;span style=&quot;color:#000000;&quot;&gt;arr &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;= &lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;[&lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;12345&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;, &lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;67890&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;]
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;puts arr[&lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;1&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;]&lt;/span&gt;&lt;/pre&gt;

&lt;p&gt;When I compiled this program, Crystal generated the &lt;code&gt;call&lt;/code&gt; instruction above,
which returns a pointer to the new array, &lt;code&gt;arr&lt;/code&gt;. Since &lt;code&gt;arr&lt;/code&gt; is an array
containing integers, Crystal uses a generic type &lt;code&gt;Array(Int32).&lt;/code&gt;&lt;/p&gt;
&lt;p&gt;Machine languages that target real machines only support hardware types that
machine supports.  For example, Intel x86 assembly language allows you to save
integers of different widths, 16, 32 or 64 bits for example, and an Intel x86
CPU has registers designed to hold values of each of these sizes.&lt;/p&gt;
&lt;p&gt;LLVM IR is more powerful. It supports “structure types,” similar to a C
structure or an object in a language like Crystal or Swift. Here the &lt;code&gt;%&amp;quot;…&amp;quot;&lt;/code&gt;
syntax indicates the name inside the quotes is the name of a structure type.
And the asterisk which follows, like in C, indicates the type of the return
value of my function call is a pointer to this structure.&lt;/p&gt;
&lt;p&gt;My example LLVM program defines the type &lt;code&gt;Array(Int32)&lt;/code&gt; like this:&lt;/p&gt;
&lt;pre type=&quot;console&quot;&gt;%&quot;Array(Int32)&quot; = type { i32, i32, i32, i32, i32* }&lt;/pre&gt;
&lt;p&gt;Structure types allow LLVM IR programs to create pointers to structures or
objects, and to access any of the values inside each object. That makes writing
a compiler much easier. In my example, the call instruction returns a pointer
to an object which contains 4 32-bit integer values, followed by a pointer to
other 32 integer values. But what are all of these integer values? Above I said
this function call was returning a new array - how can that be the case?&lt;/p&gt;
&lt;p&gt;LLVM itself has no idea, and no opinion on the matter. To understand what these
values are, and what they have to do with the array in my program, we need to
learn more about the Crystal compiler that generated this LLVM IR code.&lt;/p&gt;
&lt;p&gt;Reading the &lt;a href=&quot;https://github.com/crystal-lang/crystal/blob/master/src/array.cr#L48&quot;&gt;Crystal standard
library&lt;/a&gt;,
we can see Crystal implements arrays like this:&lt;/p&gt;
&lt;pre style=&quot;background-color:#ffffff;&quot;&gt;
&lt;span style=&quot;color:#a71d5d;&quot;&gt;class &lt;/span&gt;&lt;span style=&quot;color:#008080;&quot;&gt;Array&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;(T)
&lt;/span&gt;&lt;span style=&quot;color:#795da3;&quot;&gt;include &lt;/span&gt;&lt;span style=&quot;color:#008080;&quot;&gt;Indexable&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;::Mutable(T)
&lt;/span&gt;&lt;span style=&quot;color:#795da3;&quot;&gt;include &lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;Comparable(Array)
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;
&lt;/span&gt;&lt;span style=&quot;color:#a7adba;&quot;&gt;# Size of an Array that we consider small to do linear scans or other optimizations.
&lt;/span&gt;&lt;span style=&quot;color:#795da3;&quot;&gt;private &lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;SMALL_ARRAY_SIZE &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;= &lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;16 
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;
&lt;/span&gt;&lt;span style=&quot;color:#a7adba;&quot;&gt;# The size of this array.
&lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;@&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;size &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;: &lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;Int32
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;
&lt;/span&gt;&lt;span style=&quot;color:#a7adba;&quot;&gt;# The capacity of `@buffer`.
&lt;/span&gt;&lt;span style=&quot;color:#a7adba;&quot;&gt;# Note that, because `@buffer` moves on shift, the actual
&lt;/span&gt;&lt;span style=&quot;color:#a7adba;&quot;&gt;# capacity (the allocated memory) starts at `@buffer - @offset_to_buffer`.
&lt;/span&gt;&lt;span style=&quot;color:#a7adba;&quot;&gt;# The actual capacity is also given by the `remaining_capacity` internal method.
&lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;@&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;capacity &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;: &lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;Int32
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;
&lt;/span&gt;&lt;span style=&quot;color:#a7adba;&quot;&gt;# Offset to the buffer that was originally allocated, and which needs to
&lt;/span&gt;&lt;span style=&quot;color:#a7adba;&quot;&gt;# be reallocated on resize. On shift this value gets increased, together with
&lt;/span&gt;&lt;span style=&quot;color:#a7adba;&quot;&gt;# `@buffer`. To reach the root buffer you have to do `@buffer - @offset_to_buffer`,
&lt;/span&gt;&lt;span style=&quot;color:#a7adba;&quot;&gt;# and this is also provided by the `root_buffer` internal method.
&lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;@&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;offset_to_buffer &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;: &lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;Int32 &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;= &lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;0
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;
&lt;/span&gt;&lt;span style=&quot;color:#a7adba;&quot;&gt;# The buffer where elements start.
&lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;@&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;buffer &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;: &lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;Pointer(T)
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;
&lt;/span&gt;&lt;span style=&quot;color:#a7adba;&quot;&gt;# In 64 bits the Array is composed then by:
&lt;/span&gt;&lt;span style=&quot;color:#a7adba;&quot;&gt;# - type_id            : Int32   # 4 bytes -|
&lt;/span&gt;&lt;span style=&quot;color:#a7adba;&quot;&gt;# - size               : Int32   # 4 bytes  |- packed as 8 bytes
&lt;/span&gt;&lt;span style=&quot;color:#a7adba;&quot;&gt;#
&lt;/span&gt;&lt;span style=&quot;color:#a7adba;&quot;&gt;# - capacity           : Int32   # 4 bytes -|
&lt;/span&gt;&lt;span style=&quot;color:#a7adba;&quot;&gt;# - offset_to_buffer   : Int32   # 4 bytes  |- packed as 8 bytes
&lt;/span&gt;&lt;span style=&quot;color:#a7adba;&quot;&gt;#
&lt;/span&gt;&lt;span style=&quot;color:#a7adba;&quot;&gt;# - buffer             : Pointer # 8 bytes  |- another 8 bytes&lt;/span&gt;&lt;/pre&gt;

&lt;p&gt;The comments above are very illustrative and complete - the Crystal team took
the time to document their standard library and explain not only how to use
each class, like &lt;code&gt;Array(T)&lt;/code&gt;, but how they are implemented internally.&lt;/p&gt;
&lt;p&gt;In this case, we can see the four &lt;code&gt;i32&lt;/code&gt; values inside the &lt;code&gt;Array(Int32)&lt;/code&gt; LLVM
structure type hold the size and capacity off the array, among other things.
And the &lt;code&gt;i32*&lt;/code&gt; value is a pointer to the actual contents of the array.&lt;/p&gt;
&lt;h2&gt;Functions&lt;/h2&gt;
&lt;p&gt;The target of the call instruction appears next, after the return type:&lt;/p&gt;
&lt;img src=&quot;https://patshaughnessy.net/assets/2022/2/19/function.svg&quot;&gt;
&lt;p&gt;This is quite a mouthful! What sort of function is this?&lt;/p&gt;
&lt;p&gt;There are two steps to understanding this: First, the &lt;code&gt;@&amp;quot;…&amp;quot;&lt;/code&gt; syntax. This is
simply a global identifier in this LLVM program. So my &lt;code&gt;call&lt;/code&gt; instruction is just
calling a global function. In LLVM programs, all functions are global; there is
no concept of a class, module or similar groupings of code.&lt;/p&gt;
&lt;p&gt;But what in the world does that crazy identifier mean?&lt;/p&gt;
&lt;p&gt;LLVM ignores this complex name. For LLVM this is just a name like &lt;code&gt;foo&lt;/code&gt; or &lt;code&gt;bar&lt;/code&gt;.
But for Crystal, the name has much more significance. Crystal encoded a lot of
information into this one name. Crystal can do this because the LLVM code isn’t
intended for anyone to read directly. Crystal has created a “mangled name,”
meaning the original version of the function to call is there but it’s been
mangled or rewritten in a confusing manner.&lt;/p&gt;
&lt;p&gt;Crystal rewrites function names to ensure they are unique. In Crystal, like in
many other statically typed languages, functions with different argument types
or return value types are actually different functions. So in Crystal if I
write:&lt;/p&gt;
&lt;pre style=&quot;background-color:#ffffff;&quot;&gt;
&lt;span style=&quot;color:#a71d5d;&quot;&gt;def &lt;/span&gt;&lt;span style=&quot;color:#795da3;&quot;&gt;foo&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;(a : Int32)
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;puts &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;&amp;quot;&lt;/span&gt;&lt;span style=&quot;color:#008080;&quot;&gt;Int: &lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;#{a}&lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;&amp;quot;
&lt;/span&gt;&lt;span style=&quot;color:#a71d5d;&quot;&gt;end
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;
&lt;/span&gt;&lt;span style=&quot;color:#a71d5d;&quot;&gt;def &lt;/span&gt;&lt;span style=&quot;color:#795da3;&quot;&gt;foo&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;(a : String)
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;puts &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;&amp;quot;&lt;/span&gt;&lt;span style=&quot;color:#008080;&quot;&gt;String: &lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;#{a}&lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;&amp;quot;
&lt;/span&gt;&lt;span style=&quot;color:#a71d5d;&quot;&gt;end
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;foo(&lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;123&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;)
&lt;/span&gt;&lt;span style=&quot;color:#a7adba;&quot;&gt;#=&amp;gt; Int: 123
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;foo(&lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;&amp;quot;&lt;/span&gt;&lt;span style=&quot;color:#008080;&quot;&gt;123&lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;&amp;quot;&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;)
&lt;/span&gt;&lt;span style=&quot;color:#a7adba;&quot;&gt;#=&amp;gt; String: 123&lt;/span&gt;&lt;/pre&gt;

&lt;p&gt;…I have two separate, different functions both called &lt;code&gt;foo&lt;/code&gt;. The type of the
parameter &lt;code&gt;a&lt;/code&gt; distinguishes one from the other.&lt;/p&gt;
&lt;p&gt;Crystal generates unique function names by encoding the arguments, return value
and type of the receiver into the into the function name string, making it
quite complex. Let’s break it down:&lt;/p&gt;
&lt;img src=&quot;https://patshaughnessy.net/assets/2022/2/19/mangled.svg&quot;&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;&lt;code&gt;Array(Int32)@Array(T)&lt;/code&gt; - this is the type of the receiver. That means the
&lt;code&gt;unsafe_build&lt;/code&gt; function is actually a method on the &lt;code&gt;Array(T)&lt;/code&gt; generic class.
And in this case, the receiver is an array holding 32 bit integers, the
&lt;code&gt;Array(Int32)&lt;/code&gt; class. Crystal includes both names in the mangled function name.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;code&gt;unsafe_build&lt;/code&gt; - this is the function Crystal is calling.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;code&gt;Int32&lt;/code&gt; - these are the function’s parameter types. In this case, Crystal is
passing in a single integer, so we just see one &lt;code&gt;Int32&lt;/code&gt; type.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;code&gt;Array(Int32)&lt;/code&gt; - this is the return value type, a new array containing integers.&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;As I discussed in &lt;a href=&quot;https://patshaughnessy.net/2022/1/22/visiting-an-abstract-syntax-tree&quot;&gt;my last
post&lt;/a&gt;,
the Crystal compiler internally rewrites my array literal expression &lt;code&gt;[12345, 67890]&lt;/code&gt; into code that creates and initializes a new array object:&lt;/p&gt;
&lt;pre style=&quot;background-color:#ffffff;&quot;&gt;
&lt;span style=&quot;color:#000000;&quot;&gt;__temp_621 &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;= &lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;::Array(typeof(&lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;12345&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;, &lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;67890&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;)).unsafe_build(&lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;2&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;)
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;__temp_622 &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;=&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt; __temp_621.to_unsafe
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;__temp_622[&lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;0&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;] &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;= &lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;12345
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;__temp_622[&lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;1&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;] &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;= &lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;67890
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;__temp_621&lt;/span&gt;&lt;/pre&gt;

&lt;p&gt;In this expanded code, Crystal calls &lt;code&gt;unsafe_build&lt;/code&gt; and passes in &lt;code&gt;2&lt;/code&gt;, the
required capacity of the new array. And to distinguish this use of
&lt;code&gt;unsafe_build&lt;/code&gt; from other &lt;code&gt;unsafe_build&lt;/code&gt; functions that might exist in my
program, the compiler generated the mangled name we see above. &lt;/p&gt;
&lt;h2&gt;Arguments&lt;/h2&gt;
&lt;p&gt;Finally, after the function name the LLVM IR instruction shows the arguments
for the function call:&lt;/p&gt;
&lt;img src=&quot;https://patshaughnessy.net/assets/2022/2/19/args.svg&quot;&gt;
&lt;p&gt;LLVM IR uses parentheses, like most languages, to enclose the arguments to a
function call. And the types precede each value: &lt;code&gt;610&lt;/code&gt; is a 32-bit integer and
&lt;code&gt;2&lt;/code&gt; is also a 32-bit integer.&lt;/p&gt;
&lt;p&gt;But wait a minute! We saw just above the expanded Crystal code for generating
the array literal passes a single value, &lt;code&gt;2&lt;/code&gt;, into the call to &lt;code&gt;unsafe_build&lt;/code&gt;.
And looking at the mangled function name above, we also see there is a single
&lt;code&gt;i32&lt;/code&gt; parameter to the function call.&lt;/p&gt;
&lt;p&gt;But reading the LLVM IR code we can see a second value is also passed in:
&lt;code&gt;610&lt;/code&gt;. What in the world does &lt;code&gt;610&lt;/code&gt; mean? I don’t have 610 elements in my new
array, and 610 is not one of the array elements. So what is going on here?&lt;/p&gt;
&lt;p&gt;Crystal is an object oriented language, meaning that each function is
optionally associated with a class. In OOP parlance, we say that we are
“sending a message” to a “receiver.” In this case, &lt;code&gt;unsafe_build&lt;/code&gt; is the message,
and &lt;code&gt;::Array(typeof(12345, 67890))&lt;/code&gt; is the receiver. In fact, this function is
really a class method. We are calling &lt;code&gt;unsafe_build&lt;/code&gt; on the &lt;code&gt;Array(Int32)&lt;/code&gt; class,
not on an instance of one array.&lt;/p&gt;
&lt;p&gt;Regardless, LLVM IR does’t support classes or instance methods or class
methods. In LLVM IR, we only have simple, global functions. And indeed, the
LLVM virtual machine doesn’t care what these arguments are or what they mean.
LLVM doesn’t encode the meaning or purpose of each argument; it just does what
the Crystal compiler tells it to do.&lt;/p&gt;
&lt;p&gt;But Crystal, on the other hand, has to implement object oriented behavior
somehow. Specifically, the &lt;code&gt;unsafe_build&lt;/code&gt; function needs to behave differently
depending on which class it was called for, depending on what the receiver is.
For example:&lt;/p&gt;
&lt;pre style=&quot;background-color:#ffffff;&quot;&gt;
&lt;span style=&quot;color:#000000;&quot;&gt;::Array(typeof(&lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;12345&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;, &lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;67890&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;)).unsafe_build(&lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;2&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;)&lt;/span&gt;&lt;/pre&gt;

&lt;p&gt;… has to return an array of two integers. While:&lt;/p&gt;
&lt;pre style=&quot;background-color:#ffffff;&quot;&gt;
&lt;span style=&quot;color:#000000;&quot;&gt;::Array(typeof(&lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;&amp;quot;&lt;/span&gt;&lt;span style=&quot;color:#008080;&quot;&gt;abc&lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;&amp;quot;&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;, &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;&amp;quot;&lt;/span&gt;&lt;span style=&quot;color:#008080;&quot;&gt;def&lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;&amp;quot;&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;)).unsafe_build(&lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;2&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;)&lt;/span&gt;&lt;/pre&gt;

&lt;p&gt;…has to return an array of two strings. How does this work in the LLVM IR code?&lt;/p&gt;
&lt;p&gt;To implement object oriented behavior, Crystal passes the receiver as a hidden,
special argument to the function call:&lt;/p&gt;
&lt;img src=&quot;https://patshaughnessy.net/assets/2022/2/19/args2.svg&quot;&gt;
&lt;p&gt;This receiver argument is a reference or pointer to the receiver’s object, and
is normally known as &lt;code&gt;self&lt;/code&gt;. Here &lt;code&gt;610&lt;/code&gt; is a reference or tag corresponding to
the &lt;code&gt;Array(Int32)&lt;/code&gt; class, the receiver. And &lt;code&gt;2&lt;/code&gt; is the actual argument to the
&lt;code&gt;unsafe_build&lt;/code&gt; method.&lt;/p&gt;
&lt;p&gt;Reading the LLVM IR code, we’ve learned that Crystal secretly passes a hidden
&lt;code&gt;self&lt;/code&gt; argument to every method call to an object. Then inside each method, the
code has access to &lt;code&gt;self&lt;/code&gt;, to the object instance that code is running for. Some
languages, like Rust, require us to pass &lt;code&gt;self&lt;/code&gt; explicitly in each method call;
in Crystal this behavior is automatic and hidden.&lt;/p&gt;
&lt;h2&gt;Learning How Compilers Work&lt;/h2&gt;
&lt;p&gt;LLVM IR is a simple language designed for compiler engineers. I think of it
like a blank slate for them to write on. Most LLVM instructions are quite
simple and easy to understand; as we saw above, understanding the basic syntax
of the call instruction wasn’t hard at all.&lt;/p&gt;
&lt;p&gt;The hard part was understanding how the Crystal compiler, which targets LLVM
IR, generates code. The LLVM syntax itself was easy to follow; it was the
Crystal language’s implementation that was harder to understand.&lt;/p&gt;
&lt;p&gt;And this is the real reason to learn about LLVM IR syntax. If you take the time
to learn how LLVM instructions work, then you can start to read the code your
favorite language’s compiler generates. And once you can do that, you can learn
more about how your favorite compiler works, and what your programs actually do
when you run them.&lt;/p&gt;
</content></entry><entry><title>Visiting an Abstract Syntax Tree</title><link href="https://patshaughnessy.net/2022/1/22/visiting-an-abstract-syntax-tree" rel="alternate"></link><id href="https://patshaughnessy.net/2022/1/22/visiting-an-abstract-syntax-tree" rel="alternate"></id><published>2022-01-22T00:00:00Z</published><updated>2022-01-22T00:00:00Z</updated><category>Crystal</category><author><name>Pat Shaughnessy</name></author><summary type="html">&lt;div style=&quot;float: left; padding: 8px 30px 0px 0px; text-align: center; line-height:18px&quot;&gt;
  &lt;img src=&quot;https://patshaughnessy.net/assets/2022/1/22/visit-tree.jpg&quot;&gt;&lt;br/&gt;
  &lt;i&gt;Joshua Tree National Park
  &lt;small&gt;(via: &lt;a href=&quot;https://commons.wikimedia.org/wiki/File:Backpacker_at_Sunset_(22849298523).jpg&quot;&gt;Wikimedia Commons&lt;/a&gt;)&lt;/small&gt;
  &lt;/i&gt;
&lt;/div&gt;
&lt;p&gt;In my &lt;a href=&quot;https://patshaughnessy.net/2021/1</summary><content type="html">&lt;div style=&quot;float: left; padding: 8px 30px 0px 0px; text-align: center; line-height:18px&quot;&gt;
  &lt;img src=&quot;https://patshaughnessy.net/assets/2022/1/22/visit-tree.jpg&quot;&gt;&lt;br/&gt;
  &lt;i&gt;Joshua Tree National Park
  &lt;small&gt;(via: &lt;a href=&quot;https://commons.wikimedia.org/wiki/File:Backpacker_at_Sunset_(22849298523).jpg&quot;&gt;Wikimedia Commons&lt;/a&gt;)&lt;/small&gt;
  &lt;/i&gt;
&lt;/div&gt;
&lt;p&gt;In my &lt;a href=&quot;https://patshaughnessy.net/2021/12/22/reading-code-like-a-compiler&quot;&gt;last
post&lt;/a&gt;, I
explored how &lt;a href=&quot;https://crystal-lang.org&quot;&gt;Crystal&lt;/a&gt; parsed a simple program and
produced a data structure called an &lt;a href=&quot;https://en.wikipedia.org/wiki/Abstract_syntax_tree&quot;&gt;abstract syntax
tree&lt;/a&gt; (AST). But what does
Crystal do with the AST? Why bother going to such lengths to create it?&lt;/p&gt;
&lt;p&gt;After Crystal parses my code, it repeatedly steps through all the entries or
nodes in the AST and builds up a description of the intended meaning and
behavior of my code. This process is known as &lt;em&gt;semantic analysis&lt;/em&gt;. Later,
Crystal will use this description to convert my program into a machine language
executable.&lt;/p&gt;
&lt;p&gt;But what does this description contain? What does it really mean for a compiler
to &lt;em&gt;understand&lt;/em&gt; anything? Let’s pack our bags and visit an abstract syntax tree
with Crystal to find out.&lt;/p&gt;
&lt;div style=&quot;clear: both&quot;&gt;&lt;/div&gt;
&lt;h2&gt;The Visitor Pattern&lt;/h2&gt;
&lt;p&gt;Imagine several tourists visiting a famous tree: Each of them sees the same
tree in a different way. The tree doesn’t change, but the perspective of each
person looking at it is different. They each take a different photo, or
remember different details.&lt;/p&gt;
&lt;p&gt;In Computer Science this separation of the data structure (the tree) from the
algorithms using it (the tourists) is known as the &lt;a href=&quot;https://en.wikipedia.org/wiki/Visitor_pattern&quot;&gt;visitor
pattern&lt;/a&gt;. This technique allows
compilers and other programs to run multiple algorithms on the same data
structure without making a mess.&lt;/p&gt;
&lt;p&gt;The visitor pattern calls for two functions: &lt;code&gt;accept&lt;/code&gt; and &lt;code&gt;visit&lt;/code&gt;. First, a
node in the data structure “accepts” a visitor:&lt;/p&gt;
&lt;img class=&quot;svg&quot; width=&quot;400px&quot; src=&quot;https://patshaughnessy.net/assets/2022/1/22/visitor1.svg&quot;&gt;
&lt;p&gt;After accepting a visitor, the node turns around and calls the &lt;code&gt;visit&lt;/code&gt; method on
&lt;code&gt;Visitor&lt;/code&gt;:&lt;/p&gt;
&lt;img class=&quot;svg&quot; width=&quot;400px&quot; src=&quot;https://patshaughnessy.net/assets/2022/1/22/visitor2.svg&quot;&gt;
&lt;p&gt;The &lt;code&gt;visit&lt;/code&gt; method implements whatever algorithm that visitor is interested in.&lt;/p&gt;
&lt;p&gt;This seems kind of pointless… why use &lt;code&gt;accept&lt;/code&gt; at all? We could just call
&lt;code&gt;visit&lt;/code&gt; directly. The key is that, after calling the visitor and passing
itself, the node passes the visitor to each of its children, recursively:&lt;/p&gt;
&lt;img class=&quot;svg&quot; width=&quot;400px&quot; src=&quot;https://patshaughnessy.net/assets/2022/1/22/visitor3.svg&quot;&gt;
&lt;p&gt;And then the visitor can visit each of the child nodes also. The &lt;code&gt;Visitor&lt;/code&gt;
class doesn’t necessarily need to know anything about how to navigate the node
data structure. And more and more visitor classes can implement new algorithms
without changing the underlying data structure and breaking each other.&lt;/p&gt;
&lt;h2&gt;The Visitor Pattern in the Crystal Compiler&lt;/h2&gt;
&lt;p&gt;In order to understand what my code means, Crystal reads through my program’s
AST over and over again using different visitors. Each algorithm looks for
certain syntax, records information about the types and objects my code uses or
possibly even transforms my code into a different form.&lt;/p&gt;
&lt;div style=&quot;float: right; padding: 8px 0px 0px 30px; text-align: center; line-height:18px&quot;&gt;
  &lt;img src=&quot;https://patshaughnessy.net/assets/2022/1/22/angel-oak.jpg&quot;&gt;&lt;br/&gt;
  &lt;i&gt;A photo I took in 2018 of &lt;a href=&quot;https://en.wikipedia.org/wiki/Angel_Oak&quot;&gt;Angel Oak&lt;/a&gt;,&lt;br/&gt; a 400-500 year old tree in South Carolina.&lt;/i&gt;
&lt;/div&gt;
&lt;p&gt;Crystal implements the basics of the visitor pattern in
&lt;a href=&quot;https://github.com/crystal-lang/crystal/blob/master/src/compiler/crystal/syntax/visitor.cr#L24&quot;&gt;visitor.cr&lt;/a&gt;,
inside the superclass of all AST nodes:&lt;/p&gt;
&lt;pre style=&quot;background-color:#ffffff;&quot;&gt;
&lt;span style=&quot;color:#a71d5d;&quot;&gt;class &lt;/span&gt;&lt;span style=&quot;color:#008080;&quot;&gt;ASTNode
&lt;/span&gt;&lt;span style=&quot;color:#343d46;&quot;&gt;  &lt;/span&gt;&lt;span style=&quot;color:#a71d5d;&quot;&gt;def &lt;/span&gt;&lt;span style=&quot;color:#795da3;&quot;&gt;accept&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;(visitor)
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;    &lt;/span&gt;&lt;span style=&quot;color:#a71d5d;&quot;&gt;if&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt; visitor.visit_any self
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;      &lt;/span&gt;&lt;span style=&quot;color:#a71d5d;&quot;&gt;if&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt; visitor.visit self
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;        accept_children visitor
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;      &lt;/span&gt;&lt;span style=&quot;color:#a71d5d;&quot;&gt;end
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;      visitor.end_visit self
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;      visitor.end_visit_any self
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;    &lt;/span&gt;&lt;span style=&quot;color:#a71d5d;&quot;&gt;end
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;  &lt;/span&gt;&lt;span style=&quot;color:#a71d5d;&quot;&gt;end
&lt;/span&gt;&lt;span style=&quot;color:#a71d5d;&quot;&gt;end&lt;/span&gt;&lt;/pre&gt;

&lt;p&gt;Each subclass of &lt;code&gt;ASTNode&lt;/code&gt; implements its own version of &lt;code&gt;accept_children&lt;/code&gt;.&lt;/p&gt;
&lt;h2&gt;My Tiny Crystal Program&lt;/h2&gt;
&lt;p&gt;To get a sense of how the visitor pattern works inside of Crystal, let’s look
at one line of code from my last post:&lt;/p&gt;
&lt;pre style=&quot;background-color:#ffffff;&quot;&gt;
&lt;span style=&quot;color:#000000;&quot;&gt;arr &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;= &lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;[&lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;12345&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;, &lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;67890&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;]&lt;/span&gt;&lt;/pre&gt;

&lt;p&gt;As I explained last month, the Crystal parser generates this AST tree fragment:&lt;/p&gt;
&lt;img class=&quot;svg&quot; width=&quot;400px&quot; src=&quot;https://patshaughnessy.net/assets/2022/1/22/ast1.svg&quot;&gt;
&lt;p&gt;Once the parser is finished and has created this small tree, the Crystal
compiler steps through it a number of different times, looking for classes,
variables, type declarations, etc. Each of these passes through the AST is
performed by a different visitor class: &lt;code&gt;TopLevelVisitor&lt;/code&gt;,
&lt;code&gt;InstanceVarsInitializerVisitor&lt;/code&gt; or &lt;code&gt;ClassVarsInitializerVisitor&lt;/code&gt; among many
others.&lt;/p&gt;
&lt;p&gt;The most important visitor class the Crystal compiler uses is called simply
&lt;code&gt;MainVisitor&lt;/code&gt;. You can find the code for &lt;code&gt;MainVisitor&lt;/code&gt; in
&lt;a href=&quot;https://github.com/crystal-lang/crystal/blob/master/src/compiler/crystal/semantic/main_visitor.cr#L26&quot;&gt;main_visitor.cr&lt;/a&gt;:&lt;/p&gt;
&lt;pre style=&quot;background-color:#ffffff;&quot;&gt;
&lt;span style=&quot;color:#a7adba;&quot;&gt;# This is the main visitor of the program, ran after types have been declared
&lt;/span&gt;&lt;span style=&quot;color:#a7adba;&quot;&gt;# and their type declarations (like `@x : Int32`) have been processed.
&lt;/span&gt;&lt;span style=&quot;color:#a7adba;&quot;&gt;#
&lt;/span&gt;&lt;span style=&quot;color:#a7adba;&quot;&gt;# This visits the &amp;quot;main&amp;quot; code of the program and resolves calls, instantiates
&lt;/span&gt;&lt;span style=&quot;color:#a7adba;&quot;&gt;# methods and visits them, recursively, with other MainVisitors.
&lt;/span&gt;&lt;span style=&quot;color:#a7adba;&quot;&gt;#
&lt;/span&gt;&lt;span style=&quot;color:#a7adba;&quot;&gt;# The visitor keeps track of a method&amp;#39;s variables (or the main program, split into
&lt;/span&gt;&lt;span style=&quot;color:#a7adba;&quot;&gt;# several files, in case of top-level code). It keeps track both of the type of a
&lt;/span&gt;&lt;span style=&quot;color:#a7adba;&quot;&gt;# variable at a single point (stored in @vars) and the combined type of all assignments
&lt;/span&gt;&lt;span style=&quot;color:#a7adba;&quot;&gt;# to it (in @meta_vars).
&lt;/span&gt;&lt;span style=&quot;color:#a7adba;&quot;&gt;#
&lt;/span&gt;&lt;span style=&quot;color:#a7adba;&quot;&gt;# Call resolution logic is in `Call#recalculate`, where method lookup is done.
&lt;/span&gt;&lt;span style=&quot;color:#a71d5d;&quot;&gt;class &lt;/span&gt;&lt;span style=&quot;color:#008080;&quot;&gt;MainVisitor &lt;/span&gt;&lt;span style=&quot;color:#343d46;&quot;&gt;&amp;lt; &lt;/span&gt;&lt;span style=&quot;color:#008080;&quot;&gt;SemanticVisitor&lt;/span&gt;&lt;/pre&gt;

&lt;p&gt;Since Crystal supports typed parameters and method overloading, the visitor
class implements a different &lt;code&gt;visit&lt;/code&gt; method for each type of node that it visits,
for example:&lt;/p&gt;
&lt;pre style=&quot;background-color:#ffffff;&quot;&gt;
&lt;span style=&quot;color:#a71d5d;&quot;&gt;class &lt;/span&gt;&lt;span style=&quot;color:#008080;&quot;&gt;MainVisitor &lt;/span&gt;&lt;span style=&quot;color:#343d46;&quot;&gt;&amp;lt; &lt;/span&gt;&lt;span style=&quot;color:#008080;&quot;&gt;SemanticVisitor
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;  &lt;/span&gt;&lt;span style=&quot;color:#a71d5d;&quot;&gt;def &lt;/span&gt;&lt;span style=&quot;color:#795da3;&quot;&gt;visit&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;(node : Assign)
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;  &lt;/span&gt;&lt;span style=&quot;color:#a71d5d;&quot;&gt;def &lt;/span&gt;&lt;span style=&quot;color:#795da3;&quot;&gt;visit&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;(node : Var)
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;  &lt;/span&gt;&lt;span style=&quot;color:#a71d5d;&quot;&gt;def &lt;/span&gt;&lt;span style=&quot;color:#795da3;&quot;&gt;visit&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;(node : ArrayLiteral)
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;  &lt;/span&gt;&lt;span style=&quot;color:#a71d5d;&quot;&gt;def &lt;/span&gt;&lt;span style=&quot;color:#795da3;&quot;&gt;visit&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;(node : NumberLiteral)
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;Etc…&lt;/span&gt;&lt;/pre&gt;

&lt;p&gt;Now let’s look at three examples of what the &lt;code&gt;MainVisitor&lt;/code&gt; class does with my
code: identifying variables, assigning types and expanding array literals. The
Crystal compiler is much too complex to describe in a single blog post, but
hopefully I can give you glimpse into the sort of work Crystal does during
semantic analysis.&lt;/p&gt;
&lt;h2&gt;Identifying Variables&lt;/h2&gt;
&lt;p&gt;Obviously, my example code creates and initializes a variable called &lt;code&gt;arr&lt;/code&gt;:&lt;/p&gt;
&lt;pre style=&quot;background-color:#ffffff;&quot;&gt;
&lt;span style=&quot;color:#000000;&quot;&gt;arr &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;= &lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;[&lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;12345&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;, &lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;67890&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;]&lt;/span&gt;&lt;/pre&gt;

&lt;p&gt;But how does Crystal identify this variable’s name and value? What does it do
with &lt;code&gt;arr&lt;/code&gt;?&lt;/p&gt;
&lt;p&gt;The &lt;code&gt;MainVisitor&lt;/code&gt; class starts to process my code by visiting the root node of
this branch of my AST, the &lt;code&gt;Assign&lt;/code&gt; node:&lt;/p&gt;
&lt;img class=&quot;svg&quot; width=&quot;375px&quot; src=&quot;https://patshaughnessy.net/assets/2022/1/22/visit-assign1.svg&quot;&gt;
&lt;p&gt;As you can see, earlier during the parsing phrase Crystal had saved the target
variable and value of this assign statement in the child AST nodes. The target
variable, &lt;code&gt;arr&lt;/code&gt;, appears in the &lt;code&gt;Var&lt;/code&gt; node, and the value to assign is an
&lt;code&gt;ArrayLiteral&lt;/code&gt; node. The &lt;code&gt;MainVisitor&lt;/code&gt; now knows I declared a new variable, called
&lt;code&gt;arr&lt;/code&gt;, in the current lexical scope. Since my program has no classes, methods or
any other lexical scopes, Crystal saves this variable in a table of variables
for the top level program:&lt;/p&gt;
&lt;img class=&quot;svg&quot; width=&quot;300px&quot; src=&quot;https://patshaughnessy.net/assets/2022/1/22/table.svg&quot;&gt;
&lt;p&gt;Actually, to be more accurate, there will always be many other variables in
this table along with &lt;code&gt;arr&lt;/code&gt;. All Crystal programs automatically include the
standard library, so Crystal also saves all of the top level variables from the
standard library in this table.&lt;/p&gt;
&lt;p&gt;In a more normal program, there will be many lexical scopes for different
method and class or module definitions, and &lt;code&gt;MainVisitor&lt;/code&gt; will save each
variable in the corresponding table.&lt;/p&gt;
&lt;h2&gt;Assigning Types&lt;/h2&gt;
&lt;p&gt;Probably the most important function of &lt;code&gt;MainVisitor&lt;/code&gt; is to assign a type to each
value in my program. The simplest example of this is when &lt;code&gt;MainVisitor&lt;/code&gt; visits a
&lt;code&gt;NumberLiteral&lt;/code&gt; node:&lt;/p&gt;
&lt;img class=&quot;svg&quot; width=&quot;300px&quot; src=&quot;https://patshaughnessy.net/assets/2022/1/22/visit-number-literal.svg&quot;&gt;
&lt;p&gt;Looking at the size of the numeric value, Crystal determines the type should be
&lt;code&gt;Int32&lt;/code&gt;. Crystal then saves this type right inside of the &lt;code&gt;NumberLiteral&lt;/code&gt; node:&lt;/p&gt;
&lt;img class=&quot;svg&quot; width=&quot;114px&quot; src=&quot;https://patshaughnessy.net/assets/2022/1/22/updated-number-literal.svg&quot;&gt;
&lt;p&gt;Strictly speaking, this violates the visitor pattern because the visitors
shouldn’t be modifying the data structure they visit. But the type of each
node, the type of each programming construct in my program, is really an
integral part of that node. In this case the &lt;code&gt;MainVisitor&lt;/code&gt; is really just
completing each node. It’s not changing the structure of the AST in this case…
although as we’ll see in a minute the &lt;code&gt;MainVisitor&lt;/code&gt; does this for other nodes!&lt;/p&gt;
&lt;h2&gt;Type Inference&lt;/h2&gt;
&lt;p&gt;Sometimes type values can’t be determined from the intrinsic value of an AST
node. Often the type of a node is determined by other nodes in the AST.&lt;/p&gt;
&lt;p&gt;Recall my example line of code is:&lt;/p&gt;
&lt;pre style=&quot;background-color:#ffffff;&quot;&gt;
&lt;span style=&quot;color:#000000;&quot;&gt;arr &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;= &lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;[&lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;12345&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;, &lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;67890&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;]&lt;/span&gt;&lt;/pre&gt;

&lt;p&gt;Here Crystal automatically sets the type of the arr variable to the type of the
array literal expression: &lt;code&gt;Array(Int32)&lt;/code&gt;. In Computer Science, this is known as
&lt;em&gt;type inference&lt;/em&gt;. Because Crystal can automatically determine the type of
&lt;code&gt;arr&lt;/code&gt;, I don’t need to declare it explicitly by writing something more
complicated like this:&lt;/p&gt;
&lt;pre style=&quot;background-color:#ffffff;&quot;&gt;
&lt;span style=&quot;color:#000000;&quot;&gt;arr &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;=&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt; uninitialized Array(Int32)
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;arr &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;= &lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;[&lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;12345&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;, &lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;67890&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;]&lt;/span&gt;&lt;/pre&gt;

&lt;p&gt;Type inference allows me to write concise, clean code with fewer type
annotations. Most modern, statically typed languages such as Swift, Rust,
Julia, Kotlin, etc., use type inference in the same way as Crystal. Even newer
versions of Java or C++ use type inference.&lt;/p&gt;
&lt;p&gt;The Crystal compiler implements type inference when the MainVisitor encounters
an &lt;code&gt;Assign&lt;/code&gt; AST node, what we saw above.&lt;/p&gt;
&lt;img class=&quot;svg&quot; width=&quot;375px&quot; src=&quot;https://patshaughnessy.net/assets/2022/1/22/visit-assign1.svg&quot;&gt;
&lt;p&gt;After encountering the &lt;code&gt;Assign&lt;/code&gt; node, Crystal recursively processes one of the
two child nodes, the &lt;code&gt;ArrayLiteral&lt;/code&gt; value, and its child nodes. When this process
finishes, Crystal knows the type of the &lt;code&gt;ArrayLiteral&lt;/code&gt; node is &lt;code&gt;Array(Int32)&lt;/code&gt;:&lt;/p&gt;
&lt;img class=&quot;svg&quot; width=&quot;425px&quot; src=&quot;https://patshaughnessy.net/assets/2022/1/22/set-type.svg&quot;&gt;
&lt;p&gt;I’ll take a closer look at how Crystal processes the &lt;code&gt;ArrayLiteral&lt;/code&gt; node next.
But for now, once Crystal has the type of the &lt;code&gt;ArrayLiteral&lt;/code&gt; node it copies that
type over to the &lt;code&gt;Var&lt;/code&gt; node and sets its type also:&lt;/p&gt;
&lt;img class=&quot;svg&quot; width=&quot;425px&quot; src=&quot;https://patshaughnessy.net/assets/2022/1/22/set-type2.svg&quot;&gt;
&lt;p&gt;But Crystal does something else interesting here: It sets up a dependency
between the two AST nodes: it “binds” the variable to the value:&lt;/p&gt;
&lt;img class=&quot;svg&quot; width=&quot;325px&quot; src=&quot;https://patshaughnessy.net/assets/2022/1/22/bind.svg&quot;&gt;
&lt;p&gt;This binding dependency allows Crystal to later update the type of the &lt;code&gt;arr&lt;/code&gt;
variable whenever necessary. In this case the value &lt;code&gt;[12345, 67890]&lt;/code&gt; will always
have the same type, but I suspect that sometimes the Crystal compiler can
update types during semantic analysis. In this way if the Crystal compiler ever
changed its mind about the type of some value, it can easy update the types of
any dependent values. I also suspect Crystal uses these type dependency
connections to produce error messages whenever you pass an incorrect type to
some function, for example. These are just guesses, however; if anyone from the
Crystal team knows exactly what these type bindings are used for let me know.&lt;/p&gt;
&lt;p&gt;&lt;b&gt;Update:&lt;/b&gt; Ary Borenszweig explained that sometimes the Crystal compiler
updates the type of variables based on how they are used. He posted an
interesting example on &lt;a href=&quot;https://forum.crystal-lang.org/t/visiting-an-abstract-syntax-tree/4304&quot;&gt;The Crystal Programming Language
Forum&lt;/a&gt;.&lt;/p&gt;
&lt;h2&gt;Expanding an Array Literal&lt;/h2&gt;
&lt;p&gt;So far we’ve seen Crystal set the type of the &lt;code&gt;NumberLiteral&lt;/code&gt; node to &lt;code&gt;Int32&lt;/code&gt;,
and we’ve seen Crystal assign &lt;code&gt;arr&lt;/code&gt; a type of &lt;code&gt;Array(Int32)&lt;/code&gt;. But how did Crystal
determine the type of the array literal &lt;code&gt;[12345, 67890]&lt;/code&gt;?&lt;/p&gt;
&lt;p&gt;This is where things get even more complicated. Sometimes during semantic
analysis the Crystal compiler completely rewrites parts of your code, replacing
it with something else. This happens even with my simple example. When visiting
the &lt;code&gt;ArrayLiteral&lt;/code&gt; node, the &lt;code&gt;MainVisitor&lt;/code&gt; expands this simple line of code into
something more complex:&lt;/p&gt;
&lt;pre style=&quot;background-color:#ffffff;&quot;&gt;
&lt;span style=&quot;color:#000000;&quot;&gt;__temp_621 &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;= &lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;::Array(typeof(&lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;12345&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;, &lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;67890&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;)).unsafe_build(&lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;2&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;)
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;__temp_622 &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;=&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt; __temp_621.to_unsafe
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;__temp_622[&lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;0&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;] &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;= &lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;12345
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;__temp_622[&lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;1&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;] &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;= &lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;67890
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;__temp_621&lt;/span&gt;&lt;/pre&gt;

&lt;p&gt;Reading this, you can see how later my compiled program will create the new
array. First Crystal creates an empty array with a capacity of 2, and an
element type of &lt;code&gt;Int32&lt;/code&gt;. &lt;code&gt;typeof(12345, 67890)&lt;/code&gt; returns the type (or multiple
types inside a union type) found in the given set of values, in this case just
&lt;code&gt;Int32&lt;/code&gt;. Later Crystal sets the two elements in the array just by assigning
them.&lt;/p&gt;
&lt;p&gt;Crystal achieves this by replacing part of my program’s AST with a new branch:&lt;/p&gt;
&lt;img class=&quot;svg&quot; width=&quot;375px&quot; src=&quot;https://patshaughnessy.net/assets/2022/1/22/expanded-ast.svg&quot;&gt;
&lt;p&gt;For clarity, I’m not drawing the AST nodes for the inner assign operations,
only the first line:&lt;/p&gt;
&lt;pre style=&quot;background-color:#ffffff;&quot;&gt;
&lt;span style=&quot;color:#000000;&quot;&gt;__temp_621 &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;= &lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;::Array(typeof(&lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;12345&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;, &lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;67890&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;)).unsafe_build(&lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;2&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;)&lt;/span&gt;&lt;/pre&gt;

&lt;h2&gt;Putting It All Together&lt;/h2&gt;
&lt;p&gt;With this new, updated AST we can see exactly how Crystal determines the type
of my variable &lt;code&gt;arr&lt;/code&gt;. Starting at the root of my AST, &lt;code&gt;MainVisitor&lt;/code&gt; visits all of
the AST nodes in this order in a series of recursive calls:&lt;/p&gt;
&lt;img class=&quot;svg&quot; width=&quot;114px&quot; src=&quot;https://patshaughnessy.net/assets/2022/1/22/call-recurse.svg&quot;&gt;
&lt;p&gt;And it determines the types of each of these nodes as it returns from the
recursive calls:&lt;/p&gt;
&lt;img class=&quot;svg&quot; width=&quot;240px&quot; src=&quot;https://patshaughnessy.net/assets/2022/1/22/return-recurse.svg&quot;&gt;
&lt;p&gt;Some interesting details here that I don’t understand completely or have space
to explain here:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;The &lt;code&gt;TypeOf&lt;/code&gt; node calculates a common union type using a type formula. In this
example, it just returns &lt;code&gt;Int32&lt;/code&gt; because both elements of my array, &lt;code&gt;12345&lt;/code&gt; and
&lt;code&gt;67890&lt;/code&gt;, are simple 32 bit integers.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;I believe the &lt;code&gt;Generic&lt;/code&gt; node refers to a Crystal generic class via the &lt;code&gt;Path&lt;/code&gt; node
shown above, in this example &lt;code&gt;Array(T)&lt;/code&gt;. When &lt;code&gt;MainVisitor&lt;/code&gt; processes the &lt;code&gt;Generic&lt;/code&gt;
node, it sets &lt;code&gt;T&lt;/code&gt; to the type &lt;code&gt;Int32&lt;/code&gt;, arriving at the type &lt;code&gt;Array(Int32).class&lt;/code&gt;.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;The &lt;code&gt;Call&lt;/code&gt; node looks up the method my code is calling (&lt;code&gt;unsafe_build&lt;/code&gt;) and
uses the type from that method’s return value. I didn’t have time to explore
how method lookup works in Crystal, however, so I’m not sure about this.&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;Scratching the Surface&lt;/h2&gt;
&lt;p&gt;Today we looked at a tiny piece of what the Crystal compiler can do. There are
many more types of AST nodes, each of which the &lt;code&gt;MainVisitor&lt;/code&gt; class handles
differently. And there are many different visitor classes also, beyond
&lt;code&gt;MainVisitor&lt;/code&gt;. When analyzing a more complex program Crystal has to understand
class and module definitions, instance and class variables, type annotations,
different lexical scopes, macros, and much, much more. Crystal will need all of
this information later, during the code generation phase, the next step that
follows semantic analysis.&lt;/p&gt;
&lt;p&gt;But I hope this article gave you a sense of what sort of work a compiler has to
do in order to understand your code. As you can see, for a statically typed
language like Crystal the compiler spends much of its time identifying all of
the types in your code, and determining which programming constructs or AST
nodes have which types.&lt;/p&gt;
&lt;p&gt;Next time I’ll look at code generation: Now that Crystal has identified the
variables, function calls and types in my code it is ready to generate the
machine language code needed to execute my program. To do that, it will
leverage the LLVM framework.&lt;/p&gt;
</content></entry><entry><title>Reading Code Like a Compiler</title><link href="https://patshaughnessy.net/2021/12/22/reading-code-like-a-compiler" rel="alternate"></link><id href="https://patshaughnessy.net/2021/12/22/reading-code-like-a-compiler" rel="alternate"></id><published>2021-12-22T00:00:00Z</published><updated>2021-12-22T00:00:00Z</updated><category>Crystal</category><author><name>Pat Shaughnessy</name></author><summary type="html">&lt;div style=&quot;float: left; padding: 8px 30px 0px 0px; text-align: center; line-height:18px&quot;&gt;
  &lt;img src=&quot;https://patshaughnessy.net/assets/2021/12/22/depth-of-field.jpg&quot;&gt;&lt;br/&gt;
  &lt;i&gt;Imagine trying to read an entire book while &lt;br/&gt;
  focusing on only one or two words at a time
  &lt;/i&gt;
&lt;/div&gt;
&lt;p&gt;We use compilers every day to parse our code, find our programming mistakes and
then help us fix them. But h</summary><content type="html">&lt;div style=&quot;float: left; padding: 8px 30px 0px 0px; text-align: center; line-height:18px&quot;&gt;
  &lt;img src=&quot;https://patshaughnessy.net/assets/2021/12/22/depth-of-field.jpg&quot;&gt;&lt;br/&gt;
  &lt;i&gt;Imagine trying to read an entire book while &lt;br/&gt;
  focusing on only one or two words at a time
  &lt;/i&gt;
&lt;/div&gt;
&lt;p&gt;We use compilers every day to parse our code, find our programming mistakes and
then help us fix them. But how do compilers read and understand our code? What
does our code look like to them?&lt;/p&gt;
&lt;p&gt;We tend to read code like we would read a human language like English. We
don’t see letters; we see words and phrases. And in a very natural way we use
what we just read, the proceeding sentence or paragraph, to give us the context
we need to understand the following text. And sometimes we just skim over text
quickly to gleam a bit of the meaning without even reading every word.&lt;/p&gt;
&lt;div style=&quot;clear: both&quot;&gt;&lt;/div&gt;
&lt;p&gt;Compilers aren’t as smart as we are. They can’t read and understand entire
phrases or sentences all at once. They read text one letter, one word at at
time, meticulously building up a record of what they have read so far.&lt;/p&gt;
&lt;p&gt;I was curious to learn more about how compilers parse text, but where should I
look? Which compiler should I study? Once again, like in my last few posts,
Crystal was the answer.&lt;/p&gt;
&lt;h2&gt;Crystal: A Compiler Accessible to Everyone&lt;/h2&gt;
&lt;p&gt;&lt;a href=&quot;https://crystal-lang.org&quot;&gt;Crystal&lt;/a&gt; is a unique combination of simple, human
syntax inspired by Ruby, with the speed and robustness enabled by static types
and the use of &lt;a href=&quot;https://llvm.org&quot;&gt;LLVM&lt;/a&gt;. But for me the most exciting thing
about Crystal is how the Crystal team implemented both its standard library and
compiler using the target language: Crystal. This makes Crystal’s internal
implementation accessible to anyone familiar with Ruby. For once, you don’t
have to be a hard core C or C++ developer to learn how a compiler works.
Reading code not much more complex than a Ruby on Rails web site, I can take a
peek under the hood of a real world compiler to see how it works internally.&lt;/p&gt;
&lt;p&gt;Not only did the Crystal team implement their compiler using Crystal, they also
wrote it by hand. Parsing is such a tedious task that often developers use a
parser generator, such as &lt;a href=&quot;https://www.gnu.org/software/bison/&quot;&gt;GNU Bison&lt;/a&gt;, to
automatically generate the parse code given a set of rules. This is how Ruby
works, for example. But the Crystal team wrote their parser directly in
Crystal, which you can read in
&lt;a href=&quot;https://github.com/crystal-lang/crystal/blob/master/src/compiler/crystal/syntax/parser.cr&quot;&gt;parser.cr&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Along with a readable compiler, I need a readable program to compile. I decided to
reuse the same array snippet from my last post:&lt;/p&gt;
&lt;pre style=&quot;background-color:#ffffff;&quot;&gt;
&lt;span style=&quot;color:#000000;&quot;&gt;arr &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;= &lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;[&lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;12345&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;, &lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;67890&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;]
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;puts arr[&lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;1&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;]&lt;/span&gt;&lt;/pre&gt;

&lt;p&gt;This tiny Crystal program creates an array of two numbers and then prints out
the second number. Simple enough: You and I can read and parse these two lines
of code in one glance and in a fraction of a second understand what it does.
Even if you’re not a Crystal or Ruby developer this syntax is so simple you can
still understand it.&lt;/p&gt;
&lt;p&gt;But the Crystal compiler can’t understand this code as easily as we can.
Parsing even this simple program is a complex task for a compiler.&lt;/p&gt;
&lt;h2&gt;How the Crystal Compiler Sees My Code&lt;/h2&gt;
&lt;p&gt;Before parsing or running the code above, Crystal converts it into a series of
tokens. To the Crystal compiler, my program looks like this:&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;https://patshaughnessy.net/assets/2021/12/22/tokens.png&quot;&gt;&lt;br/&gt;&lt;/p&gt;
&lt;p&gt;The first &lt;code&gt;IDENT&lt;/code&gt; token corresponds to the &lt;code&gt;arr&lt;/code&gt; variable at the beginning of the
first line. You can also see two &lt;code&gt;NUMBER&lt;/code&gt; tokens: the &lt;a href=&quot;https://github.com/crystal-lang/crystal/blob/master/src/compiler/crystal/syntax/lexer.cr&quot;&gt;Crystal tokenizer
code&lt;/a&gt;
converted each series of numerical digits into single tokens, one for 12345 and
the other for 67890. Along with these tokens you can also see other tokens for
punctuation used in Crystal syntax, like the equals sign and left and right
square brackets. There is also a new line token and one for the end of the
entire file.&lt;/p&gt;
&lt;h2&gt;Reading a Book One Word at a Time&lt;/h2&gt;
&lt;p&gt;To understand my code, Crystal processes these tokens one at a time, stepping
tediously through the entire program. What a slow, painful process!&lt;/p&gt;
&lt;p&gt;How would we read if we could only see one word at a time? I imagine covering
the book I’m trying to read with a piece of paper or plastic that had a small
hole in it… and that through the hole I could only see one word at a time. How
would I read one entire page? Well, I’d have to move the paper around, showing
one word and then another and another. And how would I know where to move the
paper next? Would I simply move the paper forward one word at at time? What if
I forgot some word I had seen earlier? I’d have to backtrack - but how far back
to go? What if the meaning of the word I was looking at depended on the words
that followed it? This sounds like a nightmare.&lt;/p&gt;
&lt;p&gt;To read like this, if it was even possible at all, I’d have to have a well
thought out strategy. I’d have to know exactly how to move that plastic screen
around. When you can only read one word at a time, deciding which word to read
next becomes incredibly important. I would need an algorithm to follow.&lt;/p&gt;
&lt;p&gt;This is what a parser algorithm is: Some set of rules the parse code can use to
interpret each word, and, equally important, to decide which word to read next.
Crystal’s parse code is over 6000 lines long, so I won’t attempt to completely
explain it here. But there’s an underlying, high level algorithm the parse code
uses:&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;https://patshaughnessy.net/assets/2021/12/22/pattern-recurse-record.png&quot;&gt;&lt;br/&gt;&lt;/p&gt;
&lt;p&gt;First, the parser compares the current token, and possibly the following or
previous tokens as well, to a series of expected patterns. These patterns
define the syntax the parser is reading.  Second, the parser recurses. It calls
itself to parse the next token, or possibly multiple next tokens depending on
which pattern the parser just matched. Finally, the parser records what it saw:
which pattern matched the current token and the results of the recursive calls
to itself, for future reference.&lt;/p&gt;
&lt;h2&gt;Matching a Pattern&lt;/h2&gt;
&lt;p&gt;The best way to understand how this works is to see it in action. Let’s follow
along with the Crystal compiler as it parses the code I showed above:&lt;/p&gt;
&lt;pre style=&quot;background-color:#ffffff;&quot;&gt;
&lt;span style=&quot;color:#000000;&quot;&gt;arr &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;= &lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;[&lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;12345&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;, &lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;67890&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;]
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;puts arr[&lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;1&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;]&lt;/span&gt;&lt;/pre&gt;

&lt;p&gt;Recall Crystal already converted this code into a token stream:&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;https://patshaughnessy.net/assets/2021/12/22/token-line.png&quot;&gt;&lt;br/&gt;&lt;/p&gt;
&lt;p&gt;(To be more accurate, Crystal actually converts my code into tokens as it goes.
The parse code calls the tokenizer code each time it needs a new token. But
this timing isn’t really important.)&lt;/p&gt;
&lt;p&gt;As you might expect, Crystal starts with the first token, &lt;code&gt;IDENT&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;https://patshaughnessy.net/assets/2021/12/22/process-token1.png&quot;&gt;&lt;br/&gt;&lt;/p&gt;
&lt;p&gt;What does this mean? How does Crystal interpret &lt;code&gt;arr&lt;/code&gt;? &lt;code&gt;IDENT&lt;/code&gt; is short for
identifier, but what role does this identifier play? What meaning does &lt;code&gt;arr&lt;/code&gt; have
in my code?&lt;/p&gt;
&lt;p&gt;To decide on the correct meaning, the Crystal parser compares the &lt;code&gt;IDENT&lt;/code&gt; token
with a series of patterns. For example Crystal looks for patterns like:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;a ternary expression &lt;code&gt;a ? b : c&lt;/code&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;a range &lt;code&gt;a..b&lt;/code&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;an expression using a binary operator, such as: &lt;code&gt;a + b&lt;/code&gt;, etc.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;and many more…&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;It turns out none of these patterns apply in this case, and Crystal ends up
selecting a default pattern which handles the most common code pattern: a
function call. Crystal decides that when I wrote &lt;code&gt;arr&lt;/code&gt; I intended to call a
function called &lt;code&gt;arr&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;I often tell people I work with at my day job that I have really bad memory.
And it’s true. I constantly have to google the syntax or return values of
functions. I often forget what some code means even just a month after I wrote
it. And the Crystal compiler is no better: As soon as it processes that &lt;code&gt;IDENT&lt;/code&gt;
token above, it has to write down what it decided that token meant or else it
would forget.&lt;/p&gt;
&lt;p&gt;To record the function call, Crystal creates an object:&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;https://patshaughnessy.net/assets/2021/12/22/ast1.png&quot;&gt;&lt;br/&gt;&lt;/p&gt;
&lt;p&gt;As we’ll see in a moment, Crystal builds up a tree of these objects, called an
&lt;a href=&quot;https://en.wikipedia.org/wiki/Abstract_syntax_tree&quot;&gt;Abstract Syntax
Tree&lt;/a&gt; (AST). The AST will
later serve as a record of the syntactic structure of my code.&lt;/p&gt;
&lt;h2&gt;Recursively Calling Itself&lt;/h2&gt;
&lt;p&gt;Parsing is inherently a recursive process. Unlike English text, Crystal
expressions can be nested one inside another to any depth. Although I suppose
English grammar is somewhat recursive and can be nested to some degree. I
wonder if the grammars for some other human languages are more recursive than
English? Interesting question.&lt;/p&gt;
&lt;p&gt;For parsing a programming language like Crystal, the simplest thing for the
parser code to do is recursively call itself. And it does this based on the
pattern it just matched. For example, if Crystal had parsed a plus sign, it
would need to recursively call itself to parse the values that appeared before
and after the plus.&lt;/p&gt;
&lt;p&gt;In this example, Crystal has to decide what arguments to pass to this call to
the &lt;code&gt;arr&lt;/code&gt; function. Did I write &lt;code&gt;arr(1, 2, 3)&lt;/code&gt; or just &lt;code&gt;arr&lt;/code&gt;? Or &lt;code&gt;arr()&lt;/code&gt;? What were
the values 1, 2 and 3? Each of these could be a complex expression in their own
right, maybe appearing inside of parentheses, a compound value like an array or
maybe yet another function call.&lt;/p&gt;
&lt;p&gt;To find the arguments of the function call, inside the recursive call to the
parse code Crystal proceeds forward to process the next two tokens:&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;https://patshaughnessy.net/assets/2021/12/22/process-token2.png&quot;&gt;&lt;br/&gt;&lt;/p&gt;
&lt;p&gt;Crystal skips over the space, and then encounters the equals sign. Suddenly
Crystal realizes it was wrong! The &lt;code&gt;arr&lt;/code&gt; identifier wasn’t a reference to a
function at all, it was a variable declaration. Yes, sometimes compilers change
their minds while reading, just like we do!&lt;/p&gt;
&lt;h2&gt;Recording an AST Node&lt;/h2&gt;
&lt;p&gt;To record this new, revised syntax, Crystal changes the &lt;code&gt;Call&lt;/code&gt; AST node it
created earlier to an &lt;code&gt;Assign&lt;/code&gt; AST node, and creates a new &lt;code&gt;Var&lt;/code&gt; AST node to
record the variable being assigned to:&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;https://patshaughnessy.net/assets/2021/12/22/ast2.png&quot;&gt;&lt;br/&gt;&lt;/p&gt;
&lt;p&gt;Now the AST is starting to resemble a tree. Because of the recursive nature of
parse algorithm, this tree structure is an ideal way of record what the
compiler has parsed so far. Trees are recursive too: Each branch is a tree in
its own right.&lt;/p&gt;
&lt;h2&gt;Rinse and Repeat&lt;/h2&gt;
&lt;p&gt;But what value should Crystal assign to that variable? What should appear in
the AST as the value attribute of the &lt;code&gt;Assign&lt;/code&gt; node?&lt;/p&gt;
&lt;p&gt;To find out, the Crystal compiler recursively calls the same parsing algorithm
again, but starting with the &lt;code&gt;[&lt;/code&gt; token:&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;https://patshaughnessy.net/assets/2021/12/22/process-token3.png&quot;&gt;&lt;br/&gt;&lt;/p&gt;
&lt;p&gt;Following the pattern match, record and recurse process, the Crystal compiler
once again matches the new token, &lt;code&gt;[&lt;/code&gt;, with a series of expected patterns. This
time, Crystal decides that the left bracket is the start of literal array
expression and records a new AST node:&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;https://patshaughnessy.net/assets/2021/12/22/array-literal1.png&quot;&gt;&lt;br/&gt;&lt;/p&gt;
&lt;p&gt;But before inserting it into the syntax tree, Crystal recursively calls itself
to parse each of the values that appear in the array. The array literal pattern
expects a series of values to appear separated by spaces, so Crystal proceeds
to process the following tokens, looking for values separated by commas:&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;https://patshaughnessy.net/assets/2021/12/22/process-token4.png&quot;&gt;&lt;br/&gt;&lt;/p&gt;
&lt;p&gt;After encountering the comma, Crystal recursively calls the same parse code
again on the previous token or tokens that appeared before the comma, because
the array value before the comma could be another expression of arbitrary depth
and complexity. In this example, Crystal finds a simple numeric array element,
and creates a new AST node to represent the numeric value:&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;https://patshaughnessy.net/assets/2021/12/22/number-literal1.png&quot;&gt;&lt;br/&gt;&lt;/p&gt;
&lt;p&gt;After reading the comma, Crystal calls its parser recursively again, and finds
the second number:&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;https://patshaughnessy.net/assets/2021/12/22/number-literal2.png&quot;&gt;&lt;br/&gt;&lt;/p&gt;
&lt;p&gt;Remember Crystal has a bad memory. With all these new AST nodes, Crystal will
quickly forget what they mean. Fortunately, Crystal reads in the right square
bracket and realizes I ended the array literal in my code:&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;https://patshaughnessy.net/assets/2021/12/22/process-token5.png&quot;&gt;&lt;br/&gt;&lt;/p&gt;
&lt;p&gt;Now those recursive calls to the parse code return, and Crystal assembles these
new AST nodes:&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;https://patshaughnessy.net/assets/2021/12/22/array-literal2.png&quot;&gt;&lt;br/&gt;&lt;/p&gt;
&lt;p&gt;…and then places them inside the larger, surrounding AST:&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;https://patshaughnessy.net/assets/2021/12/22/ast3.png&quot;&gt;&lt;br/&gt;&lt;/p&gt;
&lt;p&gt;After this, these recursive calls return and the Crystal compiler moves on to
parse the second line of my program.&lt;/p&gt;
&lt;h2&gt;A Complete Abstract Syntax Tree&lt;/h2&gt;
&lt;p&gt;After following the Crystal parser for a while, I added some debug logging code
to the compiler so I could see the result. Here’s my example code again:&lt;/p&gt;
&lt;pre style=&quot;background-color:#ffffff;&quot;&gt;
&lt;span style=&quot;color:#000000;&quot;&gt;arr &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;= &lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;[&lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;12345&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;, &lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;67890&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;]
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;puts arr[&lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;1&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;]&lt;/span&gt;&lt;/pre&gt;

&lt;p&gt;And here’s the complete AST the Crystal compiler generated after parsing my
code. My debug logging indented each line to indicate the AST structure:&lt;/p&gt;
&lt;pre type=&quot;console&quot;&gt;&amp;lt;Crystal::Expressions exp_count=3 &gt;
  &amp;lt;Crystal::Require string=prelude &gt;
  &amp;lt;Crystal::Assign target=Crystal::Var value=Crystal::ArrayLiteral &gt;
    &amp;lt;Crystal::Var name=arr &gt;
    &amp;lt;Crystal::ArrayLiteral element_count=2 of=Nil name=Nil &gt;
      &amp;lt;Crystal::NumberLiteral number=12345 kind=i32 &gt;
      &amp;lt;Crystal::NumberLiteral number=67890 kind=i32 &gt;
  &amp;lt;Crystal::Call obj= name=puts arg_count=1 &gt;
    &amp;lt;Crystal::Call obj=arr name=[] arg_count=1 &gt;
      &amp;lt;Crystal::Var name=arr &gt;
      &amp;lt;Crystal::NumberLiteral number=1 kind=i32 &gt;&lt;/pre&gt;
&lt;p&gt;Each of these values is a subclass of the &lt;code&gt;Crystal::ASTNode&lt;/code&gt; superclass.
Crystal defines all of these in the
&lt;a href=&quot;https://github.com/crystal-lang/crystal/blob/master/src/compiler/crystal/syntax/ast.cr&quot;&gt;ast.cr&lt;/a&gt;
file. Some interesting details to note:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;The top level node is called &lt;code&gt;Expressions&lt;/code&gt;, and more or less holds one
expression per line of code.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;The second node, the first child node of &lt;code&gt;Expressions&lt;/code&gt;, is called &lt;code&gt;Require&lt;/code&gt;.
The surprise here is that I didn’t even put a &lt;code&gt;require&lt;/code&gt; keyword in my
program! Crystal silently inserts &lt;code&gt;require prelude&lt;/code&gt; to the beginning of
all Crystal programs. The “prelude” is the Crystal standard library, the code
that defines &lt;code&gt;Array&lt;/code&gt;, &lt;code&gt;String&lt;/code&gt; many other core classes. Reading the AST allows
us to see how the Crystal compiler does this automatically.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;The third node and its children are the nodes we saw Crystal create above for
my first line of code, the array literal and the variable it is assigned to.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Finally, the last branch of the tree shows the call to &lt;code&gt;puts&lt;/code&gt;. This time
Crystal’s default guess about identifiers being function calls was correct.
Another interesting detail here is that the inner call to the &lt;code&gt;[]&lt;/code&gt; function
was not generated by an identifier, but by the &lt;code&gt;[&lt;/code&gt; token. This was one of the
patterns the Crystal parser checked for after one of the recursive parse
calls.&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;Next Time&lt;/h2&gt;
&lt;p&gt;What’s the point of all of this? What does the Crystal compiler do next with
the AST? This tree structure is a fantastic summary of how Crystal parsed my
code, and, as we’ll see later, also provides a convenient way for Crystal later
to process my code and transform it in different ways.&lt;/p&gt;
&lt;p&gt;When I have time, I plan to write a few more posts about more of the inner
workings of the Crystal compiler and the LLVM framework, which Crystal later
uses to generate my x86 executable program.&lt;/p&gt;
</content></entry><entry><title>Find Your Language’s Primitives</title><link href="https://patshaughnessy.net/2021/11/29/find-your-languages-primitives" rel="alternate"></link><id href="https://patshaughnessy.net/2021/11/29/find-your-languages-primitives" rel="alternate"></id><published>2021-11-29T00:00:00Z</published><updated>2021-11-29T00:00:00Z</updated><category>Crystal</category><author><name>Pat Shaughnessy</name></author><summary type="html">&lt;div style=&quot;float: right; padding: 8px 0px 30px 30px; text-align: center; line-height:18px&quot;&gt;
  &lt;img src=&quot;https://patshaughnessy.net/assets/2021/11/29/dig1.jpg&quot;&gt;&lt;br/&gt;
  &lt;i&gt;If you dig into your programming language's syntax, you might &lt;br/&gt;discover that it is capable of much more than you thought it was.
  &lt;/i&gt;
&lt;/div&gt;
&lt;p&gt;Wikipedia defines “Language Primitive” &lt;a href=&quot;https://en.wikipedia.org/wi</summary><content type="html">&lt;div style=&quot;float: right; padding: 8px 0px 30px 30px; text-align: center; line-height:18px&quot;&gt;
  &lt;img src=&quot;https://patshaughnessy.net/assets/2021/11/29/dig1.jpg&quot;&gt;&lt;br/&gt;
  &lt;i&gt;If you dig into your programming language's syntax, you might &lt;br/&gt;discover that it is capable of much more than you thought it was.
  &lt;/i&gt;
&lt;/div&gt;
&lt;p&gt;Wikipedia defines “Language Primitive” &lt;a href=&quot;https://en.wikipedia.org/wiki/Language_primitive&quot;&gt;this
way&lt;/a&gt;:&lt;/p&gt;
&lt;blockquote&gt;
In computing, language primitives are the simplest elements available in a
programming language. A primitive is the smallest 'unit of processing'
available to a programmer of a given machine, or can be an atomic element of an
expression in a language.
&lt;/blockquote&gt;
&lt;p&gt;By looking at a language’s primitives, we can learn what kind of code will be
easy to write or impossible to express, and what types of problems the language
was intended to solve.  Whether you’ve been using a language for years, or just
now learning a new language for fun, take the time to find and learn about your
language’s primitives. You might discover something you never knew, and will
come away with a deeper understanding of how your programs work.&lt;/p&gt;
&lt;p&gt;As an example today, I’m going to look at how arrays work in three languages:
Ruby, Crystal and x86 Assembly Language.&lt;/p&gt;
&lt;h2&gt;Retrieving an Array Element In Ruby&lt;/h2&gt;
&lt;p&gt;In Ruby I can create an array and later access an element like this:&lt;/p&gt;
&lt;pre style=&quot;background-color:#ffffff;&quot;&gt;
&lt;span style=&quot;color:#000000;&quot;&gt;arr &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;= &lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;[&lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;12345&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;, &lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;67890&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;]
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;puts arr[&lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;1&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;]&lt;/span&gt;&lt;/pre&gt;

&lt;p&gt;This code would be the same or almost the same in many other programming
languages. It just means: “find the second element of the array and print it to
stdout.”&lt;/p&gt;
&lt;p&gt;But how does this actually work? In Ruby, the &lt;span class=&quot;code&quot;&gt;Array&lt;/span&gt;
class and all of its methods are language primitives. This means array methods
like &lt;span class=&quot;code&quot;&gt;[]&lt;/span&gt; or &lt;span class=&quot;code&quot;&gt;[]=&lt;/span&gt; cannot be
broken down into smaller pieces of Ruby code. As Wikipedia says, these methods
are the smallest unit of processing available to Ruby programmers working with
arrays.&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;https://patshaughnessy.net/assets/2021/11/29/primitive1.png&quot;&gt;&lt;br/&gt;&lt;/p&gt;
&lt;p&gt;Ruby hides the details of how arrays actually work from us. To learn how Ruby
actually saves and retrieves values from an array, we would need to switch
languages and drop down a level of abstraction, and read the C implementation
in the Ruby source code:
&lt;a href=&quot;https://github.com/ruby/ruby/blob/master/array.c&quot;&gt;array.c&lt;/a&gt;. There’s nothing
wrong with this, of course. Ruby developers use arrays every day without any
trouble. But switching from Ruby to C makes understanding internal details much
more difficult.&lt;/p&gt;
&lt;h2&gt;Retrieving an Array Element In Crystal&lt;/h2&gt;
&lt;p&gt;This Fall I decided to learn more about &lt;a href=&quot;https://crystal-lang.org&quot;&gt;Crystal&lt;/a&gt;, a
statically typed language with syntax that resembles Ruby. I expected to find a
similar &lt;span class=&quot;code&quot;&gt;Array#[]&lt;/span&gt; primitive.  But surprisingly, I was
wrong!&lt;/p&gt;
&lt;p&gt;The same code from above also works in Crystal:&lt;/p&gt;
&lt;pre style=&quot;background-color:#ffffff;&quot;&gt;
&lt;span style=&quot;color:#000000;&quot;&gt;arr &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;= &lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;[&lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;12345&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;, &lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;67890&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;]
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;puts arr[&lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;1&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;]&lt;/span&gt;&lt;/pre&gt;

&lt;p&gt;In Crystal, arrays are not language primitives because the Crystal standard
library implements arrays using Crystal itself. The &lt;span
class=&quot;code&quot;&gt;Array#[]&lt;/span&gt; method is not the smallest unit of processing
available to Crystal programmers. Let’s dig into the details and divide up the
&lt;span class=&quot;code&quot;&gt;[]&lt;/span&gt; method into smaller and smaller pieces to see how
the Crystal team implemented it.&lt;/p&gt;
&lt;p&gt;Reading
&lt;a href=&quot;https://github.com/crystal-lang/crystal/blob/master/src/indexable.cr#L56&quot;&gt;src/indexable.cr&lt;/a&gt;
in the Crystal standard library, here’s the implementation of &lt;span class=&quot;code&quot;&gt;Indexable#[]&lt;/span&gt;
which the array class uses when I call &lt;span class=&quot;code&quot;&gt;arr[1]&lt;/span&gt; above:&lt;/p&gt;
&lt;pre style=&quot;background-color:#ffffff;&quot;&gt;
&lt;span style=&quot;color:#a7adba;&quot;&gt;# Returns the element at the given *index*.
&lt;/span&gt;&lt;span style=&quot;color:#a7adba;&quot;&gt;#
&lt;/span&gt;&lt;span style=&quot;color:#a7adba;&quot;&gt;# Negative indices can be used to start counting from the end of the array.
&lt;/span&gt;&lt;span style=&quot;color:#a7adba;&quot;&gt;# Raises `IndexError` if trying to access an element outside the array&amp;#39;s range.
&lt;/span&gt;&lt;span style=&quot;color:#a7adba;&quot;&gt;#
&lt;/span&gt;&lt;span style=&quot;color:#a7adba;&quot;&gt;# ```
&lt;/span&gt;&lt;span style=&quot;color:#a7adba;&quot;&gt;# ary = [&amp;#39;a&amp;#39;, &amp;#39;b&amp;#39;, &amp;#39;c&amp;#39;]
&lt;/span&gt;&lt;span style=&quot;color:#a7adba;&quot;&gt;# ary[0]  # =&amp;gt; &amp;#39;a&amp;#39;
&lt;/span&gt;&lt;span style=&quot;color:#a7adba;&quot;&gt;# ary[2]  # =&amp;gt; &amp;#39;c&amp;#39;
&lt;/span&gt;&lt;span style=&quot;color:#a7adba;&quot;&gt;# ary[-1] # =&amp;gt; &amp;#39;c&amp;#39;
&lt;/span&gt;&lt;span style=&quot;color:#a7adba;&quot;&gt;# ary[-2] # =&amp;gt; &amp;#39;b&amp;#39;
&lt;/span&gt;&lt;span style=&quot;color:#a7adba;&quot;&gt;#
&lt;/span&gt;&lt;span style=&quot;color:#a7adba;&quot;&gt;# ary[3]  # raises IndexError
&lt;/span&gt;&lt;span style=&quot;color:#a7adba;&quot;&gt;# ary[-4] # raises IndexError
&lt;/span&gt;&lt;span style=&quot;color:#a7adba;&quot;&gt;# ```
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;@[AlwaysInline]
&lt;/span&gt;&lt;span style=&quot;color:#a71d5d;&quot;&gt;def &lt;/span&gt;&lt;span style=&quot;color:#795da3;&quot;&gt;[]&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;(index : Int)
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;  fetch(index) { &lt;/span&gt;&lt;span style=&quot;color:#795da3;&quot;&gt;raise &lt;/span&gt;&lt;span style=&quot;color:#008080;&quot;&gt;IndexError&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;.&lt;/span&gt;&lt;span style=&quot;color:#795da3;&quot;&gt;new &lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;}
&lt;/span&gt;&lt;span style=&quot;color:#a71d5d;&quot;&gt;end&lt;/span&gt;&lt;/pre&gt;

&lt;p&gt;The Crystal team implemented &lt;span class=&quot;code&quot;&gt;[]&lt;/span&gt; using another method
called &lt;span class=&quot;code&quot;&gt;fetch&lt;/span&gt;:&lt;/p&gt;
&lt;pre style=&quot;background-color:#ffffff;&quot;&gt;
&lt;span style=&quot;color:#a7adba;&quot;&gt;# Returns the element at the given *index*, if in bounds,
&lt;/span&gt;&lt;span style=&quot;color:#a7adba;&quot;&gt;# otherwise executes the given block with the index and returns its value.
&lt;/span&gt;&lt;span style=&quot;color:#a7adba;&quot;&gt;#
&lt;/span&gt;&lt;span style=&quot;color:#a7adba;&quot;&gt;# ```
&lt;/span&gt;&lt;span style=&quot;color:#a7adba;&quot;&gt;# a = [:foo, :bar]
&lt;/span&gt;&lt;span style=&quot;color:#a7adba;&quot;&gt;# a.fetch(0) { :default_value }    # =&amp;gt; :foo
&lt;/span&gt;&lt;span style=&quot;color:#a7adba;&quot;&gt;# a.fetch(2) { :default_value }    # =&amp;gt; :default_value
&lt;/span&gt;&lt;span style=&quot;color:#a7adba;&quot;&gt;# a.fetch(2) { |index| index * 3 } # =&amp;gt; 6
&lt;/span&gt;&lt;span style=&quot;color:#a7adba;&quot;&gt;# ```
&lt;/span&gt;&lt;span style=&quot;color:#a71d5d;&quot;&gt;def &lt;/span&gt;&lt;span style=&quot;color:#795da3;&quot;&gt;fetch&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;(index : Int)
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;  index &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;=&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt; check_index_out_of_bounds(index) &lt;/span&gt;&lt;span style=&quot;color:#a71d5d;&quot;&gt;do
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;    &lt;/span&gt;&lt;span style=&quot;color:#a71d5d;&quot;&gt;return yield &lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;index
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;  &lt;/span&gt;&lt;span style=&quot;color:#a71d5d;&quot;&gt;end
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;  unsafe_fetch(index)
&lt;/span&gt;&lt;span style=&quot;color:#a71d5d;&quot;&gt;end&lt;/span&gt;&lt;/pre&gt;

&lt;p&gt;Neither the &lt;span class=&quot;code&quot;&gt;[]&lt;/span&gt; operator nor the &lt;span
class=&quot;code&quot;&gt;fetch&lt;/span&gt; method are language primitives. To find a language
primitive, I need to keep dividing the code up into smaller and smaller pieces,
until it can’t be divided any further. The same process a chemist would use to
break up some material into smaller and smaller molecules until they are left
with a set of atoms.&lt;/p&gt;
&lt;p&gt;Let’s continue by reading &lt;span class=&quot;code&quot;&gt;unsafe_fetch&lt;/span&gt;:&lt;/p&gt;
&lt;pre style=&quot;background-color:#ffffff;&quot;&gt;
&lt;span style=&quot;color:#a7adba;&quot;&gt;# Returns the element at the given *index*, without doing any bounds check.
&lt;/span&gt;&lt;span style=&quot;color:#a7adba;&quot;&gt;#
&lt;/span&gt;&lt;span style=&quot;color:#a7adba;&quot;&gt;# `Indexable` makes sure to invoke this method with *index* in `0...size`,
&lt;/span&gt;&lt;span style=&quot;color:#a7adba;&quot;&gt;# so converting negative indices to positive ones is not needed here.
&lt;/span&gt;&lt;span style=&quot;color:#a7adba;&quot;&gt;#
&lt;/span&gt;&lt;span style=&quot;color:#a7adba;&quot;&gt;# Clients never invoke this method directly. Instead, they access
&lt;/span&gt;&lt;span style=&quot;color:#a7adba;&quot;&gt;# elements with `#[](index)` and `#[]?(index)`.
&lt;/span&gt;&lt;span style=&quot;color:#a7adba;&quot;&gt;#
&lt;/span&gt;&lt;span style=&quot;color:#a7adba;&quot;&gt;# This method should only be directly invoked if you are absolutely
&lt;/span&gt;&lt;span style=&quot;color:#a7adba;&quot;&gt;# sure the index is in bounds, to avoid a bounds check for a small boost
&lt;/span&gt;&lt;span style=&quot;color:#a7adba;&quot;&gt;# of performance.
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;abstract &lt;/span&gt;&lt;span style=&quot;color:#a71d5d;&quot;&gt;def &lt;/span&gt;&lt;span style=&quot;color:#795da3;&quot;&gt;unsafe_fetch&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;(index : Int)&lt;/span&gt;&lt;/pre&gt;

&lt;p&gt;Since &lt;span class=&quot;code&quot;&gt;Indexable#unsafe_fetch&lt;/span&gt; is an abstract method, I
need to read how the &lt;span class=&quot;code&quot;&gt;Array&lt;/span&gt; class implements it back
in
&lt;a href=&quot;https://github.com/crystal-lang/crystal/blob/master/src/array.cr#L663&quot;&gt;src/array.cr&lt;/a&gt;:&lt;/p&gt;
&lt;pre style=&quot;background-color:#ffffff;&quot;&gt;
&lt;span style=&quot;color:#000000;&quot;&gt;@[AlwaysInline]
&lt;/span&gt;&lt;span style=&quot;color:#a71d5d;&quot;&gt;def &lt;/span&gt;&lt;span style=&quot;color:#795da3;&quot;&gt;unsafe_fetch&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;(index : Int) &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;: &lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;T
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;  &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;@&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;buffer[index]
&lt;/span&gt;&lt;span style=&quot;color:#a71d5d;&quot;&gt;end&lt;/span&gt;&lt;/pre&gt;

&lt;div style=&quot;float: right; padding: 8px 0px 30px 30px; text-align: center; line-height:18px&quot;&gt;
  &lt;img src=&quot;https://patshaughnessy.net/assets/2021/11/29/dig2.jpg&quot;&gt;&lt;br/&gt;
	&lt;i&gt;&lt;small&gt;(source: &lt;a href=&quot;https://commons.wikimedia.org/wiki/File:Digging_in_permafrost.jpg&quot;&gt;Nick Bonzey via Wikimedia Commons&lt;/a&gt;)&lt;/small&gt;&lt;/i&gt;
&lt;/div&gt;
&lt;p&gt;So far, I’ve drilled down through 3 levels of Crystal implementation:&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;https://patshaughnessy.net/assets/2021/11/29/primitive2.png&quot;&gt;&lt;br/&gt;&lt;/p&gt;
&lt;p&gt;But I haven’t found a primitive function yet. Let’s keep digging!&lt;/p&gt;
&lt;h2&gt;The Crystal Array Class&lt;/h2&gt;
&lt;p&gt;To learn more, I need to scroll up and read the beginning of the Crystal &lt;span
class=&quot;code&quot;&gt;Array&lt;/span&gt; class definition:&lt;/p&gt;
&lt;pre style=&quot;background-color:#ffffff;&quot;&gt;
&lt;span style=&quot;color:#a7adba;&quot;&gt;# An `Array` is an ordered, integer-indexed collection of objects of type T.
&lt;/span&gt;&lt;span style=&quot;color:#a7adba;&quot;&gt;#
&lt;/span&gt;&lt;span style=&quot;color:#a7adba;&quot;&gt;# Array indexing starts at 0. A negative index is assumed to be
&lt;/span&gt;&lt;span style=&quot;color:#a7adba;&quot;&gt;# relative to the end of the array: -1 indicates the last element,
&lt;/span&gt;&lt;span style=&quot;color:#a7adba;&quot;&gt;# -2 is the next to last element, and so on.&lt;/span&gt;&lt;/pre&gt;

&lt;p&gt;etc...&lt;/p&gt;
&lt;pre style=&quot;background-color:#ffffff;&quot;&gt;
&lt;span style=&quot;color:#a7adba;&quot;&gt;# An `Array` is implemented using an internal buffer of some capacity
&lt;/span&gt;&lt;span style=&quot;color:#a7adba;&quot;&gt;# and is reallocated when elements are pushed to it when more capacity
&lt;/span&gt;&lt;span style=&quot;color:#a7adba;&quot;&gt;# is needed. This is normally known as a [dynamic array](http://en.wikipedia.org/wiki/Dynamic_array).
&lt;/span&gt;&lt;span style=&quot;color:#a7adba;&quot;&gt;#
&lt;/span&gt;&lt;span style=&quot;color:#a71d5d;&quot;&gt;class &lt;/span&gt;&lt;span style=&quot;color:#008080;&quot;&gt;Array&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;(T)&lt;/span&gt;&lt;/pre&gt;

&lt;p&gt;etc...&lt;/p&gt;
&lt;pre style=&quot;background-color:#ffffff;&quot;&gt;
&lt;span style=&quot;color:#a7adba;&quot;&gt;# The buffer where elements start.
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;  &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;@&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;buffer &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;: &lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;Pointer(T)&lt;/span&gt;&lt;/pre&gt;

&lt;p&gt;I’ve deleted some of the comments and code for clarity. You can read the full,
original version in &lt;a href=&quot;https://github.com/crystal-lang/crystal/blob/master/src/array.cr&quot;&gt;src/array.cr.&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Now I get a sense of how the &lt;span class=&quot;code&quot;&gt;unsafe_fetch&lt;/span&gt; method
above works. Let’s repeat that again:&lt;/p&gt;
&lt;pre style=&quot;background-color:#ffffff;&quot;&gt;
&lt;span style=&quot;color:#a71d5d;&quot;&gt;def &lt;/span&gt;&lt;span style=&quot;color:#795da3;&quot;&gt;unsafe_fetch&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;(index : Int) &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;: &lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;T
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;  &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;@&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;buffer[index]
&lt;/span&gt;&lt;span style=&quot;color:#a71d5d;&quot;&gt;end&lt;/span&gt;&lt;/pre&gt;

&lt;p&gt;Crystal saves all of the elements in each array into a memory buffer called
&lt;span class=&quot;code&quot;&gt;@buffer&lt;/span&gt;. And when I access an element of the array
like this:&lt;/p&gt;
&lt;pre style=&quot;background-color:#ffffff;&quot;&gt;
&lt;span style=&quot;color:#000000;&quot;&gt;puts arr[&lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;1&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;]&lt;/span&gt;&lt;/pre&gt;

&lt;p&gt;Crystal first checks that the array index (1 in this example) is valid, and
then loads the array element I want from the buffer using: &lt;span
class=&quot;code&quot;&gt;@buffer[1]&lt;/span&gt;.&lt;/p&gt;
&lt;h2&gt;The Crystal Pointer Class&lt;/h2&gt;
&lt;p&gt;But how does &lt;span class=&quot;code&quot;&gt;@buffer[index]&lt;/span&gt; actually work? I haven’t learned anything yet! I’m
just going around in circles. So far all I’ve been able to find is that Crystal
implements &lt;span class=&quot;code&quot;&gt;Array#[]&lt;/span&gt; with a different &lt;span class=&quot;code&quot;&gt;[]&lt;/span&gt; operator, on a different class. What
type of object is &lt;span class=&quot;code&quot;&gt;@buffer&lt;/span&gt;? What does it do?&lt;/p&gt;
&lt;p&gt;Reading the array class declaration again more carefully, I can see that
&lt;span class=&quot;code&quot;&gt;@buffer&lt;/span&gt; is an instance of the &lt;span class=&quot;code&quot;&gt;Pointer&lt;/span&gt; class:&lt;/p&gt;
&lt;pre style=&quot;background-color:#ffffff;&quot;&gt;
&lt;span style=&quot;color:#a7adba;&quot;&gt;# The buffer where elements start.
&lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;@&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;buffer &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;: &lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;Pointer(T)&lt;/span&gt;&lt;/pre&gt;

&lt;p&gt;Let’s read how Crystal implements &lt;span class=&quot;code&quot;&gt;Pointer#[]&lt;/span&gt; in
&lt;a href=&quot;https://github.com/crystal-lang/crystal/blob/master/src/pointer.cr#L107&quot;&gt;src/pointer.cr:&lt;/a&gt;&lt;/p&gt;
&lt;pre style=&quot;background-color:#ffffff;&quot;&gt;
&lt;span style=&quot;color:#a7adba;&quot;&gt;# Gets the value pointed at this pointer&amp;#39;s address plus `offset * sizeof(T)`.
&lt;/span&gt;&lt;span style=&quot;color:#a7adba;&quot;&gt;#
&lt;/span&gt;&lt;span style=&quot;color:#a7adba;&quot;&gt;# ```
&lt;/span&gt;&lt;span style=&quot;color:#a7adba;&quot;&gt;# ptr = Pointer.malloc(4) { |i| i + 10 }
&lt;/span&gt;&lt;span style=&quot;color:#a7adba;&quot;&gt;# ptr[0] # =&amp;gt; 10
&lt;/span&gt;&lt;span style=&quot;color:#a7adba;&quot;&gt;# ptr[1] # =&amp;gt; 11
&lt;/span&gt;&lt;span style=&quot;color:#a7adba;&quot;&gt;# ptr[2] # =&amp;gt; 12
&lt;/span&gt;&lt;span style=&quot;color:#a7adba;&quot;&gt;# ptr[3] # =&amp;gt; 13
&lt;/span&gt;&lt;span style=&quot;color:#a7adba;&quot;&gt;# ```
&lt;/span&gt;&lt;span style=&quot;color:#a71d5d;&quot;&gt;def &lt;/span&gt;&lt;span style=&quot;color:#795da3;&quot;&gt;[]&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;(offset)
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;  (self &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;+&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt; offset).value
&lt;/span&gt;&lt;span style=&quot;color:#a71d5d;&quot;&gt;end&lt;/span&gt;&lt;/pre&gt;

&lt;p&gt;Reading this, I discovered the Crystal team has written a class that represents
pointers! Just like using a pointer in C, Crystal code can refer to and access
any memory location directly. Because the &lt;span class=&quot;code&quot;&gt;Pointer&lt;/span&gt;
class is part of the language, Crystal allows us to implement our own data
structures and algorithms in a very detailed manner, allocating and accessing
memory just like the Crystal team has while implementing arrays, hashes and
other classes.&lt;/p&gt;
&lt;p&gt;Now I’ve dug down through 4 levels of Crystal function calls, but I still
haven’t found a language primitive yet.&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;https://patshaughnessy.net/assets/2021/11/29/primitive3.png&quot;&gt;&lt;br/&gt;&lt;/p&gt;
&lt;h2&gt;A Crystal Language Primitive&lt;/h2&gt;
&lt;p&gt;We still haven’t discovered how arrays actually work - how pointers actually
work. That is, reading this line of code above:&lt;/p&gt;
&lt;pre style=&quot;background-color:#ffffff;&quot;&gt;
&lt;span style=&quot;color:#000000;&quot;&gt;(self &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;+&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt; offset).value&lt;/span&gt;&lt;/pre&gt;

&lt;p&gt;…I understand the meaning and intent of pointer arithmetic, and how it’s used
by Crystal arrays, but I still don’t see how or where Crystal actually obtains
the array element referenced by a given pointer.&lt;/p&gt;
&lt;p&gt;Digging deeper, let’s read the &lt;span class=&quot;code&quot;&gt;Pointer#value&lt;/span&gt; method
to find out - this method’s implementation should tell me exactly how Crystal
obtains the value:&lt;/p&gt;
&lt;pre style=&quot;background-color:#ffffff;&quot;&gt;
&lt;span style=&quot;color:#a7adba;&quot;&gt;# Gets the value pointed by this pointer.
&lt;/span&gt;&lt;span style=&quot;color:#a7adba;&quot;&gt;#
&lt;/span&gt;&lt;span style=&quot;color:#a7adba;&quot;&gt;# ```
&lt;/span&gt;&lt;span style=&quot;color:#a7adba;&quot;&gt;# ptr = Pointer(Int32).malloc(4)
&lt;/span&gt;&lt;span style=&quot;color:#a7adba;&quot;&gt;# ptr.value = 42
&lt;/span&gt;&lt;span style=&quot;color:#a7adba;&quot;&gt;# ptr.value # =&amp;gt; 42
&lt;/span&gt;&lt;span style=&quot;color:#a7adba;&quot;&gt;# ```
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;@[Primitive(&lt;/span&gt;&lt;span style=&quot;color:#008080;&quot;&gt;:pointer_get&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;)]
&lt;/span&gt;&lt;span style=&quot;color:#a71d5d;&quot;&gt;def &lt;/span&gt;&lt;span style=&quot;color:#795da3;&quot;&gt;value&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt; : T
&lt;/span&gt;&lt;span style=&quot;color:#a71d5d;&quot;&gt;end&lt;/span&gt;&lt;/pre&gt;

&lt;p&gt;As you can see, this is a language primitive - it says “primitive” right there
above the method definition!&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;https://patshaughnessy.net/assets/2021/11/29/primitive4.png&quot;&gt;&lt;br/&gt;&lt;/p&gt;
&lt;p&gt;Returning to Wikipedia’s definition of a language primitive, this is an atomic
element of an expression. The Crystal compiler knows not to try to compile this
code but to assume this behavior is part of the language. In fact, there is no
implementation for &lt;span class=&quot;code&quot;&gt;Pointer#value&lt;/span&gt; here at all: the
method is empty!&lt;/p&gt;
&lt;pre style=&quot;background-color:#ffffff;&quot;&gt;
&lt;span style=&quot;color:#a71d5d;&quot;&gt;def &lt;/span&gt;&lt;span style=&quot;color:#795da3;&quot;&gt;value&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt; : T
&lt;/span&gt;&lt;span style=&quot;color:#a71d5d;&quot;&gt;end&lt;/span&gt;&lt;/pre&gt;

&lt;p&gt;The empty &lt;span class=&quot;code&quot;&gt;value&lt;/span&gt; method above doesn’t tell us where
the value actually comes from, or how Crystal obtains it. To learn that, we
need to step down one level of abstraction - we need to use a lower level
language, not Crystal.&lt;/p&gt;
&lt;h2&gt;Retrieving an Array Element In x86 Assembly Language&lt;/h2&gt;
&lt;div style=&quot;float: right; padding: 18px 0px 30px 30px; text-align: center; line-height:18px&quot;&gt;
  &lt;img src=&quot;https://patshaughnessy.net/assets/2021/11/29/cave.png&quot;&gt;&lt;br/&gt;
&lt;/div&gt;
&lt;p&gt;What lower level language should we use? Since the Crystal team used the &lt;a href=&quot;https://llvm.org&quot;&gt;Low
Level Virtual Machine (LLVM)&lt;/a&gt; project to implement their
compiler, I could look at LLVM’s low level instruction language. But since I’m
not familiar with that, or with how the Crystal compiler works, I decided to
jump down to the lowest level of abstraction available to me on my Intel Mac:
x86 Assembly Language.&lt;/p&gt;
&lt;p&gt;Here’s my Crystal program again:&lt;/p&gt;
&lt;pre type=&quot;console&quot;&gt;$ cat array_example.cr
arr = [12345, 67890]
puts arr[1]&lt;/pre&gt;
&lt;p&gt;If I compile the program without running it, and then use the &lt;span
class=&quot;code&quot;&gt;llvm-objdump&lt;/span&gt; command, LLVM will give me a version of my
code converted into Intel x86 Assembly Language:&lt;/p&gt;
&lt;pre type=&quot;console&quot;&gt;$ crystal build array_example.cr
$ llvm-objdump -D array_example &gt; array_example.a&lt;/pre&gt;
&lt;p&gt;Now by reading the assembly produced by the Crystal compiler and LLVM, I can
see how the &lt;span class=&quot;code&quot;&gt;Pointer#[]&lt;/span&gt; and &lt;span
class=&quot;code&quot;&gt;Pointer#value&lt;/span&gt; methods actually work:&lt;/p&gt;
&lt;pre type=&quot;console&quot;&gt;0000000100089bb0 &lt;_*Pointer(Int32)@Pointer(T)#[]&lt;Int32&gt;:Int32&gt;:
100089bb0: 50                           pushq %rax
100089bb1: e8 0a 00 00 00               callq 0x100089bc0 &lt;_*Pointer(Int32)@Pointer(T)#+&lt;Int32&gt;:Pointer(Int32)&gt;
100089bb6: 8b 00                        movl  (%rax), %eax
100089bb8: 59                           popq  %rcx
100089bb9: c3                           retq

0000000100089bc0 &lt;_*Pointer(Int32)@Pointer(T)#+&lt;Int32&gt;:Pointer(Int32)&gt;:
100089bc0: 48 63 c6                     movslq  %esi, %rax
100089bc3: 48 c1 e0 02                  shlq  $2, %rax
100089bc7: 48 01 c7                     addq  %rax, %rdi
100089bca: 48 89 f8                     movq  %rdi, %rax
100089bcd: c3                           retq&lt;/pre&gt;
&lt;p&gt;Assembly language is just another programming language like any other, but with
a different set of primitives. The primitives in this language are hardware
instructions that my laptop’s CPU can understand and execute directly:&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;https://patshaughnessy.net/assets/2021/11/29/primitive5.png&quot;&gt;&lt;br/&gt;&lt;/p&gt;
&lt;p&gt;I won’t pretend to understand all the details here, but if you’re curious about
what this code does - how my compiled Crystal program actually retrieves a
value from an array - here are a few highlights you can look for:&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;https://patshaughnessy.net/assets/2021/11/29/assembly-table.svg&quot;&gt;&lt;br/&gt;&lt;/p&gt;
&lt;p&gt;If you’d like to learn more about x86 assembly language, I wrote an article a
few years ago explaining some of the basics: &lt;a href=&quot;https://patshaughnessy.net/2016/11/26/learning-to-read-x86-assembly-language&quot;&gt;Learning to Read x86 Assembly
Language&lt;/a&gt;.&lt;/p&gt;
&lt;h2&gt;Understand the Primitives of the Language You Are Using&lt;/h2&gt;
&lt;p&gt;Why did I bother with this exercise? To make sure I deeply understand the
programming languages I’m using. Ruby hides much of its implementation in C, so
I didn’t learn much looking at the Ruby array primitives in this example. And
the primitive functions of assembly language are by definition the instructions
my CPU can execute directly. It’s always fun trying to identify and understand
machine level instructions!&lt;/p&gt;
&lt;p&gt;But Crystal surprised me - I expected to see a set of primitive array functions
like we have in Ruby, but I was wrong. Instead, I learned that Crystal supports
pointers, just like C or other low level languages do. I discovered that
Crystal, unlike Ruby, might be an appropriate choice for low level systems
programming tasks. And I was able to learn all of this, along with the array
implementation details, because the Crystal team implemented its standard
library in the same, target language: Crystal. All I had to do was make an
effort to read some code.&lt;/p&gt;
&lt;p&gt;Dive into details and find out what the language primitives are in your
favorite programming language. You might be surprised and discover that your
language is capable of much more than you thought it was.&lt;/p&gt;
</content></entry><entry><title>Generic Types: Adding Math Puzzles To Your Code</title><link href="https://patshaughnessy.net/2021/11/6/generic-types-adding-math-puzzles-to-your-code" rel="alternate"></link><id href="https://patshaughnessy.net/2021/11/6/generic-types-adding-math-puzzles-to-your-code" rel="alternate"></id><published>2021-11-06T00:00:00Z</published><updated>2021-11-06T00:00:00Z</updated><category>Crystal</category><author><name>Pat Shaughnessy</name></author><summary type="html">&lt;div style=&quot;float: left; padding: 8px 30px 30px 0px; text-align: center; line-height:18px&quot;&gt;
  &lt;img src=&quot;https://patshaughnessy.net/assets/2021/11/6/formula.png&quot;&gt;&lt;br/&gt;
  &lt;i&gt;In this formula, x is the bound variable, a is&lt;br/&gt;the free variable and e is constant.&lt;/i&gt;
&lt;/div&gt;
&lt;p&gt;Most modern, statically typed languages allow us to use generic types. We write
a function once with generic type syntax, and </summary><content type="html">&lt;div style=&quot;float: left; padding: 8px 30px 30px 0px; text-align: center; line-height:18px&quot;&gt;
  &lt;img src=&quot;https://patshaughnessy.net/assets/2021/11/6/formula.png&quot;&gt;&lt;br/&gt;
  &lt;i&gt;In this formula, x is the bound variable, a is&lt;br/&gt;the free variable and e is constant.&lt;/i&gt;
&lt;/div&gt;
&lt;p&gt;Most modern, statically typed languages allow us to use generic types. We write
a function once with generic type syntax, and then the compiler can apply the
same code over and over again to different actual, concrete types. Hence the
name &lt;em&gt;generic&lt;/em&gt;.&lt;/p&gt;
&lt;p&gt;This is a powerful language feature, but generic code is often confusing and
hard to read. For me, generic code resembles something from my high school
algebra textbook. I see small math puzzles sprinkled around my computer
program. Why do this? Why add math problems to your code? Computer programming
is already hard enough; why make it even more complicated?&lt;/p&gt;
&lt;p&gt;Generic types allow us to get the best of both words: the safety and
performance of static types, with the flexibility and simplicity of a dynamic,
typeless language.&lt;/p&gt;
&lt;p&gt;But this comes at a steep price: Using generic types force you to write two
programs, in a sense. Your normal code for runtime, and a second, parallel
type-specific program that runs at compile time. To see what I mean, let’s take
an example from the &lt;a href=&quot;https://github.com/crystal-lang/crystal/tree/master/src&quot;&gt;Crystal standard
library&lt;/a&gt; and explore
how the Crystal team used generic type syntax to implement their array class.&lt;/p&gt;
&lt;h2&gt;Array#uniq in Crystal&lt;/h2&gt;
&lt;p&gt;&lt;a href=&quot;https://patshaughnessy.net/2021/10/23/to-learn-a-new-language-read-its-standard-library&quot;&gt;Last
time&lt;/a&gt;
I looked at the Crystal standard library, specifically at how Crystal removes
duplicate elements from an array in
&lt;a href=&quot;https://github.com/crystal-lang/crystal/blob/master/src/array.cr#L1843&quot;&gt;Array#uniq&lt;/a&gt;.
I discussed a couple of optimizations the Crystal team used to implement &lt;code&gt;uniq&lt;/code&gt;
for small or empty arrays.&lt;/p&gt;
&lt;p&gt;But what about the general case? How does Crystal remove duplicate elements
from large arrays? If I remove the small array optimizations, the Crystal
implementation of &lt;code&gt;Array#uniq&lt;/code&gt; reads like this:&lt;/p&gt;
&lt;pre style=&quot;background-color:#ffffff;&quot;&gt;
&lt;span style=&quot;color:#a71d5d;&quot;&gt;class &lt;/span&gt;&lt;span style=&quot;color:#008080;&quot;&gt;Array&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;(T)
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;  &lt;/span&gt;&lt;span style=&quot;color:#a71d5d;&quot;&gt;def &lt;/span&gt;&lt;span style=&quot;color:#795da3;&quot;&gt;uniq
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;    &lt;/span&gt;&lt;span style=&quot;color:#a7adba;&quot;&gt;# Convert the Array into a Hash and then ask for its values
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;    to_lookup_hash.values
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;  &lt;/span&gt;&lt;span style=&quot;color:#a71d5d;&quot;&gt;end
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;  &lt;/span&gt;&lt;span style=&quot;color:#795da3;&quot;&gt;protected &lt;/span&gt;&lt;span style=&quot;color:#a71d5d;&quot;&gt;def &lt;/span&gt;&lt;span style=&quot;color:#795da3;&quot;&gt;to_lookup_hash
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;    to_lookup_hash { &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;|&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;elem&lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;|&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt; elem }
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;  &lt;/span&gt;&lt;span style=&quot;color:#a71d5d;&quot;&gt;end
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;  &lt;/span&gt;&lt;span style=&quot;color:#795da3;&quot;&gt;protected &lt;/span&gt;&lt;span style=&quot;color:#a71d5d;&quot;&gt;def &lt;/span&gt;&lt;span style=&quot;color:#795da3;&quot;&gt;to_lookup_hash&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;(&lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;&amp;amp; &lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;: T &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;-&amp;gt; &lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;U) forall U
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;    each_with_object(Hash(U, T).&lt;/span&gt;&lt;span style=&quot;color:#795da3;&quot;&gt;new&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;) &lt;/span&gt;&lt;span style=&quot;color:#a71d5d;&quot;&gt;do &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;|&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;o, h&lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;|
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;      key &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;= &lt;/span&gt;&lt;span style=&quot;color:#a71d5d;&quot;&gt;yield&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt; o
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;      &lt;/span&gt;&lt;span style=&quot;color:#a71d5d;&quot;&gt;unless&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt; h.has_key?(key)
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;        h[key] &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;=&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt; o
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;      &lt;/span&gt;&lt;span style=&quot;color:#a71d5d;&quot;&gt;end
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;    &lt;/span&gt;&lt;span style=&quot;color:#a71d5d;&quot;&gt;end
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;  &lt;/span&gt;&lt;span style=&quot;color:#a71d5d;&quot;&gt;end
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;
&lt;/span&gt;&lt;span style=&quot;color:#a71d5d;&quot;&gt;end&lt;/span&gt;&lt;/pre&gt;

&lt;p&gt;This is &lt;em&gt;almost&lt;/em&gt; easy to read. Admittedly, I’m a Ruby developer, so the Ruby-like
syntax makes perfect sense to me. However, most developers will be able to
figure this out without much effort.&lt;/p&gt;
&lt;p&gt;Crystal identifies the unique elements of the array by converting it into a
“lookup hash:”&lt;/p&gt;
&lt;p&gt;&lt;img width=&quot;650&quot; src=&quot;https://patshaughnessy.net/assets/2021/11/6/lookup-hash.svg&quot;&gt;&lt;br/&gt;&lt;/p&gt;
&lt;p&gt;As you know, hash keys are unique. By converting the array into a hash, Crystal
has quickly identified the unique elements of that array.&lt;/p&gt;
&lt;h2&gt;Converting An Array To A Hash At Runtime&lt;/h2&gt;
&lt;p&gt;But if you read carefully, you’ll see that Crystal converts the array to a hash
twice: once at compile time and then later again at runtime. Let’s look at the
second, runtime program first, working our way from the inside out.&lt;/p&gt;
&lt;p&gt;First, the &lt;code&gt;unless&lt;/code&gt; clause inside the loop checks whether the hash already
contains a given element. If the element isn’t already in the hash, Crystal
adds it:&lt;/p&gt;
&lt;pre style=&quot;background-color:#ffffff;&quot;&gt;
&lt;span style=&quot;color:#a71d5d;&quot;&gt;unless&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt; h.has_key?(key)
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;  h[key] &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;=&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt; o
&lt;/span&gt;&lt;span style=&quot;color:#a71d5d;&quot;&gt;end&lt;/span&gt;&lt;/pre&gt;

&lt;p&gt;This is the crux of the unique algorithm. Crystal won’t insert a given key
value twice. (Although the &lt;code&gt;unless&lt;/code&gt; clause is technically unnecessary; saving a
repeated value would be harmless, overwriting the previous copy in the hash.)&lt;/p&gt;
&lt;p&gt;Looking up one line, we can see the &lt;code&gt;to_lookup_hash&lt;/code&gt; function accepts a block,
and calls it to calculate the a key value for each array element:&lt;/p&gt;
&lt;pre style=&quot;background-color:#ffffff;&quot;&gt;
&lt;span style=&quot;color:#000000;&quot;&gt;key &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;= &lt;/span&gt;&lt;span style=&quot;color:#a71d5d;&quot;&gt;yield&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt; o
&lt;/span&gt;&lt;span style=&quot;color:#a71d5d;&quot;&gt;unless&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt; h.has_key?(key)
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;  h[key] &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;=&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt; o
&lt;/span&gt;&lt;span style=&quot;color:#a71d5d;&quot;&gt;end&lt;/span&gt;&lt;/pre&gt;

&lt;p&gt;And reading farther up, we can see another definition of &lt;code&gt;to_lookup_hash&lt;/code&gt; passes
in such a block:&lt;/p&gt;
&lt;pre style=&quot;background-color:#ffffff;&quot;&gt;
&lt;span style=&quot;color:#795da3;&quot;&gt;protected &lt;/span&gt;&lt;span style=&quot;color:#a71d5d;&quot;&gt;def &lt;/span&gt;&lt;span style=&quot;color:#795da3;&quot;&gt;to_lookup_hash
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;  to_lookup_hash { &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;|&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;elem&lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;|&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt; elem }
&lt;/span&gt;&lt;span style=&quot;color:#a71d5d;&quot;&gt;end&lt;/span&gt;&lt;/pre&gt;

&lt;p&gt;Since the block &lt;code&gt;{ |elem| elem }&lt;/code&gt; just returns whatever was passed into it, the
keys and values of the lookup hash will be the same:&lt;/p&gt;
&lt;p&gt;&lt;img width=&quot;275&quot; src=&quot;https://patshaughnessy.net/assets/2021/11/6/keys-and-values.svg&quot;&gt;&lt;br/&gt;&lt;/p&gt;
&lt;p&gt;This block adds a bit of flexibility to the code. The Crystal team might want
to reuse this function someday with a different block and set of keys.&lt;/p&gt;
&lt;p&gt;Finally, let’s look at how Crystal iterates over the array:&lt;/p&gt;
&lt;pre style=&quot;background-color:#ffffff;&quot;&gt;
&lt;span style=&quot;color:#000000;&quot;&gt;each_with_object(Hash(U, T).&lt;/span&gt;&lt;span style=&quot;color:#795da3;&quot;&gt;new&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;) &lt;/span&gt;&lt;span style=&quot;color:#a71d5d;&quot;&gt;do &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;|&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;o, h&lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;|
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;  key &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;= &lt;/span&gt;&lt;span style=&quot;color:#a71d5d;&quot;&gt;yield&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt; o
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;  &lt;/span&gt;&lt;span style=&quot;color:#a71d5d;&quot;&gt;unless&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt; h.has_key?(key)
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;    h[key] &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;=&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt; o
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;  &lt;/span&gt;&lt;span style=&quot;color:#a71d5d;&quot;&gt;end
&lt;/span&gt;&lt;span style=&quot;color:#a71d5d;&quot;&gt;end&lt;/span&gt;&lt;/pre&gt;

&lt;p&gt;Crystal calls &lt;code&gt;each_with_object&lt;/code&gt; on the array and provides a single argument:
&lt;code&gt;Hash(U, T).new&lt;/code&gt;. Here’s our first example of generic type syntax. I’ll come
back to that in a moment. For now, I can guess that &lt;code&gt;Hash(U, T).new&lt;/code&gt; creates a
new, empty hash.&lt;/p&gt;
&lt;p&gt;Next, &lt;code&gt;each_with_object&lt;/code&gt; loops over the receiver (the array), and calls the block
&lt;code&gt;do |o, h| … end&lt;/code&gt; for each element. It sets &lt;code&gt;o&lt;/code&gt; to each element’s value as it
iterates, and &lt;code&gt;h&lt;/code&gt; to the hash created with &lt;code&gt;Hash(U, T).new&lt;/code&gt;. As we saw above, the
block inserts each value &lt;code&gt;o&lt;/code&gt; into the hash &lt;code&gt;h&lt;/code&gt;, skipping duplicates.&lt;/p&gt;
&lt;p&gt;Finally, after the iteration completes, Crystal returns the values from the new
hash, a new array containing only the unique elements from the original array:`&lt;/p&gt;
&lt;pre style=&quot;background-color:#ffffff;&quot;&gt;
&lt;span style=&quot;color:#000000;&quot;&gt;to_lookup_hash.values&lt;/span&gt;&lt;/pre&gt;

&lt;h2&gt;Converting An Array To A Hash At Compile Time&lt;/h2&gt;
&lt;p&gt;But there’s more to this program than meets the eye. Crystal actually runs part
of this code, the generic type syntax, earlier at compile time. And, curiously,
that code also converts the array into a hash but in a different way. When the
Crystal team wrote &lt;code&gt;Array#uniq&lt;/code&gt;, they had to write two programs, not one!&lt;/p&gt;
&lt;p&gt;What exactly do I mean by “converting an array to a hash at compile time?”
How and why does this happen? And what’s the second program here?&lt;/p&gt;
&lt;p&gt;The answer has to do with the &lt;code&gt;Hash(U, T).new&lt;/code&gt; expression we read above. Crystal
needs to convert the type &lt;code&gt;Array(T)&lt;/code&gt; into the type &lt;code&gt;Hash(U, T)&lt;/code&gt;. Let’s step through
the second, type-level mirror program to find out how this works.&lt;/p&gt;
&lt;p&gt;You can imagine the Crystal compiler processing the generic type code like
this:&lt;/p&gt;
&lt;p&gt;&lt;img width=&quot;550&quot; src=&quot;https://patshaughnessy.net/assets/2021/11/6/solved-puzzle.svg&quot;&gt;&lt;br/&gt;&lt;/p&gt;
&lt;p&gt;The first line is actually the most important:&lt;/p&gt;
&lt;pre style=&quot;background-color:#ffffff;&quot;&gt;
&lt;span style=&quot;color:#a71d5d;&quot;&gt;class &lt;/span&gt;&lt;span style=&quot;color:#008080;&quot;&gt;Array&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;(T)&lt;/span&gt;&lt;/pre&gt;

&lt;p&gt;This line means the Crystal team is not implementing a simple array of
elements. Instead, they are implementing an array that must contain elements of
the same type. And here on this line they name that type in a generic way: the
type variable &lt;code&gt;T&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;Specifying an array element type provides two important benefits: First, it is
a nice safety net for us, the developers using Crystal arrays. Crystal’s
compiler will prevent us from accidentally inserting elements from a different
or unexpected type into the array. And the best part is that Crystal will tell
us about our mistake at compile time, before our code ever runs and does any
harm with real data. It’s annoying and difficult to write a second compile-time
program using types, but the compiler might - and probably will - find some of
our mistakes and tell us before the program even runs.&lt;/p&gt;
&lt;p&gt;Second, because Crystal knows that all of the elements of the array have the
same type, it can emit more efficient code that takes advantage of this
knowledge. The machine language code the compiler produces can save and copy
array elements faster because it knows how much memory each element occupies.&lt;/p&gt;
&lt;p&gt;And by using generic type syntax to write &lt;code&gt;Array#uniq&lt;/code&gt;, the Crystal team gives us
these benefits regardless of what kind of elements we add to our arrays. The
Crystal compiler automatically maps the type we happen to choose in our code to
the variable &lt;code&gt;T&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;Next, take a look at the &lt;code&gt;to_lookup_hash&lt;/code&gt; function declaration:&lt;/p&gt;
&lt;pre style=&quot;background-color:#ffffff;&quot;&gt;
&lt;span style=&quot;color:#795da3;&quot;&gt;protected &lt;/span&gt;&lt;span style=&quot;color:#a71d5d;&quot;&gt;def &lt;/span&gt;&lt;span style=&quot;color:#795da3;&quot;&gt;to_lookup_hash&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;(&lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;&amp;amp; &lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;: T &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;-&amp;gt; &lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;U) forall U&lt;/span&gt;&lt;/pre&gt;

&lt;p&gt;What in the world does this mean? What is &lt;code&gt;forall U&lt;/code&gt; referring to?&lt;/p&gt;
&lt;p&gt;The first thing to note here is Crystal’s block parameter syntax: &lt;code&gt;&amp;amp; : T -&amp;gt; U&lt;/code&gt;.
Crystal has borrowed the &lt;code&gt;&amp;amp;&lt;/code&gt; character from Ruby to indicate the following
value is a block or closure, not a simple value. But in Crystal, the block
parameters and the block’s return value all must have types. And here in this
code those types are generic types: &lt;code&gt;T&lt;/code&gt; and &lt;code&gt;U&lt;/code&gt;. The arrow syntax tells us that the
block takes a single parameter of type &lt;code&gt;T&lt;/code&gt;, and returns a single value of type &lt;code&gt;U&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;But what do &lt;code&gt;T&lt;/code&gt; and &lt;code&gt;U&lt;/code&gt; mean? Where are they defined?&lt;/p&gt;
&lt;p&gt;This is our math puzzle for the day. Just like in a limit, integral or infinite
series from Calculus, this formula contains &lt;em&gt;bound and free variables&lt;/em&gt;:&lt;/p&gt;
&lt;pre style=&quot;background-color:#ffffff;&quot;&gt;
&lt;span style=&quot;color:#4f5b66;&quot;&gt;&amp;amp; : &lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;T &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;-&amp;gt; &lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;U forall U&lt;/span&gt;&lt;/pre&gt;

&lt;p&gt;Since the enclosing class statement above declared the class to be &lt;code&gt;Array(T)&lt;/code&gt;,
Crystal binds the &lt;code&gt;T&lt;/code&gt; type variable here to be the same type as above. In order
words, the type of values passed into this block must be the same as the type
of the elements in the array.&lt;/p&gt;
&lt;p&gt;But what about &lt;code&gt;U&lt;/code&gt;? What type is that?&lt;/p&gt;
&lt;p&gt;The &lt;code&gt;forall U&lt;/code&gt; clause declares the U type to be a &lt;em&gt;free variable&lt;/em&gt;. That means
that &lt;code&gt;U&lt;/code&gt;, unlike the type &lt;code&gt;T&lt;/code&gt;, is not bound to any known type value. &lt;code&gt;forall&lt;/code&gt;
tells the Crystal compiler that the following code should apply equally well to
any type &lt;code&gt;U&lt;/code&gt;, “for all” possible types &lt;code&gt;U&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;The Crystal compiler solves this math puzzle using the &lt;code&gt;yield&lt;/code&gt; statement we saw
above:&lt;/p&gt;
&lt;pre style=&quot;background-color:#ffffff;&quot;&gt;
&lt;span style=&quot;color:#000000;&quot;&gt;key &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;= &lt;/span&gt;&lt;span style=&quot;color:#a71d5d;&quot;&gt;yield&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt; o&lt;/span&gt;&lt;/pre&gt;

&lt;p&gt;Here Crystal knows the value &lt;code&gt;o&lt;/code&gt; has a type of &lt;code&gt;T&lt;/code&gt;. How? Because Crystal knows
that all of the elements of the array have a type &lt;code&gt;T&lt;/code&gt;, (this is the &lt;code&gt;Array(T)&lt;/code&gt;
class) and Crystal knows the variable &lt;code&gt;o&lt;/code&gt; was set earlier to an array element
by &lt;code&gt;each_with_object&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;Next, the Crystal compiler can determine that &lt;code&gt;U == T&lt;/code&gt;, that both types are the
same. How? When Crystal compiles the block’s code above:&lt;/p&gt;
&lt;pre style=&quot;background-color:#ffffff;&quot;&gt;
&lt;span style=&quot;color:#000000;&quot;&gt;to_lookup_hash { &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;|&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;elem&lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;|&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt; elem }&lt;/span&gt;&lt;/pre&gt;

&lt;p&gt;…Crystal notices that the return value of the block is the same as the value
passed into the block, that &lt;code&gt;elem == elem&lt;/code&gt;. Then, the Crystal compiler maps
this return value to the block declaration: &lt;code&gt;&amp;amp; : T -&amp;gt; U&lt;/code&gt;. Because Crystal knows
&lt;code&gt;elem == elem&lt;/code&gt; in the block code, it deduces that the types of these values are
also the same, that &lt;code&gt;U == T&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;Finally, let’s return to the line above that iterates over the array:&lt;/p&gt;
&lt;pre style=&quot;background-color:#ffffff;&quot;&gt;
&lt;span style=&quot;color:#000000;&quot;&gt;each_with_object(Hash(U, T).&lt;/span&gt;&lt;span style=&quot;color:#795da3;&quot;&gt;new&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;) &lt;/span&gt;&lt;span style=&quot;color:#a71d5d;&quot;&gt;do &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;|&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;o, h&lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;|&lt;/span&gt;&lt;/pre&gt;

&lt;p&gt;Now the Crystal compiler can use the knowledge that types &lt;code&gt;U == T&lt;/code&gt; when it
creates the lookup hash. In Crystal when you create a hash, just as when you
create an array, you have to provide a type parameter for all the values. And
for hashes you also need to provide a type for the keys. &lt;code&gt;Hash(U, T).new&lt;/code&gt;
means: Create a new hash which has keys of type &lt;code&gt;U&lt;/code&gt; and values of type &lt;code&gt;T&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;Armed with the knowledge that the keys of the lookup hash all have type &lt;code&gt;U&lt;/code&gt;, and
that &lt;code&gt;U == T&lt;/code&gt;, the Crystal compiler can emit the correct, optimized code for
finding and inserting hash values when it produces machine language
instructions for this passage:&lt;/p&gt;
&lt;pre style=&quot;background-color:#ffffff;&quot;&gt;
&lt;span style=&quot;color:#000000;&quot;&gt;key &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;= &lt;/span&gt;&lt;span style=&quot;color:#a71d5d;&quot;&gt;yield&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt; o
&lt;/span&gt;&lt;span style=&quot;color:#a71d5d;&quot;&gt;unless&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt; h.has_key?(key)
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;  h[key] &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;=&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt; o
&lt;/span&gt;&lt;span style=&quot;color:#a71d5d;&quot;&gt;end&lt;/span&gt;&lt;/pre&gt;

&lt;p&gt;Crystal knows that both &lt;code&gt;o&lt;/code&gt; and &lt;code&gt;key&lt;/code&gt; have the same type &lt;code&gt;T&lt;/code&gt;, which allows the
&lt;code&gt;has_key?&lt;/code&gt; and &lt;code&gt;[]=&lt;/code&gt; methods to run quickly and correctly.&lt;/p&gt;
&lt;h2&gt;Are Generic Types Worth It?&lt;/h2&gt;
&lt;p&gt;It’s a good thing the Crystal compiler is good at solving math puzzles!
Crystal, like many other modern languages (Haskell, Swift, Rust, etc.) is able
to determine the actual, concrete types for variables like &lt;code&gt;T&lt;/code&gt; and &lt;code&gt;U&lt;/code&gt;, and for
all values in your code, using &lt;em&gt;type inference&lt;/em&gt;. The Crystal compiler can
deduce what the type of each variable is based on the usage of that variable
and the context of the surrounding code.&lt;/p&gt;
&lt;p&gt;But the problem is that I, a user of the Crystal language, have to be good at
math puzzles also. As we’ve just seen, in order to write code using Crystal or
any language with a modern type system, I have to write my code twice: once to
solve the problem I actually want to solve, and a second time to prove to the
compiler that my code is consistent and mathematically correct at a type level.&lt;/p&gt;
&lt;p&gt;Are the added performance and added safety worth it? That depends entirely on
what code you are writing, how fast it needs to run, how many times and for how
long that code will be used - and most importantly, how much time you have to
solve math problems.&lt;/p&gt;
</content></entry><entry><title>To Learn a New Language, Read Its Standard Library</title><link href="https://patshaughnessy.net/2021/10/23/to-learn-a-new-language-read-its-standard-library" rel="alternate"></link><id href="https://patshaughnessy.net/2021/10/23/to-learn-a-new-language-read-its-standard-library" rel="alternate"></id><published>2021-10-23T00:00:00Z</published><updated>2021-10-23T00:00:00Z</updated><category>Crystal</category><author><name>Pat Shaughnessy</name></author><summary type="html">&lt;div style=&quot;float: left; padding: 8px 30px 20px 0px; text-align: center; line-height:18px&quot;&gt;
  &lt;img src=&quot;https://patshaughnessy.net/assets/2021/10/23/chicken-little.png&quot;&gt;&lt;br/&gt;
  &lt;i&gt;If I was learning to read English as a foreign language,&lt;br/&gt; I would need something simple to get started.&lt;br/&gt;
  &lt;small&gt;(from The Remarkable Story of Chicken Little, 1840)&lt;/small&gt;&lt;/i&gt; 
&lt;/div&gt;
&lt;p&gt;The best way to learn a</summary><content type="html">&lt;div style=&quot;float: left; padding: 8px 30px 20px 0px; text-align: center; line-height:18px&quot;&gt;
  &lt;img src=&quot;https://patshaughnessy.net/assets/2021/10/23/chicken-little.png&quot;&gt;&lt;br/&gt;
  &lt;i&gt;If I was learning to read English as a foreign language,&lt;br/&gt; I would need something simple to get started.&lt;br/&gt;
  &lt;small&gt;(from The Remarkable Story of Chicken Little, 1840)&lt;/small&gt;&lt;/i&gt; 
&lt;/div&gt;
&lt;p&gt;The best way to learn a new programming language, just like a human language,
is from example. To learn how to write code you first need to read someone
else’s code. But who is the best person to learn from? Which code should we
read? Where should we look to find it?&lt;/p&gt;
&lt;p&gt;This year in my spare time I was learning about
&lt;a href=&quot;https://crystal-lang.org&quot;&gt;Crystal&lt;/a&gt;. I had played around with some simple
scripts, but I wanted to learn more. Then I stumbled on to Crystal’s &lt;a href=&quot;https://github.com/crystal-lang/crystal/tree/master/src&quot;&gt;standard
library&lt;/a&gt;. I was
relieved to see that Crystal’s core classes are implemented using Crystal
itself!&lt;/p&gt;
&lt;p&gt;Crystal’s standard library is clear, simple, concise and well documented.
Reading Crystal’s internal implementation of Array or Hash is like reading
a fairy tale in a children’s book. Anyone can understand it, even people
without a Ph.D. in Computer Science or systems programming experience.&lt;/p&gt;
&lt;div style=&quot;clear: left&quot;&gt;&lt;/div&gt;
&lt;p&gt;&lt;b&gt;Update:&lt;/b&gt; There was a &lt;a href=&quot;https://news.ycombinator.com/item?id=28975453&quot;&gt;long discussion on Hacker
News&lt;/a&gt; about whether reading the
standard library really is a good idea for various different languages.&lt;/p&gt;
&lt;h2&gt;At First Glance, Crystal Is Ruby&lt;/h2&gt;
&lt;p&gt;At first glance, when I read Crystal’s &lt;a href=&quot;https://github.com/crystal-lang/crystal/blob/master/src/array.cr&quot;&gt;Array
implementation&lt;/a&gt;,
I thought I was reading a Ruby program:&lt;/p&gt;
&lt;pre style=&quot;background-color:#ffffff;&quot;&gt;
&lt;span style=&quot;color:#a71d5d;&quot;&gt;class &lt;/span&gt;&lt;span style=&quot;color:#008080;&quot;&gt;Array&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;(T)
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;  &lt;/span&gt;&lt;span style=&quot;color:#795da3;&quot;&gt;include &lt;/span&gt;&lt;span style=&quot;color:#008080;&quot;&gt;Indexable&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;::Mutable(T)
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;  &lt;/span&gt;&lt;span style=&quot;color:#795da3;&quot;&gt;include &lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;Comparable(Array)
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;  &lt;/span&gt;&lt;span style=&quot;color:#a7adba;&quot;&gt;# Size of an Array that we consider small to do linear scans or other optimizations.
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;  &lt;/span&gt;&lt;span style=&quot;color:#795da3;&quot;&gt;private &lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;SMALL_ARRAY_SIZE &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;= &lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;16
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;  &lt;/span&gt;&lt;span style=&quot;color:#a7adba;&quot;&gt;# The size of this array.
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;  &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;@&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;size &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;: &lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;Int32
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;  &lt;/span&gt;&lt;span style=&quot;color:#a7adba;&quot;&gt;# The capacity of `@buffer`.
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;  &lt;/span&gt;&lt;span style=&quot;color:#a7adba;&quot;&gt;# Note that, because `@buffer` moves on shift, the actual
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;  &lt;/span&gt;&lt;span style=&quot;color:#a7adba;&quot;&gt;# capacity (the allocated memory) starts at `@buffer - @offset_to_buffer`.
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;  &lt;/span&gt;&lt;span style=&quot;color:#a7adba;&quot;&gt;# The actual capacity is also given by the `remaining_capacity` internal method.
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;  &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;@&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;capacity &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;: &lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;Int32
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;  &lt;/span&gt;&lt;span style=&quot;color:#a7adba;&quot;&gt;# Offset to the buffer that was originally allocated, and which needs to
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;  &lt;/span&gt;&lt;span style=&quot;color:#a7adba;&quot;&gt;# be reallocated on resize. On shift this value gets increased, together with
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;  &lt;/span&gt;&lt;span style=&quot;color:#a7adba;&quot;&gt;# `@buffer`. To reach the root buffer you have to do `@buffer - @offset_to_buffer`,
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;  &lt;/span&gt;&lt;span style=&quot;color:#a7adba;&quot;&gt;# and this is also provided by the `root_buffer` internal method.
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;  &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;@&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;offset_to_buffer &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;: &lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;Int32 &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;= &lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;0
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;  &lt;/span&gt;&lt;span style=&quot;color:#a7adba;&quot;&gt;# The buffer where elements start.
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;  &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;@&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;buffer &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;: &lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;Pointer(T)&lt;/span&gt;&lt;/pre&gt;

&lt;p&gt;There are lots of familiar keywords, like &lt;code&gt;class&lt;/code&gt;, &lt;code&gt;include&lt;/code&gt; and &lt;code&gt;private&lt;/code&gt;. I also
see Ruby’s &lt;code&gt;@&lt;/code&gt; character indicating an instance variable. This code is about 100x
easier to read vs. &lt;a href=&quot;https://github.com/ruby/ruby/blob/master/array.c&quot;&gt;Ruby’s own C implementation of
Array&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Along with the familiar Ruby-like syntax, notice the helpful comments. Even
though I’ve just started reading I can already make an educated guess at how
Crystal arrays function internally. I can see there’s a pointer to memory which
holds the array elements, and that the code keeps track of the capacity of this
memory along with the actual size of the array. Finally, reading the comment for
&lt;code&gt;offset_to_buffer&lt;/code&gt; I can imagine there are some optimizations related to adding
and removing elements. The comment is both helpful and intriguing.&lt;/p&gt;
&lt;p&gt;But I’m not reading Ruby code. There are important differences here: generic
type syntax and most importantly each of the instance variables is declared
with a static type known at compile time. How do I use static types in Crystal?
What types are available? What about the generic type parameter &lt;code&gt;T&lt;/code&gt;? Should I
use that in my own Crystal code? What other syntax differences vs. Ruby are
there?&lt;/p&gt;
&lt;p&gt;The best way to learn how to write Crystal code is simply to scroll down and
read one of the Array methods.&lt;/p&gt;
&lt;h2&gt;Array#uniq&lt;/h2&gt;
&lt;p&gt;Here’s how Crystal finds the unique elements of an array:&lt;/p&gt;
&lt;pre style=&quot;background-color:#ffffff;&quot;&gt;
&lt;span style=&quot;color:#a71d5d;&quot;&gt;def &lt;/span&gt;&lt;span style=&quot;color:#795da3;&quot;&gt;uniq
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;  &lt;/span&gt;&lt;span style=&quot;color:#a71d5d;&quot;&gt;if&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt; size &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;&amp;lt;= &lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;1
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;    &lt;/span&gt;&lt;span style=&quot;color:#a71d5d;&quot;&gt;return &lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;dup
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;  &lt;/span&gt;&lt;span style=&quot;color:#a71d5d;&quot;&gt;end
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;  &lt;/span&gt;&lt;span style=&quot;color:#a7adba;&quot;&gt;# Heuristic: for a small array it&amp;#39;s faster to do a linear scan
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;  &lt;/span&gt;&lt;span style=&quot;color:#a7adba;&quot;&gt;# than creating a Hash to find out duplicates.
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;  &lt;/span&gt;&lt;span style=&quot;color:#a71d5d;&quot;&gt;if&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt; size &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;&amp;lt;= &lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;SMALL_ARRAY_SIZE
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;    ary &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;= &lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;Array(T).&lt;/span&gt;&lt;span style=&quot;color:#795da3;&quot;&gt;new
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;    each &lt;/span&gt;&lt;span style=&quot;color:#a71d5d;&quot;&gt;do &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;|&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;elem&lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;|
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;      ary &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;&amp;lt;&amp;lt;&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt; elem &lt;/span&gt;&lt;span style=&quot;color:#a71d5d;&quot;&gt;unless&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt; ary.includes?(elem)
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;    &lt;/span&gt;&lt;span style=&quot;color:#a71d5d;&quot;&gt;end
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;    &lt;/span&gt;&lt;span style=&quot;color:#a71d5d;&quot;&gt;return&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt; ary
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;  &lt;/span&gt;&lt;span style=&quot;color:#a71d5d;&quot;&gt;end
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;  &lt;/span&gt;&lt;span style=&quot;color:#a7adba;&quot;&gt;# Convert the Array into a Hash and then ask for its values
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;  to_lookup_hash.values
&lt;/span&gt;&lt;span style=&quot;color:#a71d5d;&quot;&gt;end&lt;/span&gt;&lt;/pre&gt;

&lt;p&gt;The first three lines handle the trivial case of when an array is empty or
contains only one element:&lt;/p&gt;
&lt;pre style=&quot;background-color:#ffffff;&quot;&gt;
&lt;span style=&quot;color:#a71d5d;&quot;&gt;if&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt; size &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;&amp;lt;= &lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;1
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;  &lt;/span&gt;&lt;span style=&quot;color:#a71d5d;&quot;&gt;return &lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;dup
&lt;/span&gt;&lt;span style=&quot;color:#a71d5d;&quot;&gt;end&lt;/span&gt;&lt;/pre&gt;

&lt;p&gt;Obviously in this case, there are no duplicate elements and &lt;code&gt;Array#uniq&lt;/code&gt; should
simply return the original array. One important detail: Crystal uses &lt;code&gt;dup&lt;/code&gt; to
return a copy of the array. This reminds me that in Ruby &lt;code&gt;uniq&lt;/code&gt; returns a copy
of the receiver, while &lt;code&gt;uniq!&lt;/code&gt; mutates the receiver. My guess is that Crystal
implements Array methods in the same way…&lt;/p&gt;
&lt;p&gt;The second passage is an optimization:&lt;/p&gt;
&lt;pre style=&quot;background-color:#ffffff;&quot;&gt;
&lt;span style=&quot;color:#a7adba;&quot;&gt;# Heuristic: for a small array it&amp;#39;s faster to do a linear scan
&lt;/span&gt;&lt;span style=&quot;color:#a7adba;&quot;&gt;# than creating a Hash to find out duplicates.
&lt;/span&gt;&lt;span style=&quot;color:#a71d5d;&quot;&gt;if&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt; size &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;&amp;lt;= &lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;SMALL_ARRAY_SIZE
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;  ary &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;= &lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;Array(T).&lt;/span&gt;&lt;span style=&quot;color:#795da3;&quot;&gt;new
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;  each &lt;/span&gt;&lt;span style=&quot;color:#a71d5d;&quot;&gt;do &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;|&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;elem&lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;|
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;    ary &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;&amp;lt;&amp;lt;&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt; elem &lt;/span&gt;&lt;span style=&quot;color:#a71d5d;&quot;&gt;unless&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt; ary.includes?(elem)
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;  &lt;/span&gt;&lt;span style=&quot;color:#a71d5d;&quot;&gt;end
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;  &lt;/span&gt;&lt;span style=&quot;color:#a71d5d;&quot;&gt;return&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt; ary
&lt;/span&gt;&lt;span style=&quot;color:#a71d5d;&quot;&gt;end&lt;/span&gt;&lt;/pre&gt;

&lt;p&gt;For small arrays (16 or fewer elements) Crystal iterates over them and removes
duplicates using a simple algorithm. I’ll take a look at how that works in a
moment.&lt;/p&gt;
&lt;p&gt;The final line of code handles arrays with 17 or more elements:&lt;/p&gt;
&lt;pre style=&quot;background-color:#ffffff;&quot;&gt;
&lt;span style=&quot;color:#a7adba;&quot;&gt;# Convert the Array into a Hash and then ask for its values
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;to_lookup_hash.values&lt;/span&gt;&lt;/pre&gt;

&lt;p&gt;As you might guess, Crystal removes duplicate values from larger arrays using a
hash. I'll dive into the details about how this works in my next post.&lt;/p&gt;
&lt;h2&gt;Arrays With 16 Or Fewer Elements&lt;/h2&gt;
&lt;p&gt;But first, let’s take a closer look at case #2 from above, when the array
contains 16 or fewer elements. First, Crystal creates a new, empty array called
ary:&lt;/p&gt;
&lt;pre style=&quot;background-color:#ffffff;&quot;&gt;
&lt;span style=&quot;color:#000000;&quot;&gt;ary &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;= &lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;Array(T).&lt;/span&gt;&lt;span style=&quot;color:#795da3;&quot;&gt;new&lt;/span&gt;&lt;/pre&gt;

&lt;p&gt;Note the generic type syntax &lt;code&gt;Array(T).new&lt;/code&gt;. This tells the Crystal compiler
that the new array, what will become the return value from &lt;code&gt;Array#uniq&lt;/code&gt;, will
only contain elements of the same type as the original array.&lt;/p&gt;
&lt;p&gt;Ruby developers will find the rest of this code easy to follow…&lt;/p&gt;
&lt;pre style=&quot;background-color:#ffffff;&quot;&gt;
&lt;span style=&quot;color:#000000;&quot;&gt;each &lt;/span&gt;&lt;span style=&quot;color:#a71d5d;&quot;&gt;do &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;|&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;elem&lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;|
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;  ary &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;&amp;lt;&amp;lt;&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt; elem &lt;/span&gt;&lt;span style=&quot;color:#a71d5d;&quot;&gt;unless&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt; ary.includes?(elem)
&lt;/span&gt;&lt;span style=&quot;color:#a71d5d;&quot;&gt;end&lt;/span&gt;&lt;/pre&gt;

&lt;p&gt;Crystal calls &lt;code&gt;each&lt;/code&gt; to iterate over all the elements in the receiver, the
array we are calling &lt;code&gt;uniq&lt;/code&gt; on. Then using &lt;code&gt;&amp;lt;&amp;lt;&lt;/code&gt;, Crystal appends each of the
original array’s elements to the new array, unless the new array already
contains a given element.&lt;/p&gt;
&lt;p&gt;Like Ruby, Crystal implements the &lt;code&gt;includes?&lt;/code&gt; method inside the &lt;code&gt;Enumerable&lt;/code&gt;
module. Crystal arrays are enumerable because of the &lt;code&gt;include Indexable::Mutable(T)&lt;/code&gt; statement we read above. (&lt;code&gt;Indexable::Mutable&lt;/code&gt; includes
&lt;code&gt;Indexable&lt;/code&gt; which includes &lt;code&gt;Enumerable&lt;/code&gt;). You can find Crystal’s implementation
of &lt;code&gt;includes?&lt;/code&gt; (not &lt;code&gt;include?&lt;/code&gt; as in Ruby) in
&lt;a href=&quot;https://github.com/crystal-lang/crystal/blob/master/src/enumerable.cr&quot;&gt;enumerable.cr&lt;/a&gt;:&lt;/p&gt;
&lt;pre style=&quot;background-color:#ffffff;&quot;&gt;
&lt;span style=&quot;color:#a71d5d;&quot;&gt;def &lt;/span&gt;&lt;span style=&quot;color:#795da3;&quot;&gt;includes?&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;(obj) &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;: &lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;Bool
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;  any? { &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;|&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;e&lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;|&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt; e &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;==&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt; obj }
&lt;/span&gt;&lt;span style=&quot;color:#a71d5d;&quot;&gt;end&lt;/span&gt;&lt;/pre&gt;

&lt;p&gt;Here the &lt;code&gt;any?&lt;/code&gt; method calls the given block once for each element in the
array, and returns true if the block returns true for any of the elements. In
other words, this code searches the array in a linear fashion, one element at a
time. Crystal’s development team has decided that it’s faster to filter out
repeated elements from small arrays by repeatedly searching the array using
linear scans. Since there are never more than 16 elements, those scans won’t
take too much time.&lt;/p&gt;
&lt;h2&gt;Simple and Concise&lt;/h2&gt;
&lt;p&gt;You might be thinking: This is an incredibly simple algorithm; anyone could have
written this code! Why bother writing a blog post about this?&lt;/p&gt;
&lt;p&gt;That’s exactly my point: This is simple and concise code. I could have written
it - you could have also. There’s nothing superfluous, not an extra word here.
Just enough code to get the job done. And there’s no noise… no macros, no odd C
memory tricks, no weird bitwise mask operations. This is the kind of code I
need to read now when I’m learning how to use Crystal. As a side benefit, I
also get to learn how Crystal works internally.&lt;/p&gt;
&lt;p&gt;But what happens for longer arrays, with 100s or 1000s of elements? How does
Crystal remove duplicates from longer arrays efficiently? I'll take a look at
how that works in my next post.&lt;/p&gt;
</content></entry><entry><title>Downloading 100,000 Files Using Async Rust</title><link href="https://patshaughnessy.net/2020/1/20/downloading-100000-files-using-async-rust" rel="alternate"></link><id href="https://patshaughnessy.net/2020/1/20/downloading-100000-files-using-async-rust" rel="alternate"></id><published>2020-01-20T00:00:00Z</published><updated>2020-01-20T00:00:00Z</updated><category>Rust</category><author><name>Pat Shaughnessy</name></author><summary type="html">&lt;div style=&quot;float: left; padding: 8px 30px 20px 0px; text-align: center; line-height:18px&quot;&gt;
  &lt;img src=&quot;https://patshaughnessy.net/assets/2020/1/20/traffic-light.jpg&quot;&gt;&lt;br/&gt;
  &lt;i&gt;Rust's new async/await feature makes it &lt;br/&gt;
easy to stop and start asynchronous tasks&lt;/i&gt;&lt;br/&gt;
  &lt;small&gt;(from: &lt;a href=&quot;https://commons.wikimedia.org/wiki/File:Red_and_green_traffic_signals,_Stamford_Road,_Singapore_-_20</summary><content type="html">&lt;div style=&quot;float: left; padding: 8px 30px 20px 0px; text-align: center; line-height:18px&quot;&gt;
  &lt;img src=&quot;https://patshaughnessy.net/assets/2020/1/20/traffic-light.jpg&quot;&gt;&lt;br/&gt;
  &lt;i&gt;Rust's new async/await feature makes it &lt;br/&gt;
easy to stop and start asynchronous tasks&lt;/i&gt;&lt;br/&gt;
  &lt;small&gt;(from: &lt;a href=&quot;https://commons.wikimedia.org/wiki/File:Red_and_green_traffic_signals,_Stamford_Road,_Singapore_-_20111210.jpg&quot;&gt;Wikimedia Commons&lt;/a&gt;)&lt;/small&gt;&lt;/i&gt; 
&lt;/div&gt;
&lt;p&gt;Imagine if you had a text file containing thousands of URLs:&lt;/p&gt;
&lt;pre&gt;$ cat urls.txt
https://example.com/1.html
https://example.com/2.html
https://example.com/3.html

etc...

https://example.com/99999.html
https://example.com/100000.html&lt;/pre&gt;
&lt;p&gt;…and you needed to download all of those HTML pages efficiently. How would you
do it? Maybe a shell script using &lt;span class=&quot;code&quot;&gt;xargs&lt;/span&gt; and &lt;span
class=&quot;code&quot;&gt;curl&lt;/span&gt;? Maybe a simple Golang program? Go’s powerful
concurrency features would work well for this.&lt;/p&gt;
&lt;div style=&quot;clear: both&quot;&gt;&lt;/div&gt;
&lt;p&gt;Instead, I decided to try to use Rust. I’ve read a lot about safe concurrency
in Rust, but I’ve never tried it. I also wanted to learn what Rust’s new
“async/await” feature was all about. This seemed like the perfect task for
asynchronous Rust code.&lt;/p&gt;
&lt;p&gt;&lt;em&gt;TL/DR&lt;/em&gt;: &lt;a href=&quot;https://gist.github.com/patshaughnessy/27b1611e2c912346b929df97998d488d&quot;&gt;Here’s the
code&lt;/a&gt;.
The rest of this post will explain how it works.&lt;/p&gt;
&lt;h2&gt;Getting Started With Reqwest&lt;/h2&gt;
&lt;p&gt;There are many different Rust HTTP clients to choose from, and &lt;a href=&quot;https://medium.com/@shnatsel/smoke-testing-rust-http-clients-b8f2ee5db4e6&quot;&gt;apparently some
controversy&lt;/a&gt;
about which works best. Because I’m a Rust newbie, I decided simply to pick the most
popular: &lt;a href=&quot;https://github.com/seanmonstar/reqwest&quot;&gt;reqwest&lt;/a&gt;. Request is a high
level, easy to use HTTP client, written by &lt;a href=&quot;https://seanmonstar.com/&quot;&gt;Sean
McArthur&lt;/a&gt;. He just updated it to work with Tokio,
Rust’s new async/await engine, so this is the perfect time to try using it.
Here’s the example from the readme:&lt;/p&gt;
&lt;pre style=&quot;background-color:#ffffff;&quot;&gt;
&lt;span style=&quot;color:#a71d5d;&quot;&gt;use &lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;std::collections::HashMap;
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;#[tokio::main]
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;async &lt;/span&gt;&lt;span style=&quot;color:#a71d5d;&quot;&gt;fn &lt;/span&gt;&lt;span style=&quot;color:#795da3;&quot;&gt;main&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;() -&amp;gt; Result&amp;lt;(), Box&amp;lt;dyn std::error::Error&amp;gt;&amp;gt; {
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;    &lt;/span&gt;&lt;span style=&quot;color:#a71d5d;&quot;&gt;let&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt; resp &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;= &lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;reqwest::get(&lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;&amp;quot;&lt;/span&gt;&lt;span style=&quot;color:#008080;&quot;&gt;https://httpbin.org/ip&lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;&amp;quot;&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;)
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;        .await&lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;?
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;        .json::&amp;lt;HashMap&amp;lt;String, String&amp;gt;&amp;gt;()
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;        .await&lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;?&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;;
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;    println!(&lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;&amp;quot;&lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;{:#?}&lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;&amp;quot;&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;, resp);
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;    Ok(())
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;}&lt;/span&gt;&lt;/pre&gt;

&lt;p&gt;This version downloads some JSON and parses it. Notice the new &lt;span
class=&quot;code&quot;&gt;async&lt;/span&gt; and &lt;span class=&quot;code&quot;&gt;await&lt;/span&gt; keywords. The
main function is &lt;span class=&quot;code&quot;&gt;async&lt;/span&gt; - this means that the function
becomes part of a large state machine run by Tokio. When you mark a function
asynchronous, you can then call &lt;span class=&quot;code&quot;&gt;await&lt;/span&gt; inside it,
which will pause that function temporarily, allowing other asynchronous
functions to run on the same thread.&lt;/p&gt;
&lt;p&gt;I decided to modify this to print out the number of bytes downloaded instead; you could
easily change it to save the data to a file or do whatever you want.&lt;/p&gt;
&lt;pre style=&quot;background-color:#ffffff;&quot;&gt;
&lt;span style=&quot;color:#a71d5d;&quot;&gt;let&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt; path &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;= &amp;quot;&lt;/span&gt;&lt;span style=&quot;color:#008080;&quot;&gt;https://httpbin.org/ip&lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;&amp;quot;&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;;
&lt;/span&gt;&lt;span style=&quot;color:#a71d5d;&quot;&gt;match &lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;reqwest::get(path).await {
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;    Ok(resp) &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;=&amp;gt; &lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;{
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;        &lt;/span&gt;&lt;span style=&quot;color:#a71d5d;&quot;&gt;match&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt; resp.text().await {
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;            Ok(text) &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;=&amp;gt; &lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;{
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;                println!(&lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;&amp;quot;&lt;/span&gt;&lt;span style=&quot;color:#008080;&quot;&gt;RESPONSE: &lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;{}&lt;/span&gt;&lt;span style=&quot;color:#008080;&quot;&gt; bytes from &lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;{}&lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;&amp;quot;&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;, text.len(), path);
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;            }
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;            Err(&lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;_&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;) &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;=&amp;gt; &lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;println!(&lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;&amp;quot;&lt;/span&gt;&lt;span style=&quot;color:#008080;&quot;&gt;ERROR reading &lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;{}&lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;&amp;quot;&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;, path),
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;        }
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;    }
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;    Err(&lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;_&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;) &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;=&amp;gt; &lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;println!(&lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;&amp;quot;&lt;/span&gt;&lt;span style=&quot;color:#008080;&quot;&gt;ERROR downloading &lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;{}&lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;&amp;quot;&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;, path),
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;}
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;Ok(())&lt;/span&gt;&lt;/pre&gt;

&lt;p&gt;This is a two step process:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;First I call &lt;span class=&quot;code&quot;&gt;get(path)&lt;/span&gt; to send the HTTP GET request. Then I use &lt;span class=&quot;code&quot;&gt;await&lt;/span&gt; to wait for
the request to finish and return a result.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Second, if the request was successful, I call &lt;span
  class=&quot;code&quot;&gt;resp.text()&lt;/span&gt; to get the contents of the response body. And
I wait again while that is loaded.&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;I handle the errors explicitly and always return a unit result &lt;span
class=&quot;code&quot;&gt;Ok(())&lt;/span&gt; because that makes the code below simpler
when I start downloading more than one page concurrently.&lt;/p&gt;
&lt;p&gt;Visually, I can draw the &lt;span class=&quot;code&quot;&gt;get&lt;/span&gt; and &lt;span
class=&quot;code&quot;&gt;text&lt;/span&gt; calls like this:&lt;/p&gt;
&lt;img src=&quot;https://patshaughnessy.net/assets/2020/1/20/get-and-text.png&quot;&gt;
&lt;p&gt;First I call &lt;span class=&quot;code&quot;&gt;get&lt;/span&gt; and wait, then I call &lt;span
class=&quot;code&quot;&gt;text&lt;/span&gt; and wait.&lt;/p&gt;
&lt;p&gt;But what is asynchronous about this? This reads like normal, single threaded
code. I do one thing, then I do another.&lt;/p&gt;
&lt;h2&gt;Sending 3 Concurrent Requests&lt;/h2&gt;
&lt;p&gt;The magic happens when I have more than one request I want to make in parallel. Let’s use three hard coded path strings:&lt;/p&gt;
&lt;pre style=&quot;background-color:#ffffff;&quot;&gt;
&lt;span style=&quot;color:#a71d5d;&quot;&gt;let&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt; paths &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;= &lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;vec![
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;    &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;&amp;quot;&lt;/span&gt;&lt;span style=&quot;color:#008080;&quot;&gt;https://example.com/1.html&lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;&amp;quot;&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;.to_string(),
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;    &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;&amp;quot;&lt;/span&gt;&lt;span style=&quot;color:#008080;&quot;&gt;https://example.com/2.html&lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;&amp;quot;&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;.to_string(),
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;    &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;&amp;quot;&lt;/span&gt;&lt;span style=&quot;color:#008080;&quot;&gt;https://example.com/3.html&lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;&amp;quot;&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;.to_string(),
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;];&lt;/span&gt;&lt;/pre&gt;

&lt;p&gt;To download the 3 HTML files in parallel, I spawn three Tokio “tasks” and wait
for them all to complete. (This requires adding the futures crate to
Cargo.toml, which implements &lt;span class=&quot;code&quot;&gt;join_all&lt;/span&gt;.)&lt;/p&gt;
&lt;pre style=&quot;background-color:#ffffff;&quot;&gt;
&lt;span style=&quot;color:#a7adba;&quot;&gt;// Iterate over the paths.
&lt;/span&gt;&lt;span style=&quot;color:#a71d5d;&quot;&gt;let mut&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt; tasks: Vec&amp;lt;JoinHandle&amp;lt;Result&amp;lt;(), ()&amp;gt;&amp;gt;&amp;gt;&lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;= &lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;vec![];
&lt;/span&gt;&lt;span style=&quot;color:#a71d5d;&quot;&gt;for&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt; path &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;in&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt; paths {
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;    &lt;/span&gt;&lt;span style=&quot;color:#a7adba;&quot;&gt;// Copy each path into a new string
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;    &lt;/span&gt;&lt;span style=&quot;color:#a7adba;&quot;&gt;// that can be consumed/captured by the task closure
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;    &lt;/span&gt;&lt;span style=&quot;color:#a71d5d;&quot;&gt;let&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt; path &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;=&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt; path.clone();
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;    &lt;/span&gt;&lt;span style=&quot;color:#a7adba;&quot;&gt;// Create a Tokio task for each path
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;    tasks.push(tokio::spawn(async &lt;/span&gt;&lt;span style=&quot;color:#a71d5d;&quot;&gt;move &lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;{
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;        &lt;/span&gt;&lt;span style=&quot;color:#a71d5d;&quot;&gt;match &lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;reqwest::get(&lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;&amp;amp;&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;path).await {
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;            Ok(resp) &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;=&amp;gt; &lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;{
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;                &lt;/span&gt;&lt;span style=&quot;color:#a71d5d;&quot;&gt;match&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt; resp.text().await {
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;                    Ok(text) &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;=&amp;gt; &lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;{
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;                        println!(&lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;&amp;quot;&lt;/span&gt;&lt;span style=&quot;color:#008080;&quot;&gt;RESPONSE: &lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;{}&lt;/span&gt;&lt;span style=&quot;color:#008080;&quot;&gt; bytes from &lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;{}&lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;&amp;quot;&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;, text.len(), path);
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;                    }
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;                    Err(&lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;_&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;) &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;=&amp;gt; &lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;println!(&lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;&amp;quot;&lt;/span&gt;&lt;span style=&quot;color:#008080;&quot;&gt;ERROR reading &lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;{}&lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;&amp;quot;&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;, path),
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;                }
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;            }
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;            Err(&lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;_&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;) &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;=&amp;gt; &lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;println!(&lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;&amp;quot;&lt;/span&gt;&lt;span style=&quot;color:#008080;&quot;&gt;ERROR downloading &lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;{}&lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;&amp;quot;&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;, path),
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;        }
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;        Ok(())
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;    }));
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;}
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;
&lt;/span&gt;&lt;span style=&quot;color:#a7adba;&quot;&gt;// Wait for them all to finish
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;println!(&lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;&amp;quot;&lt;/span&gt;&lt;span style=&quot;color:#008080;&quot;&gt;Started &lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;{}&lt;/span&gt;&lt;span style=&quot;color:#008080;&quot;&gt; tasks. Waiting...&lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;&amp;quot;&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;, tasks.len());
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;join_all(tasks).await;&lt;/span&gt;&lt;/pre&gt;

&lt;p&gt;Each Tokio task is a closure passed to the &lt;span
class=&quot;code&quot;&gt;tokio::spawn&lt;/span&gt; function, marked &lt;span class=&quot;code&quot;&gt;async
move&lt;/span&gt;. I create a copy of each path, using &lt;span
class=&quot;code&quot;&gt;path.clone()&lt;/span&gt;, so the closure has its own copy of the path
string with its own lifetime.&lt;/p&gt;
&lt;p&gt;The complex type annotation on the &lt;span class=&quot;code&quot;&gt;tasks&lt;/span&gt; array
indicates what each call to &lt;span class=&quot;code&quot;&gt;spawn&lt;/span&gt; returns: a &lt;span
class=&quot;code&quot;&gt;JoinHandle&lt;/span&gt; enclosing a &lt;span class=&quot;code&quot;&gt;Result&lt;/span&gt;. To
keep things simple, I handle all errors in the closure and just return &lt;span
class=&quot;code&quot;&gt;Ok(())&lt;/span&gt;.  This means each &lt;span
class=&quot;code&quot;&gt;JoinHandle&lt;/span&gt; contains a trivial result: &lt;span
class=&quot;code&quot;&gt;Result&amp;lt;(), ()&amp;gt;&lt;/span&gt;. I could have written the closure to return
some value and/or some error value instead.&lt;/p&gt;
&lt;p&gt;After the loop is finished and all three tasks have been spawned, I call &lt;span
class=&quot;code&quot;&gt;join_all(tasks).await&lt;/span&gt; to wait for them all to finish.&lt;/p&gt;
&lt;h2&gt;Asynchronous vs Multithreaded&lt;/h2&gt;
&lt;div style=&quot;float: right; padding: 8px 0px 20px 30px; text-align: center; line-height:18px&quot;&gt;
  &lt;img src=&quot;https://patshaughnessy.net/assets/2020/1/20/traffic-light2.jpg&quot;&gt;&lt;br/&gt;
  &lt;small&gt;(from: &lt;a href=&quot;https://commons.wikimedia.org/wiki/File:Traffic_lights,_Zl%C3%ADn.JPG&quot;&gt;Wikimedia Commons&lt;/a&gt;)&lt;/small&gt;&lt;/i&gt; 
&lt;/div&gt;
&lt;p&gt;At first glance, it looks like this code is spawning three different threads. I
even call a spawn function. A multithreaded download might look like this:&lt;/p&gt;
&lt;img src=&quot;https://patshaughnessy.net/assets/2020/1/20/multithreaded.png&quot;&gt;
&lt;p&gt;We have 3 paths, so we have 3 threads. Each thread calls &lt;span class=&quot;code&quot;&gt;get&lt;/span&gt; and waits, and
then calls &lt;span class=&quot;code&quot;&gt;text&lt;/span&gt; and waits.&lt;/p&gt;
&lt;p&gt;However, Rust’s Tokio engine doesn’t work that way. Instead of launching an
entirely new thread for each task, it runs all three tasks on the same thread.&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Update&lt;/em&gt;: Wesley Moore &lt;a href=&quot;https://twitter.com/wezm/status/1219734031857635329&quot;&gt;pointed out on
Twitter&lt;/a&gt; that: &amp;quot;Tokio
multiplexes m tasks into a pool of n threads so it’s able to use all available
cores. (M:N threading).&amp;quot; It looks like Tokio supports both a Basic (single
threaded) and Threaded (thread pool) Scheduler; see &lt;a href=&quot;https://docs.rs/tokio/0.2.10/tokio/runtime/index.html#threaded-scheduler&quot;&gt;the
docs&lt;/a&gt;
for more information.&lt;/p&gt;
&lt;p&gt;I imagine three tasks running on one thread like this:&lt;/p&gt;
&lt;img src=&quot;https://patshaughnessy.net/assets/2020/1/20/one-thread.png&quot;&gt;
&lt;p&gt;Each time I call &lt;span class=&quot;code&quot;&gt;await&lt;/span&gt;, Rust stops one task and
starts another using the same thread. In fact, depending on how long it takes
for each task to complete, they might be run in a different order:&lt;/p&gt;
&lt;img src=&quot;https://patshaughnessy.net/assets/2020/1/20/different-order.png&quot;&gt;
&lt;p&gt;There’s no way to predict ahead of time what order the tasks will run it.
That’s why I needed to copy each path string above; each task needs it own copy
of the string with its own independent lifetime because it might be run at any
time.&lt;/p&gt;
&lt;p&gt;The only guarantee I have is that the &lt;span class=&quot;code&quot;&gt;join_all&lt;/span&gt; call
at the bottom will block until all of the tasks have finished; that is, until
all of the futures I pushed onto the tasks array have completed.&lt;/p&gt;
&lt;h2&gt;Sending 100,000 Concurrent Requests&lt;/h2&gt;
&lt;p&gt;I can scale this up to 100,000 requests by reading the URLs in from a file instead:&lt;/p&gt;
&lt;pre style=&quot;background-color:#ffffff;&quot;&gt;
&lt;span style=&quot;color:#a71d5d;&quot;&gt;fn &lt;/span&gt;&lt;span style=&quot;color:#795da3;&quot;&gt;read_lines&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;(path: &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;&amp;amp;&lt;/span&gt;&lt;span style=&quot;color:#a71d5d;&quot;&gt;str&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;) -&amp;gt; Result&amp;lt;Vec&amp;lt;String&amp;gt;, Box&amp;lt;dyn Error&amp;gt;&amp;gt; {
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;    &lt;/span&gt;&lt;span style=&quot;color:#a71d5d;&quot;&gt;let&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt; file &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;= &lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;File::open(path)&lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;?&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;;
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;    &lt;/span&gt;&lt;span style=&quot;color:#a71d5d;&quot;&gt;let&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt; reader &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;= &lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;BufReader::new(file);
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;    Ok(
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;        reader.lines().filter_map(Result::ok).collect()
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;    )
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;}
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;#[tokio::main]
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;async &lt;/span&gt;&lt;span style=&quot;color:#a71d5d;&quot;&gt;fn &lt;/span&gt;&lt;span style=&quot;color:#795da3;&quot;&gt;main&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;() -&amp;gt; Result&amp;lt;(), Box&amp;lt;dyn Error&amp;gt;&amp;gt; {
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;	&lt;/span&gt;&lt;span style=&quot;color:#a71d5d;&quot;&gt;let&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt; paths: Vec&amp;lt;String&amp;gt; &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;= &lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;read_lines(&lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;&amp;quot;&lt;/span&gt;&lt;span style=&quot;color:#008080;&quot;&gt;urls.txt&lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;&amp;quot;&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;)&lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;?&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;;
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;etc&lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;...&lt;/span&gt;&lt;/pre&gt;

&lt;p&gt;When I tried this out for the first time I was excited: How long would it take
to download 100,000 HTML pages simultaneously like this? Would it be 100,000x
faster than downloading one file? I typed &lt;span class=&quot;code&quot;&gt;cargo run
--release&lt;/span&gt; to build my code in release mode and get the best possible
performance out of Rust. Asynchronous code, zero cost abstractions, no garbage
collector, this was going to be great!&lt;/p&gt;
&lt;p&gt;Of course, it didn’t work.&lt;/p&gt;
&lt;p&gt;What happened? The problem is the web server can't handle so many concurrent network
connections. Using my thread/task diagram, launching all 100,000 tasks might
look like this:&lt;/p&gt;
&lt;img src=&quot;https://patshaughnessy.net/assets/2020/1/20/simultaneous.png&quot;&gt;
&lt;p&gt;I spawn 100,000 tasks all on to the same thread, and Tokio starts executing
them all. Each time my code above calls &lt;span
class=&quot;code&quot;&gt;get(&amp;amp;path).await&lt;/span&gt;, Tokio pauses that task and starts
another, which calls &lt;span class=&quot;code&quot;&gt;get(&amp;amp;path).await&lt;/span&gt; again, opening
yet another HTTP request. My laptop quickly runs out of network resources and
these tasks start to fail.&lt;/p&gt;
&lt;h2&gt;Sending a Buffered, Concurrent Stream of 100,000 Requests&lt;/h2&gt;
&lt;p&gt;Instead, I need to limit the number of concurrent Tokio tasks - the number of
concurrent HTTP requests. I need the diagram to look something like this:&lt;/p&gt;
&lt;img src=&quot;https://patshaughnessy.net/assets/2020/1/20/buffered.png&quot;&gt;
&lt;p&gt;After the first 8 tasks are started, the first 8 blue boxes on the left, Tokio
waits for at least one of them to complete before starting a 9th task. I
indicate this with the “max concurrency” arrow.&lt;/p&gt;
&lt;p&gt;Once one of the first 8 calls to &lt;span class=&quot;code&quot;&gt;reqwest::get&lt;/span&gt;
completes, Tokio is free to run a 9th task. The first &amp;quot;pop from buffer&amp;quot; arrow.
And once that 9th task or any other task completes, Tokio starts a 10th task,
etc., in this manner processing all 100,000 tasks 8 at a time.&lt;/p&gt;
&lt;p&gt;To achieve this, I can use &lt;span class=&quot;code&quot;&gt;StreamExt&lt;/span&gt; trait’s &lt;a
href=&quot;https://rust-lang-nursery.github.io/futures-api-docs/0.3.0-alpha.5/futures/stream/trait.StreamExt.html#method.buffer_unordered&quot;&gt;&lt;span
class=&quot;code&quot;&gt;buffer_unordered&lt;/span&gt;&lt;/a&gt; function:&lt;/p&gt;
&lt;pre style=&quot;background-color:#ffffff;&quot;&gt;
&lt;span style=&quot;color:#a71d5d;&quot;&gt;let&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt; fetches &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;= &lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;futures::stream::iter(
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;    paths.into_iter().map(|path| {
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;        async &lt;/span&gt;&lt;span style=&quot;color:#a71d5d;&quot;&gt;move &lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;{
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;            &lt;/span&gt;&lt;span style=&quot;color:#a71d5d;&quot;&gt;match &lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;reqwest::get(&lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;&amp;amp;&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;path).await {
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;                Ok(resp) &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;=&amp;gt; &lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;{
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;                    &lt;/span&gt;&lt;span style=&quot;color:#a71d5d;&quot;&gt;match&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt; resp.text().await {
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;                        Ok(text) &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;=&amp;gt; &lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;{
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;                            println!(&lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;&amp;quot;&lt;/span&gt;&lt;span style=&quot;color:#008080;&quot;&gt;RESPONSE: &lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;{}&lt;/span&gt;&lt;span style=&quot;color:#008080;&quot;&gt; bytes from &lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;{}&lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;&amp;quot;&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;, text.len(), path);
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;                        }
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;                        Err(&lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;_&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;) &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;=&amp;gt; &lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;println!(&lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;&amp;quot;&lt;/span&gt;&lt;span style=&quot;color:#008080;&quot;&gt;ERROR reading &lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;{}&lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;&amp;quot;&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;, path),
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;                    }
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;                }
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;                Err(&lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;_&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;) &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;=&amp;gt; &lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;println!(&lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;&amp;quot;&lt;/span&gt;&lt;span style=&quot;color:#008080;&quot;&gt;ERROR downloading &lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;{}&lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;&amp;quot;&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;, path),
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;            }
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;        }
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;})
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;).buffer_unordered(&lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;8&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;).collect::&amp;lt;Vec&amp;lt;()&amp;gt;&amp;gt;();
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;println!(&lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;&amp;quot;&lt;/span&gt;&lt;span style=&quot;color:#008080;&quot;&gt;Waiting...&lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;&amp;quot;&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;);
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;fetches.await;&lt;/span&gt;&lt;/pre&gt;

&lt;p&gt;First I create an iterator which maps all of the paths to my closures, and passes
them to &lt;span class=&quot;code&quot;&gt;futures::stream::iter&lt;/span&gt;.
This will create a list of futures, each one executing my closure.&lt;/p&gt;
&lt;p&gt;At the bottom I call &lt;span class=&quot;code&quot;&gt;buffer_unordered&lt;/span&gt; and pass in 8.  The
code in &lt;span class=&quot;code&quot;&gt;buffer_unordered&lt;/span&gt; will execute up to 8 futures
from the stream concurrently, and then start to buffer the remaining futures.
As each task completes, each HTTP request in my example, &lt;span
class=&quot;code&quot;&gt;buffer_unordered&lt;/span&gt; will pull another task out of its buffer
and execute it.&lt;/p&gt;
&lt;p&gt;This code will slowly but steadily iterate over the 100,000 URLs, downloading
them in parallel. Experimenting with this, it doesn’t seem to matter very much
exactly what level of concurrency I pick. I found the best performance when I
picked a concurrency of 50. Using 50 concurrent Tokio tasks, it took about 30
minutes to download all one hundred thousand HTML files.&lt;/p&gt;
&lt;p&gt;However, none of that matters. I’m not measuring the performance of Rust, Tokio
or Reqwest. These numbers have more to do with the web server and network
connection I’m using. The real performance here was my own developer
performance: With just a few lines of code I was able to write an asynchronous
I/O program that can scale as much as I would like. The &lt;span
class=&quot;code&quot;&gt;async&lt;/span&gt; and &lt;span class=&quot;code&quot;&gt;await&lt;/span&gt; keywords make
this code easy to write and easy to read.&lt;/p&gt;
</content></entry><entry><title>Using Result Combinator Functions in Rust</title><link href="https://patshaughnessy.net/2019/11/19/using-result-combinator-functions-in-rust" rel="alternate"></link><id href="https://patshaughnessy.net/2019/11/19/using-result-combinator-functions-in-rust" rel="alternate"></id><published>2019-11-19T00:00:00Z</published><updated>2019-11-19T00:00:00Z</updated><category>Rust</category><author><name>Pat Shaughnessy</name></author><summary type="html">&lt;div style=&quot;float: right; padding: 8px 0px 20px 30px; text-align: center; line-height:18px&quot;&gt;
  &lt;img src=&quot;https://patshaughnessy.net/assets/2019/11/19/train-yard.jpeg&quot;&gt;&lt;br/&gt;
  &lt;i&gt;Rust’s Result type can help you control your program’s&lt;br/&gt;
  flow by checking for errors in a succinct, elegant way&lt;/i&gt;
&lt;/div&gt;
&lt;p&gt;Using Rust for the first time, error handling was my biggest stumbling block.
Was this </summary><content type="html">&lt;div style=&quot;float: right; padding: 8px 0px 20px 30px; text-align: center; line-height:18px&quot;&gt;
  &lt;img src=&quot;https://patshaughnessy.net/assets/2019/11/19/train-yard.jpeg&quot;&gt;&lt;br/&gt;
  &lt;i&gt;Rust’s Result type can help you control your program’s&lt;br/&gt;
  flow by checking for errors in a succinct, elegant way&lt;/i&gt;
&lt;/div&gt;
&lt;p&gt;Using Rust for the first time, error handling was my biggest stumbling block.
Was this value a &lt;span class=&quot;code&quot;&gt;Result&amp;lt;T, E&amp;gt;&lt;/span&gt; or just a T?  And the
right T? The right E? I couldn’t just write the code I wanted to write. It
felt confusing and overly elaborate.&lt;/p&gt;
&lt;p&gt;But after a while, I started to get a feel for the basics of using &lt;span
class=&quot;code&quot;&gt;Result&lt;/span&gt;. I discovered that the combinator methods Result
provides, like &lt;span class=&quot;code&quot;&gt;map&lt;/span&gt;, &lt;span class=&quot;code&quot;&gt;or_else&lt;/span&gt;
and &lt;span class=&quot;code&quot;&gt;ok&lt;/span&gt;, made error handling fun. Well, maybe
that's a bit of an overstatement. They made using &lt;span
class=&quot;code&quot;&gt;Result&lt;/span&gt; a bit easier, at least.&lt;/p&gt;
&lt;p&gt;So far my favorite &lt;span class=&quot;code&quot;&gt;Result&lt;/span&gt; combinator method is
&lt;a
href=&quot;https://doc.rust-lang.org/std/result/enum.Result.html#method.and_then&quot;&gt;&lt;span
class=&quot;code&quot;&gt;and_then&lt;/span&gt;&lt;/a&gt;. Using &lt;span class=&quot;code&quot;&gt;and_then&lt;/span&gt; &lt;em&gt;is&lt;/em&gt;
actually fun! For example, I wrote &lt;a href=&quot;https://github.com/patshaughnessy/patshaughnessy.github.io/blob/master/src/lib.rs#L43&quot;&gt;this Rust
code&lt;/a&gt;
to generate the static HTML pages for this blog site:&lt;/p&gt;
&lt;pre style=&quot;background-color:#ffffff;&quot;&gt;
&lt;span style=&quot;color:#a71d5d;&quot;&gt;let&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt; count &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;=&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt; all_posts.len();
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;all_posts.sort_by_key(|p| Reverse(p.date));
&lt;/span&gt;&lt;span style=&quot;color:#a71d5d;&quot;&gt;let&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt; params &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;=&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt; CompileParams {all_posts: all_posts, output_path: output_path, draft: draft};
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;Ok(params).and_then(compile_posts)
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;          .and_then(compile_home_page)
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;          .and_then(compile_rss_feed)
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;          .map(|_output| count)&lt;/span&gt;&lt;/pre&gt;

&lt;p&gt;Ignoring the details about sorting and counting, my code:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;First creates a struct holding input parameters, and wraps it using &lt;span class=&quot;code&quot;&gt;Ok(params)&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;em&gt;And then&lt;/em&gt; tries to compile all the posts in my blog, passing in the input parameters&lt;/li&gt;
&lt;li&gt;&lt;em&gt;And then&lt;/em&gt; if this was successful, it tries to compile the home page
(index.html)&lt;/li&gt;
&lt;li&gt;&lt;em&gt;And then&lt;/em&gt; if this was successful, it tries to compile the RSS feed (index.xml)&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;If there was an error at any time in this process, it short circuits and stops.
Here’s a flowchart that illustrates this control flow:&lt;/p&gt;
&lt;div style=&quot;margin-left: auto; margin-right: auto; width:235px&quot;&gt;
&lt;br/&gt;
&lt;img src=&quot;https://patshaughnessy.net/assets/2019/11/19/flowchart.png&quot;&gt;
&lt;/div&gt;
&lt;p&gt;The happy path is from top to bottom, along the left side. If any of the
compile methods fail, &lt;span class=&quot;code&quot;&gt;and_then&lt;/span&gt; short circuits the
happy path and jumps to the end.&lt;/p&gt;
&lt;h2&gt;Matching Result Types&lt;/h2&gt;
&lt;p&gt;To chain &lt;span class=&quot;code&quot;&gt;and_then&lt;/span&gt; methods together like this, I used
the same input and output types for each of the compile methods:&lt;/p&gt;
&lt;pre style=&quot;background-color:#ffffff;&quot;&gt;
&lt;span style=&quot;color:#a71d5d;&quot;&gt;fn &lt;/span&gt;&lt;span style=&quot;color:#795da3;&quot;&gt;compile_posts&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;(params: CompileParams) -&amp;gt; Result&amp;lt;CompileParams, InvalidPostError&amp;gt;&lt;/span&gt;&lt;/pre&gt;

&lt;pre style=&quot;background-color:#ffffff;&quot;&gt;
&lt;span style=&quot;color:#a71d5d;&quot;&gt;fn &lt;/span&gt;&lt;span style=&quot;color:#795da3;&quot;&gt;compile_home_page&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;(params: CompileParams) -&amp;gt; Result&amp;lt;CompileParams, InvalidPostError&amp;gt;&lt;/span&gt;&lt;/pre&gt;

&lt;pre style=&quot;background-color:#ffffff;&quot;&gt;
&lt;span style=&quot;color:#a71d5d;&quot;&gt;fn &lt;/span&gt;&lt;span style=&quot;color:#795da3;&quot;&gt;compile_rss_feed&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;(params: CompileParams) -&amp;gt; Result&amp;lt;CompileParams, InvalidPostError&amp;gt;&lt;/span&gt;&lt;/pre&gt;

&lt;p&gt;Each method expects a &lt;span class=&quot;code&quot;&gt;CompileParams&lt;/span&gt; struct, and
returns one wrapped in &lt;span class=&quot;code&quot;&gt;Result&lt;/span&gt;. Rust unwraps the &lt;span
class=&quot;code&quot;&gt;CompileParams&lt;/span&gt; from one call and passes it to the next.&lt;/p&gt;
&lt;p&gt;I use &lt;span class=&quot;code&quot;&gt;InvalidPostError&lt;/span&gt; throughout my code to provide
a consistent way to return errors. This was a bit of a challenge at first,
until I realized it was easy to cast other types of errors into
&lt;span class=&quot;code&quot;&gt;InvalidPostError&lt;/span&gt; like this:&lt;/p&gt;
&lt;pre style=&quot;background-color:#ffffff;&quot;&gt;
&lt;span style=&quot;color:#a71d5d;&quot;&gt;impl &lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;From&amp;lt;std::io::Error&amp;gt; &lt;/span&gt;&lt;span style=&quot;color:#a71d5d;&quot;&gt;for &lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;InvalidPostError {
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;    &lt;/span&gt;&lt;span style=&quot;color:#a71d5d;&quot;&gt;fn &lt;/span&gt;&lt;span style=&quot;color:#795da3;&quot;&gt;from&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;(e: std::io::Error) -&amp;gt; InvalidPostError {
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;        &lt;/span&gt;&lt;span style=&quot;color:#a71d5d;&quot;&gt;let&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt; message &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;= &lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;format!(&lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;&amp;quot;&lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;{}&lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;&amp;quot;&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;, e);
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;        InvalidPostError::new(&lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;&amp;amp;&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;message)
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;    }
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;}&lt;/span&gt;&lt;/pre&gt;

&lt;p&gt;Now the Rust compiler knows how to map a &lt;span class=&quot;code&quot;&gt;std::io::Error&lt;/span&gt; into an &lt;span class=&quot;code&quot;&gt;InvalidPostError&lt;/span&gt;.&lt;/p&gt;
&lt;h2&gt;Error Handling the Old Fashioned Way&lt;/h2&gt;
&lt;p&gt;Here’s the code I didn’t have to write: (This is Ruby; substitute your favorite
PL that doesn't support &lt;a href=&quot;https://medium.com/@huund/monadic-error-handling-1e2ce66e3810&quot;&gt;monadic error
handling&lt;/a&gt;.)&lt;/p&gt;
&lt;pre style=&quot;background-color:#ffffff;&quot;&gt;
&lt;span style=&quot;color:#a71d5d;&quot;&gt;if&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt; compile_posts(params)
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;  &lt;/span&gt;&lt;span style=&quot;color:#a71d5d;&quot;&gt;if&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt; compile_home_page(params)
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;    &lt;/span&gt;&lt;span style=&quot;color:#a71d5d;&quot;&gt;if&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt; compile_rss_feed(params)
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;      puts &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;&amp;quot;&lt;/span&gt;&lt;span style=&quot;color:#008080;&quot;&gt;Success!&lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;&amp;quot;
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;    &lt;/span&gt;&lt;span style=&quot;color:#a71d5d;&quot;&gt;else
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;      puts &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;&amp;quot;&lt;/span&gt;&lt;span style=&quot;color:#008080;&quot;&gt;Error compiling RSS Feed&lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;&amp;quot;
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;    &lt;/span&gt;&lt;span style=&quot;color:#a71d5d;&quot;&gt;end
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;  &lt;/span&gt;&lt;span style=&quot;color:#a71d5d;&quot;&gt;else
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;    puts &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;&amp;quot;&lt;/span&gt;&lt;span style=&quot;color:#008080;&quot;&gt;Error compiling home page&lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;&amp;quot;
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;  &lt;/span&gt;&lt;span style=&quot;color:#a71d5d;&quot;&gt;end
&lt;/span&gt;&lt;span style=&quot;color:#a71d5d;&quot;&gt;else
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;  puts &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;&amp;quot;&lt;/span&gt;&lt;span style=&quot;color:#008080;&quot;&gt;Error compiling a blog post&lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;&amp;quot;
&lt;/span&gt;&lt;span style=&quot;color:#a71d5d;&quot;&gt;end&lt;/span&gt;&lt;/pre&gt;

&lt;p&gt;I didn’t have to write a series of if/else blocks. This would have been tedious
to write and tedious to read. And I probably would have forgotten (or have been
too lazy) to check one of the return values.&lt;/p&gt;
&lt;p&gt;And I didn’t have to write this code either:&lt;/p&gt;
&lt;pre style=&quot;background-color:#ffffff;&quot;&gt;
&lt;span style=&quot;color:#a71d5d;&quot;&gt;def &lt;/span&gt;&lt;span style=&quot;color:#795da3;&quot;&gt;compile_posts&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;(params)
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;  &lt;/span&gt;&lt;span style=&quot;color:#795da3;&quot;&gt;raise &lt;/span&gt;&lt;span style=&quot;color:#008080;&quot;&gt;InvalidPostError&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;.&lt;/span&gt;&lt;span style=&quot;color:#795da3;&quot;&gt;new&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;(&lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;&amp;quot;&lt;/span&gt;&lt;span style=&quot;color:#008080;&quot;&gt;Failed compiling the posts&lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;&amp;quot;&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;)
&lt;/span&gt;&lt;span style=&quot;color:#a71d5d;&quot;&gt;end
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;
&lt;/span&gt;&lt;span style=&quot;color:#a71d5d;&quot;&gt;def &lt;/span&gt;&lt;span style=&quot;color:#795da3;&quot;&gt;compile_home_page&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;(params)
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;  &lt;/span&gt;&lt;span style=&quot;color:#795da3;&quot;&gt;raise &lt;/span&gt;&lt;span style=&quot;color:#008080;&quot;&gt;InvalidPostError&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;.&lt;/span&gt;&lt;span style=&quot;color:#795da3;&quot;&gt;new&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;(&lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;&amp;quot;&lt;/span&gt;&lt;span style=&quot;color:#008080;&quot;&gt;Failed compiling the home page&lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;&amp;quot;&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;)
&lt;/span&gt;&lt;span style=&quot;color:#a71d5d;&quot;&gt;end
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;
&lt;/span&gt;&lt;span style=&quot;color:#a71d5d;&quot;&gt;def &lt;/span&gt;&lt;span style=&quot;color:#795da3;&quot;&gt;compile_rss_feed&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;(params)
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;  &lt;/span&gt;&lt;span style=&quot;color:#795da3;&quot;&gt;raise &lt;/span&gt;&lt;span style=&quot;color:#008080;&quot;&gt;InvalidPostError&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;.&lt;/span&gt;&lt;span style=&quot;color:#795da3;&quot;&gt;new&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;(&lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;&amp;quot;&lt;/span&gt;&lt;span style=&quot;color:#008080;&quot;&gt;Failed compiling the RSS feed&lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;&amp;quot;&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;)
&lt;/span&gt;&lt;span style=&quot;color:#a71d5d;&quot;&gt;end
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;
&lt;/span&gt;&lt;span style=&quot;color:#a71d5d;&quot;&gt;begin
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;  compile_posts(params)
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;  compile_home_page(params)
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;  compile_rss_feed(params)
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;  puts &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;&amp;quot;&lt;/span&gt;&lt;span style=&quot;color:#008080;&quot;&gt;Success&lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;&amp;quot;
&lt;/span&gt;&lt;span style=&quot;color:#a71d5d;&quot;&gt;rescue &lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;InvalidPostError =&amp;gt; e
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;  puts e.message
&lt;/span&gt;&lt;span style=&quot;color:#a71d5d;&quot;&gt;end&lt;/span&gt;&lt;/pre&gt;

&lt;p&gt;Once again this is fragile: I might raise the wrong exception type or not raise
one at all. Or I might rescue the wrong type. Worse, there’s no indication at
the call site what might happen.&lt;/p&gt;
&lt;p&gt;To be honest, I probably won’t bother handling errors at all for a simple Ruby
script like this. If an exception happens someday while building my blog site,
then I’ll deal with it then. I’d probably just write this code:&lt;/p&gt;
&lt;pre style=&quot;background-color:#ffffff;&quot;&gt;
&lt;span style=&quot;color:#000000;&quot;&gt;compile_posts(params)
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;compile_home_page(params)
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;compile_rss_feed(params)
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;puts &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;&amp;quot;&lt;/span&gt;&lt;span style=&quot;color:#008080;&quot;&gt;Success&lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;&amp;quot;&lt;/span&gt;&lt;/pre&gt;

&lt;h2&gt;Rust Error Handling: Easy To Read, Hard To Write&lt;/h2&gt;
&lt;p&gt;Combining results together using &lt;span class=&quot;code&quot;&gt;and_then&lt;/span&gt; and other
&lt;span class=&quot;code&quot;&gt;Result&lt;/span&gt; functions enables me to write error checking
code in a natural, succinct way:&lt;/p&gt;
&lt;pre style=&quot;background-color:#ffffff;&quot;&gt;
&lt;span style=&quot;color:#000000;&quot;&gt;Ok(params).and_then(compile_posts)
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;          .and_then(compile_home_page)
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;          .and_then(compile_rss_feed)&lt;/span&gt;&lt;/pre&gt;

&lt;p&gt;This is just as simple to read as the Ruby version above that doesn’t check for
any errors. While it’s harder to write, having the Rust compiler check my
thought process as I piece together different code paths is a huge help.
Learning to use and get along with the Rust compiler is worth it: You end up
with code that is both readable and correct.&lt;/p&gt;
</content></entry><entry><title>How Rust Makes Error Handling Part of the Language</title><link href="https://patshaughnessy.net/2019/10/3/how-rust-makes-error-handling-part-of-the-language" rel="alternate"></link><id href="https://patshaughnessy.net/2019/10/3/how-rust-makes-error-handling-part-of-the-language" rel="alternate"></id><published>2019-10-03T00:00:00Z</published><updated>2019-10-03T00:00:00Z</updated><category>Rust</category><author><name>Pat Shaughnessy</name></author><summary type="html">&lt;div style=&quot;float: left; padding: 8px 30px 20px 0px; text-align: center; line-height:18px&quot;&gt;
  &lt;img src=&quot;https://patshaughnessy.net/assets/2019/10/3/fingers-toes.png&quot;&gt;&lt;br/&gt;
&lt;i&gt;In Spanish these are all “dedos,” while in English&lt;br/&gt;we can distinguish between fingers and toes. &lt;/i&gt;
&lt;/div&gt;
&lt;p&gt;Learning a foreign language can be an incredible experience, not only because
you can talk to new people, </summary><content type="html">&lt;div style=&quot;float: left; padding: 8px 30px 20px 0px; text-align: center; line-height:18px&quot;&gt;
  &lt;img src=&quot;https://patshaughnessy.net/assets/2019/10/3/fingers-toes.png&quot;&gt;&lt;br/&gt;
&lt;i&gt;In Spanish these are all “dedos,” while in English&lt;br/&gt;we can distinguish between fingers and toes. &lt;/i&gt;
&lt;/div&gt;
&lt;p&gt;Learning a foreign language can be an incredible experience, not only because
you can talk to new people, visit new countries, read new books, etc. When you
learn the words someone from a different culture uses, you start to see things
from their perspective. You understand the way they think a bit more.&lt;/p&gt;
&lt;p&gt;The same is true for programming languages. Learning the syntax, keywords and
patterns of a new programming language enables you to think about problems from
a different perspective. You learn to solve problems in a different way.&lt;/p&gt;
&lt;p&gt;I’ve been studying &lt;a href=&quot;https://www.rust-lang.org&quot;&gt;Rust&lt;/a&gt; recently, a new
programming language for me. As a Ruby developer, I was curious to learn how
Rust developers approach solving problems. What do Rust programs look like?
What new words would I learn?&lt;/p&gt;
&lt;h2&gt;Why Rust Was Difficult For Me&lt;/h2&gt;
&lt;p&gt;I knew it would be a challenge to learn Rust. I had heard horror stories about
how difficult the Rust compiler can be to use, or about how confusing the
ownership memory model and the borrow checker can be. And I was right: Rust is
a very difficult language to learn. But not because of move semantics or memory
management.&lt;/p&gt;
&lt;p&gt;For me, the most challenging syntax in Rust had to do with simple error
handling. Let’s take an example: opening and reading a text file. In Ruby, this
is a one-liner and error handling is completely optional:&lt;/p&gt;
&lt;pre style=&quot;background-color:#ffffff;&quot;&gt;
&lt;span style=&quot;color:#000000;&quot;&gt;string &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;= &lt;/span&gt;&lt;span style=&quot;color:#008080;&quot;&gt;File&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;.read(&lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;&amp;quot;&lt;/span&gt;&lt;span style=&quot;color:#008080;&quot;&gt;foo.txt&lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;&amp;quot;&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;)&lt;/span&gt;&lt;/pre&gt;

&lt;p&gt;In Ruby, &lt;span class=&quot;code&quot;&gt;File.read&lt;/span&gt; returns a simple string. Will this
ever return an error? Who knows. Maybe Ruby will raise an exception, maybe not.
I don’t have to worry about that at the call site when I’m writing the code. I
can focus on the happy path, but I end up with a program that can’t handle
errors.&lt;/p&gt;
&lt;p&gt;Golang, at least, returns an error value explicitly when I try to read a file:&lt;/p&gt;
&lt;pre&gt;b, err := ioutil.ReadFile(&quot;foo.txt&quot;)
if err != nil {
    fmt.Print(err)
} else {
    str := string(b)
}&lt;/pre&gt;
&lt;p&gt;Here the Golang &lt;span class=&quot;code&quot;&gt;ioutil.ReadFile&lt;/span&gt; function returns two
values: the string I want and also an error value. The Go compiler forces me to
think about errors that might occur, at least for a moment. But error handling
is still optional. I can simply choose to ignore the &lt;span
class=&quot;code&quot;&gt;err&lt;/span&gt; value entirely. Most C programs work in a similar
fashion, returning an error code in some manner.  And if I do choose to handle
the error, I end up with verbose, messy code that checks for error codes over
and over again.&lt;/p&gt;
&lt;p&gt;In Rust error handling in mandatory. Let’s try to rewrite the same example
using Rust:&lt;/p&gt;
&lt;pre style=&quot;background-color:#ffffff;&quot;&gt;
&lt;span style=&quot;color:#a71d5d;&quot;&gt;let mut&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt; file &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;= &lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;File::open(&lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;&amp;quot;&lt;/span&gt;&lt;span style=&quot;color:#008080;&quot;&gt;foo.txt&lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;&amp;quot;&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;);
&lt;/span&gt;&lt;span style=&quot;color:#a71d5d;&quot;&gt;let mut&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt; contents &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;= &lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;String::new();
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;file.read_to_string(&lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;&amp;amp;&lt;/span&gt;&lt;span style=&quot;color:#a71d5d;&quot;&gt;mut&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt; contents);&lt;/span&gt;&lt;/pre&gt;

&lt;p&gt;Right away I run into trouble when I try to compile this:&lt;/p&gt;
&lt;pre class=&quot;console&quot;&gt;error[E0599]: no method named `read_to_string` found for type
`std::result::Result&amp;lt;std::fs::File, std::io::Error&gt;` in the current scope&lt;/pre&gt;
&lt;p&gt;What? What is the Rust compiler talking about? I can see there’s a &lt;span
class=&quot;code&quot;&gt;read_to_string&lt;/span&gt; method on the &lt;span class=&quot;code&quot;&gt;File&lt;/span&gt;
struct &lt;a href=&quot;https://doc.rust-lang.org/std/io/trait.Read.html#method.read_to_string&quot;&gt;right in the
documentation&lt;/a&gt;!
(Actually the method is on the &lt;span class=&quot;code&quot;&gt;Read&lt;/span&gt; trait which &lt;span
class=&quot;code&quot;&gt;File&lt;/span&gt; implements.) The problem is the &lt;span
class=&quot;code&quot;&gt;File::open&lt;/span&gt; function doesn’t return a file at all. It
returns a value of type &lt;span class=&quot;code&quot;&gt;io::Result&amp;lt;File&amp;gt;&lt;/span&gt;:&lt;/p&gt;
&lt;pre style=&quot;background-color:#ffffff;&quot;&gt;
&lt;span style=&quot;color:#a71d5d;&quot;&gt;pub fn &lt;/span&gt;&lt;span style=&quot;color:#795da3;&quot;&gt;open&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;&amp;lt;P: AsRef&amp;lt;Path&amp;gt;&amp;gt;(path: P) -&amp;gt; io::Result&amp;lt;File&amp;gt;&lt;/span&gt;&lt;/pre&gt;

&lt;p&gt;How do I use this? What does &lt;span class=&quot;code&quot;&gt;io::Result&amp;lt;File&amp;gt;&lt;/span&gt; even
mean? When I try to write Rust code the way I write Ruby or Go code, I get
cryptic errors and it doesn’t work.&lt;/p&gt;
&lt;p&gt;The problem is I’m trying to speak Rust the same way I speak in Ruby. Rust is a
foreign language; I need to learn some vocabulary before I can try to talk to
someone. This is why Rust is difficult to learn. It’s a foreign language that
uses many words completely unfamiliar to most developers.&lt;/p&gt;
&lt;h2&gt;Types Are the Vocabulary of Programming Languages&lt;/h2&gt;
&lt;p&gt;My wife is Spanish, and lucky for me she’s had the patience and the endurance
to teach me and our kids Spanish over the years. As a native English speaker,
it always seemed curious and amusing to me that Spanish has only one word for
fingers and toes, &lt;em&gt;dedos&lt;/em&gt;. Don’t people in Spain or Latin America ever need to
talk about only fingers and not toes? Or vice-versa? And in Spain I invariably
end up saying silly things like &lt;em&gt;dedos altos&lt;/em&gt; (“upper fingers”), or &lt;em&gt;dedos
bajos&lt;/em&gt; (“lower fingers”). I always worry about which digits I’m talking about.
Somehow, though, the Spanish never have any trouble with this; where the
&lt;em&gt;dedos&lt;/em&gt; are located always seems obvious to them from the context.&lt;/p&gt;
&lt;p&gt;But I wonder: Do Spanish speakers have trouble learning English when it comes
to fingers vs. toes? Do they ever say finger when they mean toe? The problem is
not just learning a new word. You have to learn the meaning behind the word.
English has a concept, a distinction, that Spanish doesn’t.&lt;/p&gt;
&lt;p&gt;Back to computer programming, the “words” we use in programming languages
aren’t only syntax tokens like if, else, let, etc. They are the values that we
pass around in our programs. And those values have types, even for loosely,
dynamically typed languages like Ruby.&lt;/p&gt;
&lt;p&gt;Aside from whatever formal definition Computer Science has for types, I simply
think of a value’s type as it’s meaning or purpose. To understand what role a
value plays in your program, you need to understand the concept behind its
type. Just like the words finger and toe represent certain anatomical concepts
in English, types like &lt;span class=&quot;code&quot;&gt;Result&amp;lt;T, E&amp;gt;&lt;/span&gt; or &lt;span
class=&quot;code&quot;&gt;Option&amp;lt;T&amp;gt;&lt;/span&gt; represent programming concepts in Rust - concepts
that foreigners need to learn for the first time.&lt;/p&gt;
&lt;p&gt;&lt;/p&gt;
&lt;blockquote&gt;
Language shapes the way we think, and determines what we can think about.&lt;br/&gt;
-- Benjamin Lee Whorf
&lt;/blockquote&gt;
&lt;p&gt;In fact, some linguists take this to the extreme: That a language’s words
determine what people in that community are able to think and talk about, what
concepts they can understand. (However, most modern linguists, &lt;a href=&quot;https://en.wikipedia.org/wiki/Linguistic_relativity&quot;&gt;according to
Wikipedia&lt;/a&gt;, don’t believe
this is actually true.)&lt;/p&gt;
&lt;p&gt;Because Rust includes the &lt;span class=&quot;code&quot;&gt;Result&lt;/span&gt; type, Rust
programmers are empowered to talk about error handling in a very natural way.
It’s part of their daily vocabulary. Of course, native Spanish speakers, I’m
guessing, have no trouble understanding the distinction between fingers and
toes. But I certainly have trouble understanding the concept behind &lt;span
class=&quot;code&quot;&gt;Result&lt;/span&gt; in Rust.&lt;/p&gt;
&lt;h2&gt;If Rust is Spanish, then Haskell is Latin&lt;/h2&gt;
&lt;p&gt;So what does &lt;span class=&quot;code&quot;&gt;Result&amp;lt;T, E&amp;gt;&lt;/span&gt; mean? What is a value of
type &lt;span class=&quot;code&quot;&gt;Result&amp;lt;T, E&amp;gt;&lt;/span&gt;?&lt;/p&gt;
&lt;p&gt;Just as human language borrow words from other languages — many Spanish words
are taken from Latin or Arabic while English borrowed many words from French and
German — programming languages borrow words and concepts from other, older
programming languages.&lt;/p&gt;
&lt;p&gt;Rust borrowed the concept behind the &lt;span class=&quot;code&quot;&gt;Result&amp;lt;T, E&amp;gt;&lt;/span&gt;
type from Haskell, a strongly typed functional programming language. Haskell
includes a type called &lt;span class=&quot;code&quot;&gt;Either&lt;/span&gt;:&lt;/p&gt;
&lt;pre&gt;data Either a b = Left a | Right b&lt;/pre&gt;
&lt;p&gt;This syntax seems bizarre at first glance but in fact it’s simple. Haskell
makes it easy to create new types by combining other types together. This line
of code means the &lt;span class=&quot;code&quot;&gt;Either&lt;/span&gt; type is a combination of two
other types: &lt;span class=&quot;code&quot;&gt;a&lt;/span&gt; and &lt;span class=&quot;code&quot;&gt;b&lt;/span&gt;.
Drawing that type equation, this is how I visualize Haskell &lt;span
class=&quot;code&quot;&gt;Either&lt;/span&gt; values:&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;https://patshaughnessy.net/assets/2019/10/3/left-or-right.png&quot;&gt;&lt;br/&gt;&lt;/p&gt;
&lt;p&gt;A single &lt;span class=&quot;code&quot;&gt;Either&lt;/span&gt; value can only encapsulate &lt;em&gt;either&lt;/em&gt;
a value of type &lt;span class=&quot;code&quot;&gt;a&lt;/span&gt; or a value of type &lt;span
class=&quot;code&quot;&gt;b&lt;/span&gt;:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;If the &lt;span class=&quot;code&quot;&gt;Either&lt;/span&gt; value is &lt;span
  class=&quot;code&quot;&gt;Left&lt;/span&gt;, then it contains an inner value of type &lt;span
  class=&quot;code&quot;&gt;a&lt;/span&gt;. This is written: &lt;span class=&quot;code&quot;&gt;Left a&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;If the &lt;span class=&quot;code&quot;&gt;Either&lt;/span&gt; value is &lt;span
  class=&quot;code&quot;&gt;Right&lt;/span&gt;, then it contains an inner value of type &lt;span
  class=&quot;code&quot;&gt;b&lt;/span&gt;. This is written: &lt;span class=&quot;code&quot;&gt;Right b&lt;/span&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The &lt;span class=&quot;code&quot;&gt;Either&lt;/span&gt; type is also “monad,” because Haskell 
provides certain functions that create and operate on &lt;span
class=&quot;code&quot;&gt;Either&lt;/span&gt; values. I won’t cover this concept here today, but
when I have time I'll discuss monads and how they can be applied to error
handling in a future post.&lt;/p&gt;
&lt;p&gt;In Haskell, the &lt;span class=&quot;code&quot;&gt;Either&lt;/span&gt; type is completely general,
and you can use it to represent any programming concept you would like.  Rust
uses the concept behind &lt;span class=&quot;code&quot;&gt;Either&lt;/span&gt; for a specific
purpose: to implement error handling. If Haskell is Latin, then Rust is
Spanish, a younger language that borrows some of the older languages’s
vocabulary and grammar.&lt;/p&gt;
&lt;h2&gt;Result&amp;lt;T, E&amp;gt; in Rust&lt;/h2&gt;
&lt;p&gt;In Rust, the &lt;span class=&quot;code&quot;&gt;Result&lt;/span&gt; type encapsulates two other types
like &lt;span class=&quot;code&quot;&gt;Either.&lt;/span&gt; A single &lt;span
class=&quot;code&quot;&gt;Result&lt;/span&gt; value has either one of those types or the other:&lt;/p&gt;
&lt;img src=&quot;https://patshaughnessy.net/assets/2019/10/3/ok-or-err.png&quot;&gt;
&lt;p&gt;Instead of &lt;span class=&quot;code&quot;&gt;Left a&lt;/span&gt; and &lt;span class=&quot;code&quot;&gt;Right
b&lt;/span&gt; like in Haskell, Rust uses the words &lt;span class=&quot;code&quot;&gt;Ok(T)&lt;/span&gt;
and &lt;span class=&quot;code&quot;&gt;Err(E)&lt;/span&gt;:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;If the &lt;span class=&quot;code&quot;&gt;Result&lt;/span&gt; value is &lt;span
  class=&quot;code&quot;&gt;Ok&lt;/span&gt;, then it contains an inner value of type &lt;span
  class=&quot;code&quot;&gt;T&lt;/span&gt;. This is written: &lt;span class=&quot;code&quot;&gt;Ok(T)&lt;/span&gt;.
&lt;span class=&quot;code&quot;&gt;Ok(T)&lt;/span&gt; means some operation was successful, and the
result of the operation is a value of type &lt;span class=&quot;code&quot;&gt;T&lt;/span&gt;.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;If the &lt;span class=&quot;code&quot;&gt;Either&lt;/span&gt; value is &lt;span
  class=&quot;code&quot;&gt;Err,&lt;/span&gt; then it contains an inner value of type &lt;span
  class=&quot;code&quot;&gt;E&lt;/span&gt;. This is written: &lt;span class=&quot;code&quot;&gt;Err(E)&lt;/span&gt;
Similarly, this means the operation was a failure, and the result of the
operation is an error of type &lt;span class=&quot;code&quot;&gt;E&lt;/span&gt;.&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Back to my open file example, the proper way to open a file and read it using
Rust is to check the &lt;span class=&quot;code&quot;&gt;Result&lt;/span&gt; values returned by the
Rust standard library functions:&lt;/p&gt;
&lt;pre style=&quot;background-color:#ffffff;&quot;&gt;
&lt;span style=&quot;color:#a71d5d;&quot;&gt;fn &lt;/span&gt;&lt;span style=&quot;color:#795da3;&quot;&gt;main&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;() {
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;    &lt;/span&gt;&lt;span style=&quot;color:#a71d5d;&quot;&gt;let&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt; file &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;= &lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;File::open(&lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;&amp;quot;&lt;/span&gt;&lt;span style=&quot;color:#008080;&quot;&gt;foo.txt&lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;&amp;quot;&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;);
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;    &lt;/span&gt;&lt;span style=&quot;color:#a71d5d;&quot;&gt;match&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt; file {
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;        Ok(file) &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;=&amp;gt; &lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;println!(&lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;&amp;quot;&lt;/span&gt;&lt;span style=&quot;color:#008080;&quot;&gt;I have a file: &lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;{:?}&lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;&amp;quot;&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;, file),
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;        Err(e) &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;=&amp;gt; &lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;println!(&lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;&amp;quot;&lt;/span&gt;&lt;span style=&quot;color:#008080;&quot;&gt;There was an error: &lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;{}&lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;&amp;quot;&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;, e)
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;    }
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;}&lt;/span&gt;&lt;/pre&gt;

&lt;p&gt;And If I want to actually read in the contents of that file, I would check that
return value also:&lt;/p&gt;
&lt;pre style=&quot;background-color:#ffffff;&quot;&gt;
&lt;span style=&quot;color:#a71d5d;&quot;&gt;fn &lt;/span&gt;&lt;span style=&quot;color:#795da3;&quot;&gt;main&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;() {
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;    &lt;/span&gt;&lt;span style=&quot;color:#a71d5d;&quot;&gt;let&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt; file &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;= &lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;File::open(&lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;&amp;quot;&lt;/span&gt;&lt;span style=&quot;color:#008080;&quot;&gt;foo.txt&lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;&amp;quot;&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;);
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;    &lt;/span&gt;&lt;span style=&quot;color:#a71d5d;&quot;&gt;match&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt; file {
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;        Ok(&lt;/span&gt;&lt;span style=&quot;color:#a71d5d;&quot;&gt;mut&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt; file) &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;=&amp;gt; &lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;{
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;            &lt;/span&gt;&lt;span style=&quot;color:#a71d5d;&quot;&gt;let mut&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt; contents &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;= &lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;String::new();
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;            &lt;/span&gt;&lt;span style=&quot;color:#a71d5d;&quot;&gt;match&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt; file.read_to_string(&lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;&amp;amp;&lt;/span&gt;&lt;span style=&quot;color:#a71d5d;&quot;&gt;mut&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt; contents) {
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;                Ok(&lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;_&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;) &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;=&amp;gt; &lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;println!(&lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;&amp;quot;&lt;/span&gt;&lt;span style=&quot;color:#008080;&quot;&gt;The file&amp;#39;s contents are: &lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;{}&lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;&amp;quot;&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;, contents),
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;                Err(e) &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;=&amp;gt; &lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;println!(&lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;&amp;quot;&lt;/span&gt;&lt;span style=&quot;color:#008080;&quot;&gt;There was an error: &lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;{}&lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;&amp;quot;&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;, e)
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;            }
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;        }
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;        Err(e) &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;=&amp;gt; &lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;println!(&lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;&amp;quot;&lt;/span&gt;&lt;span style=&quot;color:#008080;&quot;&gt;There was an error: &lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;{}&lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;&amp;quot;&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;, e)
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;    }
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;}&lt;/span&gt;&lt;/pre&gt;

&lt;h2&gt;The ? Operator In Rust&lt;/h2&gt;
&lt;p&gt;That last code snippet is quite a mouthful - error checking with Rust is even
more tedious and verbose than it is using Go!&lt;/p&gt;
&lt;p&gt;Fortunately, Rust includes an operator that allows Rust programmers to
abbreviate all of this logic. By appending the &lt;span class=&quot;code&quot;&gt;?&lt;/span&gt;
character to the call site of a function that returns a &lt;span
class=&quot;code&quot;&gt;Result&amp;lt;T, E&amp;gt;&lt;/span&gt; value, Rust automatically generates code
that checks the &lt;span class=&quot;code&quot;&gt;Result&amp;lt;T, E&amp;gt;&lt;/span&gt; value, and returns
underlying &lt;span class=&quot;code&quot;&gt;T&lt;/span&gt; value if the result is &lt;span
class=&quot;code&quot;&gt;Ok(T)&lt;/span&gt;:&lt;/p&gt;
&lt;pre style=&quot;background-color:#ffffff;&quot;&gt;
&lt;span style=&quot;color:#a71d5d;&quot;&gt;fn &lt;/span&gt;&lt;span style=&quot;color:#795da3;&quot;&gt;main&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;() {
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;    &lt;/span&gt;&lt;span style=&quot;color:#a71d5d;&quot;&gt;let mut&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt; file &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;= &lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;File::open(&lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;&amp;quot;&lt;/span&gt;&lt;span style=&quot;color:#008080;&quot;&gt;foo.txt&lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;&amp;quot;&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;)&lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;?&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;;
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;    &lt;/span&gt;&lt;span style=&quot;color:#a71d5d;&quot;&gt;let mut&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt; contents &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;= &lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;String::new();
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;    file.read_to_string(&lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;&amp;amp;&lt;/span&gt;&lt;span style=&quot;color:#a71d5d;&quot;&gt;mut&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt; contents)&lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;?&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;;
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;}&lt;/span&gt;&lt;/pre&gt;

&lt;p&gt;Here, the use of &lt;span class=&quot;code&quot;&gt;?&lt;/span&gt; after &lt;span
class=&quot;code&quot;&gt;File::open(&amp;quot;foo.txt&amp;quot;)&lt;/span&gt; tells the Rust compiler to check the
return value of &lt;span class=&quot;code&quot;&gt;File::open&lt;/span&gt; for me automatically:&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;https://patshaughnessy.net/assets/2019/10/3/success-or-failure.png&quot;&gt;&lt;br/&gt;&lt;/p&gt;
&lt;p&gt;If the return value of &lt;span class=&quot;code&quot;&gt;File::open&lt;/span&gt; is &lt;span
class=&quot;code&quot;&gt;Ok(T)&lt;/span&gt;, then Rust assigns the inner &lt;span
class=&quot;code&quot;&gt;T&lt;/span&gt; value to &lt;span class=&quot;code&quot;&gt;file&lt;/span&gt;. If &lt;span
class=&quot;code&quot;&gt;File::open&lt;/span&gt; returns &lt;span class=&quot;code&quot;&gt;Err(E)&lt;/span&gt;, then
Rust jumps to the end of the &lt;span class=&quot;code&quot;&gt;main&lt;/span&gt; function
immediately and returns.&lt;/p&gt;
&lt;p&gt;The program above is much more concise and easy to understand. The only problem
is that it doesn’t work! When I try to compile this, I get:&lt;/p&gt;
&lt;pre class=&quot;console&quot;&gt;error[E0277]: the `?` operator can only be used in a function that returns `Result` or `Option`
(or another type that implements `std::ops::Try`)
 --&gt; src/main.rs:5:20
  |
5 |     let mut file = File::open(&quot;foo.txt&quot;)?;
  |                    ^^^^^^^^^^^^^^^^^^^^^^ cannot use the `?` operator in
  a function that returns `()`
  |
  = help: the trait `std::ops::Try` is not implemented for `()`
  = note: required by `std::ops::Try::from_error`&lt;/pre&gt;
&lt;h2&gt;Rust Programs Revolve Around Error Handling&lt;/h2&gt;
&lt;p&gt;As the error message says, the problem here is that the &lt;span
class=&quot;code&quot;&gt;?&lt;/span&gt; operator generates code that will jump to the end of the
main function and return the &lt;span class=&quot;code&quot;&gt;Err(E)&lt;/span&gt; value, where E is
of type &lt;span class=&quot;code&quot;&gt;std::io::Error&lt;/span&gt;. The problem is that I haven’t
declared a return value for &lt;span class=&quot;code&quot;&gt;main&lt;/span&gt;. Therefore the Rust
compiler gives me an error:&lt;/p&gt;
&lt;pre class=&quot;console&quot;&gt;the `?` operator can only be used in a function that returns `Result` or
`Option` (or another type that implements `std::ops::Try`)&lt;/pre&gt;
&lt;p&gt;The function containing the use of the &lt;span class=&quot;code&quot;&gt;?&lt;/span&gt; operator has
to return a value of type &lt;span class=&quot;code&quot;&gt;Result&amp;lt;T, E&amp;gt;&lt;/span&gt; with a
matching &lt;span class=&quot;code&quot;&gt;E&lt;/span&gt; type in order for this to make sense. I
have to extract my &lt;span class=&quot;code&quot;&gt;File&lt;/span&gt; calls into a separate
function, like this:&lt;/p&gt;
&lt;pre style=&quot;background-color:#ffffff;&quot;&gt;
&lt;span style=&quot;color:#a71d5d;&quot;&gt;fn &lt;/span&gt;&lt;span style=&quot;color:#795da3;&quot;&gt;read&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;() -&amp;gt; Result&amp;lt;String, std::io::Error&amp;gt; {
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;    &lt;/span&gt;&lt;span style=&quot;color:#a71d5d;&quot;&gt;let mut&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt; file &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;= &lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;File::open(&lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;&amp;quot;&lt;/span&gt;&lt;span style=&quot;color:#008080;&quot;&gt;foo.txt&lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;&amp;quot;&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;)&lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;?&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;;
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;    &lt;/span&gt;&lt;span style=&quot;color:#a71d5d;&quot;&gt;let mut&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt; contents &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;= &lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;String::new();
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;    file.read_to_string(&lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;&amp;amp;&lt;/span&gt;&lt;span style=&quot;color:#a71d5d;&quot;&gt;mut&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt; contents)&lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;?&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;;
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;    Ok(contents)
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;}
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;
&lt;/span&gt;&lt;span style=&quot;color:#a71d5d;&quot;&gt;fn &lt;/span&gt;&lt;span style=&quot;color:#795da3;&quot;&gt;main&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;() {
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;    &lt;/span&gt;&lt;span style=&quot;color:#a71d5d;&quot;&gt;match &lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;read() {
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;        Ok(&lt;/span&gt;&lt;span style=&quot;color:#a71d5d;&quot;&gt;str&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;) &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;=&amp;gt; &lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;println!(&lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;&amp;quot;&lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;{}&lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;&amp;quot;&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;, &lt;/span&gt;&lt;span style=&quot;color:#a71d5d;&quot;&gt;str&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;),
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;        Err(e) &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;=&amp;gt; &lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;println!(&lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;&amp;quot;&lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;{:?}&lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;&amp;quot;&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;, e)
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;    }
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;}&lt;/span&gt;&lt;/pre&gt;

&lt;p&gt;Note the new &lt;span class=&quot;code&quot;&gt;read()&lt;/span&gt; function above returns a value of
type &lt;span class=&quot;code&quot;&gt;Result&amp;lt;String, std::io::Error&amp;gt;&lt;/span&gt;. This allows the
use of the &lt;span class=&quot;code&quot;&gt;?&lt;/span&gt; operator to compile properly. For the
happy path, if my code is able to find the “foo.txt” file and read it, then
&lt;span class=&quot;code&quot;&gt;read()&lt;/span&gt; returns &lt;span
class=&quot;code&quot;&gt;Ok(contents)&lt;/span&gt;. However, if there’s an error, &lt;span
class=&quot;code&quot;&gt;read()&lt;/span&gt; will return &lt;span class=&quot;code&quot;&gt;Err(e)&lt;/span&gt;, where
&lt;span class=&quot;code&quot;&gt;e&lt;/span&gt; is a value of type &lt;span
class=&quot;code&quot;&gt;std::io::Error&lt;/span&gt;. Note &lt;span class=&quot;code&quot;&gt;open&lt;/span&gt; returns
the same error type that &lt;span class=&quot;code&quot;&gt;read&lt;/span&gt; does:&lt;/p&gt;
&lt;img src=&quot;https://patshaughnessy.net/assets/2019/10/3/error-types.png&quot;&gt;
&lt;p&gt;This is where Rust shines. It allows for concise and readable error handling
that is also thorough and correct. The Rust compiler checks for error handling
completeness at &lt;em&gt;compile time&lt;/em&gt;, before I ever run my program.&lt;/p&gt;
&lt;p&gt;Now that I’ve learned some vocabulary words, now that I can understand how
native Rust speakers use the word &lt;span class=&quot;code&quot;&gt;Result&amp;lt;T, E&amp;gt;&lt;/span&gt;, I
can have a Rust conversation about error handling. I can begin to think like
Rust developers think. I can start to see things from their perspective.&lt;/p&gt;
&lt;p&gt;And I begin to realize that Rust programs tend to be designed with error
handling in mind. Notice above how I had to extract a separate function that
returned a value of type &lt;span class=&quot;code&quot;&gt;Result&amp;lt;T, E&amp;gt;&lt;/span&gt;, just
because of the &lt;span class=&quot;code&quot;&gt;?&lt;/span&gt; operator. The overall structure of
my program is determined by error handling just as much as it’s determined by
the nature of the task I’m trying to accomplish. Rust programmers think about
errors and what might go wrong from the very beginning, from when they start
writing code. To be honest, I've often thought about errors and what might go
wrong as an afterthought, after I've written and deployed my code.&lt;/p&gt;
</content></entry><entry><title>Using Rust to Build a Blog Site</title><link href="https://patshaughnessy.net/2019/9/4/using-rust-to-build-a-blog-site" rel="alternate"></link><id href="https://patshaughnessy.net/2019/9/4/using-rust-to-build-a-blog-site" rel="alternate"></id><published>2019-09-04T00:00:00Z</published><updated>2019-09-04T00:00:00Z</updated><category>Rust</category><author><name>Pat Shaughnessy</name></author><summary type="html">&lt;div style=&quot;float: left; padding: 8px 30px 20px 0px; text-align: center; line-height:18px&quot;&gt;
  &lt;img src=&quot;https://patshaughnessy.net/assets/2019/9/4/batteries.jpg&quot;&gt;&lt;br/&gt;
&lt;i&gt; Rust comes with batteries included&lt;br/&gt;
    &lt;small&gt;(source: &lt;a href=&quot;https://commons.wikimedia.org/wiki/File:Neo-Geo-Pocket-Color-w-batteries.jpg&quot;&gt;Wikimedia Commons&lt;/a&gt;)&lt;/small&gt;&lt;/i&gt;
&lt;/div&gt;
&lt;p&gt;After “Hello World,” blog sites </summary><content type="html">&lt;div style=&quot;float: left; padding: 8px 30px 20px 0px; text-align: center; line-height:18px&quot;&gt;
  &lt;img src=&quot;https://patshaughnessy.net/assets/2019/9/4/batteries.jpg&quot;&gt;&lt;br/&gt;
&lt;i&gt; Rust comes with batteries included&lt;br/&gt;
    &lt;small&gt;(source: &lt;a href=&quot;https://commons.wikimedia.org/wiki/File:Neo-Geo-Pocket-Color-w-batteries.jpg&quot;&gt;Wikimedia Commons&lt;/a&gt;)&lt;/small&gt;&lt;/i&gt;
&lt;/div&gt;
&lt;p&gt;After “Hello World,” blog sites are the world’s second most unneeded
application. If you want to write a blog, use Medium, Wordpress or just
Twitter. The world doesn’t need another blog app.&lt;/p&gt;
&lt;p&gt;However, like Hello World, building a static site generator is a great way to
get your feet wet in a new programming language. Recently I rewrote &lt;a href=&quot;https://github.com/patshaughnessy/patshaughnessy.github.io/blob/master/src/lib.rs&quot;&gt;the script
I use to generate this web
site&lt;/a&gt;
using Rust: I needed to update and fix my script, but really I was looking for
an excuse to write Rust.  Despite its reputation as a difficult to learn,
expert level language,  Rust turned out to be a great choice for the simple
task of generating a few HTML files, quickly and reliably. Why? Not because of
its sophisticated borrow checker or support for safe concurrency.&lt;/p&gt;
&lt;p&gt;Rust was a great choice for me because I didn’t have to write most of the code.
Rust’s dependency management and build tool,
&lt;a href=&quot;https://doc.rust-lang.org/book/ch01-03-hello-cargo.html&quot;&gt;Cargo&lt;/a&gt;, allowed me to
glue together open source Rust libraries called “crates” which do most of the
work. The Rust community’s crate registry, &lt;a href=&quot;https://crates.io&quot;&gt;crates.io&lt;/a&gt;, has
over 29,000 crates available.  Downloading, compiling and using them is dead
simple. And writing a blog site using Rust turned out to be simple too.&lt;/p&gt;
&lt;h2&gt;My Cargo.toml File&lt;/h2&gt;
&lt;p&gt;I needed a few important features to generate this web site. I wanted my script
to work like this for each blog post:&lt;/p&gt;
&lt;img src=&quot;https://patshaughnessy.net/assets/2019/9/4/flowchart.png&quot;/&gt;
&lt;p&gt;For each blog post, My new Rust script had to: parse the markdown source file
and convert it to HTML markup, highlight the syntax of my code snippets using
&amp;lt;style&amp;gt; tags and CSS, and use a template to insert the HTML for each post
into the surrounding web layout/design. Sounds like a lot of work, right?&lt;/p&gt;
&lt;p&gt;Wrong. Other Rust developers smarter than me had already implemented all of
this. All I had to do was find the crates I needed and add them to my
Cargo.toml file:&lt;/p&gt;
&lt;pre type=&quot;console&quot;&gt;[dependencies]
maud = &quot;*&quot;
pulldown-cmark = &quot;*&quot;
syntect = &quot;3.0&quot;&lt;/pre&gt;
&lt;p&gt;&lt;a href=&quot;https://github.com/raphlinus/pulldown-cmark&quot;&gt;Pulldown-cmark&lt;/a&gt; is a markdown
parser crate, &lt;a href=&quot;https://github.com/trishume/syntect&quot;&gt;Syntect&lt;/a&gt; is a color syntax
highlighting crate, and &lt;a href=&quot;https://github.com/lfairy/maud&quot;&gt;Maud&lt;/a&gt; is an HTML
template crate. Actually, to be honest I ended up adding a few other crates to
get my script to work:&lt;/p&gt;
&lt;pre type=&quot;console&quot;&gt;[dependencies]
maud = &quot;\*&quot;
pulldown-cmark = &quot;\*&quot;
regex = &quot;\*&quot;
lazy_static = &quot;\*&quot;
syntect = &quot;3.0&quot;
chrono = &quot;\*&quot;
clap = &quot;\*&quot;
ordinal = &quot;\*&quot;&lt;/pre&gt;
&lt;p&gt;I’m not sure why, but the Rust standard library is very minimal. Features that
are included in other languages, like regular expressions or date/time parsing,
are handled by crates (e.g. regex and chrono).&lt;/p&gt;
&lt;p&gt;In any case, all I had to do was build my project and Cargo downloaded
everything I needed:&lt;/p&gt;
&lt;pre type=&quot;console&quot;&gt;$ cargo build --release
    Updating crates.io index
  Downloaded chrono v0.4.7
  Downloaded clap v2.33.0
  Downloaded maud v0.19.0
  Downloaded lazy_static v1.2.0
  Downloaded pulldown-cmark v0.2.0
  Downloaded ordinal v0.2.2
  Downloaded regex v1.1.0
  Downloaded syntect v3.0.2
  Downloaded libc v0.2.44
  Downloaded num-integer v0.1.41
  Downloaded num-traits v0.2.8
  Downloaded time v0.1.42

etc…
   Compiling syntect v3.0.2
   Compiling blogc v0.1.0 (/Users/pat/apps/patshaughnessy.github.io)
    Finished release [optimized] target(s) in 2m 27s&lt;/pre&gt;
&lt;p&gt;It couldn’t be easier! During the rest of this post, I’ll show you how I used
these three crates: Pulldown-cmark, Syntect and Maud.&lt;/p&gt;
&lt;h2&gt;Pulldown-cmark&lt;/h2&gt;
&lt;p&gt;Now that my blog app included the Pulldown-mark crate, using it was just a
matter of pasting in a few of lines of code from the &lt;a href=&quot;https://docs.rs/pulldown-cmark/0.5.3/pulldown_cmark/html/fn.push_html.html&quot;&gt;helpful example on
docs.rs&lt;/a&gt;:&lt;/p&gt;
&lt;pre style=&quot;background-color:#ffffff;&quot;&gt;
&lt;span style=&quot;color:#a71d5d;&quot;&gt;let&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt; parser &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;= &lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;Parser::new(&lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;&amp;amp;&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;markdown);
&lt;/span&gt;&lt;span style=&quot;color:#a71d5d;&quot;&gt;let mut&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt; html &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;= &lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;String::new();
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;html::push_html(&lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;&amp;amp;&lt;/span&gt;&lt;span style=&quot;color:#a71d5d;&quot;&gt;mut&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt; html, parser);&lt;/span&gt;&lt;/pre&gt;

&lt;p&gt;The first line created a &lt;span class=&quot;code&quot;&gt;Parser&lt;/span&gt; struct, passing in a
reference to my markdown string. Then I created an empty, mutable target
string, called &lt;span class=&quot;code&quot;&gt;html&lt;/span&gt;. Last, I called the &lt;span
class=&quot;code&quot;&gt;push_html&lt;/span&gt; function which parsed the markdown source,
converted it to HTML and saved it into &lt;span class=&quot;code&quot;&gt;html&lt;/span&gt;. I didn’t
have to do any work whatsoever.&lt;/p&gt;
&lt;p&gt;In fact, the only real work for me had to do with “header” strings present at
the top of each post source file. For example, the
&lt;a href=&quot;https://raw.githubusercontent.com/patshaughnessy/patshaughnessy.github.io/master/posts/2017-12-15-looking-inside-postgres-at-a-gist-index.markdown&quot;&gt;2017-12-15-looking-inside-postgres-at-a-gist-index.markdown&lt;/a&gt;
file starts with:&lt;/p&gt;
&lt;pre type=&quot;console&quot;&gt;title: &quot;Looking Inside Postgres at a GiST Index&quot;
date: 2017/12/15
tag: the Postgres LTREE Extension

etc…&lt;/pre&gt;
&lt;p&gt;Here the first three lines are metadata values about the post and not part of
the post content. So before calling Pulldown-mark, my script parses and
removes these header lines:&lt;/p&gt;
&lt;pre style=&quot;background-color:#ffffff;&quot;&gt;
&lt;span style=&quot;color:#a71d5d;&quot;&gt;fn &lt;/span&gt;&lt;span style=&quot;color:#795da3;&quot;&gt;other_lines&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;(lines: &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;&amp;amp;&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;Vec&amp;lt;String&amp;gt;) -&amp;gt; Vec&amp;lt;String&amp;gt; {
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;  lines.iter().skip_while(|l| is_header(l)).map(|l| l.to_string()).collect()
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;}&lt;/span&gt;&lt;/pre&gt;

&lt;p&gt;Above the &lt;span class=&quot;code&quot;&gt;lines&lt;/span&gt; parameter is an array of strings,
each a single line of text in the markdown source file. (More precisely, it’s a
reference to a &lt;span class=&quot;code&quot;&gt;Vec&amp;lt;String&amp;gt;&lt;/span&gt;, not an array.) The code
is fairly readable: &lt;span class=&quot;code&quot;&gt;other_lines&lt;/span&gt; creates an iterator
over the lines, skips the first few header lines, and then collects the
remaining lines into a second array which the function returns.&lt;/p&gt;
&lt;p&gt;Here’s the complete &lt;span class=&quot;code&quot;&gt;html_from_markdown&lt;/span&gt; function,
which calls &lt;span class=&quot;code&quot;&gt;other_lines&lt;/span&gt;, joins them together into a
single large string, and then passes that to Pulldown-mark:&lt;/p&gt;
&lt;pre style=&quot;background-color:#ffffff;&quot;&gt;
&lt;span style=&quot;color:#a71d5d;&quot;&gt;fn &lt;/span&gt;&lt;span style=&quot;color:#795da3;&quot;&gt;html_from_markdown&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;(lines: &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;&amp;amp;&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;Vec&amp;lt;String&amp;gt;) -&amp;gt; Result&amp;lt;String, InvalidPostError&amp;gt; {
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;  &lt;/span&gt;&lt;span style=&quot;color:#a71d5d;&quot;&gt;let mut&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt; markdown &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;= &lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;String::new();
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;  &lt;/span&gt;&lt;span style=&quot;color:#a71d5d;&quot;&gt;for&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt; line &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;in &lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;other_lines(lines) {
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;    markdown.push_str(&lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;&amp;amp;&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;line);
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;    markdown.push(&lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;&amp;#39;&lt;/span&gt;&lt;span style=&quot;color:#96b5b4;&quot;&gt;\n&lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;&amp;#39;&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;);
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;  }
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;  &lt;/span&gt;&lt;span style=&quot;color:#a71d5d;&quot;&gt;let&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt; parser &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;= &lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;Parser::new(&lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;&amp;amp;&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;markdown);
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;  &lt;/span&gt;&lt;span style=&quot;color:#a71d5d;&quot;&gt;let mut&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt; html &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;= &lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;String::new();
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;  html::push_html(&lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;&amp;amp;&lt;/span&gt;&lt;span style=&quot;color:#a71d5d;&quot;&gt;mut&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt; html, parser);
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;  Ok(with_delim_removed(with_highlighted_code_snippets(&lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;&amp;amp;&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;html)))
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;}&lt;/span&gt;&lt;/pre&gt;

&lt;h2&gt;Syntect&lt;/h2&gt;
&lt;p&gt;If you read the code above carefully, you’ll notice &lt;span
class=&quot;code&quot;&gt;html_from_markdown&lt;/span&gt; calls &lt;span
class=&quot;code&quot;&gt;with_highlighted_code_snippets&lt;/span&gt; before returning the HTML
for each post. This function performs color syntax highlighting.&lt;/p&gt;
&lt;p&gt;The code snippets in each of my blog posts appear inside of &amp;lt;pre&amp;gt;…&amp;lt;/pre&amp;gt;
tags.  And I use a “type” attribute to indicate the programming language of the
snippet. For example:&lt;/p&gt;
&lt;pre type=&quot;console&quot;&gt;&amp;lt;pre type=&quot;ruby&quot;&gt;
puts &quot;This is Ruby code I’m writing about…&quot;
&amp;lt;/pre&gt;&lt;/pre&gt;
&lt;p&gt;Like parsing markdown, syntax highlighting is a very complex task: The Syntect
crate has to parse the given code snippet, determine the semantic meaning of
each keyword in the snippet based on the provided programming language, and
then insert the proper color information. Thank goodness I didn’t have to write
that code!&lt;/p&gt;
&lt;p&gt;But using Syntect was easy:&lt;/p&gt;
&lt;pre style=&quot;background-color:#ffffff;&quot;&gt;
&lt;span style=&quot;color:#a71d5d;&quot;&gt;pub fn &lt;/span&gt;&lt;span style=&quot;color:#795da3;&quot;&gt;highlighted_html_for_language&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;(snippet: &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;&amp;amp;&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;String, attributes: String) -&amp;gt; String {
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;  lazy_static! {
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;    &lt;/span&gt;&lt;span style=&quot;color:#a71d5d;&quot;&gt;static ref &lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;SYNTAX_SET&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;: SyntaxSet &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;= &lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;SyntaxSet::load_from_folder(syntax_path()).unwrap();
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;    &lt;/span&gt;&lt;span style=&quot;color:#a71d5d;&quot;&gt;static ref &lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;THEME&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;: Theme &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;= &lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;ThemeSet::get_theme(theme_path().as_path()).unwrap();
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;    &lt;/span&gt;&lt;span style=&quot;color:#a71d5d;&quot;&gt;static ref &lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;RUBY_SYNTAX&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;: &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;&amp;amp;&lt;/span&gt;&lt;span style=&quot;color:#a71d5d;&quot;&gt;&amp;#39;static&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt; SyntaxReference &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;=
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;      &lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;SYNTAX_SET&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;.find_syntax_by_extension(&lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;&amp;quot;&lt;/span&gt;&lt;span style=&quot;color:#008080;&quot;&gt;rb&lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;&amp;quot;&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;).unwrap();
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;    &lt;/span&gt;&lt;span style=&quot;color:#a71d5d;&quot;&gt;static ref &lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;RUST_SYNTAX&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;: &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;&amp;amp;&lt;/span&gt;&lt;span style=&quot;color:#a71d5d;&quot;&gt;&amp;#39;static&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt; SyntaxReference &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;=
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;      &lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;SYNTAX_SET&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;.find_syntax_by_extension(&lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;&amp;quot;&lt;/span&gt;&lt;span style=&quot;color:#008080;&quot;&gt;rs&lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;&amp;quot;&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;).unwrap();
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;etc&lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;...
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;  }
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;  &lt;/span&gt;&lt;span style=&quot;color:#a71d5d;&quot;&gt;if&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt; attributes.contains(&lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;&amp;quot;&lt;/span&gt;&lt;span style=&quot;color:#008080;&quot;&gt;ruby&lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;&amp;quot;&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;) {
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;    highlighted_html_for_string(&lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;&amp;amp;&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;snippet, &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;&amp;amp;&lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;SYNTAX_SET&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;, &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;&amp;amp;&lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;RUBY_SYNTAX&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;, &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;&amp;amp;&lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;THEME&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;)
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;  } &lt;/span&gt;&lt;span style=&quot;color:#a71d5d;&quot;&gt;else if&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt; attributes.contains(&lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;&amp;quot;&lt;/span&gt;&lt;span style=&quot;color:#008080;&quot;&gt;rust&lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;&amp;quot;&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;) {
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;    highlighted_html_for_string(&lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;&amp;amp;&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;snippet, &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;&amp;amp;&lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;SYNTAX_SET&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;, &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;&amp;amp;&lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;RUST_SYNTAX&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;, &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;&amp;amp;&lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;THEME&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;)
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;etc&lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;...&lt;/span&gt;&lt;/pre&gt;

&lt;p&gt;First I used a &lt;span class=&quot;code&quot;&gt;lazy_static&lt;/span&gt; block to initialize a few
constant values.
(&lt;a href=&quot;https://github.com/rust-lang-nursery/lazy-static.rs&quot;&gt;lazy_static&lt;/a&gt; is another
crate I didn’t have to write!) Rust executes this block once the first time
it’s encountered and then never again. The values are:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;&lt;span class=&quot;code&quot;&gt;SYNTAX_SET&lt;/span&gt;: These are the Sublime syntax files
describing each programming language I need to colorize. vim is my editor,
but I use Sublime for color syntax highlighting :) I just downloaded these
files for the languages I needed and checked them into my app.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;span class=&quot;code&quot;&gt;THEME&lt;/span&gt;: These are the Sublime “theme” files, which
select the colors to use for each type of code keyword. I found and adapted
one of these files. They play the role of a CSS file, but use XML syntax.
Weird, but not hard to figure out.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;span class=&quot;code&quot;&gt;RUBY_SYNTAX&lt;/span&gt;, &lt;span class=&quot;code&quot;&gt;RUST_SYNTAX&lt;/span&gt;,
etc. The lazy block also looks up the syntax language file for each language.&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Later my &lt;span class=&quot;code&quot;&gt;highlighted_html_for_language&lt;/span&gt; function
checks which programming language my post displays, and calls &lt;span
class=&quot;code&quot;&gt;syntect::html::highlighted_html_for_string&lt;/span&gt; with the proper
values:&lt;/p&gt;
&lt;pre style=&quot;background-color:#ffffff;&quot;&gt;
&lt;span style=&quot;color:#a71d5d;&quot;&gt;if&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt; attributes.contains(&lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;&amp;quot;&lt;/span&gt;&lt;span style=&quot;color:#008080;&quot;&gt;ruby&lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;&amp;quot;&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;) {
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;    highlighted_html_for_string(&lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;&amp;amp;&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;snippet, &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;&amp;amp;&lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;SYNTAX_SET&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;, &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;&amp;amp;&lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;RUBY_SYNTAX&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;, &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;&amp;amp;&lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;THEME&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;)
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;  } &lt;/span&gt;&lt;span style=&quot;color:#a71d5d;&quot;&gt;else if&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt; attributes.contains(&lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;&amp;quot;&lt;/span&gt;&lt;span style=&quot;color:#008080;&quot;&gt;rust&lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;&amp;quot;&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;) {
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;    highlighted_html_for_string(&lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;&amp;amp;&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;snippet, &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;&amp;amp;&lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;SYNTAX_SET&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;, &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;&amp;amp;&lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;RUST_SYNTAX&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;, &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;&amp;amp;&lt;/span&gt;&lt;span style=&quot;color:#d08770;&quot;&gt;THEME&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;)
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;etc&lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;...&lt;/span&gt;&lt;/pre&gt;

&lt;p&gt;&lt;span class=&quot;code&quot;&gt;attributes&lt;/span&gt; is the array of HTML attributes from the
&amp;lt;pre&amp;gt; tag surrounding the code snippet in my post source. My app uses
regular expressions to find the &amp;lt;pre&amp;gt;…&amp;lt;/pre&amp;gt; HTML blocks, parses the
attributes and provides them to &lt;span
class=&quot;code&quot;&gt;highlighted_html_for_language&lt;/span&gt;.&lt;/p&gt;
&lt;h2&gt;Maud&lt;/h2&gt;
&lt;p&gt;Now my script has HTML for each blog post. All I have to do now is save it in a
series of HTML files. But first I needed a template engine for Rust, like ERB
for Ruby or Mustache for node.js.&lt;/p&gt;
&lt;p&gt;This turned out to be one of the most fun parts of this project. I rewrote &lt;a href=&quot;https://github.com/patshaughnessy/patshaughnessy.github.io/tree/master/src/layout&quot;&gt;my
HTML
markup&lt;/a&gt;
using Maud &lt;span class=&quot;code&quot;&gt;@&lt;/span&gt; directives, like this:&lt;/p&gt;
&lt;pre style=&quot;background-color:#ffffff;&quot;&gt;
&lt;span style=&quot;color:#4f5b66;&quot;&gt;@&lt;/span&gt;&lt;span style=&quot;color:#a71d5d;&quot;&gt;if let &lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;Some(&lt;/span&gt;&lt;span style=&quot;color:#a71d5d;&quot;&gt;ref&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt; t) &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;=&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt; post.tag {
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;  div class&lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;=&amp;quot;&lt;/span&gt;&lt;span style=&quot;color:#008080;&quot;&gt;header&lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;&amp;quot; &lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;{
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;    &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;&amp;quot;&lt;/span&gt;&lt;span style=&quot;color:#008080;&quot;&gt;More on &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;&amp;quot; &lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;(t)
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;  }
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;  div class&lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;=&amp;quot;&lt;/span&gt;&lt;span style=&quot;color:#008080;&quot;&gt;links&lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;&amp;quot; &lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;{
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;    ul {
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;      &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;@&lt;/span&gt;&lt;span style=&quot;color:#a71d5d;&quot;&gt;for &lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;(link_url, link_title) &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;in&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt; recent_links {
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;        li {
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;          a href&lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;=&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;{ &lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;&amp;quot;&lt;/span&gt;&lt;span style=&quot;color:#008080;&quot;&gt;/&lt;/span&gt;&lt;span style=&quot;color:#4f5b66;&quot;&gt;&amp;quot; &lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;(link_url) } {
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;            (link_title)
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;          }
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;        }
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;      }
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;    }
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;  }
&lt;/span&gt;&lt;span style=&quot;color:#000000;&quot;&gt;}&lt;/span&gt;&lt;/pre&gt;

&lt;p&gt;Maud doesn’t parse the layout code at runtime, like ERB does in Ruby. Instead,
the &lt;span class=&quot;code&quot;&gt;@if&lt;/span&gt; and &lt;span class=&quot;code&quot;&gt;@for&lt;/span&gt; directives
above are macros. In fact, all of the HTML elements that appear above, like
&lt;span class=&quot;code&quot;&gt;div&lt;/span&gt; and &lt;span class=&quot;code&quot;&gt;ul&lt;/span&gt;, are macros
also. This means my Maud layout code is actually Rust code!  And that means the
Rust compiler will check it and make sure it’s valid before it ever runs.&lt;/p&gt;
&lt;p&gt;Converting my old ERB templates into Rust macros was a bit tedious, but it was
a great way to review and clean up my HTML. In fact, I found a number of
mistakes and errors in my HTML code that had been there for 10 years or longer.
It was like showing my dirty laundry to the Rust compiler. By the time the
compiler was done and let me compile my layout, it was very clean!&lt;/p&gt;
&lt;h2&gt;What It Worth It?&lt;/h2&gt;
&lt;p&gt;It took me several months on a spare time basis - an hour here an hour there -
to rewrite my blog in Rust. An experienced Rust developer working full time
could have done it in a day or two probably.&lt;/p&gt;
&lt;p&gt;What did I get for all this effort? Now I have a script that compiles all 146
of my markdown posts very quickly. My old Ruby script took 7.7 seconds to do
this, while the new Rust script only takes 0.28 seconds! That’s over 27 times
faster! In fact, the Rust code is so fast at parsing and compiling the markdown
files that I don’t check which files need to be recompiled by comparing
timestamps, i.e. what a Makefile would do during a build process. And I don’t
process the posts in parallel. Why bother? By the time I pressed ENTER and
looked up Rust was almost done building all 146 files in sequence, one after
the other.&lt;/p&gt;
&lt;p&gt;But what else did I get? The biggest improvement to my blog script, actually,
wasn’t the performance. It was the error handling I added. I didn’t mention
this above, but using the Rust standard library required me to use the
&lt;span class=&quot;code&quot;&gt;Result&amp;lt;T&amp;gt;&lt;/span&gt; generic type. This, in turn, forced me to
think about what might go wrong and what to do when it did go wrong. I’ll cover
this in my next article.  I ended up with a script that was much more reliable
and resilient to silly mistakes in my source files, and that gave me helpful
error messages… all the while running 27 times faster.&lt;/p&gt;
&lt;p&gt;However, the biggest benefit to rewriting my blog in Rust was that I clawed my
way up the Rust learning curve a bit. But that wouldn’t have been possible
without crates.io and Cargo. Using code from smarter, more seasoned Rust
developers gave me a chance to build a useful app, even as a beginner. Cargo
found, downloaded and compiled open source code from experts with just a few
simple commands.&lt;/p&gt;
</content></entry></feed>