~
~ why
Hello! My name is Arne Martin and I work as a web developer at a very small design agency called Kvitebjørn Designbyrå in Trondheim here in Norway. Kvitebjørn is an old Norwegian word for polar bear. We mostly design and develop web sites and simple web applications. I am the sole developer and I do everything from managing servers to writing CSS, but most of my time is spent with frontend development, i.e stuff that would be relevant for you, the audience. This talk, however, will probably be anything but relevant for most of y'all's day-to-day work.
~
~
~ Arne Martin Aurlien
~ [email protected] / @arnemart
~
~ Kvitebjørn Designbyrå / kvitebjorn.com
--
~
~ (•ᴥ•)
There are many programming languages in the world, but in the browser there's really only one, and outside (and indeed sometimes inside) the world of web/frontend developers -- people like us -- javascript is not always what you would call "beloved". I really like writing JS, it has its charms, but it certainly also has a few quirks, and fair share of
~
~ So what was this talk about again
~ wat
wats
but it's not the only alternative. There are several languages that use JS as a compilation target, allowing you to write code for the browser in whatever manner you prefer.
~
~ So what was this talk about again
The most common use case for compiling or transpiling to JS is probably Babel and similar things, that allow you to use javascript from the future in todays browsers. Babel converts cutting-edge language features to gross old-style JS like our ancestors used to write.
~
~
export default React.createClass({
render(): any {
return <body bgcolor="hotpink">
<bgsound src="canyon.mid" loop="infinite"/>
<marquee><font face="Comic Sans" size="40" color="teal">yolo</font></marquee>
</body>;
}
});
There's typescript from Microsoft, if you like writing JS but want some additional type safety
~
interface Shape {
name: string;
width: number;
height: number;
color?: string;
}
function area(shape : Shape) {
var area = shape.width * shape.height;
return "I'm " + shape.name + " with area " + area + " cm squared";
}
There's coffeescript, if you like whitespace and hate everything that's good and natural
~
upperCaseExpr = (textParts, expressions...) ->
textParts.reduce (text, textPart, i) ->
text + expressions[i - 1].toUpperCase() + textPart
greet = (name, adjective) ->
upperCaseExpr"""
Hi #{name}. You look #{adjective}!
"""
There's clojurescript, if you love wearing out your 8 and 9 keys
~
(defn set-html! [el content]
(set! (.-innerHTML el) content))
(defn main []
(let [content "Hello World from ClojureScript"
element (aget (js/document.getElementsByTagName "main") 0)]
(set-html! element content)))
If you like Haskell, there's Elm, a fun and friendly language despite its haskell roots
~
module Hello exposing (..)
import Html exposing (text)
main =
text "Hello"
And if you LOVE Haskell, there's PureScript Also there are several others.
~
render :: T.Render _ {} _ {}
render _ _ _ _ = R.div' [ R.h1' [ R.text "Hello world!" ] ]
spec :: T.Spec _ {} _ {}
spec = T.simpleSpec {} perfAction render
perfAction :: T.PerformAction _ {} _ {}
perfAction _ _ = T.modifyState (const {})
main :: ∀ eff. Eff (dom :: DOM.DOM | eff) R.ReactElement
main = body >>= R.render (R.createFactory (T.createClass spec) {})
where
body = do
win <- DOM.window
doc <- DOM.document win
elm <- fromJust <$> toMaybe <$> DOM.body doc
return $ DOM.htmlElementToElement elm
But how do all of these languages work?
~
~
~
~
~ how do they work
There's some code in a not-JS-language, something happens, and then there's some JS-code
~ how do they work
Let's talk about compilers. For the most part, this next bit applies to most compilers -- the "to JS"-part is almost incidental. This is how compilers work.
~
~
~
~
--
But before we start diving into that: Let be be the first to admit that I DON'T KNOW THIS STUFF I have never written or worked on a "real" compiler, and even though I have a degree in computer science I never took the compiler class -- i find this stuff fun and interesting but I have literally no idea what I'm doing.
~
~
~
~
~
~ warning
~
--
~ I have no idea what I'm doing
Okay, that's out of the way. What does a compiler do? Compiling usually happens in several steps, what steps there are and how they work varies from compiler to compiler, but here is a general overview of a common setup. I will go a bit deeper into each of these steps in a moment. Lexical analysis: Turns source code into a stream of tokens Syntactical analysis: Turns tokens into an abstract syntax tree AST transformation: Turns an AST into another AST Code generation: Generates target code
~
--
- Lexical analysis (using a tokenizer or lexer) --
- Syntactical analysis (into an Abstract Syntax Tree) --
- AST Transformation --
- Code generation
So what are tokens? During the first step, lexical analysis, the lexer or tokenizer reads the source code and generates tokens. This is a simple list of every part of the program, basically a stricter representation of the source code of the program. The tokens then go into syntactical analasys to produce the AST.
--
~ A structured representation of the text of the program
--
console.log(greeting + " web rebels");
--
[
{ type: "identifier", value: "console" },
{ type: "dot", value: "." },
{ type: "identifier", value: "log" },
{ type: "paren", value: "(" },
{ type: "identifier", value: "greeting" },
{ type: "operator", value: "+" },
{ type: "string", value: " web rebels" },
{ type: "paren", value: ")" },
{ type: "semicolon", value: ";" }
]
The AST is a hierarchical data structure that represents every part of the program. The list of tokens is parsed and turned into a recursive data structure that is syntactically valid. If there are syntax errors in the code, like unbalanced parens and stuff, this is usually where they are discovered. The AST contains all relevant info about a program, this example is simplified -- usually stuff like where something appears in the source code is included, for better error messages, also other stuff.
--
~ A data structure representing the entire program
--
console.log(greeting + " web rebels");
--
{
type: "Program",
body: [
{
type: "CallExpression",
callee: {
type: "MemberExpression",
object: {
type: "Identifier",
name: "console"
},
property: {
type: "Identifier",
name: "log"
}
},
arguments: [
{
type: "BinaryExpression",
operator: "+",
arguments: [
{ type: "Variable", name: "greeting" },
{ type: "StringLiteral", value: " web rebels" }
]
}
]
}
]
}
The next step is AST transformation. When compiling to JS, this usually means converting the AST for the language you are compiling into an AST that is JS-compatible. This can be done by "walking the tree" using a recursive function that starts at the root node and visits each node in the tree depth-first, and creates a new tree based on what it encounters.
~ AST ==> another AST
~
--
Finally the transformed AST is walked to generate the final code of the compiled program. For us, that's JS code, for "serious" or "real" compilers it would be some sort of bytecode or machine code. In this example, let's say you want to compile JS to some sort of lisp for some reason.
~ Transformed AST ==> target code
--
console.log(greeting + " web rebels");
--
||
\/
--
((:log console) (+ greeting, " web rebels"))
Just a brief aside into a topic that is actually very important and relevant: Such things as parser generators exist. This is basically a compiler compiler, where you define the grammar of your language by specifying how tokens look and how they are allowed to fit together, and have the parser, generator generate the parser for you. For javascript there is Jison, which is based off of Gnu Bison, and generates a parser based on a context-free BNF grammar. I won't be using a parser generator for the demo code as I want to show you the basics.
~
~
~ and grammars and stuff
--
~
- Jison --
- Is a thing that exists in the world
Okay, over to the practical part. Let's write a compiler!
~
~
--
~ quote unquote
I have created a programming language. It is exceedingly simple and can only be used for one thing: Rolling dice. Anyone ever play dungeons and dragons? I haven't but I know that dice are used for this game. This programming language rolls dice, in a syntax that should maybe probably be familiar for D&D-players. 1d6 means roll one 6-sided die, 2d8 + 5 means roll two 8-sided dice and add 5 to the result. The JS code we want to generate for this looks similar to this: Basically a bunch of math.randoms.
~
--
~
~ Ever play D&D?
--
~ 1d6 = roll one six-sided die
--
~
1d6 + 2 => 3,4,5,6,7,8
2d8 + 5 => 7...21
--
1d6 + 2 => Math.floor(Math.random() * 6) + 1 + 2
2d8 + 5 => Math.floor(Math.random() * 8) + 1 + Math.floor(Math.random() * 8) + 1 + 2
We will implement a tokenizer for lexical analysis, a parser for syntactical analysis, and a code generator to generate the final JS code. The tokens should look like this:
~
~
Tokens:
2d6 + 3
[
{ type: "dice", value: "2d6" },
{ type: "operator", value: "+" },
{ type: "number", value: "3" }
]
And the AST should look like this:
~
~
AST:
2d6 + 3
{
type: "program",
body: {
type: "operatorExpression",
operator: "+",
lhs: {
type: "diceExpression",
lhs: { type: "number", value: "2" },
rhs: { type: "number", value: "6" }
},
rhs: { type: "number", value: "3" }
}
}
That's the basic idea. Now let's make it.
~
~
~ demo time
That just about what everything I have to say to you today. I hope this whet your apetite for stuff like this, it can be really fun once you get into it. If any of you want to talk about emacs later, find me and we'll have a beer.
~
~
~ the end
~
~ github.com/arnemart/webreb2k17 / @arnemart