Skip to content
Maximiliano edited this page Jun 9, 2021 · 5 revisions

The DIME Analytics Coding Guide

Taken from The DIME Analytics Data Handbook

Styles rules

The style rules used throughout the stata_linter package are the following:

  1. Replace delimit to three forward slashes (///) It is recommended to avoid to use delimit command. This command removes the delimit command and add three forward slashes /// to appropriate places.

  2. Replace hard tabs to soft tabs (= whitespaces) It is recommended to avoid to use hard tabs. This command replaces them with soft tabs (= whitespaces, usually 2 or 4 whitespaces are used).

  3. Use indents in brackets after for and while loops or if/else conditions For better readability, it is recommended to add indentations within brackets of for-loops, while-loops, and if/else statements. If there are no proper indentations, this command adds whitespaces.

  4. Break too long lines Too long lines should be avoided. When a line is too long, this command breaks the line into multiple lines using line breaks (///).

  5. Add a whitespace before a curly bracket This command adds a whitespace before a curly bracket of for-loops, while-loops, or if/else statements.

  6. Remove blank lines before closing curly brackets This command removes blank lines before closing curly brackets of for-loops, while-loops, or if/else statements.

  7. Remove duplicated blank lines This command removes duplicated blank lines.

Writing a good code

“Good” code has two elements: (1) it is correct, in that it doesn’t produce any errors and its outputs are the objects intended, and (2) it is useful and comprehensible to someone who hasn’t seen it before (or even someone who has, weeks, months, or years later). Many researchers have only been trained to code correctly. But we believe that when your code runs on your computer and you get the desired results, you are only half-done writing good code.

Therefore, good code:

  • is easy to read and replicate, making it easier to spot mistakes;
  • reduces sampling, randomization, and cleaning errors;
  • can easily be reviewed by others before it’s published and can be re-used afterwards.

We always tell people to “code as if a stranger is reading it”.

You should think of good code in terms of three major elements:

  1. structure,
  2. syntax,
  3. and style.

The structure is the environment and file organization your code lives in: good structure means that it is easy to find individual pieces of code, within and across files, that correspond to specific tasks and outputs. It also means that functional code blocks are sufficiently independent from each other such that they can be shuffled around, repurposed, and even deleted without affecting the execution of other portions.

The syntax is the literal language of your code. Good syntax means that your code is readable in terms of how its mechanics implement ideas – it should not require arcane reverse-engineering to figure out what a code chunk is trying to do. It should use common commands in a generally accepted way so others can easily follow and reconstruct your intentions.

Finally, style is the way that the non-functional elements of your code convey its purpose. Elements like spacing, indentation, and naming conventions (or lack thereof) can make your code much more (or much less) accessible to someone who is reading it for the first time and needs to understand it quickly and accurately.

Clone this wiki locally