Open Source Guide #2: Code Quality and Static Analysis

This is the second post in a series on best practices and tools for
open-sourcing a project, based on our experience building Etcher and various other projects at resin.io. We'll be covering topics including documentation, code quality, community building, error tracking, and so forth. Let us know if there is anything specific you would like to see! And check out the first post in the series on creating great documentation.

Quality code matters

A clean code base is a prerequisite for any successful open source project. Development time is expensive, and thinking about code quality early on will prevent you from wasting time fighting with the system, so you can focus on what makes your open source project unique.

This post will cover some of the tools we use to keep our open source code clean and readable.

Why use static analysis?

Static analysis is a process whereby a set of tools analyze the program's source code without actually running it. We rely heavily on static analysis tools for the following reasons:

  • Avoid subtle programming issues

Static analysis tools are great for reliably catching variable re-definitions, unused functions, unreachable return statements, and drastically increasing code base quality with little human intervention. Stricter rules like disallowing unused variables or magic numbers force developers to think through their code and keep it straightforward and well documented.

  • Code style

We believe that good code bases appear consistent enough to have been written by a single contributor. Some companies maintain coding guidelines that are enforced during code reviews by senior developers that know them by heart, but we maintain living style guides in the form of linters. We aim to have an extensive set of linter configurations so that software, and not humans, address code styling decisions.

  • Automation

Human-driven tasks are error-prone and time consuming. Our developers deserve to spend more time building awesome features and fixing tricky bugs rather than conducting intensive code reviews. The vision is that automatic tooling, including static analysis, is so efficient that code reviews are all about architectural discussions, not nitty-gritty trivialities, leading to shorter and more productive reviews.

Continuous Integration

If you have to rely on every developer to remember to run the static analysis tools during development or review, you lose some of the benefits outlined above. Continuous integration services are the bad cops that ensure code is only merged if certain conditions are fulfilled, one of them being, as you can guess, no warnings from the tools.

There are a variety of tools we run in CI services, targeted at different programming languages, or different aspects of a code base. Let's go through the ones we enjoy the most.

ESLint

I think no one can deny that ESLint is the most advanced and flexible JavaScript linter our there today. It supports hundreds of core rules, and has a vibrant community that publishes plugins that bring even more rules to cover almost everything you can wish for. In case you didn't know, the popular JSCS code style linter is now part of ESLint, which means ESLint has a decent set of rules particularly related to code style.

To start using ESLint, first install it as a development dependency in your project:

npm install --save-dev eslint  

We suggest adopting a popular base ESLint configuration, and adding any custom rule on top of that. Otherwise, going through all the supported rules from scratch is daunting task (trust us!). For that reason, lets bring the standardjs base configuration into our project as well:

npm install --save-dev eslint-config-standard  

Then, we should create an .eslintrc.yml file in the root of our project that looks like this:

env:  
  commonjs: true
  es6: true
  node: true
  mocha: true

extends: 'standard'

rules:  
  arrow-parens:
    - error
    - always

The env object contains a list of the environments the code files are designed to run in. Our configuration covers a typical Node.js application, and since we're not setting browser: true, we will get some nice errors if we try to use any DOM API, for example.

The extends keyword loads a base configuration ruleset. The name of the standardjs configuration module is eslint-config-standard, so we put standard here.

Finally, the rules object contains the custom rules we want to apply on top of the configuration we're inheriting from.

I mentioned that ESLint supports user-contributed plugins, so let's also take a look at those. A lot of projects rely on lodash, so wouldn't it be cool if we could suggest that a developer use _.flatMap instead of a consecutive _.map and _.flatten call? Turns out you can. Install eslint-plugin-lodash, and add the following to .eslintrc.yml:

plugins:  
  - lodash

rules:  
  lodash/prefer-flat-map:
    - error

You can run eslint by creating a nice lint npm script in package.json:

{
  "scripts": {
    "lint": "eslint lib tests"
  }
}

I encourage you to check out the massive supported rules list, or if you need some inspiration, Etcher's .eslintrc.yml. Don't forget to also do casual npm searches for ESLint plugins for the modules you use!

Codespell

Resin.io encourages its developers to ship a lot of documentation alongside the code we produce, in the form of comments, JSDoc annotations, or accompanying Markdown documents.

Making typos is easy, and turns out that having well-written English documentation is not as straightforward for a company distributing across nearly 20 countries. It's crucial, therefore, to lint documentation the same way we lint code.

Spell-checking standalone Markdown documents is trivial, but what about code comments? Enter codespell, a tool that can check for common misspellings in both text files and source code.

You can install it like this:

pip install codespell  

The program ships with a large dictionary, but the great thing about it is that you can extend it with your own common misspellings, by creating a file called dictionary.txt in the root of the project that looks like this:

boolen->boolean  
aknowledge->acknowledge  
seleted->selected  
reming->remind  
locl->local  

Then, you can execute the program with the following options:

codespell --dictionary - --dictionary dictionary.txt <files>  

The first --dictionary - option loads the default dictionary, so we don't override it with our own.

Custom scripts

We use a variety of other linters across our projects, like sass-lint, cpplint, tslint, html-angular-validate, but there are always project-specific things we want to check for, like ensuring that versions from certain packages in package.json stay in sync (e.g. Angular.js modules), and no readily available tool to address our needs.

For that reason, we write an army of custom scripts and run them with our continuous integration services. These can be anything from Node.js, Python, or shell scripts, as long as they are easy to maintain, and get the job done.

One particular example I like is a script in the Etcher code base to ensure all source code files only consist of ASCII characters. We have non-English speakers developers that switch back and forth between their native keyboard layouts, and sometimes some Unicode characters slip in!

Isn't this a lot of extra work?

There are a few major challenges that come with relying heavily on static analysis, but they're worth it for the benefits of having a cleaner code base.

  • Setting everything up for the first time, and maintaining the system

Sure, you need to put a decent amount of work in to find or build linters, throughly configure them to your project's needs, and make sure they run on every commit. But in the long run, they can save a lot of time from code reviews, or from fixing bugs that could have been avoided automatically.

  • It's sometimes hard to please the linters

Depending on how far you might want to go, the linters can become hard to please. I believe that if you chose your rules well, however, the extra friction results in better code practices, especially on dynamic languages such as JavaScript.

  • Developers don't like code style strictness

Each developer has their own coding style, and forcing them to adhere to a different style can be annoying. For this reason, I think it's a great idea to build upon community conventions such as established standards, which most developers can agree on.


Static analysis is all about minimizing wasted time and increasing code quality. I hope this post helped you understand the benefits of this practice, and how we apply them at resin.io. If you don't use these tools yet, we encourage you to check out what's available for your languages of choice, and if you already do, then actively think about how you can leverage them to save even more developer time, and build more amazing things!

resin_io