Schedule rides in advance

NEAL, Uber’s Open Source Language-Agnostic Linting Platform

February 1, 2018 / Global

Share
Facebook
Twitter
Linkedin
Envelope
Every day, Uber’s engineers write hundreds of commits to our mobile codebase, with each new patch necessitating a thorough code review to spot bugs that might affect the user experience.
To make code reviews easier, we built Not Exactly a Linter (NEAL), an open source language-agnostic tool that allows engineers to write custom syntax-based rules, thereby automating sections of the code review process. In this article, we discuss the benefits of NEAL and walk through how you can use it for your own code reviews.
Introducing NEAL
For reviewers of more critical parts of Uber’s mobile codebase, the number of new diffs can be overwhelming, slowing down throughput and developer productivity. These roadblocks are exacerbated by our geographical diversity; with engineering teams in 12 locations worldwide, blocked reviews can result in lengthy turnarounds, making it more difficult to fix issues.
Beyond these day-to-day considerations, there are a lot of design patterns that we want to enforce or avoid in our codebase as a whole. Some of these are covered by existing linters, but many other patterns are specific to Uber’s RIB architecture. By early 2017, we realized that we needed to invest in a more automated code review solution that would remove the hurdles associated with reliably and efficiently growing a multi-language codebase at scale.
After experimenting with existing open source tools, we quickly realized that most options did not meet our extensibility and flexibility requirements. While some were built for specific languages, many others only allowed custom regular expressions or required engineers to write programs when they wanted to create new rules. We needed a solution that enabled engineers to both quickly review code—regardless of language—and generate new rules. So, we decided to start from scratch and build our own.
NEAL lets engineers write their own set of rules to cover any part of the codebase, taking the code review process one step closer to full automation. In fact, to make this process easier, NEAL uses its own domain-specific language.
At Uber, we use NEAL extensively to:
Improve the reliability of our tests by checking for patterns, such as concurrency primitives and asynchronous tests
Control the growth of our binary size by avoiding language features that result in large amounts of machine code
Enforce high-level restrictions on executable code, such as what code can be executed as part of Uber’s core flow and ensuring that newly incorporated plugins can be safely turned off
And the most exciting part? Thanks to NEAL, the rules mentioned above were written by engineers in our mobile codebase.
How does it work?
NEAL is designed to be completely extensible across three dimensions:
Language: Abstract Syntax Tree (AST) providers add support for languages as either a parser or by wrapping an existing parser.
Formatting: Reporters easily add new formats for outputting violations.
Rule creation: With NEAL, rules define actions to be taken when specific patterns are encountered during code reviews.
We decided to use AST-based linting (instead of traditional expression-based linting) because it facilitates the expression of more complex patterns, such as deeply nested structures and recursive patterns. This method is also less error-prone, since users do not have to account for meaningless text in the source code, like whitespaces, comments, and item order (e.g., `public static` vs `static public`).
By default, NEAL ships with Swift and Python AST providers, command line-friendly and arc-friendly formats, and no rules. However, you can find some of the NEAL-enforced rules we use at Uber in on our GitHub page.  
Using NEAL
NEAL does not ship with any rules, so let us look at how an engineer would go about adding one to an app’s codebase.
Suppose we have a function that performs a very expensive computation (let us call it expensiveComputation), and we want to make sure that it will not be called from any class’ initialization, which would cause the app to become unresponsive unexpectedly.
An offending program might look like:
[cc lang=”swift” noborder=”true” nowrap=”false”]

// test.swift

func expensiveComputation() { /* … */ }
class App {

init() {

expensiveComputation()

}

}

[/cc]
First, we use NEAL to extract the AST to identify which node names to use in our rules:
[cc lang=”bash” lines=”1″ nowrap=”false” noborder=”true”]

neal –print-ast test.swift

[/cc]
[cc lang=”js” noborder=”true” nowrap=”false”]

[ FunctionDeclaration {

FunctionName = Identifier { Value = “expensiveComputation” }

},

ClassDeclaration {

ClassName = Identifier { Value = “Application” }

ClassBody = [

InitializerDeclaration {

InitializerBody = [

CallExpression {

Callee = Identifier { Value = “expensiveComputation” }

Arguments = []

}] }] }]
[/cc]
(the AST was abbreviated here for brevity)
Then, we write a rule to match this example, as follows:
[cc lang=”cpp” noborder=”true” nowrap=”false”]

// test.rules
// First we have to give a name to this rule

rule NoExpensiveInitializer {
// Then we have to specify which language are we targeting, followed by

// the first node we want to match, in this case ClassDeclaration

Swift::ClassDeclaration {

// node matchers can be nested to go deeper into the AST

InitializerDeclaration {

// additionally, matchers can also have predicates

CallExpression where Callee == “expensiveComputation” {

// once we found the pattern we wanted, we can take an action

fail(“`expensiveComputation` should not be called from class initializers”)

}

}

}

}

[/cc]
Now, if we run NEAL on this code, we should see something like:
[cc lang=”bash” noborder=”true” nowrap=”false”]

$ neal –rules test.rules test.swift

[/cc]
[cc lang=”text” noborder=”true” nowrap=”false”]

[1 of 1]: Analysing test.swift

On file test.swift: (NoExpensiveInitializer)
3 | class Application {

4 | init() {

5 | expensiveComputation()

~ | ^

6 | }

7 | }
error: `expensiveComputation` should not be called from class initializers

[/cc]
Alternatively, we can create a minimal configuration file to streamline the process, as follows:
[cc lang=”js” noborder=”true” nowrap=”false”]

{

“rules”: [ “test.rules” ]

}

[/cc]
Once shipped, we can achieve the same result by simply running:
[cc lang=”bash” noborder=”true” nowrap=”false”]

$ neal .

[/cc]
Since NEAL-enforced rules are based on syntax, they can only match what is written and not what the program actually does. For instance, in our example above, NEAL will only catch the bug if expensiveComputation is called directly from the initializer, but not if it is called transitively, i.e., the initializer calls x and x calls expensiveComputation. This can be worked around with conventions and user-provided annotations, but this functionality is not supported out-of-the-box.
Next steps
To give others the freedom and extensibility of setting the parameters of their own automated code review processes, we gave NEAL back to the open source community.
If NEAL piques your interest, you can learn more by visiting our GitHub repo and reading through our documentation. Please reach out to Uber Engineering via GitHub Issues if you need any help getting started or, better yet, would like to contribute other languages to the project!
Photo Header Credit, “Giraffes on road,” by Conor Myhrvold, South Luangwa National Park, Zambia, 2009.