Introducing Static Analysis Results Baseliner (SARB)

I'm delighted to announce the beta version of SARB.

If you've tried to introduce advanced static analysis tools (e.g. Psalm and PHPStan) to legacy projects the tools have probably reported thousands of problems. It's unrealistic to fix all but the most critical ones before continuing development.

SARB is used to create a baseline of these results. As work on the project progresses SARB can takes the latest static analysis results, removes those issues in the baseline and report the issues raised since the baseline. SARB does this, in conjunction with git, by tracking lines of code between commits.

SARB is written in PHP, however it can be used to baseline results for any language and any static analysis tool.

SARB demo

Instructions for running SARB are available in the project's README. Below is a demo of using SARB. You'll need PHP >= 7.1, composer and git.

First of all clone the SARB demo project and then run composer:

git clone https://github.com/DaveLiddament/sarb-demo.git
cd sarb-demo
composer install

This sets up a project with the static analysers Psalm and PHPStan. It also installs SARB. Finally there is a single PHP file src/Person.php.

Running the static analysers on this project will reports errors...

To run Psalm use:

vendor/bin/psalm

You should get an output similar to this:

Scanning files...
Analyzing files...

ERROR: InvalidNullableReturnType - src/Person.php:17:29 - The declared return type 'string' for Demo\Person::getName is not nullable, but 'string|null' contains null
    public function getName(): string
    {
        return $this->name;
    }


ERROR: NullableReturnStatement - src/Person.php:19:3 - The declared return type 'string' for Demo\Person::getName is not nullable, but the function returns 'string|null'
        return $this->name;


------------------------------
2 errors found
------------------------------

Checks took 0.13 seconds and used 18.972MB of memory
Psalm was able to infer types for 100.000% of analyzed code (1 file)

To run PHPStan use:

vendor/bin/phpstan analyse

You should see an output like this:

Note: Using configuration file /vagrant/sarb-demo/phpstan.neon.

 ------ -----------------------------------------------------------------------------
  Line   Person.php
 ------ -----------------------------------------------------------------------------
  19     Method Demo\Person::getName() should return string but returns string|null.
 ------ -----------------------------------------------------------------------------


 [ERROR] Found 1 error

In both cases they are reporting the same issue. getName in Demo could return a string or a null but the typehint says it will only return a string.

NOTE: This example shows how to use both static analysers currently supported by SARB. If you're just starting adding in static analysis then maybe just add one at once. The majority of issues are found by both tools, however both tools also have issues that only they find.

On non trivial legacy projects both Psalm and PHPStan would probably find hundreds if not thousands of issues. It is not practical to fix all of them. After fixing the most critical issues you would then use SARB to create a baseline.

Creating a baseline with SARB

NOTE: It SARB records the git commit that the baseline was made. It is essential that all code is committed before using SARB to create the baseline

Psalm example

First create run Psalm and dump results in JSON format:

vendor/bin/psalm --report=reports/psalm/psalm-results.json

We'll use these set of results as the baseline. Next we create a SARB baseline:

vendor/bin/sarb create-baseline reports/psalm/psalm-results.json reports/sarb/psalm-baseline.json psalm-json

The create-baseline command requires 3 arguments:

  1. The static analysis output file to create the baseline from (in this case reports/psalm/psalm-results.json)
  2. The name of the file to store the SARB baseline in (in this case reports/sarb/psalm-baseline.json)
  3. The name of the ResultsParser, in this case we used Psalm's JSON format, hence psalm-json. To get a full list of supported tools and formats use the command: vendor/bin/sarb list-static-analysis-tools

The output should be like this:

Baseline created
Errors in baseline 2
PHPStan example

Again create the PHPStan output in JSON format:

vendor/bin/phpstan analyse --error-format=json > reports/phpstan/phpstan-output.json

We'll use these set of results as the baseline. Next we create a SARB baseline:

vendor/bin/sarb create-baseline reports/phpstan/phpstan-output.json reports/sarb/phpstan-baseline.json phpstan-json-tmp

The output should look like this:

[phpstan-json-tmp] guesses the classification of violations.This means results might not be 100% accurate. See https://github.com/DaveLiddament/sarb/blob/master/docs/ViolationTypeClassificationGuessing.md for more details.
Baseline created
Errors in baseline 1

Don't worry too much about the warning for now.

Updating the code

Now we have out baseline, we can update the code.

Let's add the member data $age:


class Person { /** * @var int|null */ private $age; ... rest of class as before ...

We'll also add a getter and setter for age between the $name member data and the getters and setters for name:

     private $name;

     public function setAge(?int $age): void
     {
         $this->age = $age;
     }

     public function getAge(): int
     {
         return $this->age;
     }

    ... rest of class as before ...

The updated code should look like this.

We've introduced code that the static analysers will raise issues on.

If we run either of the static anlaysis tools, they'll report both the original issue(s) and the new one(s).

E.g. Psalm will now report 4 issues:

vendor/bin/psalm

And PHPStan 2 issues:

vendor/bin/phpstan analyse

Removing baseline results

We can use SARB to strip out the baseline issues and only report the new issues.

Psalm example

Run Psalm and dump latest results in JSON format:

vendor/bin/psalm --report=reports/psalm/psalm-results.json

Use SARB to remove baseline results:

vendor/bin/sarb remove-baseline-results reports/psalm/psalm-results.json reports/sarb/psalm-baseline.json reports/psalm/baseline-removed.json

The remove-baseline-results command requires 3 arguments:

  1. The static analysis output for the current state of the code (in this case reports/psalm/psalm-results.json)
  2. The name of the file with the SARB baseline in (in this case reports/sarb/psalm-baseline.json)
  3. The name of an output file. This will be in the same format as the generated Psalm, but only include the results post baseline (in this case reports/psalm/baseline-removed.json)

The output should be like this:

Baseline uses ResultsParser [psalm-json] and HistoryAnalyser [git]
Errors before baseline 4
Errors in baseline 2
Errors introduced since baseline 2

FILE: src/Person.php
+------+-------------------------------------------------------------------------------------------------------------+
| Line | Description                                                                                                 |
+------+-------------------------------------------------------------------------------------------------------------+
| 22   | The declared return type 'int' for Demo\Person::getAge is not nullable, but 'int|null' contains null        |
| 24   | The declared return type 'int' for Demo\Person::getAge is not nullable, but the function returns 'int|null' |
+------+-------------------------------------------------------------------------------------------------------------+

It is only showing us the issues introduced since the baseline.

PHPStan example

Again create the PHPStan output in JSON format:

vendor/bin/phpstan analyse --error-format=json > reports/phpstan/phpstan-output.json

Next use SARB to remove the baseline baseline:

vendor/bin/sarb remove-baseline-results reports/phpstan/phpstan-output.json reports/sarb/phpstan-baseline.json reports/phpstan/baseline-removed.json

The output should look like this:

Baseline uses ResultsParser [phpstan-json-tmp] and HistoryAnalyser [git]
Errors before baseline 2
Errors in baseline 1
Errors introduced since baseline 1

FILE: src/Person.php
+------+----------------------------------------------------------------------+
| Line | Description                                                          |
+------+----------------------------------------------------------------------+
| 24   | Method Demo\Person::getAge() should return int but returns int|null. |
+------+----------------------------------------------------------------------+

Moving files

SARB is clever enough to cope with files being renamed.

E.g. rename Person.php to Employee.php

git mv src/Person.php src/Employee.php

NOTE You must use git mv otherwise SARB doesn't know that the file has been renamed.

Also update the class name from Person to Employee. The file should look like this.

Now rerun the analyser and SARB and you should still see only the differences introduced since the baseline. E.g.:

vendor/bin/psalm --report=reports/psalm/psalm-results.json
vendor/bin/sarb remove-baseline-results reports/psalm/psalm-results.json reports/sarb/psalm-baseline.json reports/psalm/baseline-removed.json

Next steps

Find out more from the README, learn how SARB works and write a ResultsParser for the static analysis tool of your choice.

NOTE Whilst I was working on SARB Psalm introduced it's own baseline functionality. It works slightly differently to SARB. If you're using Psalm you might want to try this first.

Warning

This is still in beta and might change or have bugs. Please report any bugs you find.