December 14th 2020
Update: See proposal on Github.
Using static analysis generics in PHP is already a reality. Many developers are reaping the considerable rewards of using static analysers and generics on their PHP codebase. A major blocker to increased uptake is the lack of a standard for generics. A standard will provide tools (such as IDEs) and libraries with a clear guidelines for implementing and supporting generics.
There is already an unofficial standard for generics, see documentation from Psalm and PHPStan.
The scope of this article is to propose how PHP code should be annotated to include the extra information required for generics. This article proposes the existing informal standard becomes the "official" standard. It also proposes expanded support using Attributes.
Example using docblock (for PHP7 code):
/** @template T of object */
class Queue
{
/** @var array<int,T> */
private array $queue = [];
/** @param T $item */
public function add(
$item,
): void {
// Implementation
}
/** @return T */
public function next()
{
// Implementation
}
/** @return array<int,T> */
public function asArray(): array
{
// Implementation
}
}
Example using attributes (for code that is only compatible with PHP8):
use StaticAnalysis\Generics\v1\Template;
use StaticAnalysis\Generics\v1\Type;
#[Template("T", "object")]
class Queue
{
#[Type("array<int,T>")]
private array $queue = [];
public function add(
#[Type("T")] $item,
): void {
// Implementation
}
#[Type("T")] // This is return type
public function next()
{
// Implementation
}
#[Type("array<int,T>")]
public function asArray(): array
{
// Implementation
}
}
This article investigates the possible methods of annotating PHP code with the additional information required by generics. It then argues why the above formats should be the chosen one.
Contents:
It is assumed that the reader is familiar with the concept of generics and attributes. The following articles and videos give further background:
There is the existing informal standard for generics. The key element is the @template
docblock. See documentation from Psalm and PHPStan.
In the context of generics, both Psalm and PHPStan pretty much follow the same standard.
Currently, this is not formally documented as a standard (e.g. as a PSR).
PSRs 5 and 19
focus on PHPDoc blocks and tags more generally, neither mention @template
.
The lack of a standard is the reason cited in PHPStorm's 2020.3 release announcement for not fully supporting @template
:
We believe that support for generics is an advanced feature that lacks a proper specification and has many edge cases. Yet, we have decided to implement basic support for the @template construct based on the Psalm syntax, to see how it goes.
The next steps to advancing this as a standard is to formalise it, probably as a PSR.
PHP 8 has a new feature, attributes. Attributes could be used instead of docblocks for providing additional information for generics.
Instead of @template
docblock use the attribute #[Template]
.
For additional type information (that appears after @var
, @param
or @return
) use the attribute #[Type]
.
Using @param
docblock:
/**
* @param T $item
*/
public function add(
$item,
): void { ... }
Using an attribute instead of @param
:
public function add(
#[Type("T")] $item,
): void { ... }
Using the @return
docblock:
/**
* @return T
*/
public function next () { ... }
Using an attribute instead of @return
(NOTE: it's not possible to add an attribute to a return type. However, as a function or method can only have one return type an attribute is attached to the function/method to give information about the return type) :
#[Type("T")]
public function next() { ... }
Using the @template
docblock:
/** @template T */
class Queue { ... }
Same information as an attribute:
#[Template("T")]
class Queue { ... }
To delve a bit deeper...
It is possible to restrict templates to be of a certain type. E.g.
/** @template T of Animal */
interface AnimalGame { /* some code */ }
The #[Template]
attribute takes an optional 2nd argument. The restriction of the template is the 2nd argument.
The docblock example would become:
#[Template("T", "Animal")]
interface AnimalGame { /* some code */ }
Template
namespace StaticAnalysis\Generics\v1;
use Attribute;
#[Attribute(Attribute::TARGET_CLASS|Attribute::TARGET_FUNCTION|Attribute::TARGET_METHOD|Attribute::IS_REPEATABLE)]
class Template
{
public function __construct(
public string $name,
public ?string $of = null,
) {}
}
Type
namespace StaticAnalysis\Generics\v1;
use Attribute;
#[Attribute(Attribute::TARGET_FUNCTION|Attribute::TARGET_METHOD|Attribute::TARGET_PARAMETER|Attribute::TARGET_PROPERTY)]
class Type
{
public function __construct(
public string $name,
) {}
}
Other attributes are also required. These include #[Extends]
and #[Implements]
.
NOTE: The namespace StaticAnalysis\Generics\v1
is a placeholder. Assuming this ends up as a PSR then it could be Psr\Generics\v1
.
An objection that a developer might have is that because the information is in a string auto completion (i.e. in IDEs) will not be possible. This is an incorrect assumption. IDEs could still understand the context of the string to provide auto completion and validation. In fact this already happens. PHPStorm provides autocompletion for docblocks. It also provides autocompletion based on the information documented in docblocks. So it would be no problem to extend this to a string within an attribute.
One might suggest that instead of using strings:
#[Template("T")]
#[Type("array<int,T>")]
function asArray(
#[Type("T")] $value,
): array {
return [$value];
}
Add the type information directly, i.e. no strings. Unfortunately the following is not valid PHP 8.0 code.
#[Template(T)]
#[Type(array<int,T>)]
function asArray(
#[Type(T)] $value,
): array {
return [$value];
}
From php.net:
Arguments to attributes can only be literal values or constant expressions.
Before suggesting other notation let's consider examples of the type information that needs supporting by the #[Type]
attribute:
T|int|null
array<int,string>
array{int: int, name: string}
array<0: int, 1: T>
class-string<T>
Queue<T>
ArrayCollection<K,V>
Inspired by PHPStorm's ArrayShape attribute, it might be possible to encode information as an array.
So let's start with the first example T|int|null
, a union:
#[Type(['T', 'int', null])]
The second example array<int,string>
defines the key and value of an array:
#[Type(['int' => 'T'])]
Let's consider the 3rd example, an array shape .array{int: int, name: string}
Following PHPStorm's example:
(NOTE: the string int
is a valid array key):
#[Type(['int' => 'int', 'name' => 'string'])]
⚠️ There are problems with this notation. The second example could be misinterpreted as a single element array shape (with a key of int
).
Consider the first example T|int|null
, which is the same as:
#[Type([0 => 'T', 1 => 'int', 2 => null])]
This too might be misinterpreted as an array shape.
There are already many problems with this method and there are still several more example cases to consider. Something more advanced is needed...
Remember the constraint that arguments to attributes can only be literal values or constant expressions. From attempt 1 we can see that there needs to be a way distinguishing between unions, array shapes and normal arrays.
Unions could be expressed like this: ['union' => [<type 1>, <type 2>, etc]
.
Array shapes could be expressed like this: ['shape' => <array shape>]
.
Back to our examples. For a union, example 1 T|int|null
, would become:
#[Type(['union' => ['T', 'int', null]])]
For now the second example array<int,string>
remains as before:
#[Type(['int' => 'T'])]
The third array{int: int, name: string}
becomes:
#[Type(['shape' => ['int' => 'int', 'name' => 'string']])]
Example 4 array<0: int, 1: T>
:
#[Type('shape' => [0 =>'int', 1 => 'string']])]
This is progress. There is no ambiguity. Unfortunately the examples are more verbose.
Let's continue with example 5 class-string<T>
:
#[Type(['class-string' => 'T'])]
As for example 6 Queue<T>
. This could work:
#[Type([Queue::class => 'T'])]
How about example 7 ArrayCollection<K,V>
:
#[Type([ArrayCollection::class => ['K', 'V'])]
Contrast this with example 2. These are similar in intent; they define the type for key and value, however they look very different:
#[Type(['K' => 'V'])]
#[Type([ArrayCollection::class => ['K', 'V'])]
Perhaps it could be argued that ['K' => 'V']
is a shortcut for ['array' => ['K', 'V']]
?
Let's compare a few examples of using strings
String version:
#[Type("ArrayCollection<K,V>")]
Becomes:
#[Type([ArrayCollection::class => ['K', 'V'])]
Consider a complex, and somewhat contrived, example:
#[Type("array{0: int, employees: array<string,Person::class>, type: class-string<T>, data: array<int,T>}|null")]
Becomes:
#[Type(['union' => ['shape' => [0 => 'int', 'employees' => ['string' => Person::class], 'type' => ['class-string' => 'T'], 'data' => ['int' => 'T'], ], null]])]
Even with this system there are still ambiguities. shape
and union
have become reserved words.
#[Type(['shape' => ['shape' => ['string' => 'string'], 'area' => 'int']])]
Does this mean:
array{shape: array<int,string>, area: int}
Or
array{0: array{string: string}, array: int}
Of course, you can pick a different name. Using array-shape
instead of shape
will probably result in less chance of a name collision, but the fundamental problem still exists.
This is probably one of many issues. It's safe to say this is not a viable solution.
Instead of just the attribute #[Type]
, have others too, e.g. #[Union]
, #[ArrayShape]
and no doubt others.
This is a non-starter. It would be impossible to describe an array or array shapes. array<int, array{name:string, age:int}>
A more drastic measure is to create an RFC to allow more scope for what can be used as arguments for attributes. There are many disadvantages to this. Firstly the earliest this could happen is for PHP 8.1, at the time of writing a year away. Secondly if the main use case for this is to support generics, then I think it would be better adding the notation to the language, even if only used by static analysis and not the run time. E.g.
class Queue<T>
{
public function add(T $item): void { /* implementation */ }
public function next(): T { /* implementation */ }
}
It's a controversial suggestion. I made it at PHP-UK conference in Feb 2020, and a couple of people thought it wasn't wise. I think I agree with them!
Let's disregard this option now.
Given the constraints placed of what is a legal argument for an Attribute, any attempt at documenting information required by generics is likely to be complicated and unintuitive.
The only 2 sensible methods for documenting are:
The notation used for expressing generics, array shapes and unions are intuitive to those who have experience of other programming languages that use generics. It would be sensible to have a standard that follows other languages rather than inventing something entirely new for PHP.
The benefits of the attributes over docblocks include:
Given that docblocks are the de facto standard, both should be supported going forward.
Drop me a DM on twitter.