Using AI to convert PHP engine from C to Rust

April 1st 2023

UPDATE: This was published on April 1st, I'm afraid it is an April Fool.

Ask PHP developers what they most want next for the language, you'll get many answers. Rust as the PHP runtime is becoming an increasingly popular dream, currently it is written in C. If PHP was written today, it'd probably use Rust, not C. This article talks about how I am converting the runtime from C to rust with the help of AI.

Before we get into the details, here is a quick reminder what happens under the hood when you run PHP code. There are several steps involved in converting from PHP code to code that is executed. The code is first split into tokens. The tokens are converted to an Abstract Syntax Tree (AST). The AST is then converted into op-codes. Finally, these op-codes executed by the PHP virtual machine.

Rewriting the C runtime in another language is quite an undertaking. Fortunately PHP provides has a rich suite of tests, in the form of phpt files.

PHP also huge numbers of applications and libraries with high coverage test suites. As part of this project we have used the 982 of the most popular (by download count on packagist) PHP projects that have a high coverage test PHPUnit test suite.

The goal of this project is for every test from both above sources to pass with the rust engine.

Prior to this project, the last time I looked at any C code was 18 years ago. I'd hoped that I'd simply be able to read the C code and write the corresponding code in rust. Jumping in to PHP's C code was quite a shock. It was certainly beyond my comprehension.

It's AI time

I'd heard of people using ChatGPT to convert one language to another. My next step was to pass the C code in chunks to ChatGPT API and ask it for the Rust equivalent. This didn't work at all. ChatGPT can convert small snippets of code, but it is no good for whole codebases. There was no easy way to split up the C code into small snippets.

However, using ChatGPT to turn PHP Parser code from PHP into rust, class at a time, was more fruitful. The first few classes took quite a bit of manual work. However feeding back the results to ChatGPT, meant it did seem to "learn". It was pretty impressive, to be honest.

The next stage was converting the AST to op-codes. There are a few ways to dump out op codes. The first task was to generate a test suite of mappings between PHP code, AST representation and expected op codes. This dataset was generated from PHP test code and the 982 applications identified above. The AST and expected opcodes were converted to a standard JSON format.

The following steps were automated. It is basically standard TDD red/green/refactor flow.

Inputs and expected outputs were fed in turn to ChatGPT to ask it to generate Rust code to make the conversion.
After each iteration the input and expected output was added to the rust test suite, as was the generated code.
The rust test suite was run after each iteration. Any failures passed back to ChatGPT to ask to fix
The whole rust codebase was passed to ChatGPT to ask it to refactor.
Tests were run and failures passed back to ChatGPT to fix.

If any of the above steps failed then I was alerted via a sentry alert. I'd have to make manual fixes before continuing the process.

Currently, about 75% of the tests are completed. The remaining 25% are proving to be pretty challenging. This work is still ongoing.

Running parallel is writing a VM to run the generated opcodes. There are about 200 opcodes. Most are quite simple to implement. Infact it seemed here was a case that writing code directly, rather than trying to convert existing code was easier. Github co-pilot and Chat GPT still helped, speed up the process. There are still a few complex bits, mainly around arrays that need finishing off.

Incredibly this whole process works! We're currently well beyond the hello world application. 27% of all tests pass!

Conclusions

This project is still not complete, but it looks like it is possible. Most of the code is written exclusively or heavily helped by AI. The code generated, even with ChatGPT refactoring, is not the most readable. But does this matter? As long as the test suite is comprehensive and all tests pass is this OK? The future could well be developers writing expectations and AI doing the rest.

Once I've finished I'll release the code on github.

Watch this space.