State of zend_execute* for Instrumentation

Benjamin Außenhofer

Last updated February 24, 2019 3:22pm

Historically, hooking into zend_execute_ex and zend_execute_internal has been the way to go for PHP extensions that want to implement profiling, tracing, monitoring and error/exception tracking functionality. I want to categorize all of these use-cases as "instrumenting" or "instrumentation" of PHP code.

There are four use cases I want to highlight (There are probably more):

Capture all function/method calls and record their call count and duration. Extensions in this category include xdebug, xhprof, php-spx, forp and many more in the open-source space.
Instrument only specific functions, but also be interested in arguments/return values for metadata. Examples: SQL Query profiling, HTTP Call profiling, framework specific profiling. Symfony WebDebug Toolbar and PHP Debugbar are examples of userland implementations that work when your application provides or calls specific hooks manually. On a PHP Extension Level this can be implemented without code changes to a PHP application. This is a primary feature of all Application Performance Management (APM) tools and uses zend_execute_ex to detect these specific function calls from a whitelist of instrumented ones.
Instrument only a few functions for metadata information about a request for monitoring purposes. If you want to measure the time of request for your monitoring / time-series / event logging database, then instrumenting certain calls to classify the requests is necessary. For example extracting controller and route names during calls to specific framework internal functions.
Instrument specific functions to detect Error/Exception conditions. The PHP global error handler is not enough to find out about error conditions in applications, because all frameworks and probably custom applications have a central exception catching function that turns them into a nice looking 500 page. Error Tracking tools use zend_execute_ex to look for function calls to framework exception handlers and send a log message about the errors and notify developers.

We can boil all these use cases down to two categories of instrumentation:

Generic instrumentation of every userland and internal function without prior knowledge about the specific function+method names beforehand.
Specific instrumentation of individual userland and internal functions by knowing their name.

Requirements for Instrumentation

Each instrumentation extensions has one or more of the following different low-level requirements. I use the word "instrumentation target" here to describe the executed "scope", which can be a function, a class+method or a file.

Access to function, class+method or file names of the instrumentation target
Access to arguments before the actual instrumentation target is called
Access to return value after the actual instrumentation target is called
Access to exception if thrown by the actual instrumentation target
Access to $this if called instrumentation target is an object
Access to Exit + Entry to allow computation of the "duration" or other resources spent (cpu cycles, network in/out, ...)
100% Guarantee that registered instrumentation is called for an instrumentation target (currently possible because everything goes through zend_execute*)

How PHP 7 already affects these use-cases

With PHP 7, the engine can execute in a "stackless" mode, where it doesn't recursively call execute_ex, but jumps around with goto inside the execute_ex call.

If you overwrite zend_execute_ex though, this is not possible anymore and the engine falls back to the recursive mode.

The downside of the recursive mode is slightly less performant and the risk of stack overflows with infinite recursion leading to a crash.

How the JIT affects these use-cases

With the JIT proposal a second "execution mode" is introduced (JITted code) that circumvents zend_execute_ex and zend_execute_internal callbacks, making all four instrumentation ways risky, because as soon as a method / function gets jitted, its instrumentation is not called anymore.

The JIT is a production level feature, but so are many profiling and APM extensions. Hooking into the zend execution process can be done with an acceptable overhead of a few percent, making production profiling, tracing and monitoring a widespread use-case. But users of these extensions still expect to profit from the JIT in a meaningful way.

This change to the execution model requires the instrumentation extensions to find a new hook to get the 100% guarantee for being called when an instrumentation target is executed in either "zend_execute" or "JIT" mode.

Current Instrumentation API alternatives

There are some alternatives right now to hook into the Zend/PHP Engine to instrument, but none of them is an acceptable alternative:

At least for internal functions (Core + Extensions) you can overwrite the function pointer of an individual function directly. This provides an alternative for overwriting zend_execute_internal for the "Specific Instrumentation" use-case, but not for the "generic" one. I have tested this works flawlessly with the JIT.
zend_extension's can provide hooks for two "tracing" opcodes FCALL_BEGIN and FCALL_END. Unfortunately this requires extensions to promote themselves to zend_extension, and the hook has more overhead and them being injected before + after the function call opcode makes it hard to access all required metadata information.
Overwriting the FCALL, FCALL_BY_NAME, ICALL opcodes allows adding instrumentation on the opcode level. However this approach automatically disables the JIT, that means it is not future proof.
Hooking into zend_ast_process could allow an instrumentation extension to inject "begin" and "end" tracing calls into the body of a userland function. This is a very complex alternative to hook into userland functions, especially finding the exit branches. But with this approach the "end" tracing call would not be called when an exception gets thrown in a child function unless you wrap the functions into a new try { } finally { } block (This might change the behavior of a function in edge cases).

All of these alternatives don't cover the full spectrum of instrumentation requirements listed above.

As a consequence the JIT breaks an implicit public API of the Zend Engine towards extensions with regard to instrumentation that we need to address with a new instrumentation API.

At runtime the API needs to provide means for all the requirements listed above and cater to both instrumentation use-cases: Generic and Specific. As such there should be two ways to instrument a function:

An extension should be able to specify to the compiler or runtime context that it wants to get notified globally about all userland and internal calls.
An extension should be able to specify individual functions, classes+methods or files that it knows by name to the compiler or runtime context that it wants to get notified about when called.

Approach 1: Flexibility in Disabling JIT for Extensions

The most simple way to approach both would be to allow disabling the JIT in a more flexible way for extensions:

Allow an extension to set an executor/runtime flag, that the JIT is not to be triggered during this request, which would automatically always call zend_execute*. This way code would generally run with the JIT, but when an extension wants to profile something generically (for example triggered by a developer or by a random sample rate) then the JIT would be skipped for that request.
Allow an extension to register onto a global hook that the JIT calls to ask for accepting to JITing a function. This would cater towards the specific instrumentation use-case where the jit would call a global hook with an API like zend_should_be_jitted(string $functionOrMethodName, ?string $className = null): bool. Then extensions could take part in the decision for JIT. Since specific instrumentors call already hook into internal function pointers, this would only be needed for userland instrumentation. (Patch against jit-dynasm-7.4 branch)

Problem with this approach would be that we keep encouraging to use zend_execute_ex overwrite.

Approach 2: Modify VM to call Enter/Exit Callbacks

This approach is for handling generic instrumentation extensions that need access to any function call.

When zend_execute_ex is overwritten, the VM cannot run in a stackless mode anymore where it jumps around "execute_ex" instead of recursively calling itself over and over again.

Given the specific requirements of instrumentation APIs, the VM could provide a mechanism to allow instrumentation that is a 100% replacement to zend_execute based instrumentation.

The tricky bit is to get this working reliably in the VM. In Stackless mode ZEND_VM_ENTER() and a jump to the label "zend_leave_helper_SPEC_LABEL" mark the start and end of a function call. But the leave helper sees execute_data being cleaned up a bit already. That means we need to find all the right spots to call end, which are quite a lot.

The next question is how to implement this that extension can:

instrument all userland and internal functions
instrument selected userland functions

We could add a new flag to a zend_function that marks it as "instrumented" and allow extensions to:

Automatically mark all userland or internal functions instrumented with three new Compiler Flags CG(compiler_options) = ZEND_COMPILE_INSTRUMENT_ALL; and Zend_COMPILE_INSTRUMENT_ALL = ZEND_COMPILE_INSTRUMENT_USER | ZEND_COMPILE_INSTRUMENT_INTERNAL;
Set the flag on a userland zend_function when overwriting zend_compile_file based on individual selecting of the function
Set the flag in MINIT/RINIT or at runtime for internal functions

If we assume this kind of profiler doesn't need arguments or return values, then maybe its possible doing this using extended debug opcodes FCALL_BEGIN + FCALL_END.

Prototype is outstanding.

Approach 3: Allow to register overloads for function

For the specific instrumentation case it is possible to use zend_ast_process hook to inject additional code into a userland function that is supposed to be traced. The idea could be to modify a function:

function foo($a, $b) {
    return $a + $b;
}

To become:

function foo($a, $b) {
   trace_start();
   try {
       return $a + $b;
   } finally {
       trace_end();
   }
}

I built a prototype showing this functionality.

Instrumentation in other JITted-Languages

Java provides APIs to hook into the AST/JVM opcode generation to add instrumentation code. This code is part of the generated intermediate structure, and as such will be considered when JIT tries to optimize it.
Ruby JIT currently does not trigger when tracing is enabled.

Next Steps

My goal is to collect more approaches and work on prototypes for each of them to come up with a recommended implementation way.

In addition I request that maintainers of open-source and closed-source extensions that overwrite zend_execute specify if they have additional requirements with respect to the information that should be available.