Read OSS

The Extension System, OPcache, and JIT: How PHP is Extended and Optimized

Advanced

Prerequisites

  • Articles 1-4: Complete engine understanding (architecture, types, compilation, VM)
  • Understanding of shared memory and inter-process communication
  • Basic knowledge of JIT compilation concepts

The Extension System, OPcache, and JIT: How PHP is Extended and Optimized

Over the previous four articles, we've traced PHP from its SAPI entry points through the type system, the compilation pipeline, and the virtual machine. But the Zend Engine alone isn't enough to build a web application. The extension system is what transforms a language engine into a complete runtime — connecting PHP to databases, HTTP, JSON, cryptography, file systems, and everything else developers need.

In this final article, we'll examine the extension API that 72+ bundled extensions use, then focus on the three most architecturally significant subsystems: OPcache (which keeps PHP fast by caching compiled code in shared memory), the JIT compiler (which takes it further by generating native machine code), and Fibers (which bring cooperative concurrency to PHP). We'll close with the streams I/O abstraction and TSRM thread safety layer.

The Extension API

Every PHP extension — whether bundled in ext/ or installed via PECL — registers itself with a zend_module_entry struct defined in Zend/zend_modules.h:

classDiagram
    class zend_module_entry {
        +char *name
        +zend_function_entry *functions
        +module_startup_func MINIT
        +module_shutdown_func MSHUTDOWN
        +request_startup_func RINIT
        +request_shutdown_func RSHUTDOWN
        +info_func MINFO
        +char *version
        +globals_size
        +globals_ctor GINIT
        +globals_dtor GDTOR
        +post_deactivate_func
        +deps: zend_module_dep*
    }

The JSON extension in ext/json/json.c is an excellent minimal example. Its module entry registers a MINIT function, a function table, and an MINFO function for phpinfo() output. It's simple enough to read in five minutes, yet it demonstrates the complete registration pattern.

The functions field points to an array of zend_function_entry structs, each mapping a PHP function name to a C handler function and argument info. Since PHP 8.0, argument info is generated from .stub.php files — PHP-syntax function declarations that a build tool converts to _arginfo.h headers. For example, ext/json/json_arginfo.h is generated from ext/json/json.stub.php.

Tip: If you're writing a new extension, start by copying a simple one like ext/json and modifying it. The stub file system (*.stub.php*_arginfo.h) eliminates the most error-prone part of extension development: manually writing argument info structs.

Extension Lifecycle Hooks

As we saw in Article 1, the PHP lifecycle has module-level and request-level phases. Extensions hook into these phases through their zend_module_entry callbacks:

flowchart TD
    START["Process Start"] --> GINIT["GINIT<br/>Initialize extension globals struct<br/>(called once per thread in ZTS)"]
    GINIT --> MINIT["MINIT<br/>Module initialization:<br/>register classes, constants,<br/>INI entries, resources"]
    
    MINIT --> LOOP["Request Loop"]
    LOOP --> RINIT["RINIT<br/>Per-request setup:<br/>reset counters, open connections"]
    RINIT --> EXEC["Script Execution"]
    EXEC --> RSHUTDOWN["RSHUTDOWN<br/>Per-request teardown:<br/>close connections, flush buffers"]
    RSHUTDOWN --> POST["post_deactivate_func<br/>Late cleanup after output sent"]
    POST --> LOOP
    
    LOOP -->|"Process exit"| MSHUTDOWN["MSHUTDOWN<br/>Module teardown:<br/>unregister, free persistent memory"]
    MSHUTDOWN --> GDTOR["GDTOR<br/>Destroy extension globals struct"]
    GDTOR --> END["Process End"]

The separation matters for correctness:

  • GINIT/GDTOR: Initialize and destroy the extension's globals struct. In ZTS builds, this runs once per thread. The globals struct holds per-module state (not per-request state).
  • MINIT: Register everything that persists across requests — classes, functions, constants, INI entries. Use pemalloc() (persistent allocation) here, not emalloc().
  • RINIT: Set up per-request state. This is where the session extension opens the session file, for example.
  • RSHUTDOWN: Clean up per-request state. Called before output is flushed.
  • MSHUTDOWN: Clean up module-level resources. Free persistent allocations, unregister handlers.

Registering Internal Functions

Every internal (C-implemented) PHP function receives its arguments through the INTERNAL_FUNCTION_PARAMETERS macro, defined in Zend/zend.h:

#define INTERNAL_FUNCTION_PARAMETERS zend_execute_data *execute_data, zval *return_value

Every internal function is void fn_name(zend_execute_data *execute_data, zval *return_value). The function parses its arguments from execute_data using zend_parse_parameters() or the faster ZEND_PARSE_PARAMETERS_* fast-path macros, and writes its return value into return_value.

flowchart LR
    STUB["json.stub.php<br/>PHP-syntax declarations:<br/>function json_encode(mixed $value, int $flags = 0): string|false"]
    GEN["gen_stub.php<br/>Build-time generator"]
    ARGINFO["json_arginfo.h<br/>Generated zend_function_entry[]<br/>+ ZEND_ARG_INFO structs"]
    IMPL["json.c<br/>PHP_FUNCTION(json_encode)<br/>{ ... C implementation ... }"]
    
    STUB --> GEN
    GEN --> ARGINFO
    ARGINFO --> |"Linked at compile"| IMPL

At the zend_function union level (as we saw in Article 3), internal functions use zend_internal_function instead of zend_op_array. The common header is identical, so the VM can handle both uniformly — but when it dispatches an internal function, it calls the C handler directly via zend_execute_internal rather than entering the opcode interpreter.

OPcache: Shared Memory Opcode Cache

OPcache is PHP's most important performance extension. Without it, every request recompiles every PHP file from source. With it, compiled op_arrays are stored in shared memory and reused across all worker processes.

The core of OPcache lives in ext/opcache/ZendAccelerator.c. During MINIT, it replaces the zend_compile_file function pointer (the hookable pointer pattern from Article 4) with its own persistent_compile_file. The interception logic:

  1. Request comes in, PHP calls zend_compile_file("script.php")
  2. OPcache's replacement function checks: is this file in the shared memory cache?
  3. Cache hit: Return a pointer to the cached zend_op_array. No compilation at all.
  4. Cache miss: Call the original compile_file, then persist the result to shared memory.
flowchart TD
    subgraph shm["Shared Memory (SHM)"]
        direction TB
        ALLOC["SHM Allocator<br/>(mmap / shm / posix)"]
        STRINGS["Interned Strings Table<br/>(shared across processes)"]
        CACHE["Opcode Cache<br/>filename → cached_script"]
        CACHED["Cached Script:<br/>op_array, class_table,<br/>function_table"]
    end
    
    subgraph workers["FPM Worker Processes"]
        W1["Worker 1"]
        W2["Worker 2"]
        W3["Worker 3"]
    end
    
    W1 -->|"Read-only access"| CACHE
    W2 -->|"Read-only access"| CACHE
    W3 -->|"Read-only access"| CACHE
    
    W1 -->|"First compile"| CACHED

The persistence layer in ext/opcache/zend_persist.c is the tricky part. Op_arrays contain pointers — to strings, literals, class entries, other op_arrays. When copying to shared memory, all these pointers must be adjusted to point to the shared memory versions. Strings are interned into the shared interned string table. Nested structures (function tables within class entries) are recursively persisted.

Shared memory allocation is handled by ext/opcache/zend_shared_alloc.c, which supports multiple backends: mmap (the default on Linux), shm (System V shared memory), and posix (POSIX shared memory). The allocated region is sized by opcache.memory_consumption (default: 128MB).

OPcache also supports preloading (opcache.preload): a PHP script that runs once at server startup and loads classes/functions into shared memory permanently. Preloaded code is never invalidated and never recompiled, providing the fastest possible access.

The JIT Compiler

PHP 8.0 introduced a JIT (Just-In-Time) compiler that translates hot opcodes to native machine code. The JIT sits inside OPcache and builds on the SSA-based optimizer's type inference results (from Article 4).

The JIT entry point is ext/opcache/jit/zend_jit.c. It operates in two modes:

Function JIT compiles entire functions to native code. When a function's execution count exceeds a threshold, the JIT compiles it and patches the op_array's handler pointers to jump directly to native code.

Tracing JIT (in ext/opcache/jit/zend_jit_trace.c) is more sophisticated. It records execution traces — linear sequences of opcodes through hot loops — and compiles those traces to native code. Traces can span function boundaries, capturing the actual hot path through inlined function calls.

flowchart TD
    OPCODE["Opcode execution"] --> COUNT{"Execution count<br/>> threshold?"}
    COUNT -->|"No"| INTERP["Continue interpreting"]
    COUNT -->|"Yes"| MODE{"JIT mode?"}
    
    MODE -->|"Function JIT"| FJIT["Compile entire function<br/>to native code"]
    MODE -->|"Tracing JIT"| RECORD["Record trace<br/>(linear opcode path)"]
    RECORD --> TRACE_END{"Loop back or<br/>return?"}
    TRACE_END -->|"Loop"| COMPILE["Compile trace"]
    TRACE_END -->|"Side exit"| LINK["Link to other traces<br/>or back to interpreter"]
    
    FJIT --> PATCH["Patch handler<br/>to native entry point"]
    COMPILE --> PATCH
    
    PATCH --> NATIVE["Execute native code<br/>Type guards inline"]
    NATIVE --> DEOPT{"Type guard<br/>fails?"}
    DEOPT -->|"Yes"| INTERP
    DEOPT -->|"No"| NATIVE

The native code is generated using the IR framework in ext/opcache/jit/ir/. IR (Intermediate Representation) is a custom compiler framework that provides SSA-based IR construction, optimization passes (constant folding, copy propagation, register allocation), and code emission for x86_64 and AArch64.

The JIT IR pipeline in ext/opcache/jit/zend_jit_ir.c translates Zend opcodes to IR instructions. Type guards are inserted based on the SSA type inference from the optimizer — if a variable was inferred as IS_LONG, the JIT emits a type check at trace entry and integer-only arithmetic in the trace body. If the type guard fails at runtime, the trace deoptimizes back to the interpreter.

JIT configuration is controlled by opcache.jit (a 4-digit bitmask) and opcache.jit_buffer_size. The default in PHP 8.4 enables the tracing JIT.

Tip: The JIT provides the biggest speedups for CPU-bound code (mathematical computation, data processing). For typical I/O-bound web applications, OPcache alone provides most of the performance benefit. Profile before enabling the JIT — it uses additional memory and can increase startup time.

Fibers: Cooperative Concurrency

PHP 8.1 introduced Fibers for cooperative multitasking. A Fiber is a lightweight execution context with its own call stack that can be suspended and resumed. The implementation is in Zend/zend_fibers.h and Zend/zend_fibers.c.

sequenceDiagram
    participant Main as Main Context
    participant Fiber as Fiber Context
    
    Main->>Fiber: $fiber->start()
    Note over Fiber: Execute callback<br/>on Fiber's own C stack
    
    Fiber->>Main: Fiber::suspend($value)
    Note over Main: Context switch back<br/>$fiber->start() returns $value
    
    Main->>Fiber: $fiber->resume($sent)
    Note over Fiber: Fiber::suspend() returns $sent
    
    Fiber->>Main: return $result
    Note over Main: $fiber->getReturn() → $result

Each Fiber has its own C stack (allocated by zend_fiber_stack_allocate, typically 512KB using mmap with guard pages). The context switch saves and restores CPU registers and switches stack pointers using platform-specific assembly (boost.context-derived code for x86_64, ARM64, etc.).

Fibers integrate with the VM's execute_data chain. When a Fiber suspends, its execute_data chain is preserved — all the VM frames, CV slots, and temporaries remain on the Fiber's stack. When it resumes, the VM continues exactly where it left off.

The key design choice is that Fibers are cooperative, not preemptive. A Fiber runs until it explicitly calls Fiber::suspend(). This eliminates the need for locks or atomic operations — only one Fiber executes at a time. Libraries like ReactPHP and Revolt use Fibers under the hood to provide async/await patterns for I/O-bound operations.

Streams: Unified I/O Abstraction

PHP's streams layer, rooted in main/streams/streams.c, provides a unified interface for all I/O operations. When you call fopen(), file_get_contents(), or fread(), you're going through the streams layer.

classDiagram
    class php_stream {
        +ops: php_stream_ops*
        +readbuf: char*
        +readbuflen: size_t
        +wrapper: php_stream_wrapper*
        +context: php_stream_context*
    }

    class php_stream_ops {
        +write(stream, buf, count) ssize_t
        +read(stream, buf, count) ssize_t
        +close(stream) int
        +flush(stream) int
        +seek(stream, offset, whence) int
        +cast(stream, castas, ret) int
        +stat(stream, ssb) int
        +label: char*
    }

    class php_stream_wrapper {
        +wops: php_stream_wrapper_ops*
        +abstract: void*
        +is_url: int
    }

    class php_stream_wrapper_ops {
        +stream_opener()
        +stream_closer()
        +url_stat()
        +dir_opener()
        +unlink()
        +rename()
        +stream_mkdir()
        +label: char*
    }

    php_stream --> php_stream_ops : ops
    php_stream --> php_stream_wrapper : wrapper
    php_stream_wrapper --> php_stream_wrapper_ops : wops

The wrapper system is what makes fopen("http://example.com/file.txt") work. Each URL scheme is handled by a registered wrapper:

Wrapper Scheme Implementation
Plain file file:// main/streams/plain_wrapper.c
HTTP http://, https:// ext/standard/http_fopen_wrapper.c
FTP ftp:// ext/standard/ftp_fopen_wrapper.c
PHP php://stdin, php://memory, etc. ext/standard/php_fopen_wrapper.c
Compression compress.zlib:// ext/zlib/zlib_fopen_wrapper.c
User-space Custom schemes Registered via stream_wrapper_register()

The transport layer (main/streams/xp_socket.c) handles socket I/O — TCP, UDP, Unix domain sockets, and SSL/TLS connections. The stream_socket_client() and stream_socket_server() functions go through this layer.

TSRM: Thread Safety

The Thread Safe Resource Manager in TSRM/TSRM.h and TSRM/TSRM.c solves a fundamental problem: PHP's engine uses extensive global state (compiler globals, executor globals, SAPI globals, per-extension globals), but in ZTS (Zend Thread Safety) builds, multiple threads may execute PHP simultaneously.

TSRM provides thread-local storage for all global state. Each module allocates a "resource ID" via ts_allocate_id(), which returns an integer index. At runtime, the global accessor macros resolve through TSRM:

Build EG(current_execute_data) expands to
Non-ZTS executor_globals.current_execute_data — direct struct access
ZTS ((zend_executor_globals*)tsrm_get_ls_cache())->current_execute_data — thread-local lookup

In non-ZTS builds (the common case for FPM), TSRM is compiled out entirely. The macros become direct struct access with zero overhead. This is why PHP distributions ship separate ZTS and non-ZTS builds — the non-ZTS build is measurably faster.

ZTS builds are required for:

  • Windows IIS with multiple threads serving PHP
  • Event-driven SAPIs that use threading
  • The parallel extension for true multithreading

Most production PHP deployments use non-ZTS FPM, where process-level isolation provides "thread safety" by giving each worker its own address space.

Series Conclusion

Across these five articles, we've traced PHP from its top-level directory structure through every major subsystem:

  1. Architecture and Lifecycle — the four layers, the SAPI contract, the request lifecycle
  2. The zval and Memory Model — 16-byte value representation, reference counting, COW, the allocator, and GC
  3. The Compilation Pipeline — lexer, parser, AST, compiler, opcodes, and the flag system
  4. The Virtual Machine — code generation, dispatch modes, register pinning, the optimizer
  5. Extensions, OPcache, and JIT — the extension API, shared memory caching, native code generation, Fibers, streams, and TSRM

The php-src codebase is large, but it's well-structured. The architectural boundaries are clear, the naming conventions are consistent, and the design patterns (hookable function pointers, lifecycle hooks, vtables of function pointers) repeat throughout. Once you understand the patterns, navigating even unfamiliar corners of the codebase becomes straightforward.

Tip: The best way to deepen your understanding is to pick a PHP function you use daily — like array_map() or json_decode() — and read its implementation end-to-end. You now have the context to understand every line.