Navigating php-src: Architecture, Layers, and the Request Lifecycle
Prerequisites
- ›Basic C programming (structs, pointers, function pointers)
- ›General familiarity with how interpreted languages work
- ›Understanding of process lifecycle concepts
Navigating php-src: Architecture, Layers, and the Request Lifecycle
PHP powers roughly 77% of all sites with a known server-side language. Yet most PHP developers never look at the engine beneath their code. The php-src repository is over two million lines of C — a number that intimidates even experienced systems programmers. This article gives you the mental map you need to navigate it. We'll break the codebase into its four architectural layers, examine the contract that lets PHP run inside Apache, Nginx, or a CLI terminal with identical behavior, and trace the complete lifecycle of a PHP request from the first main() call to the final shutdown.
By the end of this series, you'll be able to open any file in php-src and understand where it fits in the bigger picture.
Top-Level Directory Map
Before diving into architecture, let's orient ourselves with the repository's top-level layout. Each directory has a clear responsibility:
| Directory | Purpose |
|---|---|
Zend/ |
The Zend Engine — lexer, parser, compiler, VM, memory allocator, GC, type system |
main/ |
PHP runtime glue — lifecycle orchestration, INI system, streams, SAPI bridge |
sapi/ |
Server API entry points — CLI, FPM, CGI, Apache module, Embed, phpdbg |
ext/ |
72+ bundled extensions — standard, JSON, OPcache, PDO, cURL, etc. |
TSRM/ |
Thread Safe Resource Manager — thread-local storage abstraction |
build/ |
Build tool scripts (autoconf, libtool helpers) |
win32/ |
Windows-specific build configuration and compatibility layer |
tests/ |
.phpt test files for the engine and extensions |
Zend/Optimizer/ |
SSA-based optimizer (lives inside Zend/ but is a distinct subsystem) |
graph TD
subgraph "php-src repository"
SAPI["sapi/<br/>CLI, FPM, CGI, Apache, Embed"]
MAIN["main/<br/>Runtime, INI, Streams, SAPI bridge"]
EXT["ext/<br/>72+ bundled extensions"]
ZEND["Zend/<br/>Engine: lexer, parser, compiler, VM, GC"]
OPT["Zend/Optimizer/<br/>SSA optimizer"]
TSRM_DIR["TSRM/<br/>Thread safety"]
BUILD["build/, win32/<br/>Build system"]
TESTS["tests/<br/>.phpt test suite"]
end
SAPI --> MAIN
MAIN --> ZEND
EXT --> ZEND
OPT --> ZEND
ZEND --> TSRM_DIR
Tip: When you're hunting for a feature, start with
ext/for PHP-facing functions,Zend/for language semantics,main/for runtime behavior, andsapi/for host-environment integration.
The Four Architectural Layers
php-src is organized into four stacked layers. Each layer depends only on the layers below it, and each has a distinct responsibility:
flowchart TB
subgraph L4["Layer 4: SAPIs"]
CLI["CLI"]
FPM["FPM"]
CGI["CGI"]
APACHE["Apache"]
EMBED["Embed"]
end
subgraph L3["Layer 3: PHP Runtime (main/)"]
LIFECYCLE["Lifecycle Orchestration"]
INI["INI System"]
STREAMS["Streams I/O"]
SAPI_BRIDGE["SAPI Bridge"]
end
subgraph L2["Layer 2: Zend Engine"]
COMPILER["Compiler"]
VM["Virtual Machine"]
MM["Memory Manager"]
GC["Garbage Collector"]
TYPES["Type System"]
end
subgraph L1["Layer 1: TSRM"]
TLS["Thread-Local Storage"]
end
L4 --> L3
L3 --> L2
L2 --> L1
Layer 1 — TSRM sits at the bottom, providing thread-safe access to global state. In the common non-ZTS (non–thread-safe) builds, TSRM is compiled out — global accessor macros like PG(), EG(), CG(), and SG() resolve directly to struct field access. In ZTS builds (used with event-driven SAPIs or Windows IIS), they go through thread-local storage lookups.
Layer 2 — The Zend Engine is the language core. It contains the lexer, parser, AST, compiler, virtual machine, memory allocator, garbage collector, and the fundamental type system (zval, HashTable, zend_string, zend_object). The engine knows nothing about HTTP, file I/O, or configuration files — it only knows how to compile and execute PHP opcodes.
Layer 3 — PHP Runtime (main/) bridges the engine to the outside world. It orchestrates the startup/shutdown lifecycle, parses INI files, manages the streams I/O abstraction, and provides the SAPI bridge that decouples the engine from its host environment.
Layer 4 — SAPIs are the entry points. Each SAPI implements a contract (a vtable of function pointers) that tells the runtime how to read input, write output, send headers, and log errors for a specific host environment.
The global state accessor macros deserve special attention. Each layer has its own globals struct, accessed through a dedicated macro:
| Macro | Struct | Layer | Contents |
|---|---|---|---|
PG() |
php_core_globals |
Runtime | INI settings, error handling, file upload state |
EG() |
zend_executor_globals |
Engine | Current execute_data, symbol tables, exception state |
CG() |
zend_compiler_globals |
Engine | Active op_array, AST, compilation state |
SG() |
sapi_globals_struct |
Runtime | Request info, headers, current SAPI module |
These macros are everywhere in php-src. Recognizing them instantly is key to reading the code.
The SAPI Contract
The SAPI (Server API) contract is one of PHP's most elegant design decisions. It's a single struct — sapi_module_struct — containing roughly 30 function pointers that abstract every interaction between PHP and its host environment.
You can find the definition in main/SAPI.h. The key callbacks include:
classDiagram
class sapi_module_struct {
+char *name
+char *pretty_name
+startup(sapi_module_struct*) int
+shutdown(sapi_module_struct*) int
+activate() int
+deactivate() int
+ub_write(char*, size_t) size_t
+flush(void*) void
+header_handler(sapi_header_struct*, ...) int
+send_headers(sapi_headers_struct*) int
+send_header(sapi_header_struct*, void*) void
+read_post(char*, size_t) size_t
+read_cookies() char*
+register_server_variables(zval*) void
+log_message(char*, int) void
+get_fd(int*) int
+ini_defaults(HashTable*) void
}
The CLI SAPI provides a concrete example. In sapi/cli/php_cli.c, the module definition wires up CLI-specific implementations:
ub_write→ writes tostdoutviafwrite()read_post→ returns nothing (CLI has no POST body)read_cookies→ returnsNULL(no cookies in CLI)register_server_variables→ populates$_SERVERwithargv,argc, andSCRIPT_FILENAMElog_message→ writes tostderr
This contract means the Zend Engine never calls write() or fwrite() directly. It always goes through sapi_module.ub_write(), which does the right thing whether PHP is running as an Apache module, a FastCGI worker, or an embedded scripting engine.
SAPI Entry Points Compared
Each SAPI ships its own main() function, but they all converge on the same lifecycle calls. Here's how the major SAPIs differ:
| SAPI | Entry File | Process Model | Request Loop |
|---|---|---|---|
| CLI | sapi/cli/php_cli.c |
Single process, single request | Execute script and exit |
| FPM | sapi/fpm/fpm/fpm_main.c |
Master + worker pool | accept() loop in each worker |
| CGI | sapi/cgi/cgi_main.c |
Spawned per-request by web server | Single request, then exit |
| Apache | sapi/apache2handler/sapi_apache2.c |
Loaded as .so module |
Called by Apache's request handler |
| Embed | sapi/embed/php_embed.c |
Embedded in host application | Host controls lifecycle |
The CLI SAPI is the simplest: its main() parses command-line arguments, calls php_module_startup(), runs a single request, and shuts down. FPM is the most complex: it forks worker processes, manages pools with configurable sizing, and each worker loops through accept() → php_request_startup() → execute → php_request_shutdown().
Despite these differences, every SAPI eventually calls the same four lifecycle functions from main/main.c. This is the convergence point.
The Request Lifecycle
The lifecycle is the backbone of PHP's execution model. Every PHP process — whether CLI, FPM, or Apache — follows the same four-phase pattern:
sequenceDiagram
participant SAPI as SAPI main()
participant Runtime as main/main.c
participant Zend as Zend Engine
participant Ext as Extensions
Note over SAPI,Ext: Phase 1: Module Startup (once per process)
SAPI->>Runtime: php_module_startup()
Runtime->>Zend: zend_startup()
Zend->>Zend: Init memory manager, scanner, compiler, VM
Runtime->>Runtime: Parse php.ini
Runtime->>Ext: Call each extension's MINIT()
Note over SAPI,Ext: Phase 2: Request Startup (once per request)
SAPI->>Runtime: php_request_startup()
Runtime->>Zend: zend_activate()
Zend->>Zend: Reset memory arena, init symbol tables
Runtime->>Ext: Call each extension's RINIT()
Note over SAPI,Ext: Phase 3: Execution
SAPI->>Zend: zend_execute_scripts()
Zend->>Zend: Compile source → opcodes
Zend->>Zend: Execute opcodes in VM
Note over SAPI,Ext: Phase 4: Request Shutdown
SAPI->>Runtime: php_request_shutdown()
Runtime->>Ext: Call each extension's RSHUTDOWN()
Runtime->>Zend: zend_deactivate()
Zend->>Zend: Free request memory, destroy symbol tables
Note over SAPI,Ext: Phase 5: Module Shutdown (once per process)
SAPI->>Runtime: php_module_shutdown()
Runtime->>Ext: Call each extension's MSHUTDOWN()
Runtime->>Zend: zend_shutdown()
Phase 1: Module Startup happens once when the process starts (or once when the Apache module loads). The key function is php_module_startup() in main/main.c. It calls zend_startup() to initialize the engine — memory manager, scanner, compiler, executor, and built-in functions. Then it parses php.ini, registers core INI settings, and walks the extension list calling each extension's MINIT (Module Init) hook. This is where extensions register their classes, constants, and internal functions.
Phase 2: Request Startup runs before each request. php_request_startup() in main/main.c calls zend_activate() to reset the per-request memory arena, re-initialize symbol tables, and clear the executor state. Then it calls each extension's RINIT (Request Init) hook — this is where extensions like session open the session store and opcache primes the optimizer.
Phase 3: Execution is where your PHP code actually runs. The SAPI calls zend_execute_scripts(), which compiles the source file to an op_array (or retrieves a cached one from OPcache) and feeds it to the VM.
Phase 4: Request Shutdown mirrors startup. php_request_shutdown() calls each extension's RSHUTDOWN, then zend_deactivate() destroys all per-request data. In FPM and Apache, the process loops back to Phase 2 for the next request.
Phase 5: Module Shutdown runs when the process exits. Extensions get their MSHUTDOWN call, and zend_shutdown() tears down the engine.
Tip: The clean separation between module startup (once) and request startup (per-request) is why PHP's "shared nothing" architecture works so well. Each request starts with a clean slate — no leaked state from previous requests. This is also why PHP never needs to be "restarted" to pick up code changes (unless OPcache is caching).
Configuration: The INI System
PHP's configuration system is intimately tied to the lifecycle. INI files are parsed during module startup, and the change-mode system controls which settings can be modified at which lifecycle phase.
flowchart TD
A["Process Start"] --> B["Scan for php.ini"]
B --> C["Parse php.ini directives"]
C --> D["Apply PHP_INI_SYSTEM settings"]
D --> E["Extensions register INI entries in MINIT"]
E --> F["Per-request: scan .user.ini"]
F --> G["Apply PHP_INI_PERDIR settings"]
G --> H["Runtime: ini_set() calls"]
H --> I["Apply PHP_INI_USER / PHP_INI_ALL settings"]
Every INI directive has a change mode that determines when it can be modified:
| Mode | Constant | Where it can be set |
|---|---|---|
PHP_INI_SYSTEM |
4 | php.ini only — requires process restart |
PHP_INI_PERDIR |
6 | php.ini, .user.ini, or httpd.conf |
PHP_INI_USER |
7 | All of the above + ini_set() at runtime |
PHP_INI_ALL |
7 | Same as USER — settable anywhere |
The INI entries are defined in Zend/zend_ini.h and registered by each extension during MINIT using macros like STD_PHP_INI_ENTRY. The actual parsing happens inside php_module_startup(), where php_init_config() locates and parses the INI file.
The .user.ini feature (controlled by user_ini.filename) allows per-directory overrides in non-CLI SAPIs. These are scanned during request startup with a configurable cache TTL (user_ini.cache_ttl), so they don't impose per-request file system overhead.
What's Next
We now have the map. We know the four layers, the SAPI contract, and the lifecycle that governs every PHP request. In the next article, we'll zoom into the Zend Engine's core data structures — the 16-byte zval that represents every PHP value, the dual-mode HashTable that powers PHP arrays, and the memory allocator that makes PHP's "allocate everything, free it all at once" model remarkably fast. Understanding these structures is essential for reading any part of the engine code.