Dify Architecture Overview: Navigating a 6,000-File LLM Platform
Prerequisites
- ›Basic familiarity with Flask and Python web applications
- ›Docker Compose fundamentals
- ›General understanding of LLM application concepts
Dify Architecture Overview: Navigating a 6,000-File LLM Platform
Dify is an open-source platform for building LLM applications — chatbots, RAG pipelines, multi-step workflows, and agent-driven tools — through a visual interface and a comprehensive API. At over 6,000 files spanning Python and TypeScript, the codebase can feel overwhelming. This article provides the orientation you need to navigate it confidently, covering the monorepo layout, deployment topology, boot sequence, routing structure, and configuration system.
Repository Structure and Domain-Driven Layout
The Dify monorepo is organized into three top-level domains, each with a dedicated AGENTS.md that codifies conventions:
| Directory | Language | Purpose | Approx. Files |
|---|---|---|---|
api/ |
Python (Flask) | Backend API server and Celery workers | ~650 core files |
web/ |
TypeScript (Next.js) | Console frontend and embeddable chat UI | ~5,100 files |
docker/ |
YAML / Shell | Docker Compose orchestration and SSRF proxy config | ~20 files |
The project's AGENTS.md lays out the philosophy explicitly: the backend follows Domain-Driven Design and Clean Architecture principles. Async work flows through Celery with Redis as the broker. Dependencies are injected through constructors, and errors are handled with domain-specific exceptions at the correct layer.
Inside api/, the directory structure reflects DDD boundaries:
api/
├── controllers/ # HTTP layer: blueprints, request validation
├── services/ # Application services: orchestration logic
├── core/ # Domain core: workflow engine, RAG, model management
│ ├── app/ # App execution pipeline
│ ├── rag/ # Retrieval-augmented generation
│ ├── workflow/ # Workflow engine and node factory
│ ├── tools/ # Tool abstractions and runtime
│ └── plugin/ # Plugin daemon integration
├── models/ # SQLAlchemy ORM models
├── configs/ # Pydantic configuration system
├── extensions/ # Flask extension initializers
└── tasks/ # Celery task definitions
Tip: When exploring Dify for the first time, start with
api/app.py→api/app_factory.py→api/extensions/. This boot chain explains how every subsystem gets wired together.
Docker Compose Service Topology
A production Dify deployment involves seven primary services orchestrated by Docker Compose. The architecture follows a hub-and-spoke pattern with the API server at the center.
flowchart TD
subgraph External
Client[Browser / API Client]
end
subgraph Docker Network
Nginx[nginx reverse proxy]
API[api - Gunicorn/Flask]
Worker[worker - Celery]
Beat[worker_beat - Celery Beat]
Web[web - Next.js]
PluginDaemon[plugin_daemon]
Sandbox[sandbox - Code Execution]
SSRFProxy[ssrf_proxy - Squid]
subgraph Data Stores
Postgres[(PostgreSQL)]
Redis[(Redis)]
end
end
Client --> Nginx
Nginx --> API
Nginx --> Web
API --> Postgres
API --> Redis
API --> PluginDaemon
API --> SSRFProxy
Worker --> Postgres
Worker --> Redis
Beat --> Redis
Sandbox --> SSRFProxy
PluginDaemon --> Postgres
PluginDaemon -->|backwards invocation| API
The docker/docker-compose.yaml defines the service topology. Both the api and worker services use the same Docker image (langgenius/dify-api), distinguished only by the MODE environment variable — api for the HTTP server, worker for Celery.
The SSRF proxy (Squid) is a deliberate security boundary. When workflow nodes make outbound HTTP requests — fetching URLs, calling APIs, executing webhooks — those requests are routed through this proxy. This prevents Server-Side Request Forgery attacks where a malicious workflow could probe internal network services. The sandbox code execution environment is similarly isolated, connecting only to ssrf_proxy on its dedicated network.
The plugin daemon is a separate Go service that runs untrusted plugin code in isolated processes. It communicates back to the Dify API through a "backwards invocation" channel — an internal API authenticated with INNER_API_KEY_FOR_PLUGIN. We'll explore this architecture in depth in Part 5.
The Two-Process Architecture: API Server and Celery Worker
Dify's backend runs as two distinct process types from a single codebase. The entry point at api/app.py handles both:
flowchart TD
Start[app.py] --> IsDB{is_db_command?}
IsDB -->|Yes| MigApp[create_migrations_app<br/>minimal Flask + DB + Migrate]
IsDB -->|No| FullApp[create_app<br/>full Flask + 28 extensions]
FullApp --> FlaskApp[app = DifyApp]
FullApp --> CeleryApp[celery = app.extensions.celery]
FlaskApp --> Gunicorn[Gunicorn serves HTTP]
CeleryApp --> CeleryWorker[Celery processes tasks]
The file is remarkably compact. When the process is a Gunicorn worker, app is the Flask WSGI application. When it's a Celery worker, the same app is created, and the celery object is extracted from Flask's extensions registry. This shared-codebase pattern means both processes have identical access to models, services, and configuration.
The gevent monkey-patching strategy is carefully orchestrated across both paths. Gunicorn's gunicorn.conf.py uses a post_patch event subscriber that fires after gevent has patched built-in modules. This timing is critical — patching gRPC or psycopg2 before gevent patches the stdlib causes deadlocks:
def post_patch(event):
if not isinstance(event, gevent_events.GeventDidPatchBuiltinModulesEvent):
return
grpc_gevent.init_gevent()
pscycogreen_gevent.patch_psycopg()
The Celery entry point at api/celery_entrypoint.py takes a simpler approach — it patches immediately at import time because Celery's gevent pool handles monkey-patching before this module loads.
Tip: If you're debugging a mysterious hang or deadlock in development, check whether gevent patching order is correct. The comment in
gunicorn.conf.pyreferencing issue #26689 documents a real-world example of this.
Boot Sequence: create_app() and the 28 Extensions
The heart of the startup sequence is app_factory.py. The create_app() function creates a Flask app with configuration, then initializes exactly 28 extensions in a carefully ordered list.
The ordering is not arbitrary — it reflects dependency chains:
flowchart LR
subgraph Phase 1: Foundation
TZ[ext_timezone]
LOG[ext_logging]
WARN[ext_warnings]
IMP[ext_import_modules]
end
subgraph Phase 2: Core Infrastructure
DB[ext_database]
REDIS[ext_redis]
STOR[ext_storage]
CEL[ext_celery]
end
subgraph Phase 3: Application Layer
BP[ext_blueprints]
CMD[ext_commands]
OTEL[ext_otel]
SESS[ext_session_factory]
end
TZ --> LOG --> WARN --> IMP --> DB --> REDIS --> STOR --> CEL --> BP --> CMD --> OTEL --> SESS
Each extension module follows a simple contract: expose an init_app(app) function, and optionally an is_enabled() guard. The initialization loop in app_factory.py#L171-L213 checks is_enabled() before calling init_app(), and in debug mode logs the time each extension takes to initialize:
for ext in extensions:
short_name = ext.__name__.split(".")[-1]
is_enabled = ext.is_enabled() if hasattr(ext, "is_enabled") else True
if not is_enabled:
if dify_config.DEBUG:
logger.info("Skipped %s", short_name)
continue
start_time = time.perf_counter()
ext.init_app(app)
Notable ordering constraints include:
ext_databasebeforeext_migrate— migrations need the database connectionext_storagebeforeext_logstore— log storage depends on the storage backendext_logstorebeforeext_celery— Celery tasks may need log access during initializationext_blueprintsnear the end — route registration depends on nearly everything else
There's also a separate create_migrations_app() that initializes only ext_database and ext_migrate, used when running flask db commands.
Blueprint Registration and Route Groups
Dify's HTTP API is organized into seven blueprint groups, registered in ext_blueprints.py:
| Blueprint | URL Prefix | CORS Policy | Purpose |
|---|---|---|---|
console_app_bp |
/console/api |
Origin-restricted, credentials | Admin console API |
service_api_bp |
/v1 |
Open headers with Authorization |
External developer API |
web_bp |
/api |
Origin-restricted, per-route config | Web app (end-user facing) |
files_bp |
/files |
Open with CSRF token | File upload/download |
inner_api_bp |
/inner/api |
None (internal only) | Plugin daemon callbacks |
mcp_bp |
/mcp |
None | Model Context Protocol endpoints |
trigger_bp |
/trigger |
Open with webhook headers | Webhook trigger endpoints |
The CORS configuration is layered with a helper function _apply_cors_once() that prevents double-application when blueprints are reused across test instances. The web_bp blueprint has the most complex CORS setup — embedded bot endpoints (/chat-messages) use relaxed cross-origin policies without credentials, while authenticated endpoints require stricter origin checks.
flowchart TD
Request[HTTP Request] --> PathMatch{Path prefix?}
PathMatch -->|/console/api| Console[Console Blueprint<br/>Cookie auth + CSRF]
PathMatch -->|/v1| ServiceAPI[Service API Blueprint<br/>Bearer token auth]
PathMatch -->|/api| WebApp[Web Blueprint<br/>Passport token auth]
PathMatch -->|/files| Files[Files Blueprint<br/>Signed URL auth]
PathMatch -->|/inner/api| Inner[Inner API Blueprint<br/>API key auth]
PathMatch -->|/mcp| MCP[MCP Blueprint]
PathMatch -->|/trigger| Trigger[Trigger Blueprint<br/>Webhook auth]
Configuration System: Pydantic Multi-Inheritance with Remote Sources
Dify's configuration is powered by a Pydantic BaseSettings class that uses multiple inheritance to compose nine configuration groups into a single DifyConfig class:
classDiagram
class DifyConfig {
+model_config: SettingsConfigDict
+settings_customise_sources()
}
class PackagingInfo
class DeploymentConfig
class FeatureConfig
class MiddlewareConfig
class ExtraServiceConfig
class ObservabilityConfig
class RemoteSettingsSourceConfig
class EnterpriseFeatureConfig
class EnterpriseTelemetryConfig
DifyConfig --|> PackagingInfo
DifyConfig --|> DeploymentConfig
DifyConfig --|> FeatureConfig
DifyConfig --|> MiddlewareConfig
DifyConfig --|> ExtraServiceConfig
DifyConfig --|> ObservabilityConfig
DifyConfig --|> RemoteSettingsSourceConfig
DifyConfig --|> EnterpriseFeatureConfig
DifyConfig --|> EnterpriseTelemetryConfig
The configuration resolution follows a six-layer priority chain, defined in the settings_customise_sources() classmethod:
- Init settings — programmatic overrides
- Environment variables —
os.environ - Remote settings — Apollo or Nacos config servers (via
RemoteSettingsSourceFactory) - Dotenv file —
.env - File secrets — Docker secrets
- TOML file —
pyproject.tomldefaults
The RemoteSettingsSourceFactory is particularly interesting. It uses a match statement to dispatch to either ApolloSettingsSource or NacosSettingsSource based on the REMOTE_SETTINGS_SOURCE_NAME value. This allows teams to centralize configuration in enterprise config management systems while keeping the same codebase.
Tip: The
MiddlewareConfiggroup alone contains configuration for 30+ vector databases, multiple cache backends, and storage providers. When troubleshooting a missing config value, search the parent config classes — they're spread acrossapi/configs/middleware/,api/configs/feature/, and others.
What's Next
With the architectural foundation in place, we're ready to trace what happens when an actual request hits the system. In Part 2, we'll follow an HTTP request from the controller layer through the AppGenerateService mode dispatcher, down into the Generator → Runner → QueueManager → TaskPipeline four-stage execution pattern that powers every LLM interaction in Dify.