Read OSS

Dify Architecture Overview: Navigating a 6,000-File LLM Platform

Intermediate

Prerequisites

  • Basic familiarity with Flask and Python web applications
  • Docker Compose fundamentals
  • General understanding of LLM application concepts

Dify Architecture Overview: Navigating a 6,000-File LLM Platform

Dify is an open-source platform for building LLM applications — chatbots, RAG pipelines, multi-step workflows, and agent-driven tools — through a visual interface and a comprehensive API. At over 6,000 files spanning Python and TypeScript, the codebase can feel overwhelming. This article provides the orientation you need to navigate it confidently, covering the monorepo layout, deployment topology, boot sequence, routing structure, and configuration system.

Repository Structure and Domain-Driven Layout

The Dify monorepo is organized into three top-level domains, each with a dedicated AGENTS.md that codifies conventions:

Directory Language Purpose Approx. Files
api/ Python (Flask) Backend API server and Celery workers ~650 core files
web/ TypeScript (Next.js) Console frontend and embeddable chat UI ~5,100 files
docker/ YAML / Shell Docker Compose orchestration and SSRF proxy config ~20 files

The project's AGENTS.md lays out the philosophy explicitly: the backend follows Domain-Driven Design and Clean Architecture principles. Async work flows through Celery with Redis as the broker. Dependencies are injected through constructors, and errors are handled with domain-specific exceptions at the correct layer.

Inside api/, the directory structure reflects DDD boundaries:

api/
├── controllers/     # HTTP layer: blueprints, request validation
├── services/        # Application services: orchestration logic
├── core/            # Domain core: workflow engine, RAG, model management
│   ├── app/         # App execution pipeline
│   ├── rag/         # Retrieval-augmented generation
│   ├── workflow/    # Workflow engine and node factory
│   ├── tools/       # Tool abstractions and runtime
│   └── plugin/      # Plugin daemon integration
├── models/          # SQLAlchemy ORM models
├── configs/         # Pydantic configuration system
├── extensions/      # Flask extension initializers
└── tasks/           # Celery task definitions

Tip: When exploring Dify for the first time, start with api/app.pyapi/app_factory.pyapi/extensions/. This boot chain explains how every subsystem gets wired together.

Docker Compose Service Topology

A production Dify deployment involves seven primary services orchestrated by Docker Compose. The architecture follows a hub-and-spoke pattern with the API server at the center.

flowchart TD
    subgraph External
        Client[Browser / API Client]
    end

    subgraph Docker Network
        Nginx[nginx reverse proxy]
        API[api - Gunicorn/Flask]
        Worker[worker - Celery]
        Beat[worker_beat - Celery Beat]
        Web[web - Next.js]
        PluginDaemon[plugin_daemon]
        Sandbox[sandbox - Code Execution]
        SSRFProxy[ssrf_proxy - Squid]

        subgraph Data Stores
            Postgres[(PostgreSQL)]
            Redis[(Redis)]
        end
    end

    Client --> Nginx
    Nginx --> API
    Nginx --> Web
    API --> Postgres
    API --> Redis
    API --> PluginDaemon
    API --> SSRFProxy
    Worker --> Postgres
    Worker --> Redis
    Beat --> Redis
    Sandbox --> SSRFProxy
    PluginDaemon --> Postgres
    PluginDaemon -->|backwards invocation| API

The docker/docker-compose.yaml defines the service topology. Both the api and worker services use the same Docker image (langgenius/dify-api), distinguished only by the MODE environment variable — api for the HTTP server, worker for Celery.

The SSRF proxy (Squid) is a deliberate security boundary. When workflow nodes make outbound HTTP requests — fetching URLs, calling APIs, executing webhooks — those requests are routed through this proxy. This prevents Server-Side Request Forgery attacks where a malicious workflow could probe internal network services. The sandbox code execution environment is similarly isolated, connecting only to ssrf_proxy on its dedicated network.

The plugin daemon is a separate Go service that runs untrusted plugin code in isolated processes. It communicates back to the Dify API through a "backwards invocation" channel — an internal API authenticated with INNER_API_KEY_FOR_PLUGIN. We'll explore this architecture in depth in Part 5.

The Two-Process Architecture: API Server and Celery Worker

Dify's backend runs as two distinct process types from a single codebase. The entry point at api/app.py handles both:

flowchart TD
    Start[app.py] --> IsDB{is_db_command?}
    IsDB -->|Yes| MigApp[create_migrations_app<br/>minimal Flask + DB + Migrate]
    IsDB -->|No| FullApp[create_app<br/>full Flask + 28 extensions]
    FullApp --> FlaskApp[app = DifyApp]
    FullApp --> CeleryApp[celery = app.extensions.celery]
    FlaskApp --> Gunicorn[Gunicorn serves HTTP]
    CeleryApp --> CeleryWorker[Celery processes tasks]

The file is remarkably compact. When the process is a Gunicorn worker, app is the Flask WSGI application. When it's a Celery worker, the same app is created, and the celery object is extracted from Flask's extensions registry. This shared-codebase pattern means both processes have identical access to models, services, and configuration.

The gevent monkey-patching strategy is carefully orchestrated across both paths. Gunicorn's gunicorn.conf.py uses a post_patch event subscriber that fires after gevent has patched built-in modules. This timing is critical — patching gRPC or psycopg2 before gevent patches the stdlib causes deadlocks:

def post_patch(event):
    if not isinstance(event, gevent_events.GeventDidPatchBuiltinModulesEvent):
        return
    grpc_gevent.init_gevent()
    pscycogreen_gevent.patch_psycopg()

The Celery entry point at api/celery_entrypoint.py takes a simpler approach — it patches immediately at import time because Celery's gevent pool handles monkey-patching before this module loads.

Tip: If you're debugging a mysterious hang or deadlock in development, check whether gevent patching order is correct. The comment in gunicorn.conf.py referencing issue #26689 documents a real-world example of this.

Boot Sequence: create_app() and the 28 Extensions

The heart of the startup sequence is app_factory.py. The create_app() function creates a Flask app with configuration, then initializes exactly 28 extensions in a carefully ordered list.

The ordering is not arbitrary — it reflects dependency chains:

flowchart LR
    subgraph Phase 1: Foundation
        TZ[ext_timezone]
        LOG[ext_logging]
        WARN[ext_warnings]
        IMP[ext_import_modules]
    end
    subgraph Phase 2: Core Infrastructure
        DB[ext_database]
        REDIS[ext_redis]
        STOR[ext_storage]
        CEL[ext_celery]
    end
    subgraph Phase 3: Application Layer
        BP[ext_blueprints]
        CMD[ext_commands]
        OTEL[ext_otel]
        SESS[ext_session_factory]
    end
    TZ --> LOG --> WARN --> IMP --> DB --> REDIS --> STOR --> CEL --> BP --> CMD --> OTEL --> SESS

Each extension module follows a simple contract: expose an init_app(app) function, and optionally an is_enabled() guard. The initialization loop in app_factory.py#L171-L213 checks is_enabled() before calling init_app(), and in debug mode logs the time each extension takes to initialize:

for ext in extensions:
    short_name = ext.__name__.split(".")[-1]
    is_enabled = ext.is_enabled() if hasattr(ext, "is_enabled") else True
    if not is_enabled:
        if dify_config.DEBUG:
            logger.info("Skipped %s", short_name)
        continue
    start_time = time.perf_counter()
    ext.init_app(app)

Notable ordering constraints include:

  • ext_database before ext_migrate — migrations need the database connection
  • ext_storage before ext_logstore — log storage depends on the storage backend
  • ext_logstore before ext_celery — Celery tasks may need log access during initialization
  • ext_blueprints near the end — route registration depends on nearly everything else

There's also a separate create_migrations_app() that initializes only ext_database and ext_migrate, used when running flask db commands.

Blueprint Registration and Route Groups

Dify's HTTP API is organized into seven blueprint groups, registered in ext_blueprints.py:

Blueprint URL Prefix CORS Policy Purpose
console_app_bp /console/api Origin-restricted, credentials Admin console API
service_api_bp /v1 Open headers with Authorization External developer API
web_bp /api Origin-restricted, per-route config Web app (end-user facing)
files_bp /files Open with CSRF token File upload/download
inner_api_bp /inner/api None (internal only) Plugin daemon callbacks
mcp_bp /mcp None Model Context Protocol endpoints
trigger_bp /trigger Open with webhook headers Webhook trigger endpoints

The CORS configuration is layered with a helper function _apply_cors_once() that prevents double-application when blueprints are reused across test instances. The web_bp blueprint has the most complex CORS setup — embedded bot endpoints (/chat-messages) use relaxed cross-origin policies without credentials, while authenticated endpoints require stricter origin checks.

flowchart TD
    Request[HTTP Request] --> PathMatch{Path prefix?}
    PathMatch -->|/console/api| Console[Console Blueprint<br/>Cookie auth + CSRF]
    PathMatch -->|/v1| ServiceAPI[Service API Blueprint<br/>Bearer token auth]
    PathMatch -->|/api| WebApp[Web Blueprint<br/>Passport token auth]
    PathMatch -->|/files| Files[Files Blueprint<br/>Signed URL auth]
    PathMatch -->|/inner/api| Inner[Inner API Blueprint<br/>API key auth]
    PathMatch -->|/mcp| MCP[MCP Blueprint]
    PathMatch -->|/trigger| Trigger[Trigger Blueprint<br/>Webhook auth]

Configuration System: Pydantic Multi-Inheritance with Remote Sources

Dify's configuration is powered by a Pydantic BaseSettings class that uses multiple inheritance to compose nine configuration groups into a single DifyConfig class:

classDiagram
    class DifyConfig {
        +model_config: SettingsConfigDict
        +settings_customise_sources()
    }
    class PackagingInfo
    class DeploymentConfig
    class FeatureConfig
    class MiddlewareConfig
    class ExtraServiceConfig
    class ObservabilityConfig
    class RemoteSettingsSourceConfig
    class EnterpriseFeatureConfig
    class EnterpriseTelemetryConfig

    DifyConfig --|> PackagingInfo
    DifyConfig --|> DeploymentConfig
    DifyConfig --|> FeatureConfig
    DifyConfig --|> MiddlewareConfig
    DifyConfig --|> ExtraServiceConfig
    DifyConfig --|> ObservabilityConfig
    DifyConfig --|> RemoteSettingsSourceConfig
    DifyConfig --|> EnterpriseFeatureConfig
    DifyConfig --|> EnterpriseTelemetryConfig

The configuration resolution follows a six-layer priority chain, defined in the settings_customise_sources() classmethod:

  1. Init settings — programmatic overrides
  2. Environment variablesos.environ
  3. Remote settings — Apollo or Nacos config servers (via RemoteSettingsSourceFactory)
  4. Dotenv file.env
  5. File secrets — Docker secrets
  6. TOML filepyproject.toml defaults

The RemoteSettingsSourceFactory is particularly interesting. It uses a match statement to dispatch to either ApolloSettingsSource or NacosSettingsSource based on the REMOTE_SETTINGS_SOURCE_NAME value. This allows teams to centralize configuration in enterprise config management systems while keeping the same codebase.

Tip: The MiddlewareConfig group alone contains configuration for 30+ vector databases, multiple cache backends, and storage providers. When troubleshooting a missing config value, search the parent config classes — they're spread across api/configs/middleware/, api/configs/feature/, and others.

What's Next

With the architectural foundation in place, we're ready to trace what happens when an actual request hits the system. In Part 2, we'll follow an HTTP request from the controller layer through the AppGenerateService mode dispatcher, down into the Generator → Runner → QueueManager → TaskPipeline four-stage execution pattern that powers every LLM interaction in Dify.