Files
lijiaoqiao/llm-gateway-competitors/litellm-wheel-src/litellm/proxy/pass_through_endpoints/architecture.md
2026-03-26 20:06:14 +08:00

2.7 KiB

Pass-Through Endpoints Architecture

Why Pass-Through Endpoints Transform Requests

Even "pass-through" endpoints must perform essential transformations. The request body passes through unchanged, but:

sequenceDiagram
    participant Client
    participant Proxy as LiteLLM Proxy
    participant Provider as LLM Provider

    Client->>Proxy: POST /vertex_ai/v1/projects/.../generateContent
    Note over Client,Proxy: Headers: Authorization: Bearer sk-litellm-key
    Note over Client,Proxy: Body: { "contents": [...] }
    
    rect rgb(240, 240, 240)
        Note over Proxy: 1. URL Construction
        Note over Proxy: Build regional/provider-specific URL
    end
    
    rect rgb(240, 240, 240)
        Note over Proxy: 2. Auth Header Replacement
        Note over Proxy: LiteLLM key → provider credentials
    end

    rect rgb(240, 240, 240)
        Note over Proxy: 3. Extra Operations
        Note over Proxy: • x-pass-* headers (strip prefix, forward)
        Note over Proxy: • x-litellm-tags → metadata
        Note over Proxy: • Guardrails (opt-in)
        Note over Proxy: • Multipart form reconstruction
    end
    
    Proxy->>Provider: POST https://us-central1-aiplatform.googleapis.com/...
    Note over Proxy,Provider: Headers: Authorization: Bearer ya29.google-oauth...
    Note over Proxy,Provider: Body: { "contents": [...] } ← UNCHANGED
    
    Provider-->>Proxy: Response (streaming or non-streaming)

    rect rgb(240, 240, 240)
        Note over Proxy: 4. Response Handling (async)
        Note over Proxy: • Collect streaming chunks for logging
        Note over Proxy: • Cost injection (if enabled)
        Note over Proxy: • Parse response → calculate cost → log
    end
    
    Proxy-->>Client: Response (unchanged)

Essential Transformations

  • URL Construction - Build correct provider URL (e.g., regional endpoints for Vertex AI, Bedrock)
  • Auth Header Replacement - Swap LiteLLM virtual key for actual provider credentials

Extra Operations

Operation Description
x-pass-* headers Strip prefix and forward (e.g., x-pass-anthropic-betaanthropic-beta)
x-litellm-tags header Extract tags and add to request metadata for logging
Streaming chunk collection Collect chunks async for logging after stream completes
Multipart form handling Reconstruct multipart/form-data requests for file uploads
Guardrails (opt-in) Run content filtering when explicitly configured
Cost injection Inject cost into streaming chunks when include_cost_in_streaming_usage enabled

What Does NOT Change

  • Request body
  • Response body
  • Provider-specific parameters