Debye Documentation
Comprehensive reference for the Debye LLM Governance Proxy. Covers installation, CLI usage, policy configuration, API reference, architecture, security model, and operations.
1. Quick Start
Prerequisites
- An active Debye tenant provisioned by your organization (you need your
tenant-id) - Access to your organization's Identity Provider (Entra ID, Okta, Google Workspace)
- Network connectivity to the Debye proxy endpoint (provided by your IT team)
- An LLM tool that supports
ANTHROPIC_BASE_URL(Claude Code, Cursor, Windsurf, etc.)
Installation
The Debye CLI is a single static binary. Choose your platform:
macOS (Homebrew)
brew tap fusionminds/debye
brew install debyemacOS (signed PKG)
# Download from your IT team's distribution portal
sudo installer -pkg debye-macos-arm64.pkg -target /Linux (DEB)
sudo dpkg -i debye_1.0.0_amd64.debLinux (RPM)
sudo rpm -i debye-1.0.0.x86_64.rpmWindows (MSI)
# Run the signed MSI installer distributed by your IT team
# Or via command line:
msiexec /i debye-windows-amd64.msi /quietCurl one-liner (evaluation)
curl -fsSL https://get.debye.dev | shFirst login
debye login --org <tenant-id>This opens your browser to your organization's Identity Provider. After authenticating, the CLI exchanges the IdP token for a proxy-issued JWT and stores it securely in your OS keychain (macOS Keychain / Windows Credential Manager) or ~/.debye/credentials on Linux. Your tenant ID is saved to ~/.debye/config for subsequent use.
First proxied request
debye wrap claudeThis validates your JWT, refreshes it if needed, sets ANTHROPIC_BASE_URL and ANTHROPIC_API_KEY in the subprocess environment, then execs claude. Your tool connects to the Debye proxy transparently. All requests are inspected against your organization's policies before being forwarded to the Anthropic API.
Verifying it works
# Check your authentication status
debye status
# Expected output:
# User: alice@example.com
# Tenant: acme-corp
# Proxy: https://proxy.acme.debye.dev
# Token: valid (expires in 23h 42m)
# Auth: keychainIf debye status shows a valid token and correct tenant, your setup is complete. Use debye wrap before any LLM tool invocation from this point forward.
2. CLI Commands
The Debye CLI handles authentication, token lifecycle, and environment setup. It is a simple set-env-vars-and-exec wrapper with no persistent processes or local proxy.
debye login
debye login --org <tenant-id>OIDC browser login for the specified tenant. Uses authorization code flow with PKCE. Exchanges the IdP ID token at the proxy /auth/token endpoint for a proxy-issued JWT (24h lifetime) and refresh token (30 days, configurable per tenant).
Flags
| Flag | Description |
|---|---|
| --org <id> | Tenant ID (required on first login, saved for subsequent use) |
Example
$ debye login --org acme-corp
Opening browser for authentication...
Waiting for IdP callback...
Authenticated as alice@example.com
Token stored in keychain
Tenant: acme-corpMulti-tenant note: Running debye login --org <other-tenant> switches your active tenant. Only one tenant is active at a time (same pattern as gcloud config or aws --profile).
debye status
debye statusShow current authentication state: user identity, active tenant, token expiry, proxy endpoint, and credential storage location.
Example output
$ debye status
User: alice@example.com
Tenant: acme-corp
Proxy: https://proxy.acme.debye.dev
Token: valid (expires in 23h 42m)
Auth: keychaindebye env
debye env [--shell bash|zsh|fish|powershell]Print shell exports for ANTHROPIC_BASE_URL and ANTHROPIC_API_KEY. Useful for scripting or manual use. Auto-detects shell if --shell is omitted.
Flags
| Flag | Description |
|---|---|
| --shell <type> | Output format: bash, zsh, fish, or powershell (auto-detected if omitted) |
Example (bash/zsh)
$ debye env
export ANTHROPIC_BASE_URL="https://proxy.acme.debye.dev"
export ANTHROPIC_API_KEY="eyJhbGciOiJSUz..."
# Usage: eval $(debye env)Example (PowerShell)
PS> debye env --shell powershell
$env:ANTHROPIC_BASE_URL = "https://proxy.acme.debye.dev"
$env:ANTHROPIC_API_KEY = "eyJhbGciOiJSUz..."Warning: The token exported by debye env expires after 24 hours. Use debye wrap for automatic refresh, or re-run debye login to obtain a fresh token.
debye wrap
debye wrap <command> [args...]The primary way to use Debye. Checks the stored JWT, refreshes if within 1 hour of expiry, sets ANTHROPIC_BASE_URL and ANTHROPIC_API_KEY in the subprocess environment, then execs the command. The subprocess connects directly to the remote proxy.
Examples
# Wrap Claude Code
debye wrap claude
# Wrap with arguments
debye wrap claude --model claude-sonnet-4-20250514
# Wrap a custom script
debye wrap python my_llm_script.py
# Wrap Cursor (if it reads ANTHROPIC_BASE_URL)
debye wrap cursorToken lifetime note: JWTs have a 24-hour lifetime. If a session exceeds 24 hours, the subprocess will receive a 401 error. Re-run debye wrap to refresh automatically.
debye logout
debye logoutClear stored credentials from the OS keychain (or ~/.debye/credentials on Linux). Removes saved tenant configuration. After logout, debye wrap will require a new debye login.
Example
$ debye logout
Credentials cleared from keychain
Logged out of acme-corp3. Policy Rule Types
Rules are evaluated against extracted text content, not raw JSON. The policy engine recursively extracts string values from JSON structures. JSON keys and structural tokens are excluded from scanning to prevent false positives.
Secrets / Credentials
Known-format pattern detection. Specific regex for common credential formats. No generic high-entropy detection in MVP (excluded due to false-positive rate on base64 content).
| Pattern | Regex | Caught | Not caught (false positive guidance) |
|---|---|---|---|
| AWS Access Key | AKIA[0-9A-Z]{16} | AKIAIOSFODNN7EXAMPLE | Strings starting with AKIA but < 20 chars |
| GitHub Token | gh[ps]_[A-Za-z0-9_]{36,} | ghp_ABCdef123456..., gho_... | Old-format GitHub tokens without prefix |
| Anthropic Key | sk-ant-[a-zA-Z0-9-_]{20,} | sk-ant-api03-abc... | References to "sk-ant" in prose without full key |
| OpenAI Key | sk-[a-zA-Z0-9]{20,} | sk-proj-abc123... | Short strings starting with "sk-" |
| Private Key | -----BEGIN (RSA|EC|OPENSSH) PRIVATE KEY----- | PEM-formatted private key blocks | Public key headers, certificate headers |
| Connection String | [a-z]+://[^:]+:[^@]+@[^\s]+ | postgres://user:pass@host/db | URLs without embedded credentials |
PII Detection
Pattern-based, region-aware. Covers MENA national ID formats plus global patterns. All patterns use RE2-compatible regex for guaranteed linear-time matching.
| Pattern | Format | Caught | Not caught |
|---|---|---|---|
| Saudi Iqama | [12]\d{9} | 10-digit numbers starting with 1 or 2 | Random 10-digit numbers in other contexts |
| UAE Emirates ID | 784-\d{4}-\d{7}-\d | 784-1990-1234567-8 | Partial matches without full format |
| [a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,} | Standard email formats | Example domains (configurable exclusion) | |
| Phone Number | E.164 and regional formats | +971501234567, +966512345678 | Short numeric sequences, port numbers |
| Passport | Country-specific formats | Standard passport number patterns | Arbitrary alphanumeric strings |
Proprietary Code / IP
File-path rules and keyword matching, customer-configured. Use for detecting references to internal systems, proprietary packages, or classified project names.
Example configurations
# Block internal package references
Pattern: "github\.com/acme-internal/.*"
Action: block
Description: "Internal packages must not be shared with LLM providers"
# Flag classified project names
Pattern: "project-(phoenix|atlas|titan)"
Action: flag
Description: "Classified project name detected — review context"
# Block proprietary headers
Pattern: "// PROPRIETARY AND CONFIDENTIAL"
Action: block
Description: "File marked proprietary — do not send to external services"Keyword / Topic Blocking
Configurable keyword and phrase lists with exact match and basic glob support. Use for competitor names, restricted terms, or internal codenames.
Example configurations
# Block competitor name references
Keywords: ["CompetitorCo", "CompetitorProduct"]
Match: exact (case-insensitive)
Action: block
# Flag internal codenames
Keywords: ["project-*", "codename-*"]
Match: glob
Action: flag4. Policy Actions
| Action | Behavior | Request outcome |
|---|---|---|
| block | Request rejected with structured HTTP 403 error. Violation match context logged. Webhook dispatched. | Rejected. Engineer sees error with rule description. |
| flag | Request forwarded to upstream. Match logged as warning. Webhook dispatched. | Allowed. Admin notified via webhook. |
Evaluation order
Rules are evaluated in the order defined by the admin (configurable via drag-and-drop in the Dashboard or the PUT /admin/v1/policies/reorder endpoint). The first block verdict wins and stops evaluation. Flag verdicts accumulate: if multiple rules flag the same request, all are recorded in matched_rules. A block after flags records both the block and all preceding flags.
Soft-rollout workflow
Best practice for deploying new policy rules without disrupting engineers:
- Deploy as flag: Create the rule with action
flag. All matching requests are forwarded but logged as warnings. - Monitor for 1-2 weeks: Use the Audit Explorer to review flagged requests. Check for false positives. Tune patterns as needed.
- Review with your team: Share flagged violations with engineering leads. Ensure the rule descriptions clearly explain why the rule exists and how to remediate.
- Switch to block: Once false positive rate is acceptable, change the action to
blockvia the Dashboard or API (PUT /admin/v1/policies/{id}). - Communicate: Notify engineers before enforcement goes live. Include the rule description and remediation guidance.
Both flag and block actions trigger webhook notifications with identical payload structure (differing only in policy_verdict), so your alerting pipeline works identically during soft rollout and enforcement.
5. Error Codes
All error responses use a consistent JSON structure. Engineers see error.type, error.message, and error.request_id. Policy violations include matched_rules with admin-configured descriptions.
{
"error": {
"type": "<error_type>",
"message": "Human-readable description",
"request_id": "uuid",
"matched_rules": [ ... ] // only for policy_violation
}
}| Error Type | HTTP | Description | Troubleshooting |
|---|---|---|---|
| policy_violation | 403 | Request blocked by a policy rule. | Check matched_rules for the rule ID and description. The description explains the rule and remediation steps. Remove the flagged content and retry. |
| model_not_allowed | 403 | Requested model not in tenant allowlist. | Check your tenant's allowed models via the Dashboard. Contact your admin to update the model allowlist. |
| auth_error | 401 / 403 | Invalid, expired, or revoked JWT. Missing admin role for admin endpoints. | Run debye status to check token validity. If expired, run debye login. If revoked (e.g., by admin), contact your administrator. |
| validation_error | 400 | Malformed request body, invalid parameters. | Check the request body for JSON syntax errors. Verify required fields are present. Check the error.message for specifics. |
| unknown_path | 403 | Unrecognized write endpoint blocked (fail-closed). | The proxy blocks unknown POST/PUT/PATCH/DELETE paths. If Anthropic added a new endpoint, wait for Debye to classify it. |
| upstream_error | 502 / 504 | Upstream Anthropic API returned an error or timed out (120s). | Check Anthropic's status page. 502 is typically a transient upstream error. 504 means the request timed out after 120s. Retry the request. |
| internal_error | 500 / 503 | Internal proxy error. Policy engine failure or Redis unreachable. | 503 typically means Redis is down (proxy fails closed). Check /readyz. Contact your platform admin. The proxy will recover automatically when Redis reconnects. |
| rate_limited | 429 | Tenant concurrency cap exceeded. | Too many concurrent requests from your organization. Retry with exponential backoff. Your admin can increase the concurrency cap. |
| payload_too_large | 413 | Request body exceeds tenant size limit (default 25MB). | Reduce the request payload. Consider splitting large content across multiple requests. Admin can adjust the limit. |
6. Webhook Payload Format
On block or flag, an HTTP POST is sent to each configured webhook endpoint. Delivery is best-effort with a 5-second timeout and a single retry after 10 seconds. Use delivery_id for idempotency.
Full payload example
{
"delivery_id": "d4e5f6a7-b8c9-0d1e-2f3a-4b5c6d7e8f90",
"request_id": "a1b2c3d4-e5f6-7890-abcd-ef1234567890",
"timestamp": "2026-04-11T14:32:07Z",
"tenant_id": "acme-corp",
"user_id": "alice@acme.com",
"model": "claude-sonnet-4-20250514",
"policy_verdict": "blocked",
"matched_rules": [
{
"rule_id": "secrets-aws-keys",
"rule_type": "secrets",
"action": "block",
"description": "AWS access keys must be stored in Key Vault, not in code.",
"match_location": {
"message_index": 2,
"role": "user",
"content_block_index": 0,
"char_offset": 142
}
},
{
"rule_id": "pii-email",
"rule_type": "pii",
"action": "flag",
"description": "Email address detected in prompt content.",
"match_location": {
"message_index": 1,
"role": "user",
"content_block_index": 0,
"char_offset": 87
}
}
]
}Field reference
| Field | Type | Description |
|---|---|---|
| delivery_id | UUID | Unique per webhook delivery attempt. Use for idempotency deduplication. |
| request_id | UUID | Unique per proxy request. Same value across retries and multiple webhooks for the same event. |
| timestamp | ISO 8601 | Time the policy decision was made. |
| tenant_id | String | The tenant this event belongs to. |
| user_id | String | Email of the engineer whose request triggered the policy. |
| model | String | The Anthropic model requested (e.g., claude-sonnet-4-20250514). |
| policy_verdict | String | blocked or flagged. No webhook for allowed requests. |
| matched_rules | Array | All rules that matched. Each entry includes rule_id, rule_type, action, description, and match_location. |
| matched_rules[].match_location | Object | Location within the request: message_index, role, content_block_index, char_offset. |
Signature verification
Each webhook request includes an X-Debye-Signature header with an HMAC-SHA256 signature of the raw request body using the per-webhook secret (displayed once when the webhook is created via POST /admin/v1/webhooks).
# Verify in Python
import hmac, hashlib
def verify_webhook(body: bytes, signature: str, secret: str) -> bool:
expected = hmac.new(
secret.encode(), body, hashlib.sha256
).hexdigest()
return hmac.compare_digest(f"sha256={expected}", signature)
# Verify in Node.js
const crypto = require('crypto');
function verifyWebhook(body, signature, secret) {
const expected = crypto
.createHmac('sha256', secret)
.update(body)
.digest('hex');
return crypto.timingSafeEqual(
Buffer.from(`sha256=${expected}`),
Buffer.from(signature)
);
}7. API Reference
All admin endpoints are under /admin/v1. Requests require a valid JWT with the admin role in the Authorization: Bearer <token> header.
Policies
| Method | Endpoint | Description |
|---|---|---|
| GET | /policies | List all policies (ordered by sort_order) |
| POST | /policies | Create a new policy rule |
| PUT | /policies/{id} | Update an existing policy |
| DELETE | /policies/{id} | Delete a policy |
| PUT | /policies/reorder | Reorder policies (first block wins) |
| PATCH | /policies/{id}/toggle | Enable or disable a policy |
| POST | /policies/test | Test policies against a sample payload |
Create policy — request example
POST /admin/v1/policies
Authorization: Bearer <jwt>
Content-Type: application/json
{
"rule_id": "secrets-aws-keys",
"rule_type": "secrets",
"action": "block",
"enabled": true,
"scope": ["user", "tool_result"],
"match_config": {
"pattern": "AKIA[0-9A-Z]{16}"
},
"description": "AWS access keys must be stored in Key Vault, not in code. See #security."
}Create policy — response example
HTTP/1.1 201 Created
{
"id": "pol_01H8X3Y4Z5",
"rule_id": "secrets-aws-keys",
"rule_type": "secrets",
"action": "block",
"enabled": true,
"scope": ["user", "tool_result"],
"sort_order": 5,
"match_config": {
"pattern": "AKIA[0-9A-Z]{16}"
},
"description": "AWS access keys must be stored in Key Vault, not in code. See #security.",
"created_at": "2026-04-11T10:00:00Z",
"updated_at": "2026-04-11T10:00:00Z"
}Test policy — request example
POST /admin/v1/policies/test
Authorization: Bearer <jwt>
Content-Type: application/json
{
"messages": [
{ "role": "user", "content": "Here is my key: AKIAIOSFODNN7EXAMPLE" }
]
}Test policy — response example
HTTP/1.1 200 OK
{
"verdict": "blocked",
"matched_rules": [
{
"rule_id": "secrets-aws-keys",
"action": "block",
"match_location": { "message_index": 0, "role": "user", "char_offset": 18 }
}
]
}Audit Logs
| Method | Endpoint | Description |
|---|---|---|
| GET | /audit-logs | Query with filters: user, verdict, model, time range, page/limit |
| GET | /audit-logs/{request_id} | Get single audit record by request ID |
| GET | /audit-logs/export | Export audit logs as CSV (with same filters) |
Query audit logs — example
GET /admin/v1/audit-logs?verdict=blocked&user=alice@acme.com&from=2026-04-01&to=2026-04-11&limit=50
Authorization: Bearer <jwt>Violations
| Method | Endpoint | Description |
|---|---|---|
| GET | /violations | List violation match context records (admin only) |
| GET | /violations/{request_id} | Violation details with redacted match context for a specific request |
Users
| Method | Endpoint | Description |
|---|---|---|
| GET | /users | List all users seen via JWT claims |
| GET | /users/{user_id}/activity | User activity: total requests, violations, last active, model usage |
Tenant Configuration
| Method | Endpoint | Description |
|---|---|---|
| GET | /tenant/models | Get the model allowlist |
| PUT | /tenant/models | Update the model allowlist (empty = all allowed) |
Webhooks
| Method | Endpoint | Description |
|---|---|---|
| GET | /webhooks | List configured webhooks |
| POST | /webhooks | Create a webhook (returns signing secret once) |
| DELETE | /webhooks/{id} | Delete a webhook |
| POST | /webhooks/{id}/rotate-secret | Rotate the webhook signing secret |
Token Revocation
| Method | Endpoint | Description |
|---|---|---|
| POST | /revocations | Revoke tokens by JTI or user ID |
Revoke by user — example
POST /admin/v1/revocations
Authorization: Bearer <jwt>
Content-Type: application/json
{
"type": "user",
"user_id": "alice@acme.com",
"reason": "Employee offboarded"
}8. Architecture Overview
Debye operates as a transparent HTTP reverse proxy between engineering tools and the Anthropic API. The proxy authenticates, inspects, and forwards requests. A separate Admin API and Dashboard provide management capabilities.
Request flow diagram
Engineer Workstation Debye Infrastructure Upstream
==================== ====================== ========
+------------------+ HTTPS/JWT +---------------------+ HTTPS +-----------+
| Claude Code / | -----------------> | Debye Proxy | ----------------> | Anthropic |
| Cursor / | | | | API |
| Windsurf / | <-- SSE stream --- | 1. TLS termination | <-- SSE stream -- | |
| Custom tool | | 2. JWT validation | +-----------+
+------------------+ | 3. Policy eval |
| | 4. Upstream forward|
| debye wrap | 5. Usage extract |
v | 6. Audit log write |
+------------------+ +---------------------+
| Debye CLI | | |
| - OIDC login | | |
| - JWT mgmt | +-----+ +------+
| - env setup | | |
+------------------+ v v
+------------+ +-------------+
| PostgreSQL | | Redis |
| - Tenants | | - Revoke |
| - Policies| | - PubSub |
| - Audit | | - Streams |
+------------+ +-------------+
Admin Browser |
============= |
+------------------+ HTTPS/JWT +-----+-------+
| Admin Dashboard | <--------------> | Admin API |
| (React/Next.js) | | (Go) |
+------------------+ +-------------+
Request lifecycle (10 steps)
- Engineer's tool sends HTTPS POST to proxy endpoint (JWT used as API key via
debye wrap) - Proxy terminates TLS, validates JWT (signature, expiry,
kidto tenant key match, Redis revocation check) - Extracts tenant context (customer ID, user identity, policy set) from JWT claims
- Checks requested path against route table (scanned, pass-through, or blocked)
- For scanned paths: buffers full request body, parses as JSON
- Policy engine evaluates against all applicable rules (each message block and tool result scanned)
- Block: reject with structured error, log violation with match context, dispatch webhook
- Flag: forward request, log match as warning, dispatch webhook
- Allow: inject upstream Anthropic API key, forward to api.anthropic.com, stream SSE response back
- On completion/error: extract token usage from SSE stream, write audit record to PostgreSQL
Path routing
| Category | Paths | Behavior |
|---|---|---|
| Scanned | POST /v1/messages, POST /v1/messages/count_tokens | Full body buffered, parsed, policy engine evaluated |
| Pass-through | GET /v1/models, other read-only endpoints | Forwarded with JWT-to-API-key swap, no body inspection |
| Blocked | Unknown write endpoints (POST/PUT/PATCH/DELETE to unrecognized paths) | Rejected with HTTP 403. Fail-closed for writes. |
9. Authentication Flow
Debye uses OIDC-based SSO for all human users. Any OIDC-compliant IdP (Entra ID, Okta, Google Workspace) is supported via per-tenant configuration. A single auth infrastructure serves both the proxy and the Admin API.
End-to-end OIDC flow
Engineer CLI Browser/IdP Proxy
======== === =========== =====
debye login --org acme
|
+---> Resolve tenant
| config (proxy URL,
| IdP issuer)
|
+---> Start local callback
| server (127.0.0.1)
|
+---> Open browser ------> IdP login page
| (Entra ID / Okta)
| |
| User authenticates
| (MFA if configured)
| |
| Authorization code
| <--- callback ----------- + PKCE verifier
|
+---> Exchange code for
| ID token (with IdP)
|
+---> POST /auth/token -----------------------------------------> Validate ID token
| { id_token, org } Check org match
| Map groups -> roles
| Issue JWT (24h) +
| <---------------------------------------------------------- refresh token (30d)
|
+---> Store JWT + refresh
| in OS keychain
|
+---> Done. Ready for
debye wrap.
Token lifecycle
| Token | Lifetime | Storage | Refresh behavior |
|---|---|---|---|
| JWT (access) | 24 hours | OS keychain / ~/.debye/credentials | CLI auto-refreshes if within 1 hour of expiry before each debye wrap |
| Refresh token | 30 days (configurable) | Same as JWT | Exchanged at /auth/refresh for new JWT. If expired, full re-login required. |
When re-login is needed
- Refresh token has expired (more than 30 days since last login)
- Token has been revoked by an admin (via
POST /admin/v1/revocations) - Tenant has been deactivated
- After running
debye logout - Switching to a different tenant (
debye login --org <other-tenant>)
JWT claims
| Claim | Type | Purpose |
|---|---|---|
| sub | String (email) | User identity |
| org | String | Tenant ID — determines policies, keys, audit scope |
| groups | String[] | IdP group memberships for RBAC |
| roles | String[] | Proxy-determined: [admin], [engineer], or both |
| jti | UUID | Token ID for revocation |
| kid | String | JWT header — identifies tenant signing key. Must match org during validation. |
| exp / iat | Unix timestamp | Expiry (24h from issuance) and issuance time |
| aud | String | Audience — set to proxy endpoint URL. Prevents token replay across environments. |
| scope | String[] | Permitted actions (e.g., admin API access) |
10. Policy Engine Deep Dive
Content block scanning
The policy engine scans the following content types from the Anthropic Messages API request body:
| Content type | Source | What is scanned |
|---|---|---|
| System prompt | system field | Full text content |
| User messages | messages[].role == "user" | All text content blocks |
| Tool results | tool_result content blocks | Text returned from local tool execution |
| Assistant tool use | Previous assistant turns | Tool-use input parameters (replayed in conversation history) |
Scope filtering
Each rule specifies which message roles it applies to via the scope field:
"scope": ["user", "tool_result"] // Only scan user messages and tool results
"scope": ["system"] // Only scan system prompts
"scope": ["all"] // Scan everything (default)This allows fine-grained control. For example, you might want to block AWS keys in user messages and tool results but allow them in system prompts (where an admin might configure instructions mentioning key formats).
Rule evaluation order
Rules are evaluated sequentially in the order defined by the admin:
- Each rule is tested against all applicable content blocks (filtered by scope)
- A block verdict stops evaluation immediately. The request is rejected.
- Flag verdicts accumulate. Evaluation continues to find all flags.
- If a block follows flags, all preceding flags are also recorded in
matched_rules. - If no rules match, the request is allowed.
Unknown content block handling
When the Anthropic API introduces new content block types, the proxy must decide how to handle them. Two modes are available, configurable per-tenant:
| Mode | Behavior | Trade-off |
|---|---|---|
| Block (default) | Requests containing unrecognized content block types are rejected with a clear error identifying the unknown type. | Maximum security. Brief disruption when Anthropic introduces new types. |
| Skip with warning | Unrecognized types are not scanned. Request forwarded. Audit log records which types were skipped. Webhook dispatched. | Higher availability. Potential for data to bypass scanning via novel block types. |
JSON value extraction
When scanning tool_result blocks and tool-use inputparameters that contain JSON, the engine recursively extracts string values from the JSON structure. JSON keys, structural tokens, and content block metadata are excluded from scanning. This prevents false positives where a keyword like "password" matches a JSON key name rather than actual sensitive content.
Scanning depth is bounded to prevent abuse:
- Maximum nesting depth: 10 levels
- Maximum nodes per content block: 10,000
- Blocks exceeding these limits are treated as unknown content blocks (see above)
Performance characteristics
- Target: < 5ms policy evaluation (common case, no match)
- Execution: In-process, no RPC overhead
- Regex engine: Go
regexppackage (RE2 only). Guarantees linear-time matching. Immune to ReDoS. - Compilation: All regex patterns pre-compiled at config load time
- Cache: Policy set cached in-memory per tenant. Updated via atomic swap (no partial updates).
- Budget rationale: Proxy is on the critical path with ~200ms base network RTT. Policy eval must be negligible.
- Rule limits: Maximum 200 rules per tenant. Custom regex max 1KB pattern length.
11. Audit Log Schema
Every request through the proxy generates an audit record written to PostgreSQL before the response is sent to the client. The audit log is the authoritative compliance record. Request and response bodies are never stored.
| Field | Type | Description |
|---|---|---|
| request_id | UUID | Unique identifier for this request. Primary key. |
| session_id | UUID / null | Heuristic session grouping. Requests within 5 minutes from the same user are grouped. Explicitly heuristic, not authoritative for forensics. |
| tenant_id | String | From JWT org claim. All queries are tenant-scoped. |
| user_id | String | From JWT sub claim (user email). |
| timestamp_start | ISO 8601 | When the request was received by the proxy. |
| timestamp_end | ISO 8601 | When the response completed or an error occurred. |
| model | String | Requested model identifier (e.g., claude-sonnet-4-20250514). |
| message_count | Integer | Number of messages in the conversation history. |
| role_sequence | String[] | Ordered list of message roles in the request (e.g., ["user", "assistant", "user"]). |
| tool_names | String[] | Names of tools referenced in the request. |
| file_paths | String[] | File paths or URLs found in tool calls or results. |
| input_tokens | Integer / null | Input token count extracted from SSE usage. Null if request was blocked. |
| output_tokens | Integer / null | Output token count from SSE usage. Null if blocked. |
| policy_verdict | Enum | allowed, blocked, or flagged. |
| matched_rules | JSON[] | Array of matched rule objects with rule ID, match type, and location (offset, message index). |
| skipped_block_types | String[] | Unrecognized content block types that were skipped during scanning (only when tenant uses "skip with warning" mode). |
| upstream_status | Integer / null | HTTP status code from Anthropic. Null if request was blocked before forwarding. |
| response_bytes | Integer | Total bytes in the upstream response. |
| ttfb_ms | Integer / null | Time to first byte from upstream (milliseconds). Null if blocked. |
| duration_ms | Integer | Total request-response duration in milliseconds. |
| error | String / null | Error description if the request failed. Null on success. |
Violation match context (separate record)
On block, the proxy stores match context alongside the audit record. Full request bodies are never stored. This prevents creating a searchable archive of the sensitive data the product is designed to protect.
| Field | Description |
|---|---|
| request_id | Links to the audit record |
| rule_id | The rule that triggered the violation |
| rule_type | Category (secrets, pii, ip, keyword) |
| message_index | Which message in the conversation array |
| role | Role of the message (user, system, tool_result, assistant) |
| char_offset | Character offset of the match within the content block |
| redacted_context | 20 chars before and after the match, with match replaced by [REDACTED] |
| match_hmac | HMAC-SHA256 of the matched content (per-tenant key). For cross-violation correlation, not recovery. |
Match context records have a 30-day retention (configurable per tenant), separate from the audit log retention (default 1 year). This shorter retention reflects the sensitivity of the data referenced in violation context.
12. Security Model
Network enforcement
Debye uses mandatory network-level enforcement, not opt-in:
- Firewall rules: Block all outbound to
api.anthropic.comon ports 443/80 - DNS/env config: Tools point to the proxy via the wrapper CLI or group policy
- TLS termination: Proxy terminates TLS with its own certificate (does not impersonate Anthropic)
- Bypass prevention: Engineers who set
ANTHROPIC_BASE_URLmanually still go through the proxy (firewall prevents direct upstream access) and still need a valid JWT
JWT security
- Per-tenant signing keys: Stored in Azure Key Vault. A shared key across tenants would mean a single compromised key forges JWTs for all tenants.
- Key rotation: Two active keys per tenant (old + new). New tokens issued with new key. Old key removed after all tokens expire (24h max).
kidtoorgbinding: A JWT'skidmust map to a signing key belonging to the tenant in theorgclaim. A token signed with tenant A's key but carrying tenant B'sorgis rejected.- Audience validation:
audclaim set to proxy endpoint URL. Prevents token replay across environments (staging vs production).
Token revocation
- Redis denylist: Revoked
jtivalues stored in Redis, checked on every request. Entries auto-expire when the JWT would have expired. - Fail-closed on Redis outage: If Redis is unreachable, the proxy cannot verify the revocation denylist and rejects ALL requests with HTTP 503. This prevents revoked tokens (e.g., terminated employee) from being used during an outage.
- Recovery: Azure Cache for Redis Premium tier provides 99.95% SLA.
/readyzreturns 503 when Redis is down, so the container orchestrator stops routing traffic to affected instances.
Tenant isolation
| Layer | Isolation mechanism |
|---|---|
| Authentication | JWT org claim determines tenant. No valid org = rejected. |
| Policy engine | Policies loaded per-tenant, cached in-memory. Tenant-scoped read/write. |
| Upstream keys | Each tenant has own key(s) in Key Vault. No sharing. |
| Audit log | All records tagged with tenant ID. Queries filtered. No cross-tenant view. |
| Admin API | Admin users scoped to their tenant via JWT org. No cross-tenant access. |
| IdP config | Each tenant configures own OIDC issuer, client ID, audiences. |
Match context redaction
The proxy never stores full blocked request bodies. Storing complete bodies would create a curated archive of exactly the sensitive data Debye is designed to protect. Instead, on block, only redacted match context is stored: 20 characters before and after the match, with the matched content replaced by [REDACTED] and an HMAC-SHA256 of the matched content using a per-tenant server-side key (for correlation, not recovery).
Plain SHA-256 would allow offline precomputation to confirm known secrets. HMAC prevents this.
Webhook security
- HMAC signing: Each webhook includes
X-Debye-Signaturewith HMAC-SHA256 of the body using a per-webhook secret. - SSRF protection: Webhook URLs must be HTTPS, must resolve to a public IP. RFC 1918, link-local, and loopback addresses are rejected. DNS resolution performed at both configuration time and delivery time.
- Secret rotation: Webhook secrets can be rotated via
POST /admin/v1/webhooks/{id}/rotate-secret. Old secret invalidated immediately.
13. Deployment Guide
Azure infrastructure requirements
| Component | Azure service | Purpose |
|---|---|---|
| Proxy runtime | Azure Container Apps or AKS | Horizontally scalable proxy instances |
| Database | Azure Database for PostgreSQL | Tenant config, policies, audit log |
| Cache / revocation | Azure Cache for Redis (Premium) | JWT revocation denylist, policy cache pub/sub, real-time streams |
| Secret store | Azure Key Vault | Upstream API keys, JWT signing keys |
| Audit archive | Azure Blob Storage (GRS) | Compressed audit records beyond retention period |
| TLS termination | Azure App Gateway or proxy-native | Customer-facing HTTPS |
Region: Azure MENA regions (UAE North or Qatar) for customer alignment with regional data residency requirements.
Docker deployment
# Pull the proxy image
docker pull ghcr.io/fusionminds/debye-proxy:latest
# Run with required environment variables
docker run -d \
--name debye-proxy \
-p 443:8443 \
-e DEBYE_DATABASE_URL="postgres://user:pass@host:5432/debye" \
-e DEBYE_REDIS_URL="redis://host:6379" \
-e DEBYE_KEY_VAULT_URL="https://vault.azure.net" \
-e DEBYE_TLS_CERT_PATH="/certs/tls.crt" \
-e DEBYE_TLS_KEY_PATH="/certs/tls.key" \
-v /path/to/certs:/certs:ro \
ghcr.io/fusionminds/debye-proxy:latestEnvironment variables reference
| Variable | Required | Description |
|---|---|---|
| DEBYE_DATABASE_URL | Yes | PostgreSQL connection string for tenant config, policies, and audit log |
| DEBYE_REDIS_URL | Yes | Redis connection string for revocation denylist, pub/sub, and streams |
| DEBYE_KEY_VAULT_URL | Yes | Azure Key Vault URL for signing keys and upstream API keys |
| DEBYE_TLS_CERT_PATH | Yes | Path to TLS certificate file |
| DEBYE_TLS_KEY_PATH | Yes | Path to TLS private key file |
| DEBYE_LISTEN_ADDR | No | Listen address and port (default: :8443) |
| DEBYE_UPSTREAM_URL | No | Anthropic API base URL (default: https://api.anthropic.com) |
| DEBYE_UPSTREAM_TIMEOUT | No | Upstream request timeout (default: 120s) |
| DEBYE_MAX_BODY_SIZE | No | Maximum request body size (default: 25MB) |
| DEBYE_LOG_LEVEL | No | Logging level: debug, info, warn, error (default: info) |
| DEBYE_LOG_FORMAT | No | Log format: json, text (default: json) |
| DEBYE_METRICS_ADDR | No | Prometheus metrics listen address (default: :9090) |
| DEBYE_KEY_VAULT_REFRESH | No | Key Vault refresh interval (default: 5m) |
| DEBYE_POLICY_CACHE_POLL | No | Policy cache polling fallback interval (default: 30s) |
| DEBYE_CORS_ORIGIN | No | Allowed CORS origin for Admin Dashboard (no wildcard) |
| DEBYE_STREAM_TRIM_INTERVAL | No | Redis Streams trim interval for real-time feed (default: 1h) |
Health check endpoints
| Endpoint | Purpose | Checks |
|---|---|---|
| /healthz | Liveness probe | HTTP 200 if process is running. No dependency checks. For container restart decisions. |
| /readyz | Readiness probe | HTTP 200 if Redis reachable, Key Vault loaded, policy cache warm. HTTP 503 otherwise. |
| /metrics | Prometheus metrics | Request latency, policy eval duration, error rates, active connections, memory, upstream pool stats, Redis health. |
All health and metrics endpoints are unauthenticated and not exposed through the public load balancer. They are accessible only from the container orchestrator health check network.
Graceful shutdown
On SIGTERM: the proxy stops accepting new connections immediately. In-flight streaming requests are terminated (no drain timeout). This is a deliberate choice: LLM streaming can run for minutes, and drain would make deployments unpredictably slow. Engineers see a connection error and retry. Rolling updates via Azure Container Apps limit the number of instances restarting at once.
14. FAQ / Troubleshooting
"My request was blocked but I don't know why"
The error response includes a matched_rules array with each rule's description field. This description is written by your admin and should explain why the rule exists and how to remediate. Check the error message first.
If you need more detail, ask your admin to look up the request_id in the Audit Explorer. The violation details show which message index and role triggered the match, along with redacted context around the match location.
# The error response includes:
{
"error": {
"type": "policy_violation",
"request_id": "abc-123",
"matched_rules": [
{
"rule_id": "secrets-aws-keys",
"description": "AWS access keys must be stored in Key Vault. See #security."
}
]
}
}"debye login opens browser but nothing happens"
This typically means the OIDC callback is not reaching the CLI. Check the following:
- Verify your IdP configuration has the correct redirect URI for the Debye CLI callback
- Check that no firewall or proxy is blocking the localhost callback (127.0.0.1)
- Ensure the tenant ID is correct:
debye login --org <correct-tenant-id> - Check if your IdP requires MFA and whether your browser session has an active MFA challenge
- Try closing all browser tabs and running
debye loginagain
If the issue persists, check your IdP admin console for failed authentication attempts.
"Getting 503 errors on every request"
HTTP 503 means the proxy is not ready. The most common cause is Redis being unreachable. Debye fails closed: if it cannot check the revocation denylist, it rejects all requests rather than risk allowing revoked tokens.
- Check
/readyzendpoint for the specific failure reason - Verify Redis connectivity from the proxy instances
- Check Azure Cache for Redis health in the Azure portal
- If Redis was recently restarted, the proxy should auto-recover within seconds
The proxy will automatically resume normal operation when Redis becomes available.
"Token expired mid-session (401 error)"
JWTs have a 24-hour lifetime. If a coding session exceeds 24 hours, the next request will receive a 401 error. This is expected behavior.
- Simply re-run
debye wrap <command>to get a fresh token - The CLI auto-refreshes if the token is within 1 hour of expiry
- If the refresh token has also expired (30 days), run
debye login
"New Anthropic feature broke my workflow"
When Anthropic introduces new content block types, the proxy must decide how to handle them. The default behavior is block: requests containing unrecognized content block types are rejected.
- Check if your tenant uses the "block" or "skip with warning" mode for unknown block types
- Ask your admin to switch to "skip with warning" if availability is more important than strict scanning
- Fusionminds will add native support for new block types — check for proxy updates
- The error response will identify exactly which content block type is unrecognized
"Webhook not firing"
Webhook delivery is best-effort with a 5-second timeout and one retry after 10 seconds. If webhooks are not reaching your endpoint:
- SSRF validation: Webhook URLs must be HTTPS and resolve to a public IP. RFC 1918, link-local, and loopback addresses are rejected. Internal endpoints will not work.
- DNS resolution: DNS is checked at both configuration time and delivery time. Verify your domain resolves correctly.
- Signature verification: If your endpoint is rejecting requests, check that you are verifying the
X-Debye-Signatureheader correctly (HMAC-SHA256 of the raw body). - Timeout: Your endpoint must respond within 5 seconds. Long-running processing should be done asynchronously.
- Idempotency: Use
delivery_idto deduplicate. The retry may deliver the same event twice.
"Request rejected with 413 (payload too large)"
The proxy enforces a per-tenant request body size limit (default 25MB). This accommodates large requests with base64-encoded images while preventing unbounded memory consumption.
- Check the size of your request payload, especially base64-encoded content
- Consider splitting large content across multiple requests
- Your admin can increase the limit via the Admin API
"Getting 429 (rate limited) errors"
Debye enforces a per-tenant concurrency cap (default 50 concurrent requests). If your organization is hitting this limit:
- Retry with exponential backoff (your LLM tool's SDK should handle this)
- If 429 errors are frequent, ask your admin to increase the concurrency cap
- Note: Debye also passes through upstream Anthropic 429 responses unchanged