No description
Find a file
2026-04-16 19:00:37 +00:00
docs/superpowers Implement OpenAI-to-Anthropic proxy with streaming support 2026-04-15 06:29:03 +00:00
.gitignore Add OpenAI-to-Anthropic proxy design spec 2026-04-15 01:20:39 +00:00
config.yaml Add automatic retry for 429/503 with exponential backoff 2026-04-16 19:00:37 +00:00
converter.go Add temperature override option in config 2026-04-15 09:29:54 +00:00
flake.nix Add Nix flake for reproducible builds 2026-04-15 08:51:07 +00:00
go.mod Implement OpenAI-to-Anthropic proxy with streaming support 2026-04-15 06:29:03 +00:00
go.sum Implement OpenAI-to-Anthropic proxy with streaming support 2026-04-15 06:29:03 +00:00
handler.go Add automatic retry for 429/503 with exponential backoff 2026-04-16 19:00:37 +00:00
LICENSE Add Nix flake for reproducible builds 2026-04-15 08:51:07 +00:00
main.go Implement remaining code review suggestions 2026-04-15 09:16:44 +00:00
proxx Add temperature override option in config 2026-04-15 09:29:54 +00:00
README.md Add automatic retry for 429/503 with exponential backoff 2026-04-16 19:00:37 +00:00
streaming.go Implement OpenAI-to-Anthropic proxy with streaming support 2026-04-15 06:29:03 +00:00
types.go fix: add thinking parameter to match claude-code request format 2026-04-15 08:30:09 +00:00

proxx

OpenAI-to-Anthropic API proxy that converts OpenAI-compatible requests to Anthropic format and forwards them to a configurable upstream endpoint.

Features

  • OpenAI API compatibility — Exposes /v1/chat/completions and /v1/models endpoints
  • Automatic format conversion — Translates OpenAI request/response to/from Anthropic format
  • Streaming support — Converts non-streaming upstream responses to OpenAI SSE format for clients
  • Security headers — Blocks sensitive headers (Referer, Cookie, X-Forwarded-*) from reaching upstream
  • Claude-code mimicry — Sets headers to mimic the claude-code CLI tool
  • Single binary — No runtime dependencies, just one executable

Use Cases

  • Use OpenAI-compatible tools with Anthropic/ZAI API
  • Replace Claude API endpoint without modifying client code
  • Proxy requests to different Anthropic-compatible providers

Configuration

Create a config.yaml file in the working directory:

port: 8080
upstream_url: "https://api.z.ai/api/anthropic"

# Retry configuration
max_retries: 3
retry_base_delay_ms: 1000
  • port: Port to listen on (default: 8080)
  • upstream_url: Base URL for the Anthropic-compatible upstream API
  • temperature (optional): Override temperature for all requests. If set, this value is used instead of client-specified temperatures. Remove this line to respect client temperatures.
  • max_retries: Maximum retry attempts for transient errors (429, 503). Default: 3. Set to 0 to disable retries.
  • retry_base_delay_ms: Base delay in milliseconds for exponential backoff. Default: 1000. Delay formula: base_delay_ms * 2^(attempt-1) with ±50% jitter.

Building

Standard Go build

go build -o proxx

Requires Go 1.21+ with only gopkg.in/yaml.v3 as an external dependency.

Nix build (reproducible)

nix build

Or run directly from the flake:

nix run .

This uses a reproducible build environment with pinned dependencies.

Nix development shell

nix shell

Provides Go toolchain with gopls and other development tools.

Running

./proxx

The server will start and log:

Starting proxx on :8080, upstream: https://api.z.ai/api/anthropic

API Endpoints

GET /v1/models

Returns a list of available models in OpenAI format.

Example response:

{
  "object": "list",
  "data": [
    {"id": "glm-4.7", "object": "model", "created": 1234567890, "owned_by": "zhipu"},
    {"id": "glm-4.6", "object": "model", "created": 1234567890, "owned_by": "zhipu"}
  ]
}

POST /v1/chat/completions

Accepts OpenAI chat completion format and converts to Anthropic format.

Request headers:

  • Authorization: Bearer <api-key> — Required. Extracted and forwarded as x-api-key
  • Content-Type: application/json — Required

Request body:

{
  "model": "glm-4.7",
  "messages": [
    {"role": "user", "content": "Hello!"}
  ],
  "stream": false
}

Non-streaming response:

{
  "id": "chatcmpl-abc123",
  "object": "chat.completion",
  "created": 1234567890,
  "model": "glm-4.7",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "Hello! How can I help you today?"
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 10,
    "completion_tokens": 15,
    "total_tokens": 25
  }
}

Streaming response: Returns Server-Sent Events (SSE) in OpenAI format:

data: {"id":"chatcmpl-abc123","object":"chat.completion.chunk","created":1234567890,"model":"glm-4.7","choices":[{"index":0,"delta":{"role":"assistant"},"finish_reason":null}]}

data: {"id":"chatcmpl-abc123","object":"chat.completion.chunk","created":1234567890,"model":"glm-4.7","choices":[{"index":0,"delta":{"content":"Hello"},"finish_reason":null}]}

data: [DONE]

Field Mapping

OpenAI → Anthropic

OpenAI Field Anthropic Field Notes
model model Direct mapping
messages[] messages[] Converted to Anthropic format
system message system (top-level) Extracted from messages
temperature temperature Direct mapping
max_tokens max_tokens Defaults to 32000 if omitted
top_p top_p Direct mapping
stop stop_sequences Array mapping
tools[] tools[] Function tools converted
tool_choice tool_choice Enum mapping

Anthropic → OpenAI

Anthropic Field OpenAI Field Notes
content[0].text choices[0].message.content Text content
content[].tool_use choices[0].message.tool_calls[] Tool calls
usage.input_tokens usage.prompt_tokens Input tokens
usage.output_tokens usage.completion_tokens Output tokens
stop_reason finish_reason end_turnstop, tool_usetool_calls, max_tokenslength

Claude-Code Mimicry

The proxy sets these headers on all upstream requests to mimic the claude-code CLI:

Header Value
User-Agent claude-cli/1.0.18 (pro, cli)
x-api-key From incoming Authorization: Bearer
x-app cli
anthropic-version 2023-06-01
anthropic-beta interleaved-thinking-2025-05-14,prompt-caching-scope-2026-01-05,context-management-2025-06-27
X-Claude-Code-Session-Id Random UUID generated at startup
content-type application/json

Retry Behavior

When the upstream returns a retryable error (HTTP 429 or 503), proxx automatically retries with exponential backoff:

  • Exponential backoff: Delay doubles on each retry (1s, 2s, 4s, ...)
  • Jitter: ±50% random variation added to each delay to avoid thundering herd
  • Retryable statuses: 429 (rate limit), 503 (service unavailable)
  • Logged: All retry attempts are logged with attempt number, delay, and jitter

This improves resilience against temporary upstream issues without client intervention.

Security

Blocked Headers

The proxy blocks these headers from being forwarded to upstream:

  • Referer — Prevents leaking internal URLs
  • Cookie — Prevents leaking session cookies
  • Authorization — Already extracted and forwarded as x-api-key
  • X-Forwarded-For — Prevents leaking client IP
  • X-Real-Ip — Prevents leaking client IP
  • X-Forwarded-Host — Prevents leaking internal hostnames

Project Structure

proxx/
├── README.md          # This file
├── main.go           # Entry point, config loading, HTTP server setup
├── handler.go        # HTTP handlers for /v1/chat/completions and /v1/models
├── converter.go      # OpenAI <-> Anthropic format conversion logic
├── types.go          # Request/response struct types
├── config.yaml       # Default configuration file
├── go.mod
└── go.sum

License

See LICENSE file for details.