No description

Find a file

Franz Kafka 78b3239bbd Add automatic retry for 429/503 with exponential backoff		2026-04-16 19:00:37 +00:00
docs/superpowers	Implement OpenAI-to-Anthropic proxy with streaming support	2026-04-15 06:29:03 +00:00
.gitignore	Add OpenAI-to-Anthropic proxy design spec	2026-04-15 01:20:39 +00:00
config.yaml	Add automatic retry for 429/503 with exponential backoff	2026-04-16 19:00:37 +00:00
converter.go	Add temperature override option in config	2026-04-15 09:29:54 +00:00
flake.nix	Add Nix flake for reproducible builds	2026-04-15 08:51:07 +00:00
go.mod	Implement OpenAI-to-Anthropic proxy with streaming support	2026-04-15 06:29:03 +00:00
go.sum	Implement OpenAI-to-Anthropic proxy with streaming support	2026-04-15 06:29:03 +00:00
handler.go	Add automatic retry for 429/503 with exponential backoff	2026-04-16 19:00:37 +00:00
LICENSE	Add Nix flake for reproducible builds	2026-04-15 08:51:07 +00:00
main.go	Implement remaining code review suggestions	2026-04-15 09:16:44 +00:00
proxx	Add temperature override option in config	2026-04-15 09:29:54 +00:00
README.md	Add automatic retry for 429/503 with exponential backoff	2026-04-16 19:00:37 +00:00
streaming.go	Implement OpenAI-to-Anthropic proxy with streaming support	2026-04-15 06:29:03 +00:00
types.go	fix: add thinking parameter to match claude-code request format	2026-04-15 08:30:09 +00:00

README.md

proxx

OpenAI-to-Anthropic API proxy that converts OpenAI-compatible requests to Anthropic format and forwards them to a configurable upstream endpoint.

Features

OpenAI API compatibility — Exposes /v1/chat/completions and /v1/models endpoints
Automatic format conversion — Translates OpenAI request/response to/from Anthropic format
Streaming support — Converts non-streaming upstream responses to OpenAI SSE format for clients
Security headers — Blocks sensitive headers (Referer, Cookie, X-Forwarded-*) from reaching upstream
Claude-code mimicry — Sets headers to mimic the claude-code CLI tool
Single binary — No runtime dependencies, just one executable

Use Cases

Use OpenAI-compatible tools with Anthropic/ZAI API
Replace Claude API endpoint without modifying client code
Proxy requests to different Anthropic-compatible providers

Configuration

Create a config.yaml file in the working directory:

port: 8080
upstream_url: "https://api.z.ai/api/anthropic"

# Retry configuration
max_retries: 3
retry_base_delay_ms: 1000

port: Port to listen on (default: 8080)
upstream_url: Base URL for the Anthropic-compatible upstream API
temperature (optional): Override temperature for all requests. If set, this value is used instead of client-specified temperatures. Remove this line to respect client temperatures.
max_retries: Maximum retry attempts for transient errors (429, 503). Default: 3. Set to 0 to disable retries.
retry_base_delay_ms: Base delay in milliseconds for exponential backoff. Default: 1000. Delay formula: base_delay_ms * 2^(attempt-1) with ±50% jitter.

Building

Standard Go build

go build -o proxx

Requires Go 1.21+ with only gopkg.in/yaml.v3 as an external dependency.

Nix build (reproducible)

nix build

Or run directly from the flake:

nix run .

This uses a reproducible build environment with pinned dependencies.

Nix development shell

nix shell

Provides Go toolchain with gopls and other development tools.

Running

./proxx

The server will start and log:

Starting proxx on :8080, upstream: https://api.z.ai/api/anthropic

API Endpoints

GET /v1/models

Returns a list of available models in OpenAI format.

Example response:

{
  "object": "list",
  "data": [
    {"id": "glm-4.7", "object": "model", "created": 1234567890, "owned_by": "zhipu"},
    {"id": "glm-4.6", "object": "model", "created": 1234567890, "owned_by": "zhipu"}
  ]
}

POST /v1/chat/completions

Accepts OpenAI chat completion format and converts to Anthropic format.

Request headers:

Authorization: Bearer <api-key> — Required. Extracted and forwarded as x-api-key
Content-Type: application/json — Required

Request body:

{
  "model": "glm-4.7",
  "messages": [
    {"role": "user", "content": "Hello!"}
  ],
  "stream": false
}

Non-streaming response:

{
  "id": "chatcmpl-abc123",
  "object": "chat.completion",
  "created": 1234567890,
  "model": "glm-4.7",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "Hello! How can I help you today?"
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 10,
    "completion_tokens": 15,
    "total_tokens": 25
  }
}

Streaming response: Returns Server-Sent Events (SSE) in OpenAI format:

data: {"id":"chatcmpl-abc123","object":"chat.completion.chunk","created":1234567890,"model":"glm-4.7","choices":[{"index":0,"delta":{"role":"assistant"},"finish_reason":null}]}

data: {"id":"chatcmpl-abc123","object":"chat.completion.chunk","created":1234567890,"model":"glm-4.7","choices":[{"index":0,"delta":{"content":"Hello"},"finish_reason":null}]}

data: [DONE]

Field Mapping

OpenAI → Anthropic

OpenAI Field	Anthropic Field	Notes
`model`	`model`	Direct mapping
`messages[]`	`messages[]`	Converted to Anthropic format
`system` message	`system` (top-level)	Extracted from messages
`temperature`	`temperature`	Direct mapping
`max_tokens`	`max_tokens`	Defaults to 32000 if omitted
`top_p`	`top_p`	Direct mapping
`stop`	`stop_sequences`	Array mapping
`tools[]`	`tools[]`	Function tools converted
`tool_choice`	`tool_choice`	Enum mapping

Anthropic → OpenAI

Anthropic Field	OpenAI Field	Notes
`content[0].text`	`choices[0].message.content`	Text content
`content[].tool_use`	`choices[0].message.tool_calls[]`	Tool calls
`usage.input_tokens`	`usage.prompt_tokens`	Input tokens
`usage.output_tokens`	`usage.completion_tokens`	Output tokens
`stop_reason`	`finish_reason`	`end_turn` → `stop`, `tool_use` → `tool_calls`, `max_tokens` → `length`

Claude-Code Mimicry

The proxy sets these headers on all upstream requests to mimic the claude-code CLI:

Header	Value
`User-Agent`	`claude-cli/1.0.18 (pro, cli)`
`x-api-key`	From incoming `Authorization: Bearer`
`x-app`	`cli`
`anthropic-version`	`2023-06-01`
`anthropic-beta`	`interleaved-thinking-2025-05-14,prompt-caching-scope-2026-01-05,context-management-2025-06-27`
`X-Claude-Code-Session-Id`	Random UUID generated at startup
`content-type`	`application/json`

Retry Behavior

When the upstream returns a retryable error (HTTP 429 or 503), proxx automatically retries with exponential backoff:

Exponential backoff: Delay doubles on each retry (1s, 2s, 4s, ...)
Jitter: ±50% random variation added to each delay to avoid thundering herd
Retryable statuses: 429 (rate limit), 503 (service unavailable)
Logged: All retry attempts are logged with attempt number, delay, and jitter

This improves resilience against temporary upstream issues without client intervention.

Security

Blocked Headers

The proxy blocks these headers from being forwarded to upstream:

Referer — Prevents leaking internal URLs
Cookie — Prevents leaking session cookies
Authorization — Already extracted and forwarded as x-api-key
X-Forwarded-For — Prevents leaking client IP
X-Real-Ip — Prevents leaking client IP
X-Forwarded-Host — Prevents leaking internal hostnames

Project Structure

proxx/
├── README.md          # This file
├── main.go           # Entry point, config loading, HTTP server setup
├── handler.go        # HTTP handlers for /v1/chat/completions and /v1/models
├── converter.go      # OpenAI <-> Anthropic format conversion logic
├── types.go          # Request/response struct types
├── config.yaml       # Default configuration file
├── go.mod
└── go.sum

License

See LICENSE file for details.