proxx/docs/superpowers/specs/2026-04-15-openai-anthropic-proxy-design.md

# OpenAI-to-Anthropic Proxy Design (proxx)

## Purpose

A Go proxy that exposes OpenAI-compatible API endpoints, converts requests to Anthropic format, and forwards them to a configurable upstream Anthropic endpoint. The proxy mimics claude-code (the CLI tool) so upstream sees requests as originating from claude-code.

## Architecture

```
Client (OpenAI format) --> proxx (Go :8080) --> Upstream Anthropic endpoint
                                                     e.g. https://api.z.ai/api/anthropic
```

Single binary using Go standard library `net/http`. Only external dependency: `gopkg.in/yaml.v3` for config parsing.

## Configuration

Config loaded from `config.yaml` in the working directory:

```yaml
port: 8080
upstream_url: "https://api.z.ai/api/anthropic"
```

## Endpoints

### POST /v1/chat/completions

Accepts OpenAI chat completion format. Converts to Anthropic `POST /v1/messages`. Supports both streaming and non-streaming modes.

### GET /v1/models

Returns a static list of Claude models in OpenAI `/v1/models` response format. No upstream call needed.

## Authentication

The API key is extracted from the incoming request's `Authorization: Bearer <key>` header and forwarded to the upstream as the `x-api-key` header.

## Claude-Code Mimicry

The proxy sets these headers on all upstream requests:

| Header | Value |
|---|---|
| `User-Agent` | `claude-cli/1.0.18 (pro, cli)` |
| `x-api-key` | From incoming Bearer token |
| `x-app` | `cli` |
| `anthropic-version` | `2023-06-01` |
| `anthropic-beta` | `interleaved-thinking-2025-05-14,prompt-caching-scope-2026-01-05,context-management-2025-06-27` |
| `X-Claude-Code-Session-Id` | Random UUID generated at startup |
| `content-type` | `application/json` |

## Request Conversion (OpenAI -> Anthropic)

### Fields mapped directly
- `model` -> `model`
- `temperature` -> `temperature`
- `max_tokens` -> `max_tokens` (default 8192 if omitted)
- `stream` -> `stream`
- `top_p` -> `top_p`
- `stop` -> `stop_sequences`

### Messages
- OpenAI `messages[]` with `role` and `content` mapped to Anthropic `messages[]`
- `system` message (role="system") extracted to top-level `system` field
- String content mapped directly; array content (text+image parts) mapped to Anthropic content block format

### Tools
- OpenAI function tools converted to Anthropic tool format (`name`, `description`, `input_schema`)
- `tool_choice` mapped: `auto`/`none`/`required` + specific tool selection

## Response Conversion (Anthropic -> OpenAI)

### Non-streaming
- `content[0].text` -> `choices[0].message.content`
- `content[].type == "tool_use"` -> `choices[0].message.tool_calls[]`
- `usage.input_tokens` -> `usage.prompt_tokens`
- `usage.output_tokens` -> `usage.completion_tokens`
- `stop_reason` mapped: `end_turn` -> `stop`, `tool_use` -> `tool_calls`, `max_tokens` -> `length`

### Streaming
- Anthropic SSE events (`message_start`, `content_block_start`, `content_block_delta`, `content_block_stop`, `message_delta`, `message_stop`) converted to OpenAI SSE format (`role`, `content` deltas, `tool_calls` deltas, `finish_reason`)
- Final `data: [DONE]` sent after stream ends

### Error Responses
- Upstream errors converted to OpenAI error format: `{"error": {"message": "...", "type": "...", "code": "..."}}`
- Connection failures: HTTP 502
- Malformed requests: HTTP 400

## Project Structure

```
proxx/
├── main.go           # Entry point, config loading, HTTP server setup
├── handler.go        # HTTP handlers for /v1/chat/completions and /v1/models
├── converter.go      # OpenAI <-> Anthropic format conversion logic
├── types.go          # All request/response struct types
├── streaming.go      # SSE streaming conversion (Anthropic -> OpenAI)
├── config.yaml       # Default configuration file
├── go.mod
└── go.sum
```

## Testing

- Unit tests for the converter functions (pure logic, no HTTP)
- Integration test with a mock upstream server to verify end-to-end flow
- Streaming test to verify SSE event conversion