kafka/CLAUDE.md
ashisgreat22 6001979d7f docs: add CLAUDE.md for Claude Code onboarding
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-21 23:52:23 +01:00

74 lines
4.6 KiB
Markdown

# CLAUDE.md
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
## Project Overview
kafka is a privacy-respecting metasearch engine written in Go. It provides a SearXNG-compatible `/search` API and an HTML frontend (HTMX + Go templates). 9 engines are implemented natively in Go; unlisted engines can be proxied to an upstream SearXNG instance. Responses from multiple engines are merged into a single JSON/CSV/RSS/HTML response.
## Build & Run Commands
```bash
# Enter Nix dev shell (provides Go 1.24 toolchain + curl)
nix develop
# Run all tests
go test ./...
# Run a single test
go test -run TestWikipedia ./internal/engines/
# Run tests in a specific package with verbose output
go test -v ./internal/engines/
# Run the server (requires config.toml)
go run ./cmd/searxng-go -config config.toml
```
There is no Makefile. There is no linter configured.
## Architecture
**Request flow:** HTTP request -> middleware chain (global rate limit -> burst rate limit -> per-IP rate limit -> CORS) -> HTTP handler -> `search.Service` (cache check) -> `engines.Planner` (splits into local vs upstream) -> parallel local engine execution + upstream proxy -> `MergeResponses` -> cache write -> serialize (JSON/CSV/RSS/HTML).
**Key packages:**
- `internal/contracts` — Shared types: `SearchRequest`, `SearchResponse`, `MainResult`, `OutputFormat`. `MainResult` preserves unknown JSON keys from upstream via a `raw map[string]any` field and round-trips them faithfully.
- `internal/config` — TOML-based configuration with env var fallbacks. `Load(path)` reads `config.toml`; env vars override zero-value fields. See `config.example.toml` for all settings.
- `internal/engines``Engine` interface and all 9 Go-native implementations. `factory.go` registers engines via `NewDefaultPortedEngines()`. `planner.go` routes engines to local or upstream based on `LOCAL_PORTED_ENGINES` env var.
- `internal/search``Service` orchestrates the pipeline: cache check, planning, parallel engine execution via goroutines/WaitGroup, upstream proxying, response merging. Individual engine failures are reported as `unresponsive_engines` rather than aborting the search. Qwant has fallback logic to upstream on empty results.
- `internal/httpapi` — HTTP handlers for `/`, `/search`, `/healthz`, `/opensearch.xml`. Detects HTMX requests via `HX-Request` header to return fragments instead of full pages.
- `internal/upstream` — Client that proxies requests to an upstream SearXNG instance via POST.
- `internal/cache` — Valkey/Redis-backed cache with SHA-256 cache keys. No-op if unconfigured.
- `internal/middleware` — Three rate limiters (per-IP sliding window, burst+sustained, global) and CORS. All disabled by default.
- `internal/views` — HTML templates and static files embedded via `//go:embed`. Renders full pages or HTMX fragments. Templates: `base.html`, `index.html`, `results.html`, `results_inner.html`, `result_item.html`.
- `cmd/searxng-go` — Entry point. Loads TOML config, seeds env vars for engine code, wires up middleware chain, starts HTTP server.
**Engine interface** (`internal/engines/engine.go`):
```go
type Engine interface {
Name() string
Search(ctx context.Context, req contracts.SearchRequest) (contracts.SearchResponse, error)
}
```
**Adding a new engine:**
1. Create a new struct implementing the `Engine` interface in `internal/engines/` (single file, e.g., `newengine.go`)
2. Add a test file alongside it (use `roundTripperFunc` and `httpResponse` helpers in `http_mock_test.go` for mocking HTTP)
3. Register it in `NewDefaultPortedEngines()` in `factory.go`
4. Add its name to `defaultPortedEngines` in `planner.go`
5. Add category mappings in `inferFromCategories()` if applicable
## Configuration
Config is loaded from `config.toml` (see `config.example.toml`). All fields can be overridden via environment variables (env vars take precedence over zero-value TOML fields). Key sections: `[server]`, `[upstream]`, `[engines]`, `[cache]`, `[cors]`, `[rate_limit]`, `[global_rate_limit]`, `[burst_rate_limit]`.
## Conventions
- Module path: `github.com/metamorphosis-dev/kafka`
- Tests use shared mock helpers in `internal/engines/http_mock_test.go` (`roundTripperFunc`, `httpResponse`)
- Engine implementations are single files under `internal/engines/` (e.g., `wikipedia.go`, `duckduckgo.go`)
- Response merging de-duplicates by `engine|title|url` key; suggestions/corrections are merged as sets
- `MainResult` uses custom `UnmarshalJSON`/`MarshalJSON` to preserve unknown upstream JSON keys
- HTML templates and static files are embedded at build time via `//go:embed` in `internal/views/`
- Structured logging via `log/slog` with JSON handler