From be7ba66a09191849827e5178da0ea434661d505e Mon Sep 17 00:00:00 2001 From: Franz Kafka Date: Sat, 21 Mar 2026 18:51:14 +0000 Subject: [PATCH] docs: complete README rewrite - Updated to reflect all current features (9 engines, HTMX frontend, Valkey cache, 3-layer rate limiting, CORS, OpenSearch) - Added quick start for binary, Docker Compose, and NixOS - Documented all endpoints, API parameters, and response format - Configuration reference with environment variable table - Engine table with source and notes - ASCII architecture diagram - Docker and NixOS deployment sections --- README.md | 252 +++++++++++++++++++++++++++++++++++++++++++----------- 1 file changed, 200 insertions(+), 52 deletions(-) diff --git a/README.md b/README.md index c9e421c..b6c5e36 100644 --- a/README.md +++ b/README.md @@ -1,76 +1,224 @@ -## gosearch (SearXNG rewrite in Go) +# gosearch -This repository contains a standalone Go HTTP service that implements a SearXNG-compatible -API-first `/search` endpoint and proxies unported engines to an upstream SearXNG instance. +A privacy-respecting, open metasearch engine written in Go. SearXNG-compatible API with an HTML frontend, designed to be fast, lightweight, and deployable anywhere. -### Endpoints +**9 engines. No JavaScript. No tracking. One binary.** -- `GET /healthz` -> `OK` -- `GET|POST /search` - - Required form/body parameter: `q` - - Optional: `format` (`json` | `csv` | `rss`; default: `json`) +## Features -### Supported `format=...` +- **SearXNG-compatible API** — drop-in replacement for existing integrations +- **9 search engines** — Wikipedia, arXiv, Crossref, Brave, Qwant, DuckDuckGo, GitHub, Reddit, Bing +- **HTML frontend** — HTMX + Go templates with instant search, dark mode, responsive design +- **Valkey cache** — optional Redis-compatible caching with configurable TTL +- **Rate limiting** — three layers: per-IP, burst, and global (all disabled by default) +- **CORS** — configurable origins for browser-based clients +- **OpenSearch** — browsers can add gosearch as a search engine from the address bar +- **Graceful degradation** — individual engine failures don't kill the whole search +- **Docker** — multi-stage build, ~20MB runtime image +- **NixOS** — native NixOS module with systemd service -- `json`: SearXNG-style JSON response (`query`, `number_of_results`, `results`, `answers`, `corrections`, `infoboxes`, `suggestions`, `unresponsive_engines`) -- `csv`: CSV with header `title,url,content,host,engine,score,type` -- `rss`: RSS 2.0 feed based on the `opensearch_response_rss.xml` template fields +## Quick Start -### Request parameters +### Binary -The server accepts SearXNG form parameters (both `GET` query string and `POST` form-encoded): +```bash +git clone https://git.ashisgreat.xyz/penal-colony/gosearch.git +cd gosearch +go build ./cmd/searxng-go +./searxng-go -config config.toml +``` -- `q` (required): search query -- `format` (optional): `json`/`csv`/`rss` -- `pageno` (optional, default `1`): positive integer -- `safesearch` (optional, default `0`): integer `0..2` -- `time_range` (optional): `day|week|month|year` (or omitted/`None`) -- `timeout_limit` (optional): float, seconds (or omitted/`None`) -- `language` (optional, default `auto`): `auto` or a BCP-47-ish language code -- `engines` (optional): comma-separated engine names (e.g. `wikipedia,arxiv`) -- `categories` / `category_` (optional): used for selecting the initial ported subset -- `engine_data--=` (optional): per-engine custom parameters +### Docker Compose -### Environment variables +```bash +cp config.example.toml config.toml +# Edit config.toml — set your Brave API key, etc. +docker compose up -d +``` -- `PORT` (optional, default `8080`) -- `UPSTREAM_SEARXNG_URL` (optional for now, but required if you expect unported engines) - - When set, unported engines are proxied to `${UPSTREAM_SEARXNG_URL}/search` with `format=json`. -- `LOCAL_PORTED_ENGINES` (optional, default `wikipedia,arxiv,crossref,braveapi,qwant`) - - Controls which engine names are executed locally (Go-native adapters). -- `HTTP_TIMEOUT` (optional, default `10s`) - - Timeout for both local engine API calls and upstream proxy calls. -- Brave Search API: - - `BRAVE_API_KEY` (optional): enables the `braveapi` engine when set - - `BRAVE_ACCESS_TOKEN` (optional): if set, requests must include a token - (header `Authorization: Bearer `, `X-Search-Token`, `X-Brave-Access-Token`, or form field `token`) +### NixOS -### Ported vs proxied strategy +Add to your flake inputs: -1. The service plans which engines should run locally vs upstream using `LOCAL_PORTED_ENGINES`. -2. It executes local ported engines using Go-native adapters: - - `wikipedia`, `arxiv`, `crossref` -3. Any remaining requested engines are proxied to upstream SearXNG (`format=json`). -4. Responses are merged: - - `results` are de-duplicated by `engine|title|url` - - `suggestions`/`corrections` are treated as sets - - other arrays are concatenated +```nix +inputs.gosearch.url = "git+https://git.ashisgreat.xyz/penal-colony/gosearch.git"; +``` -### Running with Nix +Enable in your configuration: -This repo uses `flake.nix` to provide the Go toolchain. +```nix +imports = [ inputs.gosearch.nixosModules.default ]; + +services.gosearch = { + enable = true; + openFirewall = true; + baseUrl = "https://search.example.com"; + # config = "/etc/gosearch/config.toml"; # default +}; +``` + +Write your config: + +```bash +sudo mkdir -p /etc/gosearch +sudo cp config.example.toml /etc/gosearch/config.toml +sudo $EDITOR /etc/gosearch/config.toml +``` + +Deploy: + +```bash +sudo nixos-rebuild switch --flake .# +``` + +### Nix Development Shell ```bash nix develop go test ./... -go run ./cmd/searxng-go +go run ./cmd/searxng-go -config config.toml ``` -Example: +## Endpoints + +| Endpoint | Description | +|---|---| +| `GET /` | HTML search page | +| `GET /search?q=…&format=html` | HTML results (full page or HTMX fragment) | +| `GET/POST /search` | JSON/CSV/RSS results | +| `GET /opensearch.xml` | OpenSearch description XML | +| `GET /healthz` | Health check | +| `GET /static/*` | Embedded CSS, images, favicon | + +## Search API + +### Parameters + +| Parameter | Default | Description | +|---|---|---| +| `q` | — | Search query (required) | +| `format` | `json` | `json`, `csv`, `rss`, `html` | +| `pageno` | `1` | Page number | +| `safesearch` | `0` | Safe search level (0–2) | +| `time_range` | — | `day`, `week`, `month`, `year` | +| `language` | `auto` | BCP-47 language code | +| `engines` | all | Comma-separated engine names | + +### Example ```bash -export UPSTREAM_SEARXNG_URL="http://127.0.0.1:8888" -export PORT="8080" -nix develop -c go run ./cmd/searxng-go +curl "http://localhost:8080/search?q=golang&format=json&engines=github,duckduckgo" ``` +### Response (JSON) + +```json +{ + "query": "golang", + "number_of_results": 14768, + "results": [ + { + "title": "The Go Programming Language", + "url": "https://go.dev/", + "content": "Go is an open source programming language...", + "engine": "duckduckgo", + "score": 1.0, + "type": "result" + } + ], + "suggestions": ["golang tutorial", "golang vs rust"], + "unresponsive_engines": [] +} +``` + +## Configuration + +Copy `config.example.toml` to `config.toml` and edit. All settings can also be overridden via environment variables (listed in the example file). + +### Key Sections + +- **`[server]`** — port, timeout, public base URL for OpenSearch +- **`[upstream]`** — optional upstream SearXNG proxy for unported engines +- **`[engines]`** — which engines run locally, engine-specific settings +- **`[cache]`** — Valkey/Redis address, password, TTL +- **`[cors]`** — allowed origins and methods +- **`[rate_limit]`** — per-IP sliding window (30 req/min default) +- **`[global_rate_limit]`** — server-wide limit (disabled by default) +- **`[burst_rate_limit]`** — per-IP burst + sustained windows (disabled by default) + +### Environment Variables + +| Variable | Description | +|---|---| +| `PORT` | Listen port (default: 8080) | +| `BASE_URL` | Public URL for OpenSearch XML | +| `UPSTREAM_SEARXNG_URL` | Upstream SearXNG instance URL | +| `LOCAL_PORTED_ENGINES` | Comma-separated local engine list | +| `HTTP_TIMEOUT` | Upstream request timeout | +| `BRAVE_API_KEY` | Brave Search API key | +| `BRAVE_ACCESS_TOKEN` | Gate requests with token | +| `VALKEY_ADDRESS` | Valkey/Redis address | +| `VALKEY_PASSWORD` | Valkey/Redis password | +| `VALKEY_CACHE_TTL` | Cache TTL | + +See `config.example.toml` for the full list including rate limiting and CORS variables. + +## Engines + +| Engine | Source | Notes | +|---|---|---| +| Wikipedia | MediaWiki API | General knowledge | +| arXiv | arXiv API | Academic papers | +| Crossref | Crossref API | Academic metadata | +| Brave | Brave Search API | General web (requires API key) | +| Qwant | Qwant Lite HTML | General web | +| DuckDuckGo | DDG Lite HTML | General web | +| GitHub | GitHub Search API v3 | Code and repositories | +| Reddit | Reddit JSON API | Discussions | +| Bing | Bing RSS | General web | + +Engines not listed in `engines.local_ported` are proxied to an upstream SearXNG instance if `upstream.url` is configured. + +## Architecture + +``` +┌─────────────────────────────────────┐ +│ HTTP Handler │ +│ /search / /opensearch.xml │ +├─────────────────────────────────────┤ +│ Middleware Chain │ +│ Global → Burst → Per-IP → CORS │ +├─────────────────────────────────────┤ +│ Search Service │ +│ Parallel engine execution │ +│ WaitGroup + graceful degradation │ +├─────────────────────────────────────┤ +│ Cache Layer │ +│ Valkey/Redis (optional, no-op if │ +│ unconfigured) │ +├─────────────────────────────────────┤ +│ Engines (×9) │ +│ Each runs in its own goroutine │ +│ Failures → unresponsive_engines │ +└─────────────────────────────────────┘ +``` + +## Docker + +The Dockerfile uses a multi-stage build: + +```dockerfile +# Build stage: golang:1.24-alpine +# Runtime stage: alpine:3.21 (~20MB) +# CGO_ENABLED=0 — static binary +``` + +```bash +docker compose up -d +``` + +Includes Valkey 8 with health checks out of the box. + +## License + +MIT