docs: complete README rewrite

- Updated to reflect all current features (9 engines, HTMX frontend, Valkey cache, 3-layer rate limiting, CORS, OpenSearch)
- Added quick start for binary, Docker Compose, and NixOS
- Documented all endpoints, API parameters, and response format
- Configuration reference with environment variable table
- Engine table with source and notes
- ASCII architecture diagram
- Docker and NixOS deployment sections
This commit is contained in:
Franz Kafka 2026-03-21 18:51:14 +00:00
parent 13040268d6
commit be7ba66a09

252
README.md
View file

@ -1,76 +1,224 @@
## gosearch (SearXNG rewrite in Go)
# gosearch
This repository contains a standalone Go HTTP service that implements a SearXNG-compatible
API-first `/search` endpoint and proxies unported engines to an upstream SearXNG instance.
A privacy-respecting, open metasearch engine written in Go. SearXNG-compatible API with an HTML frontend, designed to be fast, lightweight, and deployable anywhere.
### Endpoints
**9 engines. No JavaScript. No tracking. One binary.**
- `GET /healthz` -> `OK`
- `GET|POST /search`
- Required form/body parameter: `q`
- Optional: `format` (`json` | `csv` | `rss`; default: `json`)
## Features
### Supported `format=...`
- **SearXNG-compatible API** — drop-in replacement for existing integrations
- **9 search engines** — Wikipedia, arXiv, Crossref, Brave, Qwant, DuckDuckGo, GitHub, Reddit, Bing
- **HTML frontend** — HTMX + Go templates with instant search, dark mode, responsive design
- **Valkey cache** — optional Redis-compatible caching with configurable TTL
- **Rate limiting** — three layers: per-IP, burst, and global (all disabled by default)
- **CORS** — configurable origins for browser-based clients
- **OpenSearch** — browsers can add gosearch as a search engine from the address bar
- **Graceful degradation** — individual engine failures don't kill the whole search
- **Docker** — multi-stage build, ~20MB runtime image
- **NixOS** — native NixOS module with systemd service
- `json`: SearXNG-style JSON response (`query`, `number_of_results`, `results`, `answers`, `corrections`, `infoboxes`, `suggestions`, `unresponsive_engines`)
- `csv`: CSV with header `title,url,content,host,engine,score,type`
- `rss`: RSS 2.0 feed based on the `opensearch_response_rss.xml` template fields
## Quick Start
### Request parameters
### Binary
The server accepts SearXNG form parameters (both `GET` query string and `POST` form-encoded):
```bash
git clone https://git.ashisgreat.xyz/penal-colony/gosearch.git
cd gosearch
go build ./cmd/searxng-go
./searxng-go -config config.toml
```
- `q` (required): search query
- `format` (optional): `json`/`csv`/`rss`
- `pageno` (optional, default `1`): positive integer
- `safesearch` (optional, default `0`): integer `0..2`
- `time_range` (optional): `day|week|month|year` (or omitted/`None`)
- `timeout_limit` (optional): float, seconds (or omitted/`None`)
- `language` (optional, default `auto`): `auto` or a BCP-47-ish language code
- `engines` (optional): comma-separated engine names (e.g. `wikipedia,arxiv`)
- `categories` / `category_<name>` (optional): used for selecting the initial ported subset
- `engine_data-<engine>-<key>=<value>` (optional): per-engine custom parameters
### Docker Compose
### Environment variables
```bash
cp config.example.toml config.toml
# Edit config.toml — set your Brave API key, etc.
docker compose up -d
```
- `PORT` (optional, default `8080`)
- `UPSTREAM_SEARXNG_URL` (optional for now, but required if you expect unported engines)
- When set, unported engines are proxied to `${UPSTREAM_SEARXNG_URL}/search` with `format=json`.
- `LOCAL_PORTED_ENGINES` (optional, default `wikipedia,arxiv,crossref,braveapi,qwant`)
- Controls which engine names are executed locally (Go-native adapters).
- `HTTP_TIMEOUT` (optional, default `10s`)
- Timeout for both local engine API calls and upstream proxy calls.
- Brave Search API:
- `BRAVE_API_KEY` (optional): enables the `braveapi` engine when set
- `BRAVE_ACCESS_TOKEN` (optional): if set, requests must include a token
(header `Authorization: Bearer <token>`, `X-Search-Token`, `X-Brave-Access-Token`, or form field `token`)
### NixOS
### Ported vs proxied strategy
Add to your flake inputs:
1. The service plans which engines should run locally vs upstream using `LOCAL_PORTED_ENGINES`.
2. It executes local ported engines using Go-native adapters:
- `wikipedia`, `arxiv`, `crossref`
3. Any remaining requested engines are proxied to upstream SearXNG (`format=json`).
4. Responses are merged:
- `results` are de-duplicated by `engine|title|url`
- `suggestions`/`corrections` are treated as sets
- other arrays are concatenated
```nix
inputs.gosearch.url = "git+https://git.ashisgreat.xyz/penal-colony/gosearch.git";
```
### Running with Nix
Enable in your configuration:
This repo uses `flake.nix` to provide the Go toolchain.
```nix
imports = [ inputs.gosearch.nixosModules.default ];
services.gosearch = {
enable = true;
openFirewall = true;
baseUrl = "https://search.example.com";
# config = "/etc/gosearch/config.toml"; # default
};
```
Write your config:
```bash
sudo mkdir -p /etc/gosearch
sudo cp config.example.toml /etc/gosearch/config.toml
sudo $EDITOR /etc/gosearch/config.toml
```
Deploy:
```bash
sudo nixos-rebuild switch --flake .#
```
### Nix Development Shell
```bash
nix develop
go test ./...
go run ./cmd/searxng-go
go run ./cmd/searxng-go -config config.toml
```
Example:
## Endpoints
| Endpoint | Description |
|---|---|
| `GET /` | HTML search page |
| `GET /search?q=…&format=html` | HTML results (full page or HTMX fragment) |
| `GET/POST /search` | JSON/CSV/RSS results |
| `GET /opensearch.xml` | OpenSearch description XML |
| `GET /healthz` | Health check |
| `GET /static/*` | Embedded CSS, images, favicon |
## Search API
### Parameters
| Parameter | Default | Description |
|---|---|---|
| `q` | — | Search query (required) |
| `format` | `json` | `json`, `csv`, `rss`, `html` |
| `pageno` | `1` | Page number |
| `safesearch` | `0` | Safe search level (02) |
| `time_range` | — | `day`, `week`, `month`, `year` |
| `language` | `auto` | BCP-47 language code |
| `engines` | all | Comma-separated engine names |
### Example
```bash
export UPSTREAM_SEARXNG_URL="http://127.0.0.1:8888"
export PORT="8080"
nix develop -c go run ./cmd/searxng-go
curl "http://localhost:8080/search?q=golang&format=json&engines=github,duckduckgo"
```
### Response (JSON)
```json
{
"query": "golang",
"number_of_results": 14768,
"results": [
{
"title": "The Go Programming Language",
"url": "https://go.dev/",
"content": "Go is an open source programming language...",
"engine": "duckduckgo",
"score": 1.0,
"type": "result"
}
],
"suggestions": ["golang tutorial", "golang vs rust"],
"unresponsive_engines": []
}
```
## Configuration
Copy `config.example.toml` to `config.toml` and edit. All settings can also be overridden via environment variables (listed in the example file).
### Key Sections
- **`[server]`** — port, timeout, public base URL for OpenSearch
- **`[upstream]`** — optional upstream SearXNG proxy for unported engines
- **`[engines]`** — which engines run locally, engine-specific settings
- **`[cache]`** — Valkey/Redis address, password, TTL
- **`[cors]`** — allowed origins and methods
- **`[rate_limit]`** — per-IP sliding window (30 req/min default)
- **`[global_rate_limit]`** — server-wide limit (disabled by default)
- **`[burst_rate_limit]`** — per-IP burst + sustained windows (disabled by default)
### Environment Variables
| Variable | Description |
|---|---|
| `PORT` | Listen port (default: 8080) |
| `BASE_URL` | Public URL for OpenSearch XML |
| `UPSTREAM_SEARXNG_URL` | Upstream SearXNG instance URL |
| `LOCAL_PORTED_ENGINES` | Comma-separated local engine list |
| `HTTP_TIMEOUT` | Upstream request timeout |
| `BRAVE_API_KEY` | Brave Search API key |
| `BRAVE_ACCESS_TOKEN` | Gate requests with token |
| `VALKEY_ADDRESS` | Valkey/Redis address |
| `VALKEY_PASSWORD` | Valkey/Redis password |
| `VALKEY_CACHE_TTL` | Cache TTL |
See `config.example.toml` for the full list including rate limiting and CORS variables.
## Engines
| Engine | Source | Notes |
|---|---|---|
| Wikipedia | MediaWiki API | General knowledge |
| arXiv | arXiv API | Academic papers |
| Crossref | Crossref API | Academic metadata |
| Brave | Brave Search API | General web (requires API key) |
| Qwant | Qwant Lite HTML | General web |
| DuckDuckGo | DDG Lite HTML | General web |
| GitHub | GitHub Search API v3 | Code and repositories |
| Reddit | Reddit JSON API | Discussions |
| Bing | Bing RSS | General web |
Engines not listed in `engines.local_ported` are proxied to an upstream SearXNG instance if `upstream.url` is configured.
## Architecture
```
┌─────────────────────────────────────┐
│ HTTP Handler │
│ /search / /opensearch.xml │
├─────────────────────────────────────┤
│ Middleware Chain │
│ Global → Burst → Per-IP → CORS │
├─────────────────────────────────────┤
│ Search Service │
│ Parallel engine execution │
│ WaitGroup + graceful degradation │
├─────────────────────────────────────┤
│ Cache Layer │
│ Valkey/Redis (optional, no-op if │
│ unconfigured) │
├─────────────────────────────────────┤
│ Engines (×9) │
│ Each runs in its own goroutine │
│ Failures → unresponsive_engines │
└─────────────────────────────────────┘
```
## Docker
The Dockerfile uses a multi-stage build:
```dockerfile
# Build stage: golang:1.24-alpine
# Runtime stage: alpine:3.21 (~20MB)
# CGO_ENABLED=0 — static binary
```
```bash
docker compose up -d
```
Includes Valkey 8 with health checks out of the box.
## License
MIT