kafka/README.md
Franz Kafka be7ba66a09 docs: complete README rewrite
- Updated to reflect all current features (9 engines, HTMX frontend, Valkey cache, 3-layer rate limiting, CORS, OpenSearch)
- Added quick start for binary, Docker Compose, and NixOS
- Documented all endpoints, API parameters, and response format
- Configuration reference with environment variable table
- Engine table with source and notes
- ASCII architecture diagram
- Docker and NixOS deployment sections
2026-03-21 18:51:14 +00:00

224 lines
6.8 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# gosearch
A privacy-respecting, open metasearch engine written in Go. SearXNG-compatible API with an HTML frontend, designed to be fast, lightweight, and deployable anywhere.
**9 engines. No JavaScript. No tracking. One binary.**
## Features
- **SearXNG-compatible API** — drop-in replacement for existing integrations
- **9 search engines** — Wikipedia, arXiv, Crossref, Brave, Qwant, DuckDuckGo, GitHub, Reddit, Bing
- **HTML frontend** — HTMX + Go templates with instant search, dark mode, responsive design
- **Valkey cache** — optional Redis-compatible caching with configurable TTL
- **Rate limiting** — three layers: per-IP, burst, and global (all disabled by default)
- **CORS** — configurable origins for browser-based clients
- **OpenSearch** — browsers can add gosearch as a search engine from the address bar
- **Graceful degradation** — individual engine failures don't kill the whole search
- **Docker** — multi-stage build, ~20MB runtime image
- **NixOS** — native NixOS module with systemd service
## Quick Start
### Binary
```bash
git clone https://git.ashisgreat.xyz/penal-colony/gosearch.git
cd gosearch
go build ./cmd/searxng-go
./searxng-go -config config.toml
```
### Docker Compose
```bash
cp config.example.toml config.toml
# Edit config.toml — set your Brave API key, etc.
docker compose up -d
```
### NixOS
Add to your flake inputs:
```nix
inputs.gosearch.url = "git+https://git.ashisgreat.xyz/penal-colony/gosearch.git";
```
Enable in your configuration:
```nix
imports = [ inputs.gosearch.nixosModules.default ];
services.gosearch = {
enable = true;
openFirewall = true;
baseUrl = "https://search.example.com";
# config = "/etc/gosearch/config.toml"; # default
};
```
Write your config:
```bash
sudo mkdir -p /etc/gosearch
sudo cp config.example.toml /etc/gosearch/config.toml
sudo $EDITOR /etc/gosearch/config.toml
```
Deploy:
```bash
sudo nixos-rebuild switch --flake .#
```
### Nix Development Shell
```bash
nix develop
go test ./...
go run ./cmd/searxng-go -config config.toml
```
## Endpoints
| Endpoint | Description |
|---|---|
| `GET /` | HTML search page |
| `GET /search?q=…&format=html` | HTML results (full page or HTMX fragment) |
| `GET/POST /search` | JSON/CSV/RSS results |
| `GET /opensearch.xml` | OpenSearch description XML |
| `GET /healthz` | Health check |
| `GET /static/*` | Embedded CSS, images, favicon |
## Search API
### Parameters
| Parameter | Default | Description |
|---|---|---|
| `q` | — | Search query (required) |
| `format` | `json` | `json`, `csv`, `rss`, `html` |
| `pageno` | `1` | Page number |
| `safesearch` | `0` | Safe search level (02) |
| `time_range` | — | `day`, `week`, `month`, `year` |
| `language` | `auto` | BCP-47 language code |
| `engines` | all | Comma-separated engine names |
### Example
```bash
curl "http://localhost:8080/search?q=golang&format=json&engines=github,duckduckgo"
```
### Response (JSON)
```json
{
"query": "golang",
"number_of_results": 14768,
"results": [
{
"title": "The Go Programming Language",
"url": "https://go.dev/",
"content": "Go is an open source programming language...",
"engine": "duckduckgo",
"score": 1.0,
"type": "result"
}
],
"suggestions": ["golang tutorial", "golang vs rust"],
"unresponsive_engines": []
}
```
## Configuration
Copy `config.example.toml` to `config.toml` and edit. All settings can also be overridden via environment variables (listed in the example file).
### Key Sections
- **`[server]`** — port, timeout, public base URL for OpenSearch
- **`[upstream]`** — optional upstream SearXNG proxy for unported engines
- **`[engines]`** — which engines run locally, engine-specific settings
- **`[cache]`** — Valkey/Redis address, password, TTL
- **`[cors]`** — allowed origins and methods
- **`[rate_limit]`** — per-IP sliding window (30 req/min default)
- **`[global_rate_limit]`** — server-wide limit (disabled by default)
- **`[burst_rate_limit]`** — per-IP burst + sustained windows (disabled by default)
### Environment Variables
| Variable | Description |
|---|---|
| `PORT` | Listen port (default: 8080) |
| `BASE_URL` | Public URL for OpenSearch XML |
| `UPSTREAM_SEARXNG_URL` | Upstream SearXNG instance URL |
| `LOCAL_PORTED_ENGINES` | Comma-separated local engine list |
| `HTTP_TIMEOUT` | Upstream request timeout |
| `BRAVE_API_KEY` | Brave Search API key |
| `BRAVE_ACCESS_TOKEN` | Gate requests with token |
| `VALKEY_ADDRESS` | Valkey/Redis address |
| `VALKEY_PASSWORD` | Valkey/Redis password |
| `VALKEY_CACHE_TTL` | Cache TTL |
See `config.example.toml` for the full list including rate limiting and CORS variables.
## Engines
| Engine | Source | Notes |
|---|---|---|
| Wikipedia | MediaWiki API | General knowledge |
| arXiv | arXiv API | Academic papers |
| Crossref | Crossref API | Academic metadata |
| Brave | Brave Search API | General web (requires API key) |
| Qwant | Qwant Lite HTML | General web |
| DuckDuckGo | DDG Lite HTML | General web |
| GitHub | GitHub Search API v3 | Code and repositories |
| Reddit | Reddit JSON API | Discussions |
| Bing | Bing RSS | General web |
Engines not listed in `engines.local_ported` are proxied to an upstream SearXNG instance if `upstream.url` is configured.
## Architecture
```
┌─────────────────────────────────────┐
│ HTTP Handler │
│ /search / /opensearch.xml │
├─────────────────────────────────────┤
│ Middleware Chain │
│ Global → Burst → Per-IP → CORS │
├─────────────────────────────────────┤
│ Search Service │
│ Parallel engine execution │
│ WaitGroup + graceful degradation │
├─────────────────────────────────────┤
│ Cache Layer │
│ Valkey/Redis (optional, no-op if │
│ unconfigured) │
├─────────────────────────────────────┤
│ Engines (×9) │
│ Each runs in its own goroutine │
│ Failures → unresponsive_engines │
└─────────────────────────────────────┘
```
## Docker
The Dockerfile uses a multi-stage build:
```dockerfile
# Build stage: golang:1.24-alpine
# Runtime stage: alpine:3.21 (~20MB)
# CGO_ENABLED=0 — static binary
```
```bash
docker compose up -d
```
Includes Valkey 8 with health checks out of the box.
## License
MIT