samsa/README.md
Franz Kafka aaac1f8f4b
Some checks failed
Build and Push Docker Image / build-and-push (push) Failing after 8s
Mirror to GitHub / mirror (push) Failing after 5s
Tests / test (push) Successful in 23s
docs: fix ASCII architecture diagram alignment
2026-03-23 14:17:41 +00:00

7.9 KiB
Raw Blame History

samsa

samsa — named for Gregor Samsa, who woke to find himself transformed. You wanted results; you got a metasearch engine.

A privacy-respecting, open metasearch engine written in Go. SearXNG-compatible API with an HTML frontend, designed to be fast, lightweight, and deployable anywhere.

11 engines. No JavaScript required. No tracking. One binary.

Features

  • SearXNG-compatible API — drop-in replacement for existing integrations
  • 11 search engines — Wikipedia, arXiv, Crossref, Brave Search API, Brave (scraping), Qwant, DuckDuckGo, GitHub, Reddit, Bing, Google, YouTube
  • Stack Overflow — bonus engine, not enabled by default
  • HTML frontend — Go templates + HTMX with instant search, dark mode, responsive design
  • Valkey cache — optional Redis-compatible caching with configurable TTL
  • Rate limiting — three layers: per-IP, burst, and global (all disabled by default)
  • CORS — configurable origins for browser-based clients
  • OpenSearch — browsers can add samsa as a search engine from the address bar
  • Graceful degradation — individual engine failures don't kill the whole search
  • Docker — multi-stage build, static binary, ~20MB runtime image
  • NixOS — native NixOS module with systemd service

Quick Start

Binary

git clone https://git.ashisgreat.xyz/penal-colony/samsa.git
cd samsa
go build ./cmd/samsa
./samsa -config config.toml

Docker Compose

cp config.example.toml config.toml
# Edit config.toml — set your Brave API key, YouTube API key, etc.
docker compose up -d

NixOS

Add to your flake inputs:

inputs.samsa.url = "git+https://git.ashisgreat.xyz/penal-colony/samsa.git";

Enable in your configuration:

imports = [ inputs.samsa.nixosModules.default ];

services.samsa = {
  enable = true;
  openFirewall = true;
  baseUrl = "https://search.example.com";
  # config = "/etc/samsa/config.toml";  # default
};

Write your config:

sudo mkdir -p /etc/samsa
sudo cp config.example.toml /etc/samsa/config.toml
sudo $EDITOR /etc/samsa/config.toml

Deploy:

sudo nixos-rebuild switch --flake .#

Nix Development Shell

nix develop
go test ./...
go run ./cmd/samsa -config config.toml

Endpoints

Endpoint Description
GET / HTML search page
GET /search?q=…&format=html HTML results (full page or HTMX fragment)
GET/POST /search JSON/CSV/RSS results
GET /opensearch.xml OpenSearch description XML
GET /healthz Health check
GET /static/* Embedded CSS, images, favicon

Search API

Parameters

Parameter Default Description
q Search query (required)
format json json, csv, rss, html
pageno 1 Page number
safesearch 0 Safe search level (02)
time_range day, week, month, year
language auto BCP-47 language code
engines all Comma-separated engine names

Example

curl "http://localhost:8080/search?q=golang&format=json&engines=github,duckduckgo"

Response (JSON)

{
  "query": "golang",
  "number_of_results": 14768,
  "results": [
    {
      "title": "The Go Programming Language",
      "url": "https://go.dev/",
      "content": "Go is an open source programming language...",
      "engine": "duckduckgo",
      "score": 1.0,
      "type": "result"
    }
  ],
  "suggestions": ["golang tutorial", "golang vs rust"],
  "unresponsive_engines": []
}

Configuration

Copy config.example.toml to config.toml and edit. All settings can also be overridden via environment variables (listed in the example file).

Key Sections

  • [server] — port, timeout, public base URL for OpenSearch
  • [upstream] — optional upstream metasearch proxy for unported engines
  • [engines] — which engines run locally, engine-specific settings
  • [engines.brave] — Brave Search API key
  • [engines.youtube] — YouTube Data API v3 key
  • [cache] — Valkey/Redis address, password, TTL
  • [cors] — allowed origins and methods
  • [rate_limit] — per-IP sliding window (30 req/min default)
  • [global_rate_limit] — server-wide limit (disabled by default)
  • [burst_rate_limit] — per-IP burst + sustained windows (disabled by default)

Environment Variables

Variable Description
PORT Listen port (default: 8080)
BASE_URL Public URL for OpenSearch XML
UPSTREAM_SEARXNG_URL Upstream instance URL
LOCAL_PORTED_ENGINES Comma-separated local engine list
HTTP_TIMEOUT Upstream request timeout
BRAVE_API_KEY Brave Search API key
BRAVE_ACCESS_TOKEN Gate requests with token
YOUTUBE_API_KEY YouTube Data API v3 key
VALKEY_ADDRESS Valkey/Redis address
VALKEY_PASSWORD Valkey/Redis password
VALKEY_CACHE_TTL Cache TTL

See config.example.toml for the full list including rate limiting and CORS variables.

Engines

Engine Source Notes
Wikipedia MediaWiki API General knowledge
arXiv arXiv API Academic papers
Crossref Crossref API Academic metadata
Brave Search API Brave API General web (requires API key)
Brave Brave Lite HTML General web (no key needed)
Qwant Qwant Lite HTML General web
DuckDuckGo DDG Lite HTML General web
GitHub GitHub Search API v3 Code and repositories
Reddit Reddit JSON API Discussions
Bing Bing RSS General web
Google GSA User-Agent scraping General web (no API key)
YouTube YouTube Data API v3 Videos (requires API key)
Stack Overflow Stack Exchange API Q&A (registered, not enabled by default)

Engines not listed in engines.local_ported are proxied to an upstream metasearch instance if upstream.url is configured.

API Keys

Brave Search API and YouTube Data API require keys. If omitted, those engines are silently skipped. Brave Lite (scraping) and Google (GSA UA scraping) work without keys.

Architecture

┌───────────────────────────────────────┐
│             HTTP Handler              │
│      /search  /  /opensearch.xml      │
├───────────────────────────────────────┤
│            Middleware Chain           │
│   Global → Burst → Per-IP → CORS      │
├───────────────────────────────────────┤
│            Search Service             │
│     Parallel engine execution         │
│   WaitGroup + graceful degradation    │
├───────────────────────────────────────┤
│             Cache Layer               │
│  Valkey/Redis (optional; no-op if     │
│              unconfigured)            │
├───────────────────────────────────────┤
│        Engines (×11 default)          │
│    Each runs in its own goroutine     │
│   Failures → unresponsive_engines     │
└───────────────────────────────────────┘

Docker

The Dockerfile uses a multi-stage build with a static Go binary on alpine Linux:

# Build: golang:1.24-alpine
# Runtime: alpine:3.21 (~20MB)
# CGO_ENABLED=0 — fully static
docker compose up -d

Includes Valkey 8 with health checks out of the box.

Contributing

See docs/CONTRIBUTING.md for a walkthrough of adding a new engine. The interface is two methods: Name() and Search(context, request).

License

AGPLv3