Commit graph

8 commits

Author SHA1 Message Date
a7f594b7fa feat: add YouTube engine with config file and env support
YouTube Data API v3 engine:
- Add YouTubeConfig to EnginesConfig with api_key field
- Add YOUTUBE_API_KEY env override
- Thread *config.Config through search service to factory
- Factory falls back to env vars if config fields are empty
- Update config.example.toml with youtube section

Also update default local_ported to include google and youtube.
2026-03-22 01:57:13 +00:00
e5295fa69d chore: rename project from gosearch to kafka
A search engine named after a man who proved answers don't exist.

Renamed everywhere user-facing:
- Brand name, UI titles, OpenSearch description, CSS filename
- Docker service name, NixOS module (services.kafka)
- Cache key prefix (kafka:), User-Agent strings (kafka/0.1)
- README, config.example.toml, flake.nix descriptions

Kept unchanged (internal):
- Go module path: github.com/ashie/gosearch
- Git repository URL: git.ashisgreat.xyz/penal-colony/gosearch
- Binary entrypoint: cmd/searxng-go
2026-03-21 19:20:47 +00:00
13040268d6 feat: add global and burst rate limiters
Three layers of rate limiting, all disabled by default, opt-in via config:

1. Per-IP (existing): 30 req/min per IP
2. Global: server-wide limit across all IPs
   - Lock-free atomic counter for minimal overhead
   - Returns 503 when exceeded
   - Prevents pool exhaustion from distributed attacks
3. Burst: per-IP burst + sustained windows
   - Blocks rapid-fire abuse within seconds
   - Returns 429 with X-RateLimit-Reason header
   - Example: 5 req/5s burst, 60 req/min sustained

Config:
[global_rate_limit]
requests = 0  # disabled by default
window = "1m"

[burst_rate_limit]
burst = 0  # disabled by default
burst_window = "5s"
sustained = 0
sustained_window = "1m"

Env overrides: GLOBAL_RATE_LIMIT_REQUESTS, GLOBAL_RATE_LIMIT_WINDOW,
BURST_RATE_LIMIT_BURST, BURST_RATE_LIMIT_BURST_WINDOW,
BURST_RATE_LIMIT_SUSTAINED, BURST_RATE_LIMIT_SUSTAINED_WINDOW

Full test coverage: concurrent lock-free test, window expiry, disabled states,
IP isolation, burst vs sustained distinction.
2026-03-21 18:35:31 +00:00
4ec600f6c0 feat: add OpenSearch XML endpoint
- Serve /opensearch.xml with configurable base URL
- Browsers can now add gosearch as a search engine from the address bar
- Configurable via [server] base_url or BASE_URL env var
- XML template embedded in the binary via go:embed
- Added base_url to config.example.toml
2026-03-21 17:40:05 +00:00
df8fe9474b feat: add DuckDuckGo, GitHub, Reddit, and Bing engines
- DuckDuckGo: scrapes Lite HTML endpoint for results
  - Language-aware region mapping (de→de-de, ja→jp-jp, etc.)
  - HTML parser extracts result links and snippets from DDG Lite markup
  - Shared html_helpers.go with extractAttr, stripHTML, htmlUnescape

- GitHub: uses public Search API (repos, sorted by stars)
  - No auth required (10 req/min unauthenticated)
  - Shows stars, language, topics, last updated date
  - Paginated via GitHub's page parameter

- Reddit: uses public JSON search API
  - Respects safesearch (skips over_18 posts)
  - Shows subreddit, score, comment count
  - Links self-posts to the thread URL

- Bing: scrapes web search HTML (b_algo containers)
  - Extracts titles, URLs, and snippets from Bing's result markup
  - Handles Bing's tracking URL encoding

- Updated factory, config defaults, and config.example.toml
- Full test suite: unit tests for all engines, HTML parsing tests,
  region mapping tests, live request tests (skipped in short mode)

9 engines total: wikipedia, arxiv, crossref, braveapi, qwant,
duckduckgo, github, reddit, bing
2026-03-21 16:52:11 +00:00
ebeaeeef21 feat: add CORS and rate limiting middleware
CORS:
- Configurable allowed origins (wildcard "*" or specific domains)
- Handles OPTIONS preflight with configurable methods, headers, max-age
- Exposed headers support for browser API access
- Env override: CORS_ALLOWED_ORIGINS

Rate Limiting:
- In-memory per-IP sliding window counter
- Configurable request limit and time window
- Background goroutine cleans up stale IP entries
- HTTP 429 with Retry-After header when exceeded
- Extracts real IP from X-Forwarded-For and X-Real-IP (proxy-aware)
- Env overrides: RATE_LIMIT_REQUESTS, RATE_LIMIT_WINDOW, RATE_LIMIT_CLEANUP_INTERVAL
- Set requests=0 in config to disable

Both wired into main.go as middleware chain: rate_limit → cors → handler.
Config example updated with [cors] and [rate_limit] sections.
Full test coverage for both middleware packages.
2026-03-21 15:54:52 +00:00
94322ceff4 feat: Valkey cache for search results
- Add internal/cache package using go-redis/v9 (Valkey-compatible)
- Cache keys are deterministic SHA-256 hashes of search parameters
- Cache wraps the Search() method: check cache → miss → execute → store
- Gracefully disabled if Valkey is unreachable or unconfigured
- Configurable TTL (default 5m), address, password, and DB index
- Environment variable overrides: VALKEY_ADDRESS, VALKEY_PASSWORD,
  VALKEY_DB, VALKEY_CACHE_TTL
- Structured JSON logging via slog throughout cache layer
- Refactored service.go: extract executeSearch() from Search() for clarity
- Update config.example.toml with [cache] section
- Add cache package tests (key generation, nop behavior)
2026-03-21 15:43:47 +00:00
8649864971 feat: migrate from env vars to config.toml
- Add internal/config package with TOML parsing (BurntSushi/toml)
- Create config.example.toml documenting all settings
- Update main.go to load config via -config flag (default: config.toml)
- Environment variables remain as fallback overrides for backward compat
- Config file values are used as defaults; env vars override when set
- Add comprehensive tests for file loading, defaults, and env overrides
- Add config.toml to .gitignore (secrets stay local)
2026-03-21 14:56:00 +00:00