Commit graph

14 commits

Author SHA1 Message Date
9e95ce7b53 perf: shared http.Transport with tuned connection pooling
Add internal/httpclient package as a singleton RoundTripper used by
all outbound engine requests (search, engines, autocomplete, upstream).

Key Transport settings:
- MaxIdleConnsPerHost = 20  (up from Go default of 2)
- MaxIdleConns = 100
- IdleConnTimeout = 90s
- DialContext timeout = 5s

Previously, the default transport limited each host to 2 idle connections,
forcing a new TCP+TLS handshake on every search for each engine. With
12 engines hitting the same upstream hosts in parallel, connections
were constantly recycled. Now warm connections are reused across all
goroutines and requests.
2026-03-23 14:26:26 +00:00
015f8b357a fix: rename remaining kafka references to samsa
Some checks failed
Build and Push Docker Image / build-and-push (push) Failing after 7s
Mirror to GitHub / mirror (push) Failing after 5s
Tests / test (push) Successful in 26s
- OpenSearch description: 'Search results for "..." - kafka' → samsa
- Test error message: 'kafka trial' → samsa trial
2026-03-23 07:25:30 +00:00
8e9aae062b rename: kafka → samsa
Some checks failed
Build and Push Docker Image / build-and-push (push) Failing after 11s
Mirror to GitHub / mirror (push) Failing after 5s
Tests / test (push) Successful in 42s
Full project rename from kafka to samsa (after Gregor Samsa, who
woke one morning from uneasy dreams to find himself transformed).

- Module: github.com/metamorphosis-dev/kafka → samsa
- Binary: cmd/kafka/ → cmd/samsa/
- CSS: kafka.css → samsa.css
- UI: all 'kafka' product names, titles, localStorage keys → samsa
- localStorage keys: kafka-theme → samsa-theme, kafka-engines → samsa-engines
- OpenSearch: ShortName, LongName, description, URLs updated
- AGPL headers: 'kafka' → 'samsa'
- Docs, configs, examples updated
- Cache key prefix: kafka: → samsa:
2026-03-22 23:44:55 +00:00
2b072e4de3 feat: add image search with Bing, DuckDuckGo, and Qwant engines
Some checks failed
Build and Push Docker Image / build-and-push (push) Failing after 6s
Mirror to GitHub / mirror (push) Failing after 3s
Tests / test (push) Successful in 25s
Three new image search engines:
- bing_images: Bing Images via RSS endpoint
- ddg_images: DuckDuckGo Images via VQD API
- qwant_images: Qwant Images via v3 search API

Frontend:
- Image grid layout with responsive columns
- image_item template with thumbnail, title, and source metadata
- Hover animations and lazy loading
- Grid activates automatically when category=images

Backend:
- category=images routes to image engines via planner
- Image engines registered in factory and engine allowlist
- extractImgSrc helper for parsing thumbnail URLs from HTML
- IsImageSearch flag on PageData for template layout switching
2026-03-22 16:49:24 +00:00
da367a1bfd security: harden against SAST findings (criticals through mediums)
Critical:
- Validate baseURL/sourceURL/upstreamURL at config load time
  (prevents XML injection, XSS, SSRF via config/env manipulation)
- Use xml.Escape for OpenSearch XML template interpolation

High:
- Add security headers middleware (CSP, X-Frame-Options, HSTS, etc.)
- Sanitize result URLs to reject javascript:/data: schemes
- Sanitize infobox img_src against dangerous URL schemes
- Default CORS to deny-all (was wildcard *)

Medium:
- Rate limiter: X-Forwarded-For only trusted from configured proxies
- Validate engine names against known registry allowlist
- Add 1024-char max query length
- Sanitize upstream error messages (strip raw response bodies)
- Upstream client validates URL scheme (http/https only)

Test updates:
- Update extractIP tests for new trusted proxy behavior
2026-03-22 16:22:27 +00:00
5b942a5fd6 refactor: clean up verbose and redundant comments
Some checks failed
Build and Push Docker Image / build-and-push (push) Failing after 7s
Mirror to GitHub / mirror (push) Failing after 3s
Tests / test (push) Successful in 25s
Trim or remove comments that:
- State the obvious (function names already convey purpose)
- Repeat what the code clearly shows
- Are excessively long without adding value

Keep comments that explain *why*, not *what*.
2026-03-22 11:10:50 +00:00
7be03b4017 license: change from MIT to AGPLv3
Update LICENSE file and add AGPL header to all source files.

AGPLv3 ensures that if someone runs Kafka as a network service and
modifies it, they must release their source code under the same license.
2026-03-22 08:27:23 +00:00
a7f594b7fa feat: add YouTube engine with config file and env support
YouTube Data API v3 engine:
- Add YouTubeConfig to EnginesConfig with api_key field
- Add YOUTUBE_API_KEY env override
- Thread *config.Config through search service to factory
- Factory falls back to env vars if config fields are empty
- Update config.example.toml with youtube section

Also update default local_ported to include google and youtube.
2026-03-22 01:57:13 +00:00
fcd9be16df refactor: remove SearXNG references and rename binary to kafka
- Rename cmd/searxng-go to cmd/kafka
- Remove all SearXNG references from source comments while keeping
  "SearXNG-compatible API" in user-facing docs
- Update binary paths in README, CLAUDE.md, and Dockerfile
- Update log message to "kafka starting"

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-22 01:47:03 +01:00
6346fb7155 chore: update Go module path to github.com/metamorphosis-dev/kafka
Module path now matches the GitHub mirror location.
All internal imports updated across 35+ files.
2026-03-21 19:42:01 +00:00
28b61ff251 feat: HTMX + Go Templates HTML frontend
- Add internal/views/ package with embedded templates and static files
- Go html/template with SearXNG-compatible CSS class names
- Dark mode via prefers-color-scheme, responsive layout, print styles
- HTMX integration:
  - Debounced instant search (500ms) on the search input
  - Form submission targets #results via hx-post
  - Pagination buttons are HTMX-powered (swap results div only)
  - HX-Request header detection for fragment vs full page rendering
- Template structure:
  - base.html: full page layout with HTMX script, favicon, CSS
  - index.html: homepage with centered search box
  - results.html: full results page (wraps base + results_inner)
  - results_inner.html: results fragment (HTMX partial + sidebar + pagination)
  - result_item.html: reusable result article partial
- Smart format detection: browser requests (Accept: text/html) default to HTML,
  API clients default to JSON
- Static files served at /static/ from embedded FS (CSS, favicon SVG)
- Index route at GET /
- Empty query on HTML format redirects to homepage
- Custom CSS (gosearch.css): clean, minimal, privacy-respecting aesthetic
  with light/dark mode, responsive breakpoints, print stylesheet
- Add views package tests
2026-03-21 16:10:42 +00:00
94322ceff4 feat: Valkey cache for search results
- Add internal/cache package using go-redis/v9 (Valkey-compatible)
- Cache keys are deterministic SHA-256 hashes of search parameters
- Cache wraps the Search() method: check cache → miss → execute → store
- Gracefully disabled if Valkey is unreachable or unconfigured
- Configurable TTL (default 5m), address, password, and DB index
- Environment variable overrides: VALKEY_ADDRESS, VALKEY_PASSWORD,
  VALKEY_DB, VALKEY_CACHE_TTL
- Structured JSON logging via slog throughout cache layer
- Refactored service.go: extract executeSearch() from Search() for clarity
- Update config.example.toml with [cache] section
- Add cache package tests (key generation, nop behavior)
2026-03-21 15:43:47 +00:00
385a7acab7 feat: concurrent engine execution with graceful degradation
- Run all local engines in parallel using goroutines + sync.WaitGroup
- Individual engine failures are captured as unresponsive_engines entries
  instead of aborting the entire search request
- Context cancellation is respected: cancelled engines report as unresponsive
- Upstream proxy failure is also gracefully handled (single unresponsive entry)
- Extract unresponsiveResponse() and emptyResponse() helpers for consistency
- Add comprehensive tests:
  - ConcurrentEngines: verifies parallelism (2x100ms engines complete in ~100ms)
  - GracefulDegradation: one engine fails, one succeeds, both represented
  - AllEnginesFail: no error returned, all engines in unresponsive_engines
  - ContextCancellation: engine respects context timeout, reports unresponsive
2026-03-21 15:39:00 +00:00
dc44837219 feat: build Go-based SearXNG-compatible search service
Implement an API-first Go rewrite with local engine adapters, upstream fallback, and Nix-based tooling so searches can run without matching the original UI while preserving response compatibility.

Made-with: Cursor
2026-03-20 20:34:08 +01:00