Commit graph

9 commits

Author SHA1 Message Date
b3e3123612 security: fix build errors, add honest Google UA, sanitize error msgs
- Fix config validation: upstream URLs allow private IPs (self-hosted)
- Fix util.SafeURLScheme to return parsed URL
- Replace spoofed GSA User-Agent with honest Kafka UA
- Sanitize all engine error messages (strip response bodies)
- Replace unused body reads with io.Copy(io.Discard, ...) for reuse
- Fix pre-existing braveapi_test using wrong struct type
- Fix ratelimit test reference to limiter variable
- Update ratelimit tests for new trusted proxy behavior
2026-03-22 16:27:49 +00:00
da367a1bfd security: harden against SAST findings (criticals through mediums)
Critical:
- Validate baseURL/sourceURL/upstreamURL at config load time
  (prevents XML injection, XSS, SSRF via config/env manipulation)
- Use xml.Escape for OpenSearch XML template interpolation

High:
- Add security headers middleware (CSP, X-Frame-Options, HSTS, etc.)
- Sanitize result URLs to reject javascript:/data: schemes
- Sanitize infobox img_src against dangerous URL schemes
- Default CORS to deny-all (was wildcard *)

Medium:
- Rate limiter: X-Forwarded-For only trusted from configured proxies
- Validate engine names against known registry allowlist
- Add 1024-char max query length
- Sanitize upstream error messages (strip raw response bodies)
- Upstream client validates URL scheme (http/https only)

Test updates:
- Update extractIP tests for new trusted proxy behavior
2026-03-22 16:22:27 +00:00
2d22a8cdbb feat: add Brave web search scraper engine
New brave.go: scrapes https://search.brave.com directly.
Extracts title, URL, snippet, and favicon from Brave's HTML.
No API key required.

Rename existing BraveAPIEngine (was BraveEngine) to avoid collision
with the new scraper. API engine stays as 'braveapi', scraper as 'brave'.
2026-03-22 16:01:49 +00:00
8c6d056f52 fix(engines): cap Brave API offset to 9 to avoid 422 error
Brave API only supports offset values 0-9. When pageno > 1 with
resultsPerPage=20, offset exceeded this limit causing 422 errors.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-22 12:41:42 +01:00
5b942a5fd6 refactor: clean up verbose and redundant comments
Some checks failed
Build and Push Docker Image / build-and-push (push) Failing after 7s
Mirror to GitHub / mirror (push) Failing after 3s
Tests / test (push) Successful in 25s
Trim or remove comments that:
- State the obvious (function names already convey purpose)
- Repeat what the code clearly shows
- Are excessively long without adding value

Keep comments that explain *why*, not *what*.
2026-03-22 11:10:50 +00:00
7be03b4017 license: change from MIT to AGPLv3
Update LICENSE file and add AGPL header to all source files.

AGPLv3 ensures that if someone runs Kafka as a network service and
modifies it, they must release their source code under the same license.
2026-03-22 08:27:23 +00:00
fcd9be16df refactor: remove SearXNG references and rename binary to kafka
- Rename cmd/searxng-go to cmd/kafka
- Remove all SearXNG references from source comments while keeping
  "SearXNG-compatible API" in user-facing docs
- Update binary paths in README, CLAUDE.md, and Dockerfile
- Update log message to "kafka starting"

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-22 01:47:03 +01:00
6346fb7155 chore: update Go module path to github.com/metamorphosis-dev/kafka
Module path now matches the GitHub mirror location.
All internal imports updated across 35+ files.
2026-03-21 19:42:01 +00:00
dc44837219 feat: build Go-based SearXNG-compatible search service
Implement an API-first Go rewrite with local engine adapters, upstream fallback, and Nix-based tooling so searches can run without matching the original UI while preserving response compatibility.

Made-with: Cursor
2026-03-20 20:34:08 +01:00