YouTube Data API v3 engine:
- Add YouTubeConfig to EnginesConfig with api_key field
- Add YOUTUBE_API_KEY env override
- Thread *config.Config through search service to factory
- Factory falls back to env vars if config fields are empty
- Update config.example.toml with youtube section
Also update default local_ported to include google and youtube.
Uses the official YouTube Data API v3. Requires YOUTUBE_API_KEY
environment variable (free from Google Cloud Console).
Returns video results with title, description, channel, publish
date, and thumbnail URL. Falls back gracefully if no API key.
SearXNG approach: use Google Search Appliance (GSA) User-Agent
pool — these are whitelisted enterprise identifiers Google trusts.
Key techniques:
- GSA User-Agent (iPhone OS + GSA/ version) instead of Chrome desktop
- CONSENT=YES+ cookie to bypass EU consent wall
- Parse /url?q= redirector URLs (unquote + strip &sa= params)
- div.MjjYud class for result containers (SearXNG selector)
- data-sncf divs for snippets
- detect sorry.google.com blocks
- Suggestions from ouy7Mc class cards
- Rename cmd/searxng-go to cmd/kafka
- Remove all SearXNG references from source comments while keeping
"SearXNG-compatible API" in user-facing docs
- Update binary paths in README, CLAUDE.md, and Dockerfile
- Update log message to "kafka starting"
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
duckduckgo, github, reddit, and bing were registered in factory.go
and config.go but missing from planner.go, so they were silently
skipped when LOCAL_PORTED_ENGINES was not set.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
DuckDuckGo:
- Fixed parser to handle single-quoted class attributes (class='result-link')
- Decode DDG tracking URLs (uddg= parameter) to extract real URLs
- Match snippet extraction to actual DDG Lite HTML structure (</td> terminator)
Bing:
- Switched from HTML scraping (blocked by JS detection) to RSS endpoint
(?format=rss) which returns parseable XML
- Added JSON API response parsing as fallback
- Returns graceful unresponsive_engines entry when blocked
Live test results:
- DuckDuckGo: 9 results ✅
- GitHub: 10 results (14,768 total) ✅
- Bing: 10 results via RSS ✅
- Reddit: skipped (403 from sandbox, needs browser-like context)
- DuckDuckGo: scrapes Lite HTML endpoint for results
- Language-aware region mapping (de→de-de, ja→jp-jp, etc.)
- HTML parser extracts result links and snippets from DDG Lite markup
- Shared html_helpers.go with extractAttr, stripHTML, htmlUnescape
- GitHub: uses public Search API (repos, sorted by stars)
- No auth required (10 req/min unauthenticated)
- Shows stars, language, topics, last updated date
- Paginated via GitHub's page parameter
- Reddit: uses public JSON search API
- Respects safesearch (skips over_18 posts)
- Shows subreddit, score, comment count
- Links self-posts to the thread URL
- Bing: scrapes web search HTML (b_algo containers)
- Extracts titles, URLs, and snippets from Bing's result markup
- Handles Bing's tracking URL encoding
- Updated factory, config defaults, and config.example.toml
- Full test suite: unit tests for all engines, HTML parsing tests,
region mapping tests, live request tests (skipped in short mode)
9 engines total: wikipedia, arxiv, crossref, braveapi, qwant,
duckduckgo, github, reddit, bing
Implement an API-first Go rewrite with local engine adapters, upstream fallback, and Nix-based tooling so searches can run without matching the original UI while preserving response compatibility.
Made-with: Cursor