kafka

Author	SHA1	Message	Date
Franz Kafka	a7f594b7fa	feat: add YouTube engine with config file and env support YouTube Data API v3 engine: - Add YouTubeConfig to EnginesConfig with api_key field - Add YOUTUBE_API_KEY env override - Thread *config.Config through search service to factory - Factory falls back to env vars if config fields are empty - Update config.example.toml with youtube section Also update default local_ported to include google and youtube.	2026-03-22 01:57:13 +00:00
Franz Kafka	1689cab9bd	feat: add YouTube engine via Data API v3 Uses the official YouTube Data API v3. Requires YOUTUBE_API_KEY environment variable (free from Google Cloud Console). Returns video results with title, description, channel, publish date, and thumbnail URL. Falls back gracefully if no API key.	2026-03-22 01:53:19 +00:00
Franz Kafka	31fdd5e06f	Merge branch 'feat/google-engine', remote-tracking branch 'origin/main'	2026-03-22 01:35:20 +00:00
Franz Kafka	4be9cf2725	feat: add Google engine using GSA User-Agent scraping SearXNG approach: use Google Search Appliance (GSA) User-Agent pool — these are whitelisted enterprise identifiers Google trusts. Key techniques: - GSA User-Agent (iPhone OS + GSA/ version) instead of Chrome desktop - CONSENT=YES+ cookie to bypass EU consent wall - Parse /url?q= redirector URLs (unquote + strip &sa= params) - div.MjjYud class for result containers (SearXNG selector) - data-sncf divs for snippets - detect sorry.google.com blocks - Suggestions from ouy7Mc class cards	2026-03-22 01:29:46 +00:00
ashisgreat22	fcd9be16df	refactor: remove SearXNG references and rename binary to kafka - Rename cmd/searxng-go to cmd/kafka - Remove all SearXNG references from source comments while keeping "SearXNG-compatible API" in user-facing docs - Update binary paths in README, CLAUDE.md, and Dockerfile - Update log message to "kafka starting" Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-22 01:47:03 +01:00
ashisgreat22	0d3f3c19d7	fix: add missing engines to defaultPortedEngines duckduckgo, github, reddit, and bing were registered in factory.go and config.go but missing from planner.go, so they were silently skipped when LOCAL_PORTED_ENGINES was not set. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-22 00:13:57 +01:00
Franz Kafka	6346fb7155	chore: update Go module path to github.com/metamorphosis-dev/kafka Module path now matches the GitHub mirror location. All internal imports updated across 35+ files.	2026-03-21 19:42:01 +00:00
Franz Kafka	e5295fa69d	chore: rename project from gosearch to kafka A search engine named after a man who proved answers don't exist. Renamed everywhere user-facing: - Brand name, UI titles, OpenSearch description, CSS filename - Docker service name, NixOS module (services.kafka) - Cache key prefix (kafka:), User-Agent strings (kafka/0.1) - README, config.example.toml, flake.nix descriptions Kept unchanged (internal): - Go module path: github.com/ashie/gosearch - Git repository URL: git.ashisgreat.xyz/penal-colony/gosearch - Binary entrypoint: cmd/searxng-go	2026-03-21 19:20:47 +00:00
Franz Kafka	a8ab29b23a	fix: fix DDG and Bing parsers — verified with live tests DuckDuckGo: - Fixed parser to handle single-quoted class attributes (class='result-link') - Decode DDG tracking URLs (uddg= parameter) to extract real URLs - Match snippet extraction to actual DDG Lite HTML structure (</td> terminator) Bing: - Switched from HTML scraping (blocked by JS detection) to RSS endpoint (?format=rss) which returns parseable XML - Added JSON API response parsing as fallback - Returns graceful unresponsive_engines entry when blocked Live test results: - DuckDuckGo: 9 results ✅ - GitHub: 10 results (14,768 total) ✅ - Bing: 10 results via RSS ✅ - Reddit: skipped (403 from sandbox, needs browser-like context)	2026-03-21 16:57:02 +00:00
Franz Kafka	df8fe9474b	feat: add DuckDuckGo, GitHub, Reddit, and Bing engines - DuckDuckGo: scrapes Lite HTML endpoint for results - Language-aware region mapping (de→de-de, ja→jp-jp, etc.) - HTML parser extracts result links and snippets from DDG Lite markup - Shared html_helpers.go with extractAttr, stripHTML, htmlUnescape - GitHub: uses public Search API (repos, sorted by stars) - No auth required (10 req/min unauthenticated) - Shows stars, language, topics, last updated date - Paginated via GitHub's page parameter - Reddit: uses public JSON search API - Respects safesearch (skips over_18 posts) - Shows subreddit, score, comment count - Links self-posts to the thread URL - Bing: scrapes web search HTML (b_algo containers) - Extracts titles, URLs, and snippets from Bing's result markup - Handles Bing's tracking URL encoding - Updated factory, config defaults, and config.example.toml - Full test suite: unit tests for all engines, HTML parsing tests, region mapping tests, live request tests (skipped in short mode) 9 engines total: wikipedia, arxiv, crossref, braveapi, qwant, duckduckgo, github, reddit, bing	2026-03-21 16:52:11 +00:00
Franz Kafka	dc44837219	feat: build Go-based SearXNG-compatible search service Implement an API-first Go rewrite with local engine adapters, upstream fallback, and Nix-based tooling so searches can run without matching the original UI while preserving response compatibility. Made-with: Cursor	2026-03-20 20:34:08 +01:00

11 commits