- Fix stale-while-revalidate condition (was inverted for stale vs fresh) - Add unmarshalErr tracking for cache corruption edge case - Rename CachedResponse to CachedEngineResponse throughout - Fix override tier name (engineName not engineName_override) - Add EngineCache.Logger() method - Add tiers_test.go to file map Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
789 lines
25 KiB
Markdown
789 lines
25 KiB
Markdown
# Per-Engine TTL Cache — Implementation Plan
|
|
|
|
> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
|
|
|
|
**Goal:** Replace the merged-response cache with per-engine response caching, enabling tier-based TTLs and stale-while-revalidate semantics.
|
|
|
|
**Architecture:** Each engine's raw response is cached independently with its tier-based TTL. On stale hits, return cached data immediately and refresh in background. Query hash is computed from shared params (query, pageno, safesearch, language, time_range) and prefixed with engine name for the cache key.
|
|
|
|
**Tech Stack:** Go 1.24, Valkey/Redis (go-redis/v9), existing samsa contracts
|
|
|
|
---
|
|
|
|
## File Map
|
|
|
|
| Action | File | Responsibility |
|
|
|--------|------|----------------|
|
|
| Create | `internal/cache/tiers.go` | Tier definitions, `EngineTier()` function |
|
|
| Create | `internal/cache/tiers_test.go` | Tests for EngineTier |
|
|
| Create | `internal/cache/engine_cache.go` | `EngineCache` struct with tier-aware Get/Set |
|
|
| Create | `internal/cache/engine_cache_test.go` | Tests for EngineCache |
|
|
| Modify | `internal/cache/cache.go` | Add `QueryHash()`, add `CachedEngineResponse` type |
|
|
| Modify | `internal/cache/cache_test.go` | Add tests for `QueryHash()` |
|
|
| Modify | `internal/config/config.go` | Add `TTLOverrides` to `CacheConfig` |
|
|
| Modify | `internal/search/service.go` | Use `EngineCache`, parallel lookups, background refresh |
|
|
|
|
---
|
|
|
|
## Task 1: Add QueryHash and CachedEngineResponse to cache.go
|
|
|
|
**Files:**
|
|
- Modify: `internal/cache/cache.go`
|
|
- Modify: `internal/cache/cache_test.go`
|
|
|
|
- [ ] **Step 1: Write failing test for QueryHash()**
|
|
|
|
```go
|
|
// In cache_test.go, add:
|
|
|
|
func TestQueryHash(t *testing.T) {
|
|
// Same params should produce same hash
|
|
hash1 := QueryHash("golang", 1, 0, "en", "")
|
|
hash2 := QueryHash("golang", 1, 0, "en", "")
|
|
if hash1 != hash2 {
|
|
t.Errorf("QueryHash: same params should produce same hash, got %s != %s", hash1, hash2)
|
|
}
|
|
|
|
// Different query should produce different hash
|
|
hash3 := QueryHash("rust", 1, 0, "en", "")
|
|
if hash1 == hash3 {
|
|
t.Errorf("QueryHash: different queries should produce different hash")
|
|
}
|
|
|
|
// Different pageno should produce different hash
|
|
hash4 := QueryHash("golang", 2, 0, "en", "")
|
|
if hash1 == hash4 {
|
|
t.Errorf("QueryHash: different pageno should produce different hash")
|
|
}
|
|
|
|
// time_range should affect hash
|
|
hash5 := QueryHash("golang", 1, 0, "en", "day")
|
|
if hash1 == hash5 {
|
|
t.Errorf("QueryHash: different time_range should produce different hash")
|
|
}
|
|
|
|
// Hash should be 16 characters (truncated SHA-256)
|
|
if len(hash1) != 16 {
|
|
t.Errorf("QueryHash: expected 16 char hash, got %d", len(hash1))
|
|
}
|
|
}
|
|
```
|
|
|
|
- [ ] **Step 2: Run test to verify it fails**
|
|
|
|
Run: `nix develop --command bash -c "go test -run TestQueryHash ./internal/cache/ -v"`
|
|
Expected: FAIL — "QueryHash not defined"
|
|
|
|
- [ ] **Step 3: Implement QueryHash() and CachedEngineResponse in cache.go**
|
|
|
|
Add to `cache.go` (the imports `crypto/sha256` and `encoding/hex` are already present in cache.go from the existing `Key()` function):
|
|
|
|
```go
|
|
// QueryHash computes a deterministic hash from shared request parameters
|
|
// (query, pageno, safesearch, language, time_range) for use as a cache key suffix.
|
|
// The hash is a truncated SHA-256 (16 hex chars).
|
|
func QueryHash(query string, pageno int, safesearch int, language, timeRange string) string {
|
|
h := sha256.New()
|
|
fmt.Fprintf(h, "q=%s|", query)
|
|
fmt.Fprintf(h, "pageno=%d|", pageno)
|
|
fmt.Fprintf(h, "safesearch=%d|", safesearch)
|
|
fmt.Fprintf(h, "lang=%s|", language)
|
|
if timeRange != "" {
|
|
fmt.Fprintf(h, "tr=%s|", timeRange)
|
|
}
|
|
return hex.EncodeToString(h.Sum(nil))[:16]
|
|
}
|
|
|
|
// CachedEngineResponse wraps an engine's cached response with metadata.
|
|
type CachedEngineResponse struct {
|
|
Engine string
|
|
Response []byte
|
|
StoredAt time.Time
|
|
}
|
|
```
|
|
|
|
- [ ] **Step 4: Run test to verify it passes**
|
|
|
|
Run: `nix develop --command bash -c "go test -run TestQueryHash ./internal/cache/ -v"`
|
|
Expected: PASS
|
|
|
|
- [ ] **Step 5: Commit**
|
|
|
|
```bash
|
|
git add internal/cache/cache.go internal/cache/cache_test.go
|
|
git commit -m "cache: add QueryHash and CachedEngineResponse type"
|
|
```
|
|
|
|
---
|
|
|
|
## Task 2: Create tiers.go with tier definitions
|
|
|
|
**Files:**
|
|
- Create: `internal/cache/tiers.go`
|
|
|
|
- [ ] **Step 1: Create tiers.go with tier definitions and EngineTier function**
|
|
|
|
```go
|
|
package cache
|
|
|
|
import "time"
|
|
|
|
// TTLTier represents a cache TTL tier with a name and duration.
|
|
type TTLTier struct {
|
|
Name string
|
|
Duration time.Duration
|
|
}
|
|
|
|
// defaultTiers maps engine names to their default TTL tiers.
|
|
var defaultTiers = map[string]TTLTier{
|
|
// Static knowledge engines — rarely change
|
|
"wikipedia": {Name: "static", Duration: 24 * time.Hour},
|
|
"wikidata": {Name: "static", Duration: 24 * time.Hour},
|
|
"arxiv": {Name: "static", Duration: 24 * time.Hour},
|
|
"crossref": {Name: "static", Duration: 24 * time.Hour},
|
|
"stackoverflow": {Name: "static", Duration: 24 * time.Hour},
|
|
"github": {Name: "static", Duration: 24 * time.Hour},
|
|
|
|
// API-based general search — fresher data
|
|
"braveapi": {Name: "api_general", Duration: 1 * time.Hour},
|
|
"youtube": {Name: "api_general", Duration: 1 * time.Hour},
|
|
|
|
// Scraped general search — moderately stable
|
|
"google": {Name: "scraped_general", Duration: 2 * time.Hour},
|
|
"bing": {Name: "scraped_general", Duration: 2 * time.Hour},
|
|
"duckduckgo": {Name: "scraped_general", Duration: 2 * time.Hour},
|
|
"qwant": {Name: "scraped_general", Duration: 2 * time.Hour},
|
|
"brave": {Name: "scraped_general", Duration: 2 * time.Hour},
|
|
|
|
// News/social — changes frequently
|
|
"reddit": {Name: "news_social", Duration: 30 * time.Minute},
|
|
|
|
// Image search
|
|
"bing_images": {Name: "images", Duration: 1 * time.Hour},
|
|
"ddg_images": {Name: "images", Duration: 1 * time.Hour},
|
|
"qwant_images": {Name: "images", Duration: 1 * time.Hour},
|
|
}
|
|
|
|
// EngineTier returns the TTL tier for an engine, applying overrides if provided.
|
|
// If the engine has no defined tier, returns a default of 1 hour.
|
|
func EngineTier(engineName string, overrides map[string]time.Duration) TTLTier {
|
|
// Check override first — override tier name is just the engine name
|
|
if override, ok := overrides[engineName]; ok && override > 0 {
|
|
return TTLTier{Name: engineName, Duration: override}
|
|
}
|
|
|
|
// Fall back to default tier
|
|
if tier, ok := defaultTiers[engineName]; ok {
|
|
return tier
|
|
}
|
|
|
|
// Unknown engines get a sensible default
|
|
return TTLTier{Name: "unknown", Duration: 1 * time.Hour}
|
|
}
|
|
```
|
|
|
|
- [ ] **Step 2: Run go vet to verify it compiles**
|
|
|
|
Run: `nix develop --command bash -c "go vet ./internal/cache/tiers.go"`
|
|
Expected: no output (success)
|
|
|
|
- [ ] **Step 3: Write a basic test for EngineTier**
|
|
|
|
```go
|
|
// In internal/cache/tiers_test.go:
|
|
|
|
package cache
|
|
|
|
import "testing"
|
|
|
|
func TestEngineTier(t *testing.T) {
|
|
// Test default static tier
|
|
tier := EngineTier("wikipedia", nil)
|
|
if tier.Name != "static" || tier.Duration != 24*time.Hour {
|
|
t.Errorf("wikipedia: expected static/24h, got %s/%v", tier.Name, tier.Duration)
|
|
}
|
|
|
|
// Test default api_general tier
|
|
tier = EngineTier("braveapi", nil)
|
|
if tier.Name != "api_general" || tier.Duration != 1*time.Hour {
|
|
t.Errorf("braveapi: expected api_general/1h, got %s/%v", tier.Name, tier.Duration)
|
|
}
|
|
|
|
// Test override takes precedence — override tier name is just the engine name
|
|
override := 48 * time.Hour
|
|
tier = EngineTier("wikipedia", map[string]time.Duration{"wikipedia": override})
|
|
if tier.Name != "wikipedia" || tier.Duration != 48*time.Hour {
|
|
t.Errorf("wikipedia override: expected wikipedia/48h, got %s/%v", tier.Name, tier.Duration)
|
|
}
|
|
|
|
// Test unknown engine gets default
|
|
tier = EngineTier("unknown_engine", nil)
|
|
if tier.Name != "unknown" || tier.Duration != 1*time.Hour {
|
|
t.Errorf("unknown engine: expected unknown/1h, got %s/%v", tier.Name, tier.Duration)
|
|
}
|
|
}
|
|
```
|
|
|
|
- [ ] **Step 4: Run test to verify it passes**
|
|
|
|
Run: `nix develop --command bash -c "go test -run TestEngineTier ./internal/cache/ -v"`
|
|
Expected: PASS
|
|
|
|
- [ ] **Step 5: Commit**
|
|
|
|
```bash
|
|
git add internal/cache/tiers.go internal/cache/tiers_test.go
|
|
git commit -m "cache: add tier definitions and EngineTier function"
|
|
```
|
|
|
|
---
|
|
|
|
## Task 3: Create EngineCache in engine_cache.go
|
|
|
|
**Files:**
|
|
- Create: `internal/cache/engine_cache.go`
|
|
- Create: `internal/cache/engine_cache_test.go`
|
|
|
|
**Note:** The existing `Key()` function in `cache.go` is still used for favicon caching. The new `QueryHash()` and `EngineCache` are separate and only for per-engine search response caching.
|
|
|
|
- [ ] **Step 1: Write failing test for EngineCache.Get/Set**
|
|
|
|
```go
|
|
package cache
|
|
|
|
import (
|
|
"context"
|
|
"testing"
|
|
|
|
"github.com/metamorphosis-dev/samsa/internal/contracts"
|
|
)
|
|
|
|
func TestEngineCacheGetSet(t *testing.T) {
|
|
// Create a disabled cache for unit testing (nil client)
|
|
c := &Cache{logger: slog.Default()}
|
|
ec := NewEngineCache(c, nil)
|
|
|
|
ctx := context.Background()
|
|
cached, ok := ec.Get(ctx, "wikipedia", "abc123")
|
|
if ok {
|
|
t.Errorf("Get on disabled cache: expected false, got %v", ok)
|
|
}
|
|
_ = cached // unused when ok=false
|
|
}
|
|
|
|
func TestEngineCacheKeyFormat(t *testing.T) {
|
|
key := engineCacheKey("wikipedia", "abc123")
|
|
if key != "samsa:resp:wikipedia:abc123" {
|
|
t.Errorf("engineCacheKey: expected samsa:resp:wikipedia:abc123, got %s", key)
|
|
}
|
|
}
|
|
|
|
func TestEngineCacheIsStale(t *testing.T) {
|
|
c := &Cache{logger: slog.Default()}
|
|
ec := NewEngineCache(c, nil)
|
|
|
|
// Fresh response (stored 1 minute ago, wikipedia has 24h TTL)
|
|
fresh := CachedEngineResponse{
|
|
Engine: "wikipedia",
|
|
Response: []byte(`{}`),
|
|
StoredAt: time.Now().Add(-1 * time.Minute),
|
|
}
|
|
if ec.IsStale(fresh, "wikipedia") {
|
|
t.Errorf("IsStale: 1-minute-old wikipedia should NOT be stale")
|
|
}
|
|
|
|
// Stale response (stored 25 hours ago)
|
|
stale := CachedEngineResponse{
|
|
Engine: "wikipedia",
|
|
Response: []byte(`{}`),
|
|
StoredAt: time.Now().Add(-25 * time.Hour),
|
|
}
|
|
if !ec.IsStale(stale, "wikipedia") {
|
|
t.Errorf("IsStale: 25-hour-old wikipedia SHOULD be stale (24h TTL)")
|
|
}
|
|
|
|
// Override: 30 minute TTL for reddit
|
|
overrides := map[string]time.Duration{"reddit": 30 * time.Minute}
|
|
ec2 := NewEngineCache(c, overrides)
|
|
|
|
// 20 minutes old with 30m override should NOT be stale
|
|
redditFresh := CachedEngineResponse{
|
|
Engine: "reddit",
|
|
Response: []byte(`{}`),
|
|
StoredAt: time.Now().Add(-20 * time.Minute),
|
|
}
|
|
if ec2.IsStale(redditFresh, "reddit") {
|
|
t.Errorf("IsStale: 20-min reddit with 30m override should NOT be stale")
|
|
}
|
|
|
|
// 45 minutes old with 30m override SHOULD be stale
|
|
redditStale := CachedEngineResponse{
|
|
Engine: "reddit",
|
|
Response: []byte(`{}`),
|
|
StoredAt: time.Now().Add(-45 * time.Minute),
|
|
}
|
|
if !ec2.IsStale(redditStale, "reddit") {
|
|
t.Errorf("IsStale: 45-min reddit with 30m override SHOULD be stale")
|
|
}
|
|
}
|
|
```
|
|
|
|
- [ ] **Step 2: Run test to verify it fails**
|
|
|
|
Run: `nix develop --command bash -c "go test -run TestEngineCache ./internal/cache/ -v"`
|
|
Expected: FAIL — "EngineCache not defined" or "CachedEngineResponse not defined"
|
|
|
|
- [ ] **Step 3: Implement EngineCache using GetBytes/SetBytes**
|
|
|
|
The `EngineCache` uses the existing `GetBytes`/`SetBytes` public methods on `Cache` (the `client` field is unexported so we must use those methods).
|
|
|
|
```go
|
|
package cache
|
|
|
|
import (
|
|
"context"
|
|
"encoding/json"
|
|
"log/slog"
|
|
"time"
|
|
|
|
"github.com/metamorphosis-dev/samsa/internal/contracts"
|
|
)
|
|
|
|
// EngineCache wraps Cache with per-engine tier-aware Get/Set operations.
|
|
type EngineCache struct {
|
|
cache *Cache
|
|
overrides map[string]time.Duration
|
|
}
|
|
|
|
// NewEngineCache creates a new EngineCache with optional TTL overrides.
|
|
// If overrides is nil, default tier durations are used.
|
|
func NewEngineCache(cache *Cache, overrides map[string]time.Duration) *EngineCache {
|
|
return &EngineCache{
|
|
cache: cache,
|
|
overrides: overrides,
|
|
}
|
|
}
|
|
|
|
// Get retrieves a cached engine response. Returns (zero value, false) if not
|
|
// found or if cache is disabled.
|
|
func (ec *EngineCache) Get(ctx context.Context, engine, queryHash string) (CachedEngineResponse, bool) {
|
|
key := engineCacheKey(engine, queryHash)
|
|
|
|
data, ok := ec.cache.GetBytes(ctx, key)
|
|
if !ok {
|
|
return CachedEngineResponse{}, false
|
|
}
|
|
|
|
var cached CachedEngineResponse
|
|
if err := json.Unmarshal(data, &cached); err != nil {
|
|
ec.cache.logger.Warn("engine cache hit but unmarshal failed", "key", key, "error", err)
|
|
return CachedEngineResponse{}, false
|
|
}
|
|
|
|
ec.cache.logger.Debug("engine cache hit", "key", key, "engine", engine)
|
|
return cached, true
|
|
}
|
|
|
|
// Set stores an engine response in the cache with the engine's tier TTL.
|
|
func (ec *EngineCache) Set(ctx context.Context, engine, queryHash string, resp contracts.SearchResponse) {
|
|
if !ec.cache.Enabled() {
|
|
return
|
|
}
|
|
|
|
data, err := json.Marshal(resp)
|
|
if err != nil {
|
|
ec.cache.logger.Warn("engine cache set: marshal failed", "engine", engine, "error", err)
|
|
return
|
|
}
|
|
|
|
tier := EngineTier(engine, ec.overrides)
|
|
key := engineCacheKey(engine, queryHash)
|
|
|
|
cached := CachedEngineResponse{
|
|
Engine: engine,
|
|
Response: data,
|
|
StoredAt: time.Now(),
|
|
}
|
|
|
|
cachedData, err := json.Marshal(cached)
|
|
if err != nil {
|
|
ec.cache.logger.Warn("engine cache set: wrap marshal failed", "key", key, "error", err)
|
|
return
|
|
}
|
|
|
|
ec.cache.SetBytes(ctx, key, cachedData, tier.Duration)
|
|
}
|
|
|
|
// IsStale returns true if the cached response is older than the tier's TTL.
|
|
func (ec *EngineCache) IsStale(cached CachedEngineResponse, engine string) bool {
|
|
tier := EngineTier(engine, ec.overrides)
|
|
return time.Since(cached.StoredAt) > tier.Duration
|
|
}
|
|
|
|
// Logger returns the logger for background refresh logging.
|
|
func (ec *EngineCache) Logger() *slog.Logger {
|
|
return ec.cache.logger
|
|
}
|
|
|
|
// engineCacheKey builds the cache key for an engine+query combination.
|
|
func engineCacheKey(engine, queryHash string) string {
|
|
return "samsa:resp:" + engine + ":" + queryHash
|
|
}
|
|
```
|
|
|
|
- [ ] **Step 4: Run tests to verify they pass**
|
|
|
|
Run: `nix develop --command bash -c "go test -run TestEngineCache ./internal/cache/ -v"`
|
|
Expected: PASS
|
|
|
|
- [ ] **Step 5: Commit**
|
|
|
|
```bash
|
|
git add internal/cache/engine_cache.go internal/cache/engine_cache_test.go
|
|
git commit -m "cache: add EngineCache with tier-aware Get/Set"
|
|
```
|
|
|
|
---
|
|
|
|
## Task 4: Add TTLOverrides to config
|
|
|
|
**Files:**
|
|
- Modify: `internal/config/config.go`
|
|
|
|
- [ ] **Step 1: Add TTLOverrides to CacheConfig**
|
|
|
|
In `CacheConfig` struct, add:
|
|
|
|
```go
|
|
type CacheConfig struct {
|
|
Address string `toml:"address"`
|
|
Password string `toml:"password"`
|
|
DB int `toml:"db"`
|
|
DefaultTTL string `toml:"default_ttl"`
|
|
TTLOverrides map[string]string `toml:"ttl_overrides"` // engine -> duration string
|
|
}
|
|
```
|
|
|
|
- [ ] **Step 2: Add TTLOverridesParsed() method to Config**
|
|
|
|
Add after `CacheTTL()`:
|
|
|
|
```go
|
|
// CacheTTLOverrides returns parsed TTL overrides from config.
|
|
func (c *Config) CacheTTLOverrides() map[string]time.Duration {
|
|
if len(c.Cache.TTLOverrides) == 0 {
|
|
return nil
|
|
}
|
|
out := make(map[string]time.Duration, len(c.Cache.TTLOverrides))
|
|
for engine, durStr := range c.Cache.TTLOverrides {
|
|
if d, err := time.ParseDuration(durStr); err == nil && d > 0 {
|
|
out[engine] = d
|
|
}
|
|
}
|
|
return out
|
|
}
|
|
```
|
|
|
|
- [ ] **Step 3: Run tests to verify nothing breaks**
|
|
|
|
Run: `nix develop --command bash -c "go test ./internal/config/ -v"`
|
|
Expected: PASS
|
|
|
|
- [ ] **Step 4: Commit**
|
|
|
|
```bash
|
|
git add internal/config/config.go
|
|
git commit -m "config: add TTLOverrides to CacheConfig"
|
|
```
|
|
|
|
---
|
|
|
|
## Task 5: Wire EngineCache into search service
|
|
|
|
**Files:**
|
|
- Modify: `internal/search/service.go`
|
|
|
|
- [ ] **Step 1: Read the current service.go to understand wiring**
|
|
|
|
The service currently takes `*Cache` in `ServiceConfig`. We need to change it to take `*EngineCache` or change the field type.
|
|
|
|
- [ ] **Step 2: Modify Service struct and NewService to use EngineCache**
|
|
|
|
Change `Service`:
|
|
|
|
```go
|
|
type Service struct {
|
|
upstreamClient *upstream.Client
|
|
planner *engines.Planner
|
|
localEngines map[string]engines.Engine
|
|
engineCache *cache.EngineCache
|
|
}
|
|
```
|
|
|
|
Change `NewService`:
|
|
|
|
```go
|
|
func NewService(cfg ServiceConfig) *Service {
|
|
timeout := cfg.HTTPTimeout
|
|
if timeout <= 0 {
|
|
timeout = 10 * time.Second
|
|
}
|
|
|
|
httpClient := httpclient.NewClient(timeout)
|
|
|
|
var up *upstream.Client
|
|
if cfg.UpstreamURL != "" {
|
|
c, err := upstream.NewClient(cfg.UpstreamURL, timeout)
|
|
if err == nil {
|
|
up = c
|
|
}
|
|
}
|
|
|
|
var engineCache *cache.EngineCache
|
|
if cfg.Cache != nil {
|
|
engineCache = cache.NewEngineCache(cfg.Cache, cfg.CacheTTLOverrides)
|
|
}
|
|
|
|
return &Service{
|
|
upstreamClient: up,
|
|
planner: engines.NewPlannerFromEnv(),
|
|
localEngines: engines.NewDefaultPortedEngines(httpClient, cfg.EnginesConfig),
|
|
engineCache: engineCache,
|
|
}
|
|
}
|
|
```
|
|
|
|
Add `CacheTTLOverrides` to `ServiceConfig`:
|
|
|
|
```go
|
|
type ServiceConfig struct {
|
|
UpstreamURL string
|
|
HTTPTimeout time.Duration
|
|
Cache *cache.Cache
|
|
CacheTTLOverrides map[string]time.Duration
|
|
EnginesConfig *config.Config
|
|
}
|
|
```
|
|
|
|
- [ ] **Step 3: Rewrite Search() with correct stale-while-revalidate logic**
|
|
|
|
The stale-while-revalidate flow:
|
|
|
|
1. **Cache lookup (Phase 1)**: Check cache for each engine in parallel. Classify each as:
|
|
- Fresh hit: cache has data AND not stale → deserialize, mark as `fresh`
|
|
- Stale hit: cache has data AND stale → keep in `cached`, no `fresh` yet
|
|
- Miss: cache has no data → `hit=false`, no `cached` or `fresh`
|
|
|
|
2. **Fetch (Phase 2)**: For each engine:
|
|
- Fresh hit: return immediately, no fetch needed
|
|
- Stale hit: return stale data immediately, fetch fresh in background
|
|
- Miss: fetch fresh synchronously, cache result
|
|
|
|
3. **Collect (Phase 3)**: Collect all responses for merge.
|
|
|
|
```go
|
|
// Search executes the request against local engines (in parallel) and
|
|
// optionally the upstream instance for unported engines.
|
|
func (s *Service) Search(ctx context.Context, req SearchRequest) (SearchResponse, error) {
|
|
queryHash := cache.QueryHash(
|
|
req.Query,
|
|
int(req.Pageno),
|
|
int(req.Safesearch),
|
|
req.Language,
|
|
derefString(req.TimeRange),
|
|
)
|
|
|
|
localEngineNames, upstreamEngineNames, _ := s.planner.Plan(req)
|
|
|
|
// Phase 1: Parallel cache lookups — classify each engine as fresh/stale/miss
|
|
type cacheResult struct {
|
|
engine string
|
|
cached cache.CachedEngineResponse
|
|
hit bool
|
|
fresh contracts.SearchResponse
|
|
fetchErr error
|
|
unmarshalErr bool // true if hit but unmarshal failed (treat as miss)
|
|
}
|
|
|
|
cacheResults := make([]cacheResult, len(localEngineNames))
|
|
|
|
var lookupWg sync.WaitGroup
|
|
for i, name := range localEngineNames {
|
|
lookupWg.Add(1)
|
|
go func(i int, name string) {
|
|
defer lookupWg.Done()
|
|
|
|
result := cacheResult{engine: name}
|
|
|
|
if s.engineCache != nil {
|
|
cached, ok := s.engineCache.Get(ctx, name, queryHash)
|
|
if ok {
|
|
result.hit = true
|
|
result.cached = cached
|
|
if !s.engineCache.IsStale(cached, name) {
|
|
// Fresh cache hit — deserialize and use directly
|
|
var resp contracts.SearchResponse
|
|
if err := json.Unmarshal(cached.Response, &resp); err == nil {
|
|
result.fresh = resp
|
|
} else {
|
|
// Unmarshal failed — treat as cache miss (will fetch fresh synchronously)
|
|
result.unmarshalErr = true
|
|
result.hit = false // treat as miss
|
|
}
|
|
}
|
|
// If stale: result.fresh stays zero, result.cached has stale data
|
|
}
|
|
}
|
|
|
|
cacheResults[i] = result
|
|
}(i, name)
|
|
}
|
|
lookupWg.Wait()
|
|
|
|
// Phase 2: Fetch fresh for misses and stale entries
|
|
var fetchWg sync.WaitGroup
|
|
for i, name := range localEngineNames {
|
|
cr := cacheResults[i]
|
|
|
|
// Fresh hit — nothing to do in phase 2
|
|
if cr.hit && cr.fresh.Response != nil {
|
|
continue
|
|
}
|
|
|
|
// Stale hit — return stale immediately, refresh in background
|
|
if cr.hit && cr.cached.Response != nil && s.engineCache != nil && s.engineCache.IsStale(cr.cached, name) {
|
|
fetchWg.Add(1)
|
|
go func(name string) {
|
|
defer fetchWg.Done()
|
|
eng, ok := s.localEngines[name]
|
|
if !ok {
|
|
return
|
|
}
|
|
freshResp, err := eng.Search(ctx, req)
|
|
if err != nil {
|
|
s.engineCache.Logger().Debug("background refresh failed", "engine", name, "error", err)
|
|
return
|
|
}
|
|
s.engineCache.Set(ctx, name, queryHash, freshResp)
|
|
}(name)
|
|
continue
|
|
}
|
|
|
|
// Cache miss — fetch fresh synchronously
|
|
if !cr.hit {
|
|
fetchWg.Add(1)
|
|
go func(i int, name string) {
|
|
defer fetchWg.Done()
|
|
|
|
eng, ok := s.localEngines[name]
|
|
if !ok {
|
|
cacheResults[i] = cacheResult{
|
|
engine: name,
|
|
fetchErr: fmt.Errorf("engine not registered: %s", name),
|
|
}
|
|
return
|
|
}
|
|
|
|
freshResp, err := eng.Search(ctx, req)
|
|
if err != nil {
|
|
cacheResults[i] = cacheResult{
|
|
engine: name,
|
|
fetchErr: err,
|
|
}
|
|
return
|
|
}
|
|
|
|
// Cache the fresh response
|
|
if s.engineCache != nil {
|
|
s.engineCache.Set(ctx, name, queryHash, freshResp)
|
|
}
|
|
|
|
cacheResults[i] = cacheResult{
|
|
engine: name,
|
|
fresh: freshResp,
|
|
hit: false,
|
|
}
|
|
}(i, name)
|
|
}
|
|
}
|
|
fetchWg.Wait()
|
|
|
|
// Phase 3: Collect responses for merge
|
|
responses := make([]contracts.SearchResponse, 0, len(cacheResults))
|
|
|
|
for _, cr := range cacheResults {
|
|
if cr.fetchErr != nil {
|
|
responses = append(responses, unresponsiveResponse(req.Query, cr.engine, cr.fetchErr.Error()))
|
|
continue
|
|
}
|
|
// Use fresh data if available (fresh hit or freshly fetched), otherwise use stale cached
|
|
if cr.fresh.Response != nil {
|
|
responses = append(responses, cr.fresh)
|
|
} else if cr.hit && cr.cached.Response != nil {
|
|
var resp contracts.SearchResponse
|
|
if err := json.Unmarshal(cr.cached.Response, &resp); err == nil {
|
|
responses = append(responses, resp)
|
|
}
|
|
}
|
|
}
|
|
|
|
// ... rest of upstream proxy and merge logic (unchanged) ...
|
|
}
|
|
```
|
|
|
|
Note: The imports need `encoding/json` and `fmt` added. The existing imports in service.go already include `sync` and `time`.
|
|
|
|
- [ ] **Step 4: Run tests to verify compilation**
|
|
|
|
Run: `nix develop --command bash -c "go build ./internal/search/"`
|
|
Expected: no output (success)
|
|
|
|
- [ ] **Step 5: Run full test suite**
|
|
|
|
Run: `nix develop --command bash -c "go test ./..."`
|
|
Expected: All pass
|
|
|
|
- [ ] **Step 6: Commit**
|
|
|
|
```bash
|
|
git add internal/search/service.go
|
|
git commit -m "search: wire per-engine cache with tier-aware TTLs"
|
|
```
|
|
|
|
---
|
|
|
|
## Task 6: Update config.example.toml
|
|
|
|
**Files:**
|
|
- Modify: `config.example.toml`
|
|
|
|
- [ ] **Step 1: Add TTL overrides section to config.example.toml**
|
|
|
|
Add after the `[cache]` section:
|
|
|
|
```toml
|
|
[cache.ttl_overrides]
|
|
# Per-engine TTL overrides (uncomment to use):
|
|
# wikipedia = "48h"
|
|
# reddit = "15m"
|
|
# braveapi = "2h"
|
|
```
|
|
|
|
- [ ] **Step 2: Commit**
|
|
|
|
```bash
|
|
git add config.example.toml
|
|
git commit -m "config: add cache.ttl_overrides example"
|
|
```
|
|
|
|
---
|
|
|
|
## Verification
|
|
|
|
After all tasks complete, run:
|
|
|
|
```bash
|
|
nix develop --command bash -c "go test ./... -v 2>&1 | tail -50"
|
|
```
|
|
|
|
All tests should pass. The search service should now cache each engine's response independently with tier-based TTLs.
|