samsa/docs/superpowers/plans/2026-03-24-per-engine-ttl-cache.md
ashisgreat22 bfc4f8d657 docs: fix per-engine TTL cache plan bugs
- Fix stale-while-revalidate condition (was inverted for stale vs fresh)
- Add unmarshalErr tracking for cache corruption edge case
- Rename CachedResponse to CachedEngineResponse throughout
- Fix override tier name (engineName not engineName_override)
- Add EngineCache.Logger() method
- Add tiers_test.go to file map

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-24 01:04:32 +01:00

789 lines
25 KiB
Markdown

# Per-Engine TTL Cache — Implementation Plan
> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
**Goal:** Replace the merged-response cache with per-engine response caching, enabling tier-based TTLs and stale-while-revalidate semantics.
**Architecture:** Each engine's raw response is cached independently with its tier-based TTL. On stale hits, return cached data immediately and refresh in background. Query hash is computed from shared params (query, pageno, safesearch, language, time_range) and prefixed with engine name for the cache key.
**Tech Stack:** Go 1.24, Valkey/Redis (go-redis/v9), existing samsa contracts
---
## File Map
| Action | File | Responsibility |
|--------|------|----------------|
| Create | `internal/cache/tiers.go` | Tier definitions, `EngineTier()` function |
| Create | `internal/cache/tiers_test.go` | Tests for EngineTier |
| Create | `internal/cache/engine_cache.go` | `EngineCache` struct with tier-aware Get/Set |
| Create | `internal/cache/engine_cache_test.go` | Tests for EngineCache |
| Modify | `internal/cache/cache.go` | Add `QueryHash()`, add `CachedEngineResponse` type |
| Modify | `internal/cache/cache_test.go` | Add tests for `QueryHash()` |
| Modify | `internal/config/config.go` | Add `TTLOverrides` to `CacheConfig` |
| Modify | `internal/search/service.go` | Use `EngineCache`, parallel lookups, background refresh |
---
## Task 1: Add QueryHash and CachedEngineResponse to cache.go
**Files:**
- Modify: `internal/cache/cache.go`
- Modify: `internal/cache/cache_test.go`
- [ ] **Step 1: Write failing test for QueryHash()**
```go
// In cache_test.go, add:
func TestQueryHash(t *testing.T) {
// Same params should produce same hash
hash1 := QueryHash("golang", 1, 0, "en", "")
hash2 := QueryHash("golang", 1, 0, "en", "")
if hash1 != hash2 {
t.Errorf("QueryHash: same params should produce same hash, got %s != %s", hash1, hash2)
}
// Different query should produce different hash
hash3 := QueryHash("rust", 1, 0, "en", "")
if hash1 == hash3 {
t.Errorf("QueryHash: different queries should produce different hash")
}
// Different pageno should produce different hash
hash4 := QueryHash("golang", 2, 0, "en", "")
if hash1 == hash4 {
t.Errorf("QueryHash: different pageno should produce different hash")
}
// time_range should affect hash
hash5 := QueryHash("golang", 1, 0, "en", "day")
if hash1 == hash5 {
t.Errorf("QueryHash: different time_range should produce different hash")
}
// Hash should be 16 characters (truncated SHA-256)
if len(hash1) != 16 {
t.Errorf("QueryHash: expected 16 char hash, got %d", len(hash1))
}
}
```
- [ ] **Step 2: Run test to verify it fails**
Run: `nix develop --command bash -c "go test -run TestQueryHash ./internal/cache/ -v"`
Expected: FAIL — "QueryHash not defined"
- [ ] **Step 3: Implement QueryHash() and CachedEngineResponse in cache.go**
Add to `cache.go` (the imports `crypto/sha256` and `encoding/hex` are already present in cache.go from the existing `Key()` function):
```go
// QueryHash computes a deterministic hash from shared request parameters
// (query, pageno, safesearch, language, time_range) for use as a cache key suffix.
// The hash is a truncated SHA-256 (16 hex chars).
func QueryHash(query string, pageno int, safesearch int, language, timeRange string) string {
h := sha256.New()
fmt.Fprintf(h, "q=%s|", query)
fmt.Fprintf(h, "pageno=%d|", pageno)
fmt.Fprintf(h, "safesearch=%d|", safesearch)
fmt.Fprintf(h, "lang=%s|", language)
if timeRange != "" {
fmt.Fprintf(h, "tr=%s|", timeRange)
}
return hex.EncodeToString(h.Sum(nil))[:16]
}
// CachedEngineResponse wraps an engine's cached response with metadata.
type CachedEngineResponse struct {
Engine string
Response []byte
StoredAt time.Time
}
```
- [ ] **Step 4: Run test to verify it passes**
Run: `nix develop --command bash -c "go test -run TestQueryHash ./internal/cache/ -v"`
Expected: PASS
- [ ] **Step 5: Commit**
```bash
git add internal/cache/cache.go internal/cache/cache_test.go
git commit -m "cache: add QueryHash and CachedEngineResponse type"
```
---
## Task 2: Create tiers.go with tier definitions
**Files:**
- Create: `internal/cache/tiers.go`
- [ ] **Step 1: Create tiers.go with tier definitions and EngineTier function**
```go
package cache
import "time"
// TTLTier represents a cache TTL tier with a name and duration.
type TTLTier struct {
Name string
Duration time.Duration
}
// defaultTiers maps engine names to their default TTL tiers.
var defaultTiers = map[string]TTLTier{
// Static knowledge engines — rarely change
"wikipedia": {Name: "static", Duration: 24 * time.Hour},
"wikidata": {Name: "static", Duration: 24 * time.Hour},
"arxiv": {Name: "static", Duration: 24 * time.Hour},
"crossref": {Name: "static", Duration: 24 * time.Hour},
"stackoverflow": {Name: "static", Duration: 24 * time.Hour},
"github": {Name: "static", Duration: 24 * time.Hour},
// API-based general search — fresher data
"braveapi": {Name: "api_general", Duration: 1 * time.Hour},
"youtube": {Name: "api_general", Duration: 1 * time.Hour},
// Scraped general search — moderately stable
"google": {Name: "scraped_general", Duration: 2 * time.Hour},
"bing": {Name: "scraped_general", Duration: 2 * time.Hour},
"duckduckgo": {Name: "scraped_general", Duration: 2 * time.Hour},
"qwant": {Name: "scraped_general", Duration: 2 * time.Hour},
"brave": {Name: "scraped_general", Duration: 2 * time.Hour},
// News/social — changes frequently
"reddit": {Name: "news_social", Duration: 30 * time.Minute},
// Image search
"bing_images": {Name: "images", Duration: 1 * time.Hour},
"ddg_images": {Name: "images", Duration: 1 * time.Hour},
"qwant_images": {Name: "images", Duration: 1 * time.Hour},
}
// EngineTier returns the TTL tier for an engine, applying overrides if provided.
// If the engine has no defined tier, returns a default of 1 hour.
func EngineTier(engineName string, overrides map[string]time.Duration) TTLTier {
// Check override first — override tier name is just the engine name
if override, ok := overrides[engineName]; ok && override > 0 {
return TTLTier{Name: engineName, Duration: override}
}
// Fall back to default tier
if tier, ok := defaultTiers[engineName]; ok {
return tier
}
// Unknown engines get a sensible default
return TTLTier{Name: "unknown", Duration: 1 * time.Hour}
}
```
- [ ] **Step 2: Run go vet to verify it compiles**
Run: `nix develop --command bash -c "go vet ./internal/cache/tiers.go"`
Expected: no output (success)
- [ ] **Step 3: Write a basic test for EngineTier**
```go
// In internal/cache/tiers_test.go:
package cache
import "testing"
func TestEngineTier(t *testing.T) {
// Test default static tier
tier := EngineTier("wikipedia", nil)
if tier.Name != "static" || tier.Duration != 24*time.Hour {
t.Errorf("wikipedia: expected static/24h, got %s/%v", tier.Name, tier.Duration)
}
// Test default api_general tier
tier = EngineTier("braveapi", nil)
if tier.Name != "api_general" || tier.Duration != 1*time.Hour {
t.Errorf("braveapi: expected api_general/1h, got %s/%v", tier.Name, tier.Duration)
}
// Test override takes precedence — override tier name is just the engine name
override := 48 * time.Hour
tier = EngineTier("wikipedia", map[string]time.Duration{"wikipedia": override})
if tier.Name != "wikipedia" || tier.Duration != 48*time.Hour {
t.Errorf("wikipedia override: expected wikipedia/48h, got %s/%v", tier.Name, tier.Duration)
}
// Test unknown engine gets default
tier = EngineTier("unknown_engine", nil)
if tier.Name != "unknown" || tier.Duration != 1*time.Hour {
t.Errorf("unknown engine: expected unknown/1h, got %s/%v", tier.Name, tier.Duration)
}
}
```
- [ ] **Step 4: Run test to verify it passes**
Run: `nix develop --command bash -c "go test -run TestEngineTier ./internal/cache/ -v"`
Expected: PASS
- [ ] **Step 5: Commit**
```bash
git add internal/cache/tiers.go internal/cache/tiers_test.go
git commit -m "cache: add tier definitions and EngineTier function"
```
---
## Task 3: Create EngineCache in engine_cache.go
**Files:**
- Create: `internal/cache/engine_cache.go`
- Create: `internal/cache/engine_cache_test.go`
**Note:** The existing `Key()` function in `cache.go` is still used for favicon caching. The new `QueryHash()` and `EngineCache` are separate and only for per-engine search response caching.
- [ ] **Step 1: Write failing test for EngineCache.Get/Set**
```go
package cache
import (
"context"
"testing"
"github.com/metamorphosis-dev/samsa/internal/contracts"
)
func TestEngineCacheGetSet(t *testing.T) {
// Create a disabled cache for unit testing (nil client)
c := &Cache{logger: slog.Default()}
ec := NewEngineCache(c, nil)
ctx := context.Background()
cached, ok := ec.Get(ctx, "wikipedia", "abc123")
if ok {
t.Errorf("Get on disabled cache: expected false, got %v", ok)
}
_ = cached // unused when ok=false
}
func TestEngineCacheKeyFormat(t *testing.T) {
key := engineCacheKey("wikipedia", "abc123")
if key != "samsa:resp:wikipedia:abc123" {
t.Errorf("engineCacheKey: expected samsa:resp:wikipedia:abc123, got %s", key)
}
}
func TestEngineCacheIsStale(t *testing.T) {
c := &Cache{logger: slog.Default()}
ec := NewEngineCache(c, nil)
// Fresh response (stored 1 minute ago, wikipedia has 24h TTL)
fresh := CachedEngineResponse{
Engine: "wikipedia",
Response: []byte(`{}`),
StoredAt: time.Now().Add(-1 * time.Minute),
}
if ec.IsStale(fresh, "wikipedia") {
t.Errorf("IsStale: 1-minute-old wikipedia should NOT be stale")
}
// Stale response (stored 25 hours ago)
stale := CachedEngineResponse{
Engine: "wikipedia",
Response: []byte(`{}`),
StoredAt: time.Now().Add(-25 * time.Hour),
}
if !ec.IsStale(stale, "wikipedia") {
t.Errorf("IsStale: 25-hour-old wikipedia SHOULD be stale (24h TTL)")
}
// Override: 30 minute TTL for reddit
overrides := map[string]time.Duration{"reddit": 30 * time.Minute}
ec2 := NewEngineCache(c, overrides)
// 20 minutes old with 30m override should NOT be stale
redditFresh := CachedEngineResponse{
Engine: "reddit",
Response: []byte(`{}`),
StoredAt: time.Now().Add(-20 * time.Minute),
}
if ec2.IsStale(redditFresh, "reddit") {
t.Errorf("IsStale: 20-min reddit with 30m override should NOT be stale")
}
// 45 minutes old with 30m override SHOULD be stale
redditStale := CachedEngineResponse{
Engine: "reddit",
Response: []byte(`{}`),
StoredAt: time.Now().Add(-45 * time.Minute),
}
if !ec2.IsStale(redditStale, "reddit") {
t.Errorf("IsStale: 45-min reddit with 30m override SHOULD be stale")
}
}
```
- [ ] **Step 2: Run test to verify it fails**
Run: `nix develop --command bash -c "go test -run TestEngineCache ./internal/cache/ -v"`
Expected: FAIL — "EngineCache not defined" or "CachedEngineResponse not defined"
- [ ] **Step 3: Implement EngineCache using GetBytes/SetBytes**
The `EngineCache` uses the existing `GetBytes`/`SetBytes` public methods on `Cache` (the `client` field is unexported so we must use those methods).
```go
package cache
import (
"context"
"encoding/json"
"log/slog"
"time"
"github.com/metamorphosis-dev/samsa/internal/contracts"
)
// EngineCache wraps Cache with per-engine tier-aware Get/Set operations.
type EngineCache struct {
cache *Cache
overrides map[string]time.Duration
}
// NewEngineCache creates a new EngineCache with optional TTL overrides.
// If overrides is nil, default tier durations are used.
func NewEngineCache(cache *Cache, overrides map[string]time.Duration) *EngineCache {
return &EngineCache{
cache: cache,
overrides: overrides,
}
}
// Get retrieves a cached engine response. Returns (zero value, false) if not
// found or if cache is disabled.
func (ec *EngineCache) Get(ctx context.Context, engine, queryHash string) (CachedEngineResponse, bool) {
key := engineCacheKey(engine, queryHash)
data, ok := ec.cache.GetBytes(ctx, key)
if !ok {
return CachedEngineResponse{}, false
}
var cached CachedEngineResponse
if err := json.Unmarshal(data, &cached); err != nil {
ec.cache.logger.Warn("engine cache hit but unmarshal failed", "key", key, "error", err)
return CachedEngineResponse{}, false
}
ec.cache.logger.Debug("engine cache hit", "key", key, "engine", engine)
return cached, true
}
// Set stores an engine response in the cache with the engine's tier TTL.
func (ec *EngineCache) Set(ctx context.Context, engine, queryHash string, resp contracts.SearchResponse) {
if !ec.cache.Enabled() {
return
}
data, err := json.Marshal(resp)
if err != nil {
ec.cache.logger.Warn("engine cache set: marshal failed", "engine", engine, "error", err)
return
}
tier := EngineTier(engine, ec.overrides)
key := engineCacheKey(engine, queryHash)
cached := CachedEngineResponse{
Engine: engine,
Response: data,
StoredAt: time.Now(),
}
cachedData, err := json.Marshal(cached)
if err != nil {
ec.cache.logger.Warn("engine cache set: wrap marshal failed", "key", key, "error", err)
return
}
ec.cache.SetBytes(ctx, key, cachedData, tier.Duration)
}
// IsStale returns true if the cached response is older than the tier's TTL.
func (ec *EngineCache) IsStale(cached CachedEngineResponse, engine string) bool {
tier := EngineTier(engine, ec.overrides)
return time.Since(cached.StoredAt) > tier.Duration
}
// Logger returns the logger for background refresh logging.
func (ec *EngineCache) Logger() *slog.Logger {
return ec.cache.logger
}
// engineCacheKey builds the cache key for an engine+query combination.
func engineCacheKey(engine, queryHash string) string {
return "samsa:resp:" + engine + ":" + queryHash
}
```
- [ ] **Step 4: Run tests to verify they pass**
Run: `nix develop --command bash -c "go test -run TestEngineCache ./internal/cache/ -v"`
Expected: PASS
- [ ] **Step 5: Commit**
```bash
git add internal/cache/engine_cache.go internal/cache/engine_cache_test.go
git commit -m "cache: add EngineCache with tier-aware Get/Set"
```
---
## Task 4: Add TTLOverrides to config
**Files:**
- Modify: `internal/config/config.go`
- [ ] **Step 1: Add TTLOverrides to CacheConfig**
In `CacheConfig` struct, add:
```go
type CacheConfig struct {
Address string `toml:"address"`
Password string `toml:"password"`
DB int `toml:"db"`
DefaultTTL string `toml:"default_ttl"`
TTLOverrides map[string]string `toml:"ttl_overrides"` // engine -> duration string
}
```
- [ ] **Step 2: Add TTLOverridesParsed() method to Config**
Add after `CacheTTL()`:
```go
// CacheTTLOverrides returns parsed TTL overrides from config.
func (c *Config) CacheTTLOverrides() map[string]time.Duration {
if len(c.Cache.TTLOverrides) == 0 {
return nil
}
out := make(map[string]time.Duration, len(c.Cache.TTLOverrides))
for engine, durStr := range c.Cache.TTLOverrides {
if d, err := time.ParseDuration(durStr); err == nil && d > 0 {
out[engine] = d
}
}
return out
}
```
- [ ] **Step 3: Run tests to verify nothing breaks**
Run: `nix develop --command bash -c "go test ./internal/config/ -v"`
Expected: PASS
- [ ] **Step 4: Commit**
```bash
git add internal/config/config.go
git commit -m "config: add TTLOverrides to CacheConfig"
```
---
## Task 5: Wire EngineCache into search service
**Files:**
- Modify: `internal/search/service.go`
- [ ] **Step 1: Read the current service.go to understand wiring**
The service currently takes `*Cache` in `ServiceConfig`. We need to change it to take `*EngineCache` or change the field type.
- [ ] **Step 2: Modify Service struct and NewService to use EngineCache**
Change `Service`:
```go
type Service struct {
upstreamClient *upstream.Client
planner *engines.Planner
localEngines map[string]engines.Engine
engineCache *cache.EngineCache
}
```
Change `NewService`:
```go
func NewService(cfg ServiceConfig) *Service {
timeout := cfg.HTTPTimeout
if timeout <= 0 {
timeout = 10 * time.Second
}
httpClient := httpclient.NewClient(timeout)
var up *upstream.Client
if cfg.UpstreamURL != "" {
c, err := upstream.NewClient(cfg.UpstreamURL, timeout)
if err == nil {
up = c
}
}
var engineCache *cache.EngineCache
if cfg.Cache != nil {
engineCache = cache.NewEngineCache(cfg.Cache, cfg.CacheTTLOverrides)
}
return &Service{
upstreamClient: up,
planner: engines.NewPlannerFromEnv(),
localEngines: engines.NewDefaultPortedEngines(httpClient, cfg.EnginesConfig),
engineCache: engineCache,
}
}
```
Add `CacheTTLOverrides` to `ServiceConfig`:
```go
type ServiceConfig struct {
UpstreamURL string
HTTPTimeout time.Duration
Cache *cache.Cache
CacheTTLOverrides map[string]time.Duration
EnginesConfig *config.Config
}
```
- [ ] **Step 3: Rewrite Search() with correct stale-while-revalidate logic**
The stale-while-revalidate flow:
1. **Cache lookup (Phase 1)**: Check cache for each engine in parallel. Classify each as:
- Fresh hit: cache has data AND not stale → deserialize, mark as `fresh`
- Stale hit: cache has data AND stale → keep in `cached`, no `fresh` yet
- Miss: cache has no data → `hit=false`, no `cached` or `fresh`
2. **Fetch (Phase 2)**: For each engine:
- Fresh hit: return immediately, no fetch needed
- Stale hit: return stale data immediately, fetch fresh in background
- Miss: fetch fresh synchronously, cache result
3. **Collect (Phase 3)**: Collect all responses for merge.
```go
// Search executes the request against local engines (in parallel) and
// optionally the upstream instance for unported engines.
func (s *Service) Search(ctx context.Context, req SearchRequest) (SearchResponse, error) {
queryHash := cache.QueryHash(
req.Query,
int(req.Pageno),
int(req.Safesearch),
req.Language,
derefString(req.TimeRange),
)
localEngineNames, upstreamEngineNames, _ := s.planner.Plan(req)
// Phase 1: Parallel cache lookups — classify each engine as fresh/stale/miss
type cacheResult struct {
engine string
cached cache.CachedEngineResponse
hit bool
fresh contracts.SearchResponse
fetchErr error
unmarshalErr bool // true if hit but unmarshal failed (treat as miss)
}
cacheResults := make([]cacheResult, len(localEngineNames))
var lookupWg sync.WaitGroup
for i, name := range localEngineNames {
lookupWg.Add(1)
go func(i int, name string) {
defer lookupWg.Done()
result := cacheResult{engine: name}
if s.engineCache != nil {
cached, ok := s.engineCache.Get(ctx, name, queryHash)
if ok {
result.hit = true
result.cached = cached
if !s.engineCache.IsStale(cached, name) {
// Fresh cache hit — deserialize and use directly
var resp contracts.SearchResponse
if err := json.Unmarshal(cached.Response, &resp); err == nil {
result.fresh = resp
} else {
// Unmarshal failed — treat as cache miss (will fetch fresh synchronously)
result.unmarshalErr = true
result.hit = false // treat as miss
}
}
// If stale: result.fresh stays zero, result.cached has stale data
}
}
cacheResults[i] = result
}(i, name)
}
lookupWg.Wait()
// Phase 2: Fetch fresh for misses and stale entries
var fetchWg sync.WaitGroup
for i, name := range localEngineNames {
cr := cacheResults[i]
// Fresh hit — nothing to do in phase 2
if cr.hit && cr.fresh.Response != nil {
continue
}
// Stale hit — return stale immediately, refresh in background
if cr.hit && cr.cached.Response != nil && s.engineCache != nil && s.engineCache.IsStale(cr.cached, name) {
fetchWg.Add(1)
go func(name string) {
defer fetchWg.Done()
eng, ok := s.localEngines[name]
if !ok {
return
}
freshResp, err := eng.Search(ctx, req)
if err != nil {
s.engineCache.Logger().Debug("background refresh failed", "engine", name, "error", err)
return
}
s.engineCache.Set(ctx, name, queryHash, freshResp)
}(name)
continue
}
// Cache miss — fetch fresh synchronously
if !cr.hit {
fetchWg.Add(1)
go func(i int, name string) {
defer fetchWg.Done()
eng, ok := s.localEngines[name]
if !ok {
cacheResults[i] = cacheResult{
engine: name,
fetchErr: fmt.Errorf("engine not registered: %s", name),
}
return
}
freshResp, err := eng.Search(ctx, req)
if err != nil {
cacheResults[i] = cacheResult{
engine: name,
fetchErr: err,
}
return
}
// Cache the fresh response
if s.engineCache != nil {
s.engineCache.Set(ctx, name, queryHash, freshResp)
}
cacheResults[i] = cacheResult{
engine: name,
fresh: freshResp,
hit: false,
}
}(i, name)
}
}
fetchWg.Wait()
// Phase 3: Collect responses for merge
responses := make([]contracts.SearchResponse, 0, len(cacheResults))
for _, cr := range cacheResults {
if cr.fetchErr != nil {
responses = append(responses, unresponsiveResponse(req.Query, cr.engine, cr.fetchErr.Error()))
continue
}
// Use fresh data if available (fresh hit or freshly fetched), otherwise use stale cached
if cr.fresh.Response != nil {
responses = append(responses, cr.fresh)
} else if cr.hit && cr.cached.Response != nil {
var resp contracts.SearchResponse
if err := json.Unmarshal(cr.cached.Response, &resp); err == nil {
responses = append(responses, resp)
}
}
}
// ... rest of upstream proxy and merge logic (unchanged) ...
}
```
Note: The imports need `encoding/json` and `fmt` added. The existing imports in service.go already include `sync` and `time`.
- [ ] **Step 4: Run tests to verify compilation**
Run: `nix develop --command bash -c "go build ./internal/search/"`
Expected: no output (success)
- [ ] **Step 5: Run full test suite**
Run: `nix develop --command bash -c "go test ./..."`
Expected: All pass
- [ ] **Step 6: Commit**
```bash
git add internal/search/service.go
git commit -m "search: wire per-engine cache with tier-aware TTLs"
```
---
## Task 6: Update config.example.toml
**Files:**
- Modify: `config.example.toml`
- [ ] **Step 1: Add TTL overrides section to config.example.toml**
Add after the `[cache]` section:
```toml
[cache.ttl_overrides]
# Per-engine TTL overrides (uncomment to use):
# wikipedia = "48h"
# reddit = "15m"
# braveapi = "2h"
```
- [ ] **Step 2: Commit**
```bash
git add config.example.toml
git commit -m "config: add cache.ttl_overrides example"
```
---
## Verification
After all tasks complete, run:
```bash
nix develop --command bash -c "go test ./... -v 2>&1 | tail -50"
```
All tests should pass. The search service should now cache each engine's response independently with tier-based TTLs.