61 lines
2.6 KiB
Markdown
61 lines
2.6 KiB
Markdown
# Synthetic Training Data Generator
|
|
|
|
This tool generates high-quality synthetic training data for fine-tuning LLMs using an OpenAI-compatible API. Designed for roleplay data with a strict style: **Obtuse, Passionate, Absurd** (includes mature themes).
|
|
|
|
## Current Status (2024-12-14)
|
|
|
|
**ISSUE**: The script is getting intermittent HTTP 400 and 429 errors from the API.
|
|
|
|
- **429 errors**: Quota exhausted on rotating keys (handled by key rotation)
|
|
- **400 errors**: Need to add retry logic to handle transient failures
|
|
|
|
**TODO for next session**:
|
|
1. Add retry logic with exponential backoff to `generate_training_data.py`
|
|
2. Detect when error messages are returned as successful content (the proxy sometimes returns errors inside 200 responses)
|
|
3. Consider filtering out responses that start with `错误:` (Chinese for "Error:")
|
|
|
|
## Structure
|
|
|
|
- `generate_training_data.py`: Main script that processes character cards and generates multi-turn conversations
|
|
- `.env`: API configuration (API_KEY, MODEL_NAME, BASE_URL)
|
|
- `chars/`: Directory containing character definition files (chara_card_v2 JSON format)
|
|
- `training_data.json`: Output file with generated conversations
|
|
- `GEMINI.md`: Session memory file with full context history
|
|
|
|
## Setup
|
|
|
|
1. **Configure API** - Edit `.env`:
|
|
```ini
|
|
API_KEY=your_api_key
|
|
MODEL_NAME=claude-opus-4-5-thinking
|
|
BASE_URL=http://127.0.0.1:8045/v1
|
|
```
|
|
|
|
2. **Run on NixOS**:
|
|
```bash
|
|
nix-shell -p python3Packages.python-dotenv python3Packages.requests python3Packages.openai --run "python generate_training_data.py"
|
|
```
|
|
|
|
## How It Works
|
|
|
|
1. Loads character cards from `chars/*.json`
|
|
2. Uses an enforced "GameMaster" system prompt (see `ENFORCED_SYSTEM_PROMPT` in script)
|
|
3. For each character:
|
|
- Uses the character's `first_mes` as the initial assistant message
|
|
- Generates 5 turns of User ↔ Character interaction
|
|
- User responses are generated by a "User Simulator" prompt
|
|
- Character responses use the full system prompt + character description
|
|
4. Saves incrementally to `training_data.json`
|
|
|
|
## Key Code Sections
|
|
|
|
- **Lines 137-197**: The `ENFORCED_SYSTEM_PROMPT` - detailed roleplay instructions
|
|
- **Lines 38-82**: `generate_user_response()` - simulates user input
|
|
- **Lines 84-107**: `generate_character_response()` - generates character replies
|
|
- **Error handling**: Currently catches `APIStatusError` but needs retry logic
|
|
|
|
## API Notes
|
|
|
|
- The local endpoint at `127.0.0.1:8045` is a proxy with rotating API keys
|
|
- Thinking models (`claude-*-thinking`) may have special requirements
|
|
- Error responses sometimes come back as 200 with error text in content
|