Priority: TOP — This informs schema design, skill prompts, and render pipeline.
Last Updated: 2025-12-28
Status: ✅ REFERENCES VALIDATED (cloud + on-device) — synthesis still pending
Problem Identified (resolved for references/): Early drafts were created from web searches and aggregator sources and required systematic validation against official vendor documentation.
What exists:
- 50 cloud model reference docs
- 3 on-device model compilation docs (55+ models)
- Synthesis documents in
planning/synthesis/*(not yet revalidated) - Model ID audit (complete for models already documented; new models may appear over time)
What is now done (2025-12-28):
- All existing
references/**docs have been validated against official vendor documentation (for cloud) or primary upstream sources (HF/GitHub) for on-device. - Gaps (models that should exist in the library but do not yet have dedicated docs) are tracked in:
references/GAPS.mdreferences/MODEL-INVENTORY.md
Remaining risk: Any earlier inaccuracies may still exist in planning/synthesis/* until those documents are revalidated against the now-canonical reference docs.
Each document requires an agent to:
- Fetch official vendor documentation
- Compare EVERY claim in the reference doc
- Verify prompting vocabulary matches official guidance
- Verify capabilities (resolution, duration, formats)
- Verify pricing (cross-check with MODEL-AUDIT.md)
- Verify API parameters and endpoints
- Update the reference doc with corrections
- Note any new features/capabilities not captured
Similar validation against HuggingFace model cards and GitHub repos.
After reference docs are validated, verify synthesis docs reflect corrected information.
The canonical “what’s covered vs missing” list lives in:
references/MODEL-INVENTORY.mdreferences/GAPS.md
The tables below are a convenience snapshot for this plan doc.
| Document | Model | Provider | Primary Source | Status |
|---|---|---|---|---|
references/video/veo-3.md |
Veo 3.1 | cloud.google.com/vertex-ai/generative-ai/docs | COVERED | |
references/video/sora-2.md |
Sora 2 | OpenAI | platform.openai.com/docs | COVERED |
references/video/runway-gen4.5.md |
Gen-4/4.5 | Runway | docs.dev.runwayml.com | COVERED |
references/video/kling-2.1.md |
Kling 2.1 | Kuaishou | klingai.com/global/dev | COVERED |
references/video/luma-ray3.md |
Ray2/Ray3 | Luma AI | docs.lumalabs.ai | COVERED |
references/video/hailuo-02.md |
Hailuo 02 | MiniMax | platform.minimaxi.com/docs/api-reference/video-generation-intro | COVERED |
references/video/midjourney-video.md |
Midjourney Video | Midjourney | docs.midjourney.com/docs/video | COVERED |
references/video/seedance-1.5-pro.md |
Seedance 1.5 Pro / 1.0 family | ByteDance (Volcengine Ark) | volcengine.com/docs/82379 | COVERED |
references/video/pika-2.md |
Pika 2.2 (via fal.ai) | Pika | fal.ai/models | COVERED |
references/video/pixverse.md |
PixVerse (v5.5) | PixVerse | docs.platform.pixverse.ai | COVERED |
references/video/haiper-2.x.md |
Haiper Video 2.x | Haiper | docs.haiper.ai/api-reference | COVERED |
references/video/vidu.md |
Vidu (viduq1 / 2.0 / 1.5) | Vidu | docs.platform.vidu.com | COVERED |
references/video/firefly-video.md |
Firefly Video (Generate Video API) | Adobe | developer.adobe.com/firefly-services/docs | COVERED |
references/video/nova-reel.md |
Nova Reel | AWS (Amazon Bedrock) | docs.aws.amazon.com/nova/latest/userguide | COVERED |
references/video/alibaba-wan.md |
Wan (Wan2.x / Wanx2.1 + VACE editing) | Alibaba Cloud (Model Studio / DashScope) | alibabacloud.com/help | COVERED |
| Document | Model | Provider | Primary Source | Status |
|---|---|---|---|---|
references/image/nano-banana-pro.md |
Nano Banana / Nano Banana Pro | ai.google.dev/gemini-api/docs/image-generation | COVERED | |
references/image/imagen-4.md |
Imagen 4 | ai.google.dev/gemini-api/docs/imagen | COVERED | |
references/image/flux-2.md |
FLUX.2 | Black Forest Labs | docs.bfl.ai | COVERED |
references/image/flux-kontext.md |
FLUX.1 Kontext | Black Forest Labs | docs.bfl.ai/kontext | COVERED |
references/image/gpt-image.md |
GPT Image 1.5 | OpenAI | platform.openai.com/docs/guides/image-generation | COVERED |
references/image/midjourney.md |
Midjourney V7 | Midjourney | docs.midjourney.com | COVERED |
references/image/ideogram-3.md |
Ideogram 3.0 | Ideogram | developer.ideogram.ai | COVERED |
references/image/seedream-4.md |
Seedream 4.5 | ByteDance | docs.byteplus.com | COVERED |
references/image/firefly-image.md |
Firefly Image (API) | Adobe | developer.adobe.com/firefly-services/docs | COVERED |
references/image/stability-image.md |
Stable Image + SD 3.5 (API) | Stability AI | api.stability.ai/v2alpha/openapi | COVERED |
references/image/nova-canvas.md |
Nova Canvas | AWS (Amazon Bedrock) | docs.aws.amazon.com/nova/latest/userguide | COVERED |
references/image/minimax-image.md |
MiniMax Image Generation (image-01, image-01-live) |
MiniMax | platform.minimaxi.com/docs/api-reference/image-generation-intro | COVERED |
references/image/recraft.md |
Recraft (Recraft API) | Recraft | recraft.ai/docs/api-reference | COVERED |
references/image/leonardo.md |
Leonardo (Image API) | Leonardo AI | docs.leonardo.ai/reference | COVERED |
references/image/reve-image.md |
Reve Image API (Create/Edit/Remix) | Reve | api.reve.com | COVERED |
references/image/krea.md |
Krea (Image/Video API) | Krea | docs.krea.ai/api-reference | COVERED |
references/image/freepik-mystic.md |
Freepik Mystic | Freepik | docs.freepik.com/api-reference | COVERED |
| Document | Model | Provider | Primary Source | Status |
|---|---|---|---|---|
references/audio/elevenlabs.md |
ElevenLabs TTS | ElevenLabs | elevenlabs.io/docs | COVERED |
references/audio/eleven-music.md |
Eleven Music | ElevenLabs | elevenlabs.io/docs | COVERED |
references/audio/minimax-music.md |
MiniMax Music 2.0 (music-2.0) |
MiniMax | platform.minimaxi.com/docs/api-reference/music-intro | COVERED |
references/audio/suno-v5.md |
Suno v5 | Suno | help.suno.com | COVERED |
references/audio/udio.md |
Udio v1.5 | Udio | help.udio.com | COVERED |
references/audio/openai-tts.md |
OpenAI TTS | OpenAI | platform.openai.com/docs/guides/text-to-speech | COVERED |
references/audio/fish-audio-openaudio-s1.md |
OpenAudio S1 | Fish Audio | docs.fish.audio | COVERED |
references/audio/cartesia-sonic.md |
Sonic 3 | Cartesia | docs.cartesia.ai | COVERED |
references/audio/playht.md |
PlayHT | PlayHT | docs.play.ht | COVERED |
references/audio/gemini-tts.md |
Gemini Preview TTS | Google (Gemini API) | ai.google.dev/gemini-api/docs/speech-generation | COVERED |
references/audio/minimax-speech.md |
MiniMax Speech (T2A + Async + Voice Design/Cloning) | MiniMax | platform.minimaxi.com/docs/api-reference/speech-t2a-intro | COVERED |
references/audio/google-cloud-tts.md |
Google Cloud TTS | Google Cloud | cloud.google.com/text-to-speech | COVERED |
references/audio/azure-tts.md |
Azure TTS | Microsoft | learn.microsoft.com/azure/ai-services/speech-service | COVERED |
references/audio/amazon-polly.md |
Amazon Polly | AWS | docs.aws.amazon.com/polly | COVERED |
references/audio/respeecher.md |
Respeecher | Respeecher | docs.respeecher.com | COVERED |
references/audio/stable-audio.md |
Stable Audio 2 / 2.5 | Stability AI | api.stability.ai/v2alpha/openapi | COVERED |
references/audio/lyria-2.md |
Lyria 2 | docs.cloud.google.com/vertex-ai/generative-ai/docs | COVERED | |
references/audio/lyria-realtime.md |
Lyria RealTime | Google (Gemini API) | ai.google.dev/gemini-api/docs/music-generation | COVERED |
| Document | Models | Primary Sources | Status |
|---|---|---|---|
references/video/on-device-models.md |
compilation doc | HuggingFace model cards, GitHub | COVERED |
references/image/on-device-models.md |
compilation doc | HuggingFace model cards, GitHub | COVERED |
references/audio/on-device-models.md |
compilation doc | HuggingFace model cards | COVERED |
**Task**: Validate the reference document for [MODEL] against official [VENDOR] documentation.
**Reference Document**: `references/[category]/[file].md`
**Primary Source**: [OFFICIAL_DOCS_URL]
**Secondary Sources**: [AGGREGATOR_URLS]
**Validation Checklist**:
1. **Model Identity**
- [ ] Correct model name/version
- [ ] Correct API model_id (cross-check MODEL-AUDIT.md)
- [ ] Correct provider attribution
2. **Capabilities**
- [ ] Resolution limits verified
- [ ] Duration limits verified
- [ ] Supported formats verified
- [ ] Feature claims verified (audio support, text rendering, etc.)
3. **Pricing**
- [ ] Current pricing verified
- [ ] Pricing tiers/variants verified
- [ ] Credit system (if applicable) verified
4. **API Documentation**
- [ ] Endpoint format verified
- [ ] Authentication method verified
- [ ] Required parameters verified
- [ ] Optional parameters verified
- [ ] Response format verified
5. **Prompting Guide**
- [ ] Camera movement vocabulary verified (video)
- [ ] Style/aesthetic terminology verified (image)
- [ ] Voice/emotion controls verified (audio)
- [ ] Best practices match official guidance
- [ ] Example prompts verified
6. **Limitations**
- [ ] Known limitations documented
- [ ] Rate limits documented
- [ ] Content restrictions documented
**Output**:
- List of CONFIRMED items (with evidence links)
- List of CORRECTIONS needed (with correct information and evidence)
- List of ADDITIONS (new features/capabilities not in current doc)
- Updated reference document content
**Quality Bar**:
- Every claim must have evidence from official source
- No "seems" or "probably" - use UNKNOWN if unverifiable
- Preserve document structure, only update content
Validate `references/video/veo-3.md` against:
- https://cloud.google.com/vertex-ai/generative-ai/docs/model-reference/veo-video-generation
- https://cloud.google.com/vertex-ai/generative-ai/docs/models/veo/3-1-generate
- https://ai.google.dev/gemini-api/docs
Focus areas:
- Timestamp prompting format (is [00:00-00:03] correct?)
- Audio generation capabilities
- Camera movement vocabulary (what terms does Google recommend?)
- Resolution/duration limits
- Pricing per second
Validate `references/video/sora-2.md` against:
- https://platform.openai.com/docs/models/sora-2
- https://platform.openai.com/docs/models/sora-2-pro
- https://platform.openai.com/docs/api-reference/videos
Focus areas:
- Multi-scene capabilities
- Duration limits (sora-2 vs sora-2-pro)
- Resolution options
- Prompt structure recommendations
- Credit/pricing system
Validate `references/video/runway-gen4.5.md` against:
- https://docs.dev.runwayml.com/guides/models/
- https://docs.dev.runwayml.com/guides/pricing/
Focus areas:
- Gen-4 vs Gen-4.5 availability (Gen-4.5 API not yet available per audit)
- Motion Brush documentation
- Camera control parameters
- Credit system
Validate `references/video/kling-2.1.md` against:
- https://klingai.com/global/dev
- https://app.klingai.com/global/dev/document-api
Focus areas:
- Model tiers (standard/pro/master)
- Lip-sync capabilities
- Camera movement vocabulary
- Duration limits per tier
- Pricing structure
Validate `references/video/luma-ray3.md` against:
- https://docs.lumalabs.ai/docs/api
- https://lumalabs.ai/learning-hub
Focus areas:
- Ray2 vs Ray3 availability (Ray3 API not yet available per audit)
- HDR capabilities
- Draft mode documentation
- Credit system
Validate `references/video/hailuo-02.md` against:
- https://platform.minimaxi.com/docs/api-reference/video-generation-intro
Focus areas:
- Model variants (02 vs 2.3 vs 2.3-Fast)
- Resolution/duration options
- Pricing per resolution tier
Validate `references/image/nano-banana-pro.md` against:
- https://ai.google.dev/gemini-api/docs/nanobanana
- https://ai.google.dev/gemini-api/docs/image-generation
- https://ai.google.dev/gemini-api/docs/pricing
Focus areas:
- Correct model IDs (`gemini-2.5-flash-image`, `gemini-3-pro-image-preview`)
- Token/pricing tables and image-size token costs
- 4K output + “Thinking” + thought signatures behavior (Pro)
- Prompting vocabulary + official prompt templates
Validate `references/image/imagen-4.md` against:
- https://ai.google.dev/gemini-api/docs/imagen
- https://cloud.google.com/vertex-ai/generative-ai/docs/models/imagen/4-0-generate
- https://cloud.google.com/vertex-ai/generative-ai/pricing
Focus areas:
- Model variants (fast/standard/ultra) and IDs
- Pricing (Gemini API vs Vertex AI pricing surfaces)
- Aspect ratio + output size constraints
- Prompting guidance (official)
Validate `references/image/flux-2.md` against:
- https://docs.bfl.ai/quick_start/generating_images
- https://docs.bfl.ai/flux_2/flux2_overview
- https://bfl.ai/pricing
Focus areas:
- All FLUX.2 variants (pro/max/flex/dev)
- Endpoint-based API (not model_id based)
- Text rendering capabilities
- Pricing per megapixel
Validate `references/image/gpt-image.md` against:
- https://platform.openai.com/docs/models/gpt-image-1.5
- https://platform.openai.com/docs/guides/image-generation
Focus areas:
- Model versions (1.5 vs 1 vs 1-mini)
- Token-based pricing
- Quality tiers
- Text rendering accuracy
Validate `references/image/midjourney.md` against:
- https://docs.midjourney.com
Focus areas:
- V7 capabilities
- API availability (still no public API?)
- Parameter syntax (--ar, --stylize, etc.)
- Style reference system
Validate `references/image/ideogram-3.md` against:
- https://developer.ideogram.ai/api-reference/api-reference/generate-v3
- https://ideogram.ai/features/3.0
Focus areas:
- Version 3.0 features
- Text rendering accuracy claims
- Style Codes feature
- API endpoint format
Validate `references/image/seedream-4.md` against:
- https://docs.byteplus.com/en/docs/ModelArk
- https://seed.bytedance.com/en/seedream4_5
Focus areas:
- API availability (via BytePlus ModelArk)
- Multi-reference fusion capabilities
- Speed benchmarks
- Pricing
Validate `references/audio/elevenlabs.md` against:
- https://elevenlabs.io/docs/overview/models
- https://elevenlabs.io/docs/api-reference
Focus areas:
- Model IDs (eleven_v3, eleven_multilingual_v2, etc. - use underscores!)
- Voice cloning requirements
- Stability/similarity controls
- Pricing per character
Validate `references/audio/suno-v5.md` against:
- https://help.suno.com
- https://suno.com
Focus areas:
- v5 capabilities vs v4
- NO official API (only third-party wrappers)
- Song duration limits
- Lyric formatting
Validate `references/audio/udio.md` against:
- https://help.udio.com
- https://www.udio.com/blog
Focus areas:
- v1.5 and v1.5 Allegro differences
- NO official API (Udio explicitly states this)
- Stem separation features
- Key control
Validate `references/audio/openai-tts.md` against:
- https://platform.openai.com/docs/guides/text-to-speech
- https://platform.openai.com/docs/api-reference/audio
Focus areas:
- Model IDs (tts-1, tts-1-hd, gpt-4o-mini-tts)
- Voice options
- Instructions support (gpt-4o-mini-tts only)
- Pricing structure
Validate `references/audio/fish-audio-openaudio-s1.md` against:
- https://docs.fish.audio/api-reference/endpoint/openapi-v1/text-to-speech
- https://docs.fish.audio/developer-guide/models-pricing
Focus areas:
- Model ID is just "s1" in API
- Pricing per UTF-8 bytes
- Emotion control capabilities
- Voice cloning
Validate `references/audio/cartesia-sonic.md` against:
- https://docs.cartesia.ai/build-with-cartesia/tts-models
- https://cartesia.ai/pricing
Focus areas:
- Sonic-3 vs Sonic-2 vs Sonic-turbo
- Date-stamped version snapshots
- State Space Models claims
- Latency benchmarks
**Task**: Validate on-device model [MODEL] against HuggingFace/GitHub.
**Sources**:
- HuggingFace model card: [HF_URL]
- GitHub repo: [GITHUB_URL]
**MANDATORY Validation Checklist**:
1. **Hardware Requirements**
- [ ] Minimum VRAM verified
- [ ] Recommended VRAM verified
- [ ] RAM requirements verified
2. **Mac Compatibility** (CRITICAL - user uses MacBook)
- [ ] MPS (Metal) support: YES/NO/PARTIAL
- [ ] Apple Silicon (M1/M2/M3/M4) tested: YES/NO/UNKNOWN
- [ ] Mac-specific installation steps documented
- [ ] Mac performance benchmarks if available
- [ ] Known Mac limitations or issues
3. **License**
- [ ] License type verified
- [ ] Commercial use allowed: YES/NO/CONDITIONAL
- [ ] Revenue limits (if any)
4. **Model Specs**
- [ ] Parameter count verified
- [ ] Current version/release date
- [ ] Output specs (resolution, duration, quality)
5. **Quality Claims**
- [ ] Benchmark scores verified with source
- [ ] Comparison claims verified
**Output**: Corrections + Mac compatibility assessment
| # | Model | HuggingFace/GitHub | Focus |
|---|---|---|---|
| 1 | HunyuanVideo 1.5 | tencent/HunyuanVideo-1.5 |
GGUF options, VRAM, SSTA claims |
| 2 | Wan2.1/2.2 | Wan-AI/Wan2.1-T2V-14B, Wan-AI/Wan2.2-TI2V-5B |
MoE architecture, Apache 2.0 |
| 3 | LTX-Video | Lightricks/LTX-Video |
MPS support, speed claims |
| 4 | CogVideoX | THUDM/CogVideoX-5b, THUDM/CogVideoX-2b |
Quantization, Mac support |
| 5 | Mochi 1 | genmo/mochi-1-preview |
VRAM requirements, ComfyUI |
| 6 | Stable Video Diffusion | stabilityai/stable-video-diffusion-img2vid-xt |
License, optimizations |
| 7 | Open-Sora 2.0 | hpcaitech/Open-Sora |
VRAM, output specs |
| 8 | Open-Sora Plan | PKU-YuanGroup/Open-Sora-Plan |
v1.5 capabilities |
| 9 | AnimateDiff | guoyww/AnimateDiff |
VRAM by config, SDXL support |
| 10 | SkyReels V1 | SkyworkAI/SkyReels-V1 |
Human-centric features, VBench |
| 11 | Pyramid Flow | rain1011/pyramid-flow-sd3 |
MIT license, Mac support |
| 12 | Kandinsky 5.0 | kandinskylab/Kandinsky-5.0-T2V-Lite |
10s video, attention engines |
| 13 | Step-Video | stepfun-ai/Step-Video-T2V |
30B params, multi-GPU |
| # | Model | HuggingFace | Focus |
|---|---|---|---|
| 14 | SD 1.5 | runwayml/stable-diffusion-v1-5 |
License, ecosystem |
| 15 | SDXL | stabilityai/stable-diffusion-xl-base-1.0 |
License terms, refiner |
| 16 | SDXL Turbo | stabilityai/sdxl-turbo |
Steps, resolution limits |
| 17 | SDXL Lightning | ByteDance | 2-8 step quality |
| 18 | SD 3.5 Medium | stabilityai/stable-diffusion-3.5-medium |
License (<$1M), VRAM |
| 19 | SD 3.5 Large | stabilityai/stable-diffusion-3.5-large |
Quantization options |
| 20 | FLUX.1 Schnell | black-forest-labs/FLUX.1-schnell |
Apache 2.0, NF4 options |
| 21 | FLUX.1 Dev | black-forest-labs/FLUX.1-dev |
Non-commercial terms |
| 22 | FLUX.2 Dev | black-forest-labs/FLUX.2-dev |
32B params, consumer viability |
| 23 | Stable Cascade | stabilityai/stable-cascade |
3-stage architecture |
| 24 | PixArt-Sigma | PixArt-alpha/PixArt-Sigma-XL-2-1024-MS |
DiT architecture, 4K |
| 25 | HiDream-I1 | HiDream.ai | 17B params, GGUF variants |
| 26 | Z-Image Turbo | Tongyi-MAI/Z-Image-Turbo |
#1 leaderboard, bilingual |
| 27 | Kolors | Kwai-Kolors/Kolors |
Commercial registration |
| 28 | Playground v2.5 | playgroundai/playground-v2.5-1024px-aesthetic |
Open vs v3 closed |
| 29 | HunyuanDiT | Tencent | OpenVINO, Chinese |
| 30 | DeepFloyd IF | DeepFloyd/IF-I-XL-v1.0 |
Text rendering, VRAM |
| 31 | Kandinsky 5.0 Lite | kandinskylab/kandinsky-5.0-image-lite |
Multi-modal family |
| # | Model | HuggingFace/GitHub | Focus |
|---|---|---|---|
| 32 | Chatterbox | ResembleAI/chatterbox |
MIT, emotion control, 63.8% pref |
| 33 | Fish Speech/OpenAudio S1 | fishaudio/fish-speech |
CC-BY-NC, #1 TTS-Arena |
| 34 | CosyVoice2 | FunAudioLLM/CosyVoice2-0.5B |
Apache 2.0, streaming |
| 35 | Kokoro-82M | hexgrad/Kokoro-82M |
Apache 2.0, 82M params |
| 36 | F5-TTS | SWivid/F5-TTS |
CC-BY-NC weights |
| 37 | IndexTTS-2 | index-tts/index-tts |
Duration control |
| 38 | XTTS v2 | coqui/XTTS-v2 |
Coqui license, 17 langs |
| 39 | StyleTTS2 | yl4579/StyleTTS2 |
MIT, human-level |
| 40 | GPT-SoVITS | RVC-Boss/GPT-SoVITS |
MIT, singing support |
| 41 | Bark | suno/bark |
MIT, sound effects |
| 42 | OpenVoice v2 | myshell-ai/OpenVoiceV2 |
MIT, lightweight |
| 43 | Piper | rhasspy/piper |
MIT, CPU-only |
| 44 | Tortoise TTS | neonbjb/tortoise-tts |
Apache 2.0, slow |
| 45 | WhisperSpeech | WhisperSpeech/WhisperSpeech |
Apache 2.0/MIT |
| 46 | MaskGCT | Amphion | ICLR 2025, 6 langs |
| 47 | OuteTTS | edwko/OuteTTS |
MIT, llama.cpp |
| 48 | Spark-TTS | SparkAudio/Spark-TTS-0.5B |
CC-BY-NC-SA |
| # | Model | HuggingFace/GitHub | Focus |
|---|---|---|---|
| 49 | ACE-Step | ACE-Step/ACE-Step-v1-3.5B |
Apache 2.0, 4min songs |
| 50 | YuE | multimodal-art-projection/YuE |
Apache 2.0, 5min |
| 51 | DiffRhythm | ASLP-lab/DiffRhythm |
Apache 2.0, 4m45s |
| 52 | MusicGen | facebook/musicgen-large |
CC-BY-NC, variants |
| 53 | Stable Audio Open | stabilityai/stable-audio-open-1.0 |
<$1M license |
| 54 | Riffusion | riffusion/riffusion-model-v1 |
MIT, spectrograms |
| 55 | Magenta RT | Open weights, real-time |
Phase 1: Cloud Models (18 Opus agents, parallel)
- 6 video model agents
- 6 image model agents
- 6 audio model agents
- Each validates against official vendor docs
- Returns: corrections, updated content, evidence links
Phase 2: On-Device Models (55 agents, parallel batches)
- 13 video model agents
- 18 image model agents
- 17 TTS model agents
- 7 music model agents
- Each validates against HuggingFace + GitHub
- CRITICAL: Mac compatibility verification for each model
Phase 3: Merge & Update
- Merge all corrections into reference docs
- Update 3 on-device compilation docs with per-model corrections
- Cross-check against MODEL-AUDIT.md
Phase 4: Synthesis Update
- Update PROMPT-VOCABULARY.md with verified terminology
- Update comparison docs with verified capabilities
- Update COST-OPTIMIZATION.md with verified pricing
Phase 5: Finalize
- Mark all documents as validated
- Update CONTINUITY.md
- Layer 2 truly complete
| Category | Cloud Agents | On-Device Agents | Total |
|---|---|---|---|
| Video | 7 | 13 | 20 |
| Image | 7 | 18 | 25 |
| Audio (TTS) | 4 | 17 | 21 |
| Audio (Music) | 2 | 7 | 9 |
| Total | 20 | 55 | 75 |
- All 20 cloud model docs validated against official sources
- All 55 on-device models validated against HuggingFace/GitHub
- Mac compatibility verified for every on-device model
- Every prompting guide verified against vendor recommendations
- Every capability claim has evidence link
- MODEL-AUDIT.md corrections applied to reference docs
- Synthesis docs updated to reflect corrected information
- CONTINUITY.md updated with completion status
references/
├── README.md # Library index (needs status update)
├── GLOSSARY.md # Terms and conventions
├── GAPS.md # Known gaps
├── VALIDATION-REPORT.md # Accuracy verification (needs update)
├── video/
│ ├── README.md
│ ├── veo-3.md # NEEDS REVIEW
│ ├── sora-2.md # NEEDS REVIEW
│ ├── runway-gen4.5.md # NEEDS REVIEW
│ ├── kling-2.1.md # NEEDS REVIEW
│ ├── luma-ray3.md # NEEDS REVIEW
│ ├── hailuo-02.md # NEEDS REVIEW
│ ├── midjourney-video.md # NEEDS REVIEW
│ └── on-device-models.md # NEEDS REVIEW
├── image/
│ ├── README.md
│ ├── nano-banana-pro.md # NEEDS REVIEW
│ ├── imagen-4.md # NEEDS REVIEW
│ ├── flux-2.md # NEEDS REVIEW
│ ├── gpt-image.md # NEEDS REVIEW
│ ├── midjourney.md # NEEDS REVIEW
│ ├── ideogram-3.md # NEEDS REVIEW
│ ├── seedream-4.md # NEEDS REVIEW
│ └── on-device-models.md # NEEDS REVIEW
└── audio/
├── README.md
├── elevenlabs.md # NEEDS REVIEW
├── suno-v5.md # NEEDS REVIEW
├── udio.md # NEEDS REVIEW
├── openai-tts.md # NEEDS REVIEW
├── fish-audio-openaudio-s1.md # NEEDS REVIEW
├── cartesia-sonic.md # NEEDS REVIEW
└── on-device-models.md # NEEDS REVIEW
planning/synthesis/
├── MODEL-AUDIT.md # COMPLETE (model IDs verified)
├── VIDEO-COMPARISON.md # NEEDS UPDATE after validation
├── IMAGE-COMPARISON.md # NEEDS UPDATE after validation
├── AUDIO-COMPARISON.md # NEEDS UPDATE after validation
├── PROMPT-VOCABULARY.md # NEEDS UPDATE after validation
├── COST-OPTIMIZATION.md # NEEDS UPDATE after validation
├── SCHEMA-RECOMMENDATIONS.md # NEEDS UPDATE after validation
└── INTEGRATION-PATTERNS.md # NEEDS UPDATE after validation
This research plan was updated 2025-12-27 to require full library re-validation before Layer 2 can be considered complete.