Free AI inference built into the platform
SAM runs on Cloudflare Workers, and Workers AI provides free AI inference at the edge. SAM uses Workers AI for task title generation, text-to-speech, and context summarization — all included with the platform at no extra cost.
Why use Workers AI with SAM
Zero Extra Cost
Workers AI inference is included with SAM — no API keys needed for built-in features like task titles and TTS.
Edge Inference
Models run at Cloudflare's edge — low latency for platform features, no cold starts.
Multiple Models
Llama 4, Gemma 3, Qwen 3, and Deepgram for different tasks — SAM picks the right model automatically.
AI Gateway
All inference is routed through Cloudflare AI Gateway for rate limiting, analytics, and reliability.
Get started in four steps
Already Built In
Workers AI is part of SAM's Cloudflare infrastructure — no setup required.
Automatic Model Selection
SAM uses the right model for each task: Gemma for titles, Deepgram for TTS, Llama for summarization.
AI Proxy (Optional)
Enable the AI proxy to route custom inference through Workers AI with rate limiting and token budgets.
Monitor Usage
Track AI usage, costs, and model distribution via the admin analytics dashboard.
What you can build
Zero-config AI features
Task title generation, text-to-speech for messages, and context summarization work out of the box.
AI proxy gateway
Route LLM requests through SAM's AI proxy for rate limiting, token budgets, and centralized analytics.
Self-hosted AI platform
When self-hosting SAM on Cloudflare's free tier, Workers AI provides AI capabilities without any API key costs.
Frequently asked questions
Is Workers AI really free?
Yes — Cloudflare provides a generous free tier for Workers AI inference. SAM's built-in features (task titles, TTS, summarization) use this free tier.
Can I use Workers AI for my coding agents?
Workers AI powers platform features, not the coding agents themselves. Agents use their own models (Claude, GPT, Gemini, Devstral) via their respective API keys.
What models does SAM use?
Llama 4 Scout 17B for general inference, Gemma 3 12B for task titles, Qwen 3 30B for complex tasks, and Deepgram Aura 2 for text-to-speech.
Start running Workers AI on your infrastructure
Self-host on Cloudflare's free tier. Bring your own cloud. Your agents, your code.