AI Models

Free AI inference built into the platform

SAM runs on Cloudflare Workers, and Workers AI provides free AI inference at the edge. SAM uses Workers AI for task title generation, text-to-speech, and context summarization — all included with the platform at no extra cost.

Why use Workers AI with SAM

Zero Extra Cost

Workers AI inference is included with SAM — no API keys needed for built-in features like task titles and TTS.

Edge Inference

Models run at Cloudflare's edge — low latency for platform features, no cold starts.

Multiple Models

Llama 4, Gemma 3, Qwen 3, and Deepgram for different tasks — SAM picks the right model automatically.

AI Gateway

All inference is routed through Cloudflare AI Gateway for rate limiting, analytics, and reliability.


Get started in four steps

Step 1

Already Built In

Workers AI is part of SAM's Cloudflare infrastructure — no setup required.

Step 2

Automatic Model Selection

SAM uses the right model for each task: Gemma for titles, Deepgram for TTS, Llama for summarization.

Step 3

AI Proxy (Optional)

Enable the AI proxy to route custom inference through Workers AI with rate limiting and token budgets.

Step 4

Monitor Usage

Track AI usage, costs, and model distribution via the admin analytics dashboard.


What you can build

Zero-config AI features

Task title generation, text-to-speech for messages, and context summarization work out of the box.

AI proxy gateway

Route LLM requests through SAM's AI proxy for rate limiting, token budgets, and centralized analytics.

Self-hosted AI platform

When self-hosting SAM on Cloudflare's free tier, Workers AI provides AI capabilities without any API key costs.


Frequently asked questions

Is Workers AI really free?

Yes — Cloudflare provides a generous free tier for Workers AI inference. SAM's built-in features (task titles, TTS, summarization) use this free tier.

Can I use Workers AI for my coding agents?

Workers AI powers platform features, not the coding agents themselves. Agents use their own models (Claude, GPT, Gemini, Devstral) via their respective API keys.

What models does SAM use?

Llama 4 Scout 17B for general inference, Gemma 3 12B for task titles, Qwen 3 30B for complex tasks, and Deepgram Aura 2 for text-to-speech.


Start running Workers AI on your infrastructure

Self-host on Cloudflare's free tier. Bring your own cloud. Your agents, your code.