▶️ Watch the video tutorial (15 minutes)
We've built a full AI platform across this series. Agents, tools, a knowledge base, streaming chat, failover, middleware. Now comes the question nobody likes to deal with: how do you test code that calls an AI? You can't hit OpenAI in your test suite. It's slow, it costs money on every run, and the responses change each time so your assertions never hold.
In this episode we test the whole thing with Pest and zero API calls, using the SDK's faking system. It works exactly like faking mail, notifications, or queues in Laravel. You define the responses up front and assert that the right prompts went out.
I start with a quick cleanup, moving the middleware into an AI/middleware namespace, then write tests that check the agent gets prompted correctly, handles conversation context, and returns deterministic responses. I use preventStrayPrompts to catch accidental AI calls in flows that shouldn't trigger the agent at all, like ticket creation.
Then queued prompts. I assert the TicketClassifier gets queued on a valid submission and stays out of the queue on an invalid one. Custom tools get tested directly against real database records, and a failing test catches a factory mismatch that I fix on camera. That's the kind of bug tests are supposed to catch.
I feature-test the chat endpoint for authentication, validation, and correct agent interaction, then test the vector store setup and knowledge base seeding by faking store creation, document uploads, and embedding generation.
By the end all 12 AI tests pass, run fast, and never touch an external API. You can run them on every commit without worrying about cost or flakiness.
Next episode we get into security. We protect the agent against prompt injection and other attacks, including screening every message with a local LLM before it ever reaches OpenAI. Zero API cost on the blocked ones.