Ship AI with Laravel: Failover, Queues, and Middleware for AI Agents
Published on by Harris Raftopoulos
▶️ Watch the video tutorial (10 minutes)
Our support platform works great in development. Then you ship it. OpenAI has an outage at 2am and every customer gets an error. A hundred people hit the chat at once and requests pile up. A response comes back wrong and three weeks later you have no way to figure out why. This episode is about the infrastructure that handles all of that.
We start with provider failover, the scariest one. Right now the agent is locked to a single provider. I show how to pass an array instead, so if OpenAI fails the SDK automatically retries with Anthropic, then Gemini. Same agent, same tools, same instructions, just a different provider behind the scenes. The customer never knows. I set it up at the call site first, then move the whole failover chain into config so you can change it without touching code. I also add a listener for the AgentFailedOver event so you get alerted the moment a provider drops.
Next, queues. Not every AI call needs to block a request. I take the TicketClassifier from Episode 2 and swap prompt for queue, so classification runs in the background through Laravel's queue system. The customer gets an instant confirmation and the AI work happens behind the scenes, with failover still in place on the queued job.
Then middleware. The SDK lets you intercept every prompt and response flowing through an agent, exactly like HTTP middleware. I build three. One logs each prompt along with response metadata like token counts, provider, and duration. One handles per-user rate limiting at ten prompts a minute. One tracks cost using token counts so you can calculate spend per user, per agent, per day. Then I attach all three to the agent in a deliberate order, because rate limiting should run before you waste time logging or tracking a request you're about to reject.
By the end, the agent survives provider outages, handles background work, and logs everything you need to debug and budget in production.
Next episode we test the entire system with Pest, faking agents, asserting prompts, and testing tools so nothing breaks even when you ship on a Friday afternoon.