Growing the Cloudflare AI team with talent from Ensemble AI
2026-06-15
Cloudflare is deepening our investment in AI with the addition of team members from Ensemble AI, focusing on machine learning infrastructure and efficiency....
2026-06-15
Cloudflare is deepening our investment in AI with the addition of team members from Ensemble AI, focusing on machine learning infrastructure and efficiency....
2026-04-16
We're building AI Gateway into a unified inference layer for AI, letting developers call models from 14+ providers. New features include Workers AI binding integration and an expanded catalog with multimodal models. ...
2026-04-16
We built a custom technology stack to run fast large language models on Cloudflare’s infrastructure. This post explores the engineering trade-offs and technical optimizations required to make high-performance AI inference accessible....
2026-03-19
Kimi K2.5 is now on Workers AI, helping you power agents entirely on Cloudflare’s Developer Platform. Learn how we optimized our inference stack and reduced inference costs for internal agent use cases. ...
2025-11-25
FLUX.2 [dev] by Black Forest Labs is now on Workers AI! This advanced open-weight image model offers superior photorealism, multi-reference inputs, and granular control with JSON prompting....
2025-09-25
Championing AI sovereignty through choice: diverse tools, data control, and no vendor lock-in. We're enabling this in India, Japan, and Southeast Asia, offering local, open-source models on Workers AI...
2025-08-27
AI Gateway now gives you access to your favorite AI models, dynamic routing and more — through just one endpoint....
2025-08-27
We're expanding Workers AI with new partner models from Leonardo.Ai and Deepgram. Start using state-of-the-art image generation models from Leonardo and real-time TTS and STT models from Deepgram. ...
2025-08-05
OpenAI’s newest open-source models are now available on Cloudflare Workers AI on Day 0, with support for Responses API, Code Interpreter and Web Search (coming soon)....
2025-04-11
We just made Workers AI inference faster with speculative decoding & prefix caching. Use our new batch inference for handling large request volumes seamlessly....
2025-04-06
Llama 4 Scout 17B Instruct is now available on Workers AI: use this multimodal, Mixture of Experts AI model on Cloudflare's serverless AI platform to build next-gen AI applications....
2024-09-26
Cloudflare helps you build AI applications with fast inference at the edge, optimized AI workflows, and vector database-powered RAG solutions....
2024-07-23
Cloudflare is excited to be a launch partner with Meta to introduce Workers AI support for Llama 3.1...
2024-06-27
Introducing a new way to do function calling in Workers AI by running function code alongside your inference. Plus, a new @cloudflare/ai-utils package to make getting started as simple as possible...
2024-05-22
AI Gateway is an AI ops platform that provides speed, reliability, and observability for your AI applications. With a single line of code, you can unlock powerful features including rate limiting, custom caching, real-time logs, and aggregated analytics across multiple providers...
2024-04-18
We are thrilled to give developers around the world the ability to build AI applications with Meta Llama 3 using Workers AI. We are proud to be a launch partner with Meta for their newest 8B Llama 3 model...
2024-04-02
Today, we’re excited to make a series of announcements, including Workers AI, Cloudflare’s inference platform becoming GA and support for fine-tuned models with LoRAs and one-click deploys from HuggingFace. Cloudflare Workers now supports the Python programming language, and more...
2024-04-02
Workers AI now supports fine-tuned models using LoRAs. But what is a LoRA and how does it work? In this post, we dive into fine-tuning, LoRAs and even some math to share the details of how it all works under the hood...
2024-03-14
The Workers AI and AI Gateway team recently collaborated closely with security researchers at Ben Gurion University regarding a report submitted through our Public Bug Bounty program. Through this process, we discovered and fully patched a vulnerability affecting all LLM providers. Here’s the story...
2024-02-28
In February 2024 we added 8 models for text generation, classification, and code generation use cases. Today, we’re back with 17 more models, focused on enabling new types of tasks and use cases ...