Sandboxes and Containers now support running Docker for “Docker-in-Docker” setups. This is particularly useful when your end users or agents want to run a full sandboxed development environment.
This allows you to:
Develop containerized applications with your Sandbox
Run isolated test environments for images
Build container images as part of CI/CD workflows
Deploy arbitrary images supplied at runtime within a container
We are updating naming related to some of our Networking products to better clarify their place in the Zero Trust and Secure Access Service Edge (SASE) journey.
We are retiring some older brand names in favor of names that describe exactly what the products do within your network. We are doing this to help customers build better, clearer mental models for comprehensive SASE architecture delivered on Cloudflare.
What’s changing
Magic WAN → Cloudflare WAN
Magic WAN IPsec → Cloudflare IPsec
Magic WAN GRE → Cloudflare GRE
Magic WAN Connector → Cloudflare One Appliance
Magic Firewall → Cloudflare Network Firewall
Magic Network Monitoring → Network Flow
No action is required by you — all functionality, existing configurations, and billing will remain exactly the same.
The latest release of the Agents SDK adds built-in retry utilities, per-connection protocol message control, and a fully rewritten @cloudflare/ai-chat with data parts, tool approval persistence, and zero breaking changes.
Retry utilities
A new this.retry() method lets you retry any async operation with exponential backoff and jitter. You can pass an optional shouldRetry predicate to bail early on non-retryable errors.
Retry options are validated eagerly at enqueue/schedule time, and invalid values throw immediately. Internal retries have also been added for workflow operations (terminateWorkflow, pauseWorkflow, and others) with Durable Object-aware error detection.
Per-connection protocol message control
Agents automatically send JSON text frames (identity, state, MCP server lists) to every WebSocket connection. You can now suppress these per-connection for clients that cannot handle them — binary-only devices, MQTT clients, or lightweight embedded systems.
Connections with protocol messages disabled still fully participate in RPC and regular messaging. Use isConnectionProtocolEnabled(connection) to check a connection’s status at any time. The flag persists across Durable Object hibernation.
The first stable release of @cloudflare/ai-chat ships alongside this release with a major refactor of AIChatAgent internals — new ResumableStream class, WebSocket ChatTransport, and simplified SSE parsing — with zero breaking changes. Existing code using AIChatAgent and useAgentChat works as-is.
Key new features:
Data parts — Attach typed JSON blobs (data-*) to messages alongside text. Supports reconciliation (type+id updates in-place), append, and transient parts (ephemeral via onData callback). See Data parts.
Tool approval persistence — The needsApproval approval UI now survives page refresh and DO hibernation. The streaming message is persisted to SQLite when a tool enters approval-requested state.
maxPersistedMessages — Cap SQLite message storage with automatic oldest-message deletion.
body option on useAgentChat — Send custom data with every request (static or dynamic).
Incremental persistence — Hash-based cache to skip redundant SQL writes.
Row size guard — Automatic two-pass compaction when messages approach the SQLite 2 MB limit.
autoContinueAfterToolResult defaults to true — Client-side tool results and tool approvals now automatically trigger a server continuation, matching server-executed tool behavior. Set autoContinueAfterToolResult: false in useAgentChat to restore the previous behavior.
Notable bug fixes:
Resolved stream resumption race conditions
Resolved an issue where setMessages functional updater sent empty arrays
Resolved an issue where client tool schemas were lost after DO hibernation
Resolved InvalidPromptError after tool approval (approval.id was dropped)
Resolved an issue where message metadata was not propagated on broadcast/resume paths
Resolved an issue where clearAll() did not clear in-memory chunk buffers
Resolved an issue where reasoning-delta silently dropped data when reasoning-start was missed during stream resumption
Synchronous queue and schedule getters
getQueue(), getQueues(), getSchedule(), dequeue(), dequeueAll(), and dequeueAllByCallback() were unnecessarily async despite only performing synchronous SQL operations. They now return values directly instead of wrapping them in Promises. This is backward compatible — existing code using await on these methods will continue to work.
Other improvements
Fix TypeScript “excessively deep” error — A depth counter on CanSerialize and IsSerializableParam types bails out to true after 10 levels of recursion, preventing the “Type instantiation is excessively deep” error with deeply nested types like AI SDK CoreMessage[].
POST SSE keepalive — The POST SSE handler now sends event: ping every 30 seconds to keep the connection alive, matching the existing GET SSE handler behavior. This prevents POST response streams from being silently dropped by proxies during long-running tool calls.
Widened peer dependency ranges — Peer dependency ranges across packages have been widened to prevent cascading major bumps during 0.x minor releases. @cloudflare/ai-chat and @cloudflare/codemode are now marked as optional peer dependencies.
This week’s release introduces new detections for CVE-2025-68645 and CVE-2025-31125.
Key Findings
CVE-2025-68645: A Local File Inclusion (LFI) vulnerability in the Webmail Classic UI of Zimbra Collaboration Suite (ZCS) 10.0 and 10.1 allows unauthenticated remote attackers to craft requests to the /h/rest endpoint, improperly influence internal dispatching, and include arbitrary files from the WebRoot directory.
CVE-2025-31125: Vite, the JavaScript frontend tooling framework, exposes content of non-allowed files via ?inline&import when its development server is network-exposed, enabling unauthorized attackers to read arbitrary files and potentially leak sensitive information.
Ruleset
Rule ID
Legacy Rule ID
Description
Previous Action
New Action
Comments
Cloudflare Managed Ruleset
695d76ff756844d384cab548833761f7
N/A
Zimbra – Local File Inclusion – CVE:CVE-2025-68645
Log
Block
This is a new detection.
Cloudflare Managed Ruleset
38fff9f3deba46a2abc10a8f950ed8c8
N/A
Vite – WASM Import Path Traversal – CVE:CVE-2025-31125
When AI systems request pages from any website that uses Cloudflare and has Markdown for Agents enabled, they can express the preference for text/markdown in the request: our network will automatically and efficiently convert the HTML to markdown, when possible, on the fly.
This release adds the following improvements:
The origin response limit was raised from 1 MB to 2 MB (2,097,152 bytes).
We no longer require the origin to send the content-length header.
We now support content encoded responses from the origin.
If you haven’t enabled automatic Markdown conversion yet, visit the AI Crawl Control section of the Cloudflare dashboard and enable Markdown for Agents.
Workers VPC now supports Cloudflare Origin CA certificates when connecting to your private services over HTTPS. Previously, Workers VPC only trusted certificates issued by publicly trusted certificate authorities (for example, Let’s Encrypt, DigiCert).
With this change, you can use free Cloudflare Origin CA certificates on your origin servers within private networks and connect to them from Workers VPC using the https scheme. This is useful for encrypting traffic between the tunnel and your service without needing to provision certificates from a public CA.
We’re excited to announce GLM-4.7-Flash on Workers AI, a fast and efficient text generation model optimized for multilingual dialogue and instruction-following tasks, along with the brand-new @cloudflare/tanstack-ai package and workers-ai-provider v3.1.1.
You can now run AI agents entirely on Cloudflare. With GLM-4.7-Flash’s multi-turn tool calling support, plus full compatibility with TanStack AI and the Vercel AI SDK, you have everything you need to build agentic applications that run completely at the edge.
GLM-4.7-Flash — Multilingual Text Generation Model
@cf/zai-org/glm-4.7-flash is a multilingual model with a 131,072 token context window, making it ideal for long-form content generation, complex reasoning tasks, and multilingual applications.
Key Features and Use Cases:
Multi-turn Tool Calling for Agents: Build AI agents that can call functions and tools across multiple conversation turns
Multilingual Support: Built to handle content generation in multiple languages effectively
Large Context Window: 131,072 tokens for long-form writing, complex reasoning, and processing long documents
Fast Inference: Optimized for low-latency responses in chatbots and virtual assistants
Instruction Following: Excellent at following complex instructions for code generation and structured tasks
@cloudflare/tanstack-ai v0.1.1 — TanStack AI adapters for Workers AI and AI Gateway
We’ve released @cloudflare/tanstack-ai, a new package that brings Workers AI and AI Gateway support to TanStack AI. This provides a framework-agnostic alternative for developers who prefer TanStack’s approach to building AI applications.
Workers AI adapters support four configuration modes — plain binding (env.AI), plain REST, AI Gateway binding (env.AI.gateway(id)), and AI Gateway REST — across all capabilities:
Chat (createWorkersAiChat) — Streaming chat completions with tool calling, structured output, and reasoning text streaming.
Summarization (createWorkersAiSummarize) — Text summarization.
AI Gateway adapters route requests from third-party providers — OpenAI, Anthropic, Gemini, Grok, and OpenRouter — through Cloudflare AI Gateway for caching, rate limiting, and unified billing.
To get started:
npminstall@cloudflare/tanstack-ai@tanstack/ai
workers-ai-provider v3.1.1 — transcription, speech, reranking, and reliability
The Workers AI provider for the Vercel AI SDK now supports three new capabilities beyond chat and image generation:
Transcription (provider.transcription(model)) — Speech-to-text with automatic handling of model-specific input formats across binding and REST paths.
Text-to-speech (provider.speech(model)) — Audio generation with support for voice and speed options.
Reranking (provider.reranking(model)) — Document reranking for RAG pipelines and search result ordering.
import {createWorkersAI} from "workers-ai-provider";
documents: ["ML is a branch of AI.","The weather is sunny."],
});
This release also includes a comprehensive reliability overhaul (v3.0.5):
Fixed streaming — Responses now stream token-by-token instead of buffering all chunks, using a proper TransformStream pipeline with backpressure.
Fixed tool calling — Resolved issues with tool call ID sanitization, conversation history preservation, and a heuristic that silently fell back to non-streaming mode when tools were defined.
Premature stream termination detection — Streams that end unexpectedly now report finishReason: "error" instead of silently reporting "stop".
AI Search support — Added createAISearch as the canonical export (renamed from AutoRAG). createAutoRAG still works with a deprecation warning.
Workers VPC now supports Cloudflare Origin CA certificates when connecting to your private services over HTTPS. Previously, Workers VPC only trusted certificates issued by publicly trusted certificate authorities (for example, Let’s Encrypt, DigiCert).
With this change, you can use free Cloudflare Origin CA certificates on your origin servers within private networks and connect to them from Workers VPC using the https scheme. This is useful for encrypting traffic between the tunnel and your service without needing to provision certificates from a public CA.
We’re excited to announce GLM-4.7-Flash on Workers AI, a fast and efficient text generation model optimized for multilingual dialogue and instruction-following tasks, along with the brand-new @cloudflare/tanstack-ai package and workers-ai-provider v3.1.1.
You can now run AI agents entirely on Cloudflare. With GLM-4.7-Flash’s multi-turn tool calling support, plus full compatibility with TanStack AI and the Vercel AI SDK, you have everything you need to build agentic applications that run completely at the edge.
GLM-4.7-Flash — Multilingual Text Generation Model
@cf/zai-org/glm-4.7-flash is a multilingual model with a 131,072 token context window, making it ideal for long-form content generation, complex reasoning tasks, and multilingual applications.
Key Features and Use Cases:
Multi-turn Tool Calling for Agents: Build AI agents that can call functions and tools across multiple conversation turns
Multilingual Support: Built to handle content generation in multiple languages effectively
Large Context Window: 131,072 tokens for long-form writing, complex reasoning, and processing long documents
Fast Inference: Optimized for low-latency responses in chatbots and virtual assistants
Instruction Following: Excellent at following complex instructions for code generation and structured tasks
@cloudflare/tanstack-ai v0.1.1 — TanStack AI adapters for Workers AI and AI Gateway
We’ve released @cloudflare/tanstack-ai, a new package that brings Workers AI and AI Gateway support to TanStack AI. This provides a framework-agnostic alternative for developers who prefer TanStack’s approach to building AI applications.
Workers AI adapters support four configuration modes — plain binding (env.AI), plain REST, AI Gateway binding (env.AI.gateway(id)), and AI Gateway REST — across all capabilities:
Chat (createWorkersAiChat) — Streaming chat completions with tool calling, structured output, and reasoning text streaming.
Summarization (createWorkersAiSummarize) — Text summarization.
AI Gateway adapters route requests from third-party providers — OpenAI, Anthropic, Gemini, Grok, and OpenRouter — through Cloudflare AI Gateway for caching, rate limiting, and unified billing.
To get started:
npminstall@cloudflare/tanstack-ai@tanstack/ai
workers-ai-provider v3.1.1 — transcription, speech, reranking, and reliability
The Workers AI provider for the Vercel AI SDK now supports three new capabilities beyond chat and image generation:
Transcription (provider.transcription(model)) — Speech-to-text with automatic handling of model-specific input formats across binding and REST paths.
Text-to-speech (provider.speech(model)) — Audio generation with support for voice and speed options.
Reranking (provider.reranking(model)) — Document reranking for RAG pipelines and search result ordering.
import {createWorkersAI} from "workers-ai-provider";
documents: ["ML is a branch of AI.","The weather is sunny."],
});
This release also includes a comprehensive reliability overhaul (v3.0.5):
Fixed streaming — Responses now stream token-by-token instead of buffering all chunks, using a proper TransformStream pipeline with backpressure.
Fixed tool calling — Resolved issues with tool call ID sanitization, conversation history preservation, and a heuristic that silently fell back to non-streaming mode when tools were defined.
Premature stream termination detection — Streams that end unexpectedly now report finishReason: "error" instead of silently reporting "stop".
AI Search support — Added createAISearch as the canonical export (renamed from AutoRAG). createAutoRAG still works with a deprecation warning.
Workers VPC now supports Cloudflare Origin CA certificates when connecting to your private services over HTTPS. Previously, Workers VPC only trusted certificates issued by publicly trusted certificate authorities (for example, Let’s Encrypt, DigiCert).
With this change, you can use free Cloudflare Origin CA certificates on your origin servers within private networks and connect to them from Workers VPC using the https scheme. This is useful for encrypting traffic between the tunnel and your service without needing to provision certificates from a public CA.