Workers VPC now supports Cloudflare Origin CA certificates when connecting to your private services over HTTPS. Previously, Workers VPC only trusted certificates issued by publicly trusted certificate authorities (for example, Let’s Encrypt, DigiCert).
With this change, you can use free Cloudflare Origin CA certificates on your origin servers within private networks and connect to them from Workers VPC using the https scheme. This is useful for encrypting traffic between the tunnel and your service without needing to provision certificates from a public CA.
We’re excited to announce GLM-4.7-Flash on Workers AI, a fast and efficient text generation model optimized for multilingual dialogue and instruction-following tasks, along with the brand-new @cloudflare/tanstack-ai package and workers-ai-provider v3.1.1.
You can now run AI agents entirely on Cloudflare. With GLM-4.7-Flash’s multi-turn tool calling support, plus full compatibility with TanStack AI and the Vercel AI SDK, you have everything you need to build agentic applications that run completely at the edge.
GLM-4.7-Flash — Multilingual Text Generation Model
@cf/zai-org/glm-4.7-flash is a multilingual model with a 131,072 token context window, making it ideal for long-form content generation, complex reasoning tasks, and multilingual applications.
Key Features and Use Cases:
Multi-turn Tool Calling for Agents: Build AI agents that can call functions and tools across multiple conversation turns
Multilingual Support: Built to handle content generation in multiple languages effectively
Large Context Window: 131,072 tokens for long-form writing, complex reasoning, and processing long documents
Fast Inference: Optimized for low-latency responses in chatbots and virtual assistants
Instruction Following: Excellent at following complex instructions for code generation and structured tasks
@cloudflare/tanstack-ai v0.1.1 — TanStack AI adapters for Workers AI and AI Gateway
We’ve released @cloudflare/tanstack-ai, a new package that brings Workers AI and AI Gateway support to TanStack AI. This provides a framework-agnostic alternative for developers who prefer TanStack’s approach to building AI applications.
Workers AI adapters support four configuration modes — plain binding (env.AI), plain REST, AI Gateway binding (env.AI.gateway(id)), and AI Gateway REST — across all capabilities:
Chat (createWorkersAiChat) — Streaming chat completions with tool calling, structured output, and reasoning text streaming.
Summarization (createWorkersAiSummarize) — Text summarization.
AI Gateway adapters route requests from third-party providers — OpenAI, Anthropic, Gemini, Grok, and OpenRouter — through Cloudflare AI Gateway for caching, rate limiting, and unified billing.
To get started:
npminstall@cloudflare/tanstack-ai@tanstack/ai
workers-ai-provider v3.1.1 — transcription, speech, reranking, and reliability
The Workers AI provider for the Vercel AI SDK now supports three new capabilities beyond chat and image generation:
Transcription (provider.transcription(model)) — Speech-to-text with automatic handling of model-specific input formats across binding and REST paths.
Text-to-speech (provider.speech(model)) — Audio generation with support for voice and speed options.
Reranking (provider.reranking(model)) — Document reranking for RAG pipelines and search result ordering.
import {createWorkersAI} from "workers-ai-provider";
documents: ["ML is a branch of AI.","The weather is sunny."],
});
This release also includes a comprehensive reliability overhaul (v3.0.5):
Fixed streaming — Responses now stream token-by-token instead of buffering all chunks, using a proper TransformStream pipeline with backpressure.
Fixed tool calling — Resolved issues with tool call ID sanitization, conversation history preservation, and a heuristic that silently fell back to non-streaming mode when tools were defined.
Premature stream termination detection — Streams that end unexpectedly now report finishReason: "error" instead of silently reporting "stop".
AI Search support — Added createAISearch as the canonical export (renamed from AutoRAG). createAutoRAG still works with a deprecation warning.
Workers VPC now supports Cloudflare Origin CA certificates when connecting to your private services over HTTPS. Previously, Workers VPC only trusted certificates issued by publicly trusted certificate authorities (for example, Let’s Encrypt, DigiCert).
With this change, you can use free Cloudflare Origin CA certificates on your origin servers within private networks and connect to them from Workers VPC using the https scheme. This is useful for encrypting traffic between the tunnel and your service without needing to provision certificates from a public CA.
We’re excited to announce GLM-4.7-Flash on Workers AI, a fast and efficient text generation model optimized for multilingual dialogue and instruction-following tasks, along with the brand-new @cloudflare/tanstack-ai package and workers-ai-provider v3.1.1.
You can now run AI agents entirely on Cloudflare. With GLM-4.7-Flash’s multi-turn tool calling support, plus full compatibility with TanStack AI and the Vercel AI SDK, you have everything you need to build agentic applications that run completely at the edge.
GLM-4.7-Flash — Multilingual Text Generation Model
@cf/zai-org/glm-4.7-flash is a multilingual model with a 131,072 token context window, making it ideal for long-form content generation, complex reasoning tasks, and multilingual applications.
Key Features and Use Cases:
Multi-turn Tool Calling for Agents: Build AI agents that can call functions and tools across multiple conversation turns
Multilingual Support: Built to handle content generation in multiple languages effectively
Large Context Window: 131,072 tokens for long-form writing, complex reasoning, and processing long documents
Fast Inference: Optimized for low-latency responses in chatbots and virtual assistants
Instruction Following: Excellent at following complex instructions for code generation and structured tasks
@cloudflare/tanstack-ai v0.1.1 — TanStack AI adapters for Workers AI and AI Gateway
We’ve released @cloudflare/tanstack-ai, a new package that brings Workers AI and AI Gateway support to TanStack AI. This provides a framework-agnostic alternative for developers who prefer TanStack’s approach to building AI applications.
Workers AI adapters support four configuration modes — plain binding (env.AI), plain REST, AI Gateway binding (env.AI.gateway(id)), and AI Gateway REST — across all capabilities:
Chat (createWorkersAiChat) — Streaming chat completions with tool calling, structured output, and reasoning text streaming.
Summarization (createWorkersAiSummarize) — Text summarization.
AI Gateway adapters route requests from third-party providers — OpenAI, Anthropic, Gemini, Grok, and OpenRouter — through Cloudflare AI Gateway for caching, rate limiting, and unified billing.
To get started:
npminstall@cloudflare/tanstack-ai@tanstack/ai
workers-ai-provider v3.1.1 — transcription, speech, reranking, and reliability
The Workers AI provider for the Vercel AI SDK now supports three new capabilities beyond chat and image generation:
Transcription (provider.transcription(model)) — Speech-to-text with automatic handling of model-specific input formats across binding and REST paths.
Text-to-speech (provider.speech(model)) — Audio generation with support for voice and speed options.
Reranking (provider.reranking(model)) — Document reranking for RAG pipelines and search result ordering.
import {createWorkersAI} from "workers-ai-provider";
documents: ["ML is a branch of AI.","The weather is sunny."],
});
This release also includes a comprehensive reliability overhaul (v3.0.5):
Fixed streaming — Responses now stream token-by-token instead of buffering all chunks, using a proper TransformStream pipeline with backpressure.
Fixed tool calling — Resolved issues with tool call ID sanitization, conversation history preservation, and a heuristic that silently fell back to non-streaming mode when tools were defined.
Premature stream termination detection — Streams that end unexpectedly now report finishReason: "error" instead of silently reporting "stop".
AI Search support — Added createAISearch as the canonical export (renamed from AutoRAG). createAutoRAG still works with a deprecation warning.
Workers VPC now supports Cloudflare Origin CA certificates when connecting to your private services over HTTPS. Previously, Workers VPC only trusted certificates issued by publicly trusted certificate authorities (for example, Let’s Encrypt, DigiCert).
With this change, you can use free Cloudflare Origin CA certificates on your origin servers within private networks and connect to them from Workers VPC using the https scheme. This is useful for encrypting traffic between the tunnel and your service without needing to provision certificates from a public CA.
Cloudflare’s network now supports real-time content conversion at the source, for enabled zones using content negotiation headers. When AI systems request pages from any website that uses Cloudflare and has Markdown for Agents enabled, they can express the preference for text/markdown in the request: our network will automatically and efficiently convert the HTML to markdown, when possible, on the fly.
Here is a curl example with the Accept negotiation header requesting this page from our developer documentation:
Radar now includes content type insights for AI bot and crawler traffic. The new content_type dimension and filter shows the distribution of content types returned to AI crawlers, grouped by MIME type category.
The content type dimension and filter are available via the following API endpoints:
Serialization – Binary API formats (application/protobuf, application/grpc, application/msgpack)
Other – All other content types
Additionally, individual bot information pages now display content type distribution for AI crawlers that exist in both the Verified Bots and AI Bots datasets.
Cloudflare’s network now supports real-time content conversion at the source, for enabled zones using content negotiation headers. When AI systems request pages from any website that uses Cloudflare and has Markdown for Agents enabled, they can express the preference for text/markdown in the request: our network will automatically and efficiently convert the HTML to markdown, when possible, on the fly.
Here is a curl example with the Accept negotiation header requesting this page from our developer documentation:
Radar now includes content type insights for AI bot and crawler traffic. The new content_type dimension and filter shows the distribution of content types returned to AI crawlers, grouped by MIME type category.
The content type dimension and filter are available via the following API endpoints:
Serialization – Binary API formats (application/protobuf, application/grpc, application/msgpack)
Other – All other content types
Additionally, individual bot information pages now display content type distribution for AI crawlers that exist in both the Verified Bots and AI Bots datasets.
Cloudflare’s network now supports real-time content conversion at the source, for enabled zones using content negotiation headers. When AI systems request pages from any website that uses Cloudflare and has Markdown for Agents enabled, they can express the preference for text/markdown in the request: our network will automatically and efficiently convert the HTML to markdown, when possible, on the fly.
Here is a curl example with the Accept negotiation header requesting this page from our developer documentation: