
Azure AI Foundry’s October updates offer a meaningful step forward for teams building intelligent systems at scale. From multimodal reasoning and voice interaction to reinforced fine-tuning and low-latency agents, this release brings enhancements that speak directly to what many organisations need right now: faster development cycles, more intuitive Enterprise AI experiences, and enterprise-grade reliability.
Here’s a breakdown of what’s new, along with our take on what it means for your technical strategy.
Major Model Updates
GPT-5-Codex Now Generally Available
The latest evolution in code-first AI is here. GPT‑5‑Codex blends natural language understanding with code and image reasoning, making it a valuable asset for software engineering teams. With support for architecture diagrams and UI screenshots, it can contextualise and debug code more effectively, ideal for identifying edge cases, reviewing complex pull requests, or untangling legacy infrastructure.
If you’re tackling large migrations or cross-stack refactors, GPT-5-Codex’s deep repository insight and longer context windows help manage that complexity. Now available in the Azure AI Foundry Model Catalogue, it’s ready for integration into your toolchain.
Sora Video-to-Video (Preview)
Sora’s new video-to-video transformation capabilities bring a fresh layer of control to creative and synthetic workflows. Upload a clip, describe how you’d like it styled or paced, and Sora regenerates a new version, preserving timeline flow while applying visual and tonal changes.
This is particularly promising for marketing, prototyping, or even simulation data pipelines. It reduces the friction of video iteration and opens doors for teams working in content-heavy environments. Please note that the preview program enforces safety guidelines and size constraints.
gpt-realtime (GA)
If responsiveness matters in your voice-based applications, gpt-realtime is a breakthrough. By merging speech recognition, reasoning, and speech synthesis into a single pass, this model reduces latency to the point where conversations feel natural, even under pressure.
It also handles image inputs mid-conversation and includes expressive voices, making it a solid foundation for live customer support, voice-guided workflows, or agent-led collaboration. Combined with the Voice Live API, it’s a key enabler for next-generation digital agents.
Grok 4 Fast Models (Preview)
Grok 4 Fast, from xAI, is now in preview, introducing two variants designed to balance speed and cognitive depth. The models support 131K context windows, image understanding, and parallel function calls, with one tuned for deep analytical tasks and the other for high-speed classification.
If your architecture calls for fast, low-cost results most of the time, with the ability to escalate to deeper analysis when needed, this tiered setup could be an effective pattern. Structured JSON output and strong multimodal capabilities round out the package.
o4-mini RFT Graduates to GA
Fine-tuning doesn’t need to rely on massive supervised datasets. o4-mini now supports Reinforcement Fine-Tuning (RFT), which allows you to train models based on what good looks like, using grading functions instead of hardcoded labels.
That’s a big win for nuanced applications: multi-step reasoning, policy adherence, or performance goals that don’t map neatly to right-or-wrong answers. RFT is efficient to get started with (you can define preferences in as few as 100 examples) and gives you more granular control over model behaviour.
Platform & Tooling Improvements
Browser Automation & Computer Use (Preview)
Azure’s Browser Automation tool is getting more resilient and observant, thanks to enhanced telemetry. It now handles dynamic web elements more gracefully, good news for teams automating across SaaS interfaces.
Meanwhile, the new Computer Use tool (also in preview) allows agents to interact directly with the desktop at the pixel level. This unlocks automation for canvas-heavy or legacy applications that don’t expose APIs. It’s a powerful addition, but comes with important safeguards: we recommend running it in isolated environments with explicit permissioning.
Agent SDKs & Frameworks
Both the Python and .NET Agent SDKs have received updates to support multi-tool orchestration, streaming responses, and more robust state handling. This brings smoother flows and faster feedback to complex agent use cases.
Microsoft has also released the Microsoft Agent Framework (open source), which standardises how agents are built for enterprise. It combines semantic memory, observability, and durable orchestration, all essential if you’re scaling agents beyond simple prototypes.
Retrieval & Knowledge Grounding
Azure AI Search now supports “knowledge sources”, a more powerful abstraction that allows retrieval from multiple vectorised datasets, with automatic answer synthesis and citations.
For teams building knowledge-intensive agents, this makes grounded, multi-source answers easier to generate and verify. Retrieval can now span across SharePoint, databases, and document stores, ensuring that users get accurate, context-rich responses.
Voice and Avatar Enhancements
The Voice Live API is now generally available, bringing real-time session control to speech applications. It handles barge-in scenarios smoothly and allows for adaptive state tracking—ideal for contact centres or assistants embedded in customer-facing platforms.
At the same time, Azure’s avatar system now supports expressive, high-fidelity visuals (up to 4K) synchronised with the emotional tone of the voice. This is a meaningful step forward in accessibility and user trust, especially for scenarios involving education, healthcare, or onboarding.
Security & Evaluation
Security and reliability updates round out this month’s release:
- Azure Key Vault integration is now built-in across Foundry, so secrets and credentials can be managed safely across environments.
- Identity SDKs in all major languages have been refreshed to enforce best practices and catch misconfigurations early.
- The evaluation toolkit now includes groundedness metrics, multilingual red-teaming, and experiment tagging, making it easier to monitor quality and safety across iterations.
For enterprise teams, these updates are more than feature additions. They reflect a broader shift: generative AI is becoming more integrated, more controllable, and more production-ready.
With tools like GPT-5-Codex and gpt-realtime, builders gain speed without sacrificing reliability. With RFT and multi-source retrieval, they gain nuance and transparency. And with better orchestration, avatars, and evaluations, they gain user trust.