Builder-critical change radar

Latest LLM updates that can break agent stacks

Filterable stream of model/API/SDK/deprecation/pricing/latency updates with explicit action hints and one-click Checks entry points.

Latest updates

32 updates

model releasehighNVIDIA

NVIDIA Launches Nemotron 3 Nano Omni Model, Unifying Vision, Audio and Language for up to 9x More Efficient AI Agents

AI agent systems today juggle separate models for vision, speech and language — losing time and context as they pass data from one model to the other. Unveiled today, NVIDIA Nemotron 3 Nano Omni is an open multimodal model that brings these

Action: Benchmark candidate model behavior before adopting in production.

model releasehighOpenAI

Available today: GPT-5.5 Instant in Microsoft 365 Copilot

Microsoft is publishing a model or research update that may shift capability, evaluation, or architecture choices for builders.

Action: Benchmark candidate model behavior before adopting in production.

model releasehighMicrosoft

Microsoft 2026 Work Trend Index: How frontier firms are rebuilding the operating model for the age of AI

Microsoft is publishing a model or research update that may shift capability, evaluation, or architecture choices for builders.

Action: Benchmark candidate model behavior before adopting in production.

latency updatemediumGoogle

New ways to balance cost and reliability in the Gemini API

Google is introducing two new inference tiers to the Gemini API, Flex and Priority, to balance cost and latency.

Action: Re-run latency/cost checks and adjust timeout budgets.

latency updatemediumMicrosoft

Red Hat Summit 2026: Platform modernization and AI on Microsoft Azure Red Hat OpenShift

Microsoft is outlining infrastructure and inference changes that can affect serving cost, latency, and deployment architecture for builders.

Action: Re-run latency/cost checks and adjust timeout budgets.

api updatehighOpenAI

From model to agent: Equipping the Responses API with a computer environment

How OpenAI built an agent runtime using the Responses API, shell tool, and hosted containers to run secure, scalable agents with files, tools, and state.

Action: Validate API compatibility and update integration tests.

api updatehighOpenAI

Unrolling the Codex agent loop

A technical deep dive into the Codex agent loop, explaining how Codex CLI orchestrates models, tools, prompts, and performance using the Responses API.

Action: Validate API compatibility and update integration tests.

model releasehighOpenAI

What Parameter Golf taught us about AI-assisted research

Parameter Golf brought together 1,000+ participants and 2,000+ submissions to explore AI-assisted machine learning research, coding agents, quantization, and novel model design under strict constraints.

Action: Benchmark candidate model behavior before adopting in production.

latency updatemediumGoogle

Reduce friction and latency for long-running jobs with Webhooks in Gemini API

Event-Driven Webhooks are a push-based notification system that eliminates the need for inefficient polling.

Action: Re-run latency/cost checks and adjust timeout budgets.

latency updatemediumMeta

SAM 3.1: Faster and More Accessible Real-Time Video Detection and Tracking With Multiplexing and Global Reasoning

Computer Vision

Action: Re-run latency/cost checks and adjust timeout budgets.

latency updatemediumOpenAI

Speeding up agentic workflows with WebSockets in the Responses API

A deep dive into the Codex agent loop, showing how WebSockets and connection-scoped caching reduced API overhead and improved model latency.

Action: Re-run latency/cost checks and adjust timeout budgets.

latency updatemediumOpenAI

Introducing GPT-5.1 for developers

GPT-5.1 is now available in the API, bringing faster adaptive reasoning, extended prompt caching, improved coding performance, and new apply_patch and shell tools.

Action: Re-run latency/cost checks and adjust timeout budgets.

sdk updatemediumOpenAI

The next evolution of the Agents SDK

OpenAI updates the Agents SDK with native sandbox execution and a model-native harness, helping developers build secure, long-running agents across files and tools.

Action: Review SDK changelog and update integration pins/tests.

model releasehighGoogle

The next generation of Android Auto has new visuals that look great on any car screen, premium entertainment and a more helpful Gemini. #The

The next generation of Android Auto has new visuals that look great on any car screen, premium entertainment and a more helpful Gemini. #TheAndroidShow https://t.co/F4xWtChtMl

Action: Retest your production agent flow before rollout.

model releasehighGoogle

Introducing Googlebook, the first laptop designed for Gemini Intelligence. It’s crafted for heavyweight performance, built with Gemini at th

Introducing Googlebook, the first laptop designed for Gemini Intelligence. It’s crafted for heavyweight performance, built with Gemini at the core and perfectly synced with your Android phone. Coming this fall. 💻✨ #TheAndroidShow https://t.co/rn4pztApmp

Action: Retest your production agent flow before rollout.

model releasehighGoogle

Today, we introduced Gemini Intelligence, which brings the best of Gemini to our most advanced devices. Gemini Intelligence integrates premi

Today, we introduced Gemini Intelligence, which brings the best of Gemini to our most advanced devices. Gemini Intelligence integrates premium hardware and innovative software to help you stay a step ahead and work proactively to get things done throughout your day. https://t.co/NY30mNUXyy

Action: Retest your production agent flow before rollout.

model releasehighGoogle

Learn more about Gemini Intelligence on @Android → https://t.co/YE2PVrSF8G #TheAndroidShow

Learn more about Gemini Intelligence on @Android → https://t.co/YE2PVrSF8G #TheAndroidShow

Action: Retest your production agent flow before rollout.

model releasehighGoogle

We’re reimagining a 50-year-old interface - the mouse pointer - with AI. 🖱️ These experimental demos show how people can intuitively direct

We’re reimagining a 50-year-old interface - the mouse pointer - with AI. 🖱️ These experimental demos show how people can intuitively direct Gemini on their screens using motion, speech, and natural shorthand to get things done 🧵 https://t.co/p6fhgNcopz

Action: Retest your production agent flow before rollout.

model releasehighGoogle

With Gemini Intelligence on @Android, you’ll be able to: ✨ Automate multi-step tasks across your apps, like finding your class syllabus in G

With Gemini Intelligence on @Android, you’ll be able to: ✨ Automate multi-step tasks across your apps, like finding your class syllabus in Gmail and putting the books you need in your cart ✨ Fill out forms in a single tap thanks to Gemini Personal Intelligence ✨ Turn spoken

Action: Retest your production agent flow before rollout.

model releasehighAnthropic

Fast mode for Claude Opus 4.7 is now available in Cursor! It's 2.5x the speed at 6x the cost. For most tasks, we recommend using the standar

Fast mode for Claude Opus 4.7 is now available in Cursor! It's 2.5x the speed at 6x the cost. For most tasks, we recommend using the standard speed.

Action: Retest your production agent flow before rollout.

pricing changehighPerplexity

This NVIDIA remains the strongest platform for large-model inference at scale. Prefill/decode disaggregation, Blackwell-native quantization,

This NVIDIA remains the strongest platform for large-model inference at scale. Prefill/decode disaggregation, Blackwell-native quantization, custom kernels, and rack-scale NVLink turn GB200 into faster answers lower serving cost. Read the full paper here

Action: Retest your production agent flow before rollout.

pricing changehighSam Altman

i get some anxiety not using the smartest-available model/settings. but sometimes i dont mind if it's really slow. i wonder if we should foc

i get some anxiety not using the smartest-available model/settings. but sometimes i dont mind if it's really slow. i wonder if we should focus more on a price/speed tradeoff relative to a price/intelligence tradeoff.

Action: Retest your production agent flow before rollout.