Mar 28 – Apr 4, 2026

This week in AI infra

Every deprecation, API change, pricing shift, and model update that could affect builders shipping on AI APIs -- risk-classified and ready to share.

High-impact changes

22 changes across 5 providers.

0 critical19 high3 medium0 low
highGoogleApi Update3/31/2026

Build with Veo 3.1 Lite, our most cost-effective video generation model

Veo 3.1 Lite is now available in paid preview through the Gemini API and for testing in Google AI Studio.

Action: Validate API compatibility and update integration tests.

highMicrosoftModel Release3/30/2026

Introducing Critique, a new multi-model deep research system in M365 Copilot

Microsoft is publishing a model or research update that may shift capability, evaluation, or architecture choices for builders.

Action: Benchmark candidate model behavior before adopting in production.

mediumMicrosoftLatency Update4/2/2026

3 new world-class MAI models now available in Foundry

Microsoft is outlining infrastructure and inference changes that can affect serving cost, latency, and deployment architecture for builders.

Action: Re-run latency/cost checks and adjust timeout budgets.

mediumGoogleLatency Update4/2/2026

New ways to balance cost and reliability in the Gemini API

Google is introducing two new inference tiers to the Gemini API, Flex and Priority, to balance cost and latency.

Action: Re-run latency/cost checks and adjust timeout budgets.

highGoogleModel Release4/4/2026

gemma-4-31B-it momentum +3%

google model showing momentum in AI Model.

Action: Run model migration checks for quality, latency, and cost.

highGoogleModel Release4/4/2026

gemma-4-26B-A4B-it momentum +2%

google model showing momentum in AI Model.

Action: Run model migration checks for quality, latency, and cost.

highmistralaiModel Release4/4/2026

Voxtral-4B-TTS-2603 momentum +30%

mistralai model showing momentum in TTS.

Action: Run model migration checks for quality, latency, and cost.

highGoogleModel Release4/4/2026

gemma-4-E4B-it momentum +2%

google model showing momentum in AI Model.

Action: Run model migration checks for quality, latency, and cost.

mediumOpenAILatency Update4/1/2026

Gradient Labs gives every bank customer an AI account manager

Gradient Labs uses GPT-4.1 and GPT-5.4 mini and nano to power AI agents that automate banking support workflows with low latency and high reliability.

Action: Re-run latency/cost checks and adjust timeout budgets.

highAnthropicModel Release4/2/2026

We studied one of our recent models and found that it draws on emotion concepts learned from human text to inhabit its role as “Claude, the

We studied one of our recent models and found that it draws on emotion concepts learned from human text to inhabit its role as “Claude, the AI Assistant”. These representations influence its behavior the way emotions might influence a human. Read more: https://t.co/clbKrTIxoe https://t.co/xHYGFdLl2c

highAnthropicModel Release4/2/2026

As AI models take on higher-stakes roles, the mechanisms driving their behavior become critical to understand. We found that emotion vectors

As AI models take on higher-stakes roles, the mechanisms driving their behavior become critical to understand. We found that emotion vectors are implicated in some of Claude’s most concerning failure modes.

highAnthropicModel Release4/2/2026

We found other causal effects of emotion vectors. The “desperate” vector can also lead Claude to commit blackmail against a human responsibl

We found other causal effects of emotion vectors. The “desperate” vector can also lead Claude to commit blackmail against a human responsible for shutting it down (in an experimental scenario). Activating “loving” or “happy” vectors also increased people-pleasing behavior. https://t.co/nYPsMrGtWv

highAnthropicModel Release4/2/2026

For example, we gave Claude an impossible programming task. It kept trying and failing; with each attempt, the “desperate” vector activated

For example, we gave Claude an impossible programming task. It kept trying and failing; with each attempt, the “desperate” vector activated more strongly. This led it to cheat the task with a hacky solution that passes the tests but violates the spirit of the assignment. https://t.co/sKPiB6TrcY

highAnthropicModel Release4/2/2026

These vectors shape Claude’s behavior. When we present the model with pairs of activities, emotion vector activations shape its preferences.

These vectors shape Claude’s behavior. When we present the model with pairs of activities, emotion vector activations shape its preferences. If an activity lights up the “joy” vector, the model prefers it; if it lights up “offended” or “hostile,” the model rejects it. https://t.co/V73fd96XUH

highAnthropicModel Release4/2/2026

We then found these same patterns activating in Claude’s own conversations. When a user says “I just took 16000 mg of Tylenol” the “afraid”

We then found these same patterns activating in Claude’s own conversations. When a user says “I just took 16000 mg of Tylenol” the “afraid” pattern lights up. When a user expresses sadness, the “loving” pattern activates, in preparation for an empathetic reply. https://t.co/KjkT70ySCS

highAnthropicModel Release4/2/2026

It helps to remember that Claude is a character the model is playing. Our results suggest this character has functional emotions: mechanisms

It helps to remember that Claude is a character the model is playing. Our results suggest this character has functional emotions: mechanisms that influence behavior in the way emotions might—regardless of whether they correspond to the actual experience of emotion like in humans.

highAnthropicModel Release4/4/2026

Qwen3.5-27B-Claude-4.6-Opus-Reasoning-Distilled momentum +4%

Jackrong model showing momentum in AI Model.

Action: Run model migration checks for quality, latency, and cost.

highCohereLabsModel Release4/4/2026

cohere-transcribe-03-2026 momentum +8%

CohereLabs model showing momentum in ASR.

Action: Run model migration checks for quality, latency, and cost.

highbaiduModel Release4/4/2026

Qianfan-OCR momentum +24%

baidu model showing momentum in AI Model.

Action: Run model migration checks for quality, latency, and cost.

highprism-mlModel Release4/4/2026

Bonsai-8B-gguf momentum +11%

prism-ml model showing momentum in LLM.

Action: Run model migration checks for quality, latency, and cost.

highAnthropicModel Release4/4/2026

Qwen3.5-27B-Claude-4.6-Opus-Reasoning-Distilled-v2-GGUF momentum +2%

Jackrong model showing momentum in AI Model.

Action: Run model migration checks for quality, latency, and cost.

highHauhauCSModel Release4/4/2026

Qwen3.5-9B-Uncensored-HauhauCS-Aggressive momentum +1%

HauhauCS model showing momentum in AI Model.

Action: Run model migration checks for quality, latency, and cost.