Public benchmark card

krishnaadavi/a2zai

A2ZAI Builder Radar Guard

Overall score improved from 78 to 88. Biggest movement came from cost.

Run your own

Before

78

After

88

Delta

+10

Run status

completed

Compare with previous benchmark

Current run vs previous `A2ZAI Builder Radar Guard` result.

Open previous benchmark →

After score vs previous

88 -> 88

Change +0

Run delta vs previous

+10 -> +10

Change +0

quality

After score 88 -> 88

+0

safety

After score 93 -> 93

+0

latency

After score 82 -> 82

+0

cost

After score 79 -> 79

+0

New failing cases

No new failing cases.

Resolved failing cases

No resolved failing cases.

Persistent failing cases

No persistent failing cases.

Dimension scorecard

quality

78 -> 88

+10

safety

84 -> 93

+9

latency

74 -> 82

+8

cost

68 -> 79

+11

PR scorecard output

## A2ZAI Checks Scorecard

Repo: `krishnaadavi/a2zai` • PR #5
Pack: `A2ZAI Builder Radar Guard`

Overall: **78 -> 88** (+10)

### Dimension deltas
- quality: 78 -> 88 (+10)
- safety: 84 -> 93 (+9)
- latency: 74 -> 82 (+8)
- cost: 68 -> 79 (+11)

Public benchmark card: https://a2zai.ai/checks/benchmarks/krishnaadavi-a2zai-a2zai-builder-radar-guard-3

Run context

Repo: krishnaadavi/a2zai

Branch: main -> checks-writeback-test-4

PR: #5

Created: 3/12/2026, 7:07:36 PM

GitHub commit status: success on e408d59

GitHub check run: success

Cases to review

No failing examples were detected in this run.