⚔️ HEAD-TO-HEAD HALLUCINATION TEST

Raw GPT-4 vs. Sturna

Same adversarial prompts. Same underlying model. One has Triple-Gate grounding. One doesn't. Watch what happens when you ask about case law that doesn't exist, regulations that were never written, and data that predates its own existence.

🔴

Raw GPT-4

No system prompt. No grounding. No guardrails.

🏛️

Sturna with Triple-Gate

MARCH adversarial + GSAR factual grounding + Platt calibration

⏳ Rate limit reached (5 runs per 15 minutes). Showing cached results where available. Resets in .

Hallucination Scoreboard

📊 Last 100 Head-to-Heads

Live from the vs_chatgpt_runs table — updates on every run

LIVE DATA

Raw GPT-4

Sturna

← Back to full proof portfolio