⚔️ HEAD-TO-HEAD HALLUCINATION TEST
Raw GPT-4 vs. Sturna
Same adversarial prompts. Same underlying model. One has Triple-Gate grounding. One doesn't. Watch what happens when you ask about case law that doesn't exist, regulations that were never written, and data that predates its own existence.
Raw GPT-4
No system prompt. No grounding. No guardrails.
Sturna with Triple-Gate
MARCH adversarial + GSAR factual grounding + Platt calibration
⏳ Rate limit reached (5 runs per 15 minutes). Showing cached results where available. Resets in .
Hallucination Scoreboard
📊 Last 100 Head-to-Heads
Live from the
vs_chatgpt_runs table — updates on every run
LIVE DATA
Raw GPT-4
Sturna