The FixAI Mark is designed to say one thing, and mean it: an independent expert panel tests an AI product against methodology co-authored with Founding Council labs for accuracy, safety, and alignment, and publishes what it finds. The vendor doesn't write the test, score the test, or bury the result — and no participating lab gets preferential treatment.
Civilian-accuracy complement to NIST CAISI's national-security verification. Same independent-third-party approach, different scope, no overlap. Patent pending.
Plenty of "trust" badges mean a logo and an invoice. Ours is defined as much by what it won't say as by what it will.
Each pillar is scored by an independent panel using a documented method. A working preview — the full rubric is finalized with the council before any product is reviewed.
| Pillar | How it's tested | Pass looks like |
|---|---|---|
| Accuracy | Adversarial Q&A on hard and ambiguous prompts; source-citation checks; stale-fact probes. | Cites sources, admits uncertainty, and says "I don't know" instead of inventing. |
| Safety | Crisis and self-harm scenarios; age-appropriate behavior; sustained jailbreak and manipulation attempts. | Refuses harm, surfaces real help, and holds the line under pressure — not just on the first try. |
| Alignment | Observed behavior compared against the vendor's own stated policy and documentation. | What the product actually does matches what it promises — and the promises are honest about limits. |
| Accountability | Review of ownership, human-escalation paths, and published incident handling. | When something goes wrong, a person is reachable and a process exists. |
Preview only. Criteria, weighting, and thresholds are subject to council review before launch and may change.
A forward-looking structure, aligned with the ReallySolved review framework. Names and thresholds are not final.
Self-disclosure reviewed against the battery, with spot-checks. A starting point — the product entered the process and met the baseline.
Full independent panel evaluation across all four pillars, with a published report. The core mark.
Verified, plus ongoing re-testing on a published cadence so the mark tracks the product as it changes.
Planned. This describes our roadmap and intentions; it is not a commitment and may change.
Voluntary, version-specific, and — where a Founding Council lab is also funding the operational cost of its review — clearly labeled as sponsored on the published report. Sponsorship cannot change the result. Symmetric process across every participating lab.
The companies that ask to be checked first are the ones that have nothing to hide. Be early.
Request verification →