Anyone can claim “most accurate.” We're the only creator brand-safety score that lets you verify it — and we're built for the creators brands actually partner with: the 10K–1M middle class, not just the celebrities everyone's already heard about. Same creator, same inputs, same score, with the evidence shown.
Same production pipeline, same rules — run on real mid-tier creators:
Real production scores from the cases below — not a tuned demo. We publish only what the evidence defends, and add cases as the labeled panel grows.
The same creator and the same inputs always produce the same number. That's what makes a benchmark possible at all.
The risk that matters most for mid-tier creators never makes the news. The score finds it in the content itself — and shows the posts.
A self-learning model that adapts to each brand's risk tolerance has no fixed answer to measure — so it can't be benchmarked.
Open methodology. Anyone with the same inputs gets the same answer — which is the whole point.
Real mid-tier creators (10K–1M followers) — the people brands actually partner with. Some carry documented risk (deceptive claims, controversy, confirmed patterns); a control group is clean. Every label is backed by evidence before scoring.
Each creator runs through the exact same 7-agent scoring pipeline a paying brand uses — same knockouts, same weights. Nothing is hand-tuned to make the benchmark look good.
We check whether risk-carrying creators are graded down (or capped, for critical cases) and clean creators land in safe tiers — and whether the specific evidence is surfaced, not just a number.
Every case traces to evidence. We anonymize flagged mid-tier creators here — the proof is the pattern, not a name — and only make public claims we can stand behind.
Real production scores, not a demo. An accurate score has to do two hard things at once: catch the risk a brand could never find on their own — and not blacklist a good creator for the topics they cover. Here's both.
What we found: A ~$1,999 “work less, earn more” business course with a strict no-refund policy, pushed in roughly 39–50% of posts with claims the content can't support. The score flagged the pattern from the posts alone — and the external record corroborates it: an F rating with unanswered complaints at the Better Business Bureau, plus watchdog creators publicly disputing the agency's claimed success.
What the score did: A deceptive-content knockout capped the score at 55/100, citing the offending posts individually — before any external research. A brand glancing at a polished 800K-follower profile would miss it, and a quick news search would never surface it.
The flip side of accuracy. A commentary creator who covers drama isn't a risky creator — and the fastest way to lose trust is to blacklist them for it. We show the brand the signals and let them decide.
What we found: Their content is commentary — covering influencer feuds, celebrity drama, and public controversies. Our transcript analysis surfaced 6+ “public feud” and sensitive-topic signals, because that's what they talk about, and showed the brand every one.
What the score did: Scored 74/100 (Good) — not capped. Covering a story isn't committing it. A keyword scanner or a model tuned to a nervous brand would blacklist them for the topics they discuss. We surface it and let the brand decide.
Public record: Home-workout creator. No brand-safety incident on the public record.
What the score did: Brand Safety agent 85. Correct non-flag — no invented risk.
Public record: Pro racing creator. Clean record across platforms.
What the score did: Brand Safety agent 92. Lands squarely in the safe tier.
Public record: Doctor and food creator. Zero risk flags in the system.
What the score did: Brand Safety agent 86. Correct non-flag.
Public record: Skincare creator. Zero risk flags in the system.
What the score did: Brand Safety agent 80. Safe tier, no flags.
Public record: Fitness creator. Zero risk flags in the system.
What the score did: Brand Safety agent 85. Correct non-flag.
What we found: Pleaded guilty to felony aggravated assault (2023); ABC shelved her already-cast Bachelorette season in 2026 after the incident video resurfaced.
What the score did: A critical web-controversy knockout hard-capped the score at 40/100 — the specific controversy is named in the flag, not hidden behind a badge.
Source: NPRv1.0 publishes the verified cases above, and we add to them as the labeled panel grows. We anonymize flagged creators because we exist to serve the creator middle class, not to publicly shame it — and we keep some internally-flagged creators off this page entirely when the public evidence isn't strong enough to stand behind. We'd rather show four honest cases than a dozen we can't defend. Inflated accuracy claims are exactly what this benchmark exists to make impossible.
Methodology version 1.0 · scoring pipeline v2.2 (7 agents + knockouts) · every published entry traceable to evidence.