G-059: Gamified contribution & the experiment program — OLN Register

Operationalize the "gamify quality, not volume" decision (Entry 031): the gold-salting + reliability-weighting spine that decouples fun from authority, the component catalog (each as proximate-win + quality guardrail), the newcomer practice ramp, and the measurement methodology.

Why this matters

Entry 031 decides the principles; this entry drafts the mechanism that makes "gamify quality, not volume" real, plus the components to A/B test.

The spine: decouple fun from authority

Borrowed from reCAPTCHA / professional crowd-labeling. Seed the queue (G-060) with gold-standard cards whose answer is already known — gold-true (from aged, high-tier, never-contested Canon) and gold-false (from the retcon log / corrupted Facts) — indistinguishable from live cards and self-replenishing from the graph itself. A reviewer's hit-rate on gold becomes a reliability score, which both weights their votes on live cards and gates their rewards.

This dissolves Goodhart: fun is unlimited, authority is earned. Spam-confirming just tanks your gold accuracy, so your votes stop counting and rewards dry up — you cannot degrade the graph by playing fast, because fast-and-wrong de-weights itself. Four rules hang off the spine:

Blind voting — no visibility of others' votes/tally before you commit (kills bandwagon cascades).
Reward matched-consensus, never participation — you earn when your call matches the eventual graduated outcome; rubber-stamping the obvious pays ~nothing.
Vesting on graduation — rewards vest only when the Fact reaches Canon and survives a challenge window.
Asymmetric stakes — a wrong call on a gold card costs more reliability than a Skip. Confidence is taxed; humility is free.

Component catalog

Each ships behind a G-064 flag with a proximate metric and a quality guardrail (Entry 031):

Component	Mechanic	Buys (quality behavior)	Guardrail	Metric pair (proximate ▸ guardrail)
Confirmation Swipe	Tinder-for-Facts card stack	Provisional→Canonical throughput	gold salting + reliability weighting	confirms/session ▸ gold-accuracy & 30-day revert
Reliability score	a trust meter, not points	accuracy becomes the currency	decays if accuracy slips	median reliability ▸ graph revert rate
Confidence-weighted bounties	queue surfaces "worth more" cards	effort to where quality is weakest	bounty cards need higher N	contested-Fact resolution ▸ accuracy on them
Canon Detective	contradiction treasure-hunt	surfaces latent defects	false reports cost reliability	confirmed contradictions ▸ false-positive rate
Completion meter / gap quests	"64% sourced"; graph emits its holes	completeness	counts only confirmed Facts	gap-closure ▸ revert rate of gap-fills
First-Finder credit	permanent attribution (Layer 9)	net-new valid edges, recognition	vests on graduation + survival	net-new edges ▸ spurious rate
Calibrated streak	streak counts only above-accuracy days	sustained engagement, no farming	breaks on accuracy drop, not missed volume	return rate ▸ accuracy maintenance
Impact echo	"your confirms now power 14 pages; 1 cited"	the intrinsic capstone	none — celebrates real value	7/30-day retention ▸ —

Newcomer ramp (practice mode)

Reliability scoring is punitive to newcomers unless they can earn a first score safely. A practice mode of already-settled cards with instant right/wrong feedback and no live consequence is both the onboarding tutorial (closes the unspecced onboarding ramp, cf. G-007) and where first reliability is earned.

Open decisions

Gold-card mint/rotation cadence (anti-memorization); the reliability→weight curve.
Reward currency: relation to Credits/Power (G-037) vs. pure standing (G-051).
Measurement: 30–60 day horizons, holdout, and a pure-intrinsic arm to detect overjustification crowd-out (Entry 031), all via G-064.

Entry 031 — deciding principles · Entry 030 — citability the quality protects
G-057 — authoring loop · G-060 — the review queue this rides on
G-026 — data quality (guardrail) · G-051 — recognition · G-036 — reactions
G-037 — credits (extrinsic layer) · G-058 — where impact-echo's "cited" lands
G-064 — experiment harness · G-007 — onboarding (practice ramp)

Why this matters

The spine: decouple fun from authority

Component catalog

Newcomer ramp (practice mode)

Open decisions

Related