ruvn — the AI research assistant that grades its sources and hands you a cited dossier

01

Meet Dr. Priya — the “oh, that’s what it’s for” story

One real person, one real before → after.

Start here

Dr. Priya runs a small wellness studio. She’s heard clients ask about “40 Hz light-and-sound therapy” for focus and sleep, and she wants to know whether there’s real evidence before she says anything to anyone. She is not a scientist and she does not have time to read forty browser tabs. She already uses Claude Code — mostly for her booking spreadsheets — so the AI is right there on her laptop.

Before ruvn. She asks her normal AI chatbot, “Does 40 Hz light therapy actually help with sleep?” It hands her one smooth, confident paragraph. It sounds authoritative. But she has no idea whether that confidence came from a 2024 clinical trial or from a supplement-seller’s blog — they’re blended together with no labels. She can’t tell what to trust, can’t cite anything to a client, and can’t tell what’s hype. She’s stuck with a paragraph she can’t stand behind.

After ruvn. She installs it once (npm i -g @ruvnet/ruvn, then ruvn init) and asks Claude Code to run the ruvn research pipeline on the same question. Now it splits her question into precise sub-questions, goes and finds the sources, grades each one A/B/C/D, writes the findings using only the A’s and B’s, then tries to prove each statement wrong and deletes anything it can’t back up, and finally hands her a dossier where every claim has a numbered citation and every source shows its grade. She gets a TL;DR, a cited body, and a graded bibliography. She can see the strong claims rest on a graded-A 2024 paper and the weak ones were thrown out. Now she can honestly tell a client, “here’s what the good evidence says, and here’s what it doesn’t.”

The line that makes it click It’s the difference between an AI that hands you a confident paragraph you can’t check, and one that hands you a graded, cited dossier you can.

Technical view — the same question, two very different paths

The whole idea in one diagram: a plain chat blends every source into one unlabeled paragraph (left); ruvn forces the same question through six checkpoints — grade, then write from the good ones only, then fact-check, then cite — and returns a dossier you can audit (right).

Friendly view — what it feels like for Priya

A relieved wellness-studio owner at a cozy desk holding a tidy cited report with letter-grade tabs, while a chaotic pile of browser tabs and sticky-note question marks fades away behind her. — Before: forty tabs, sticky notes, and a nagging “can I actually trust this?” After: one tidy report she can hand to a client, with grades she can point to. Same AI, same laptop — ruvn just made it show its work.

Honest framing ruvn was built for exactly this kind of question — it was made for the ruv-neural gamma-entrainment (40 Hz) research project — but the repo is explicit that “it works for any research question.” The medical framing is the origin story, not a limit. And ruvn is a research tool: it grades and cites evidence; it does not give medical advice or make efficacy claims.

02

What problem it solves — the “confident blend” trap

What does it actually do? Why do I care?

When you ask a normal AI chatbot to “research” something, it blends good sources and bad sources into one confident paragraph — and you can’t tell which parts came from a peer-reviewed paper and which came from a random forum post. It sounds sure of itself either way. That’s the trap. The repo puts it bluntly: “Most ‘ask an AI’ research blends good and bad sources into one confident answer. ruvn refuses to.”

So, plainly:

What does it actually do? It does disciplined research for you — finds sources, grades them, writes only from the trustworthy ones, fact-checks itself, and gives you a cited report instead of a confident guess.
Why do I care? Because a confident-but-wrong answer is the most expensive kind. ruvn makes the AI show its work — you can see which claim came from which source, and how good that source is, at a glance. No more “trust me.”
Why do I need it? Because the AI you already use doesn’t grade its sources by default — it just answers. ruvn bolts a grading-and-verification discipline onto the AI you already have, so “research” stops meaning “plausible paragraph” and starts meaning “evidence you can check.”
Why is it important? Because the output is auditable. Every claim has a citation; every source has a grade. You — or a colleague, or a skeptic — can re-check it. That’s the difference between a vibe and a dossier.

Technical view — why a single blended answer hides the problem

The trap, drawn out: a plain AI flattens a grade-A paper, a grade-C summary, and a grade-D forum post into one paragraph where the weak source secretly counts as much as the strong one. ruvn keeps them separate, grades each, and only lets the good ones into the answer — with a citation.

Friendly view — from “trust me” to a checkable case file

On the left, a single grey blurry paragraph with a big question mark and the words trust me. On the right, the same content reorganized into a neat case file with numbered citation tags and green A and B grade stamps, joined by an arrow labelled transformation, verified. — Left: a confident grey blur you’re asked to trust. Right: the same content as a tidy case file — numbered citations, graded sources, checkable. That arrow in the middle is the whole product.

03

The grading rubric — how a source earns its way into your answer

A, B, C, D — and only A and B are allowed in.

Here is the rule the whole tool turns on. The source-grader opens each source and stamps it A, B, C, or D for how much you can trust it. The synthesizer is then only allowed to write from the A and B sources — the C’s and D’s are not allowed in. And the fact-checker holds the bar higher still: a claim must be backed by one grade-A source or two grade-B sources, or it gets stripped out.

Grade	What it means	Allowed in the answer?
A	Primary source (a real paper, an official doc), under ~2 years old, on-topic	✓ Yes
B	Reputable secondary source (major outlet, named expert), under ~5 years	✓ Yes
C	Tertiary (Wikipedia, a summary) — background only, not evidence	✗ No
D	Discarded (forum post, unsourced claim, dead link)	✗ No

The whole point in one line The synthesizer is only allowed to use A and B. The fact-checker then keeps a claim only if it has one A or two B’s behind it. Everything else is background or gone.

Technical view — the grading gate and the evidence threshold

The grading gate, exactly as the agents enforce it: Gate 1 (the grader) only lets A and B sources reach the writer; Gate 2 (the fact-checker) keeps a claim only if it has one grade-A or two grade-B sources behind it, then a citation is attached. Two separate checks, so weak evidence can’t sneak through on either.

Friendly view — the four grade stamps

Four rubber-stamp letter grades on cream paper: A and B stamped in green with checkmarks and the words used in answer; C and D stamped in oxblood red with crosses and the words not used. — Green stamps get used; red stamps don’t. A and B are the evidence your answer is built from; C is background only; D is thrown out. That’s the entire rubric on one piece of paper.

04

How it works — six specialist helpers, in a line

A research process with checkpoints, not a single answer.

ruvn is a research process with checkpoints, not a single answer. It runs six specialist helpers in a line, and — this is the clever part — each helper only sees what the one before it produced, never the raw mess. So information is forced to pass through a grading gate and a fact-checking gate before it ever reaches you. In plain words:

scout — breaks your big question into 3–7 precise little questions. “What exactly do we need to find out?”
web-searcher — goes and finds raw sources for each little question; doesn’t judge yet, just collects. “Go find the sources.”
source-grader — opens each source and stamps it A, B, C, or D for how much you can trust it. “Which of these can we actually trust?”
synthesizer — writes the findings using only the A and B sources. The C’s and D’s are not allowed in. “Summarize — but only from the good stuff.”
fact-checker — goes back and tries to disprove every claim; anything it can’t support gets deleted. “Try to prove each statement wrong, and cut what fails.”
citer — final pass: attaches a numbered citation to every claim and builds the bibliography with the grades shown. “No claim ships without a receipt.”

What you get back: a Markdown dossier — a TL;DR up top, a body where every sentence is cited, and a bibliography with a letter grade next to each source.

Technical view — the pipeline, its hand-off contract, and the model tier of each agent

The real pipeline, with the detail a README buries: six named agents in a fixed order; the strict hand-off contract (each only sees the previous agent’s structured output, never the raw web); the two forced gates (grade, then verify); the model tier each agent runs on; and the exact shape of the dossier that comes out.

Friendly view — six little helpers passing the folder down the line

Six friendly little paper-character helpers in a line on a worktable, passing a manila folder hand to hand: a scout with a spyglass, a searcher with a net of papers, a grader with a stamp, a synthesizer with a pen, a fact-checker with a red reject stamp, and a citer attaching numbered tags. — Picture six little specialists at a worktable, passing one folder down the line. Each only ever sees the folder the helper before them handed over — so by the time it reaches you, it has been searched, graded, written, fact-checked, and tagged.

05

What “solved” looks like — the dossier on your screen

The exact thing you can hand a skeptic.

“Done” isn’t a chat reply — it’s a file. ruvn returns a ready-to-paste Markdown dossier with three parts you can actually point at: a TL;DR at the top, a cited body where every sentence carries a numbered receipt, and a graded bibliography where every source shows its A/B/C tag. The citer’s own rule is blunt: “the dossier must NOT contain any claim without a citation.”

Technical view — the anatomy of the dossier ruvn writes

The solved state, concretely: a TL;DR, a body where every sentence ends in a numbered citation, and a bibliography where every source wears its grade. The grade-D forum post is listed but was never allowed into the answer — you can see exactly what was used and what was thrown out.

Friendly view — the finished case file on the desk

An open evidence dossier on a warm desk with source clippings clipped in, each wearing a letter grade, and a magnifying glass on top, representing the finished, checkable research result. — This is what “solved” feels like: a tidy case file you can open, where the strong sources are clipped in and graded, and the magnifying glass is right there if you want to check any one of them yourself.

06

“I already have Claude Code — why do I need ruvn too?”

The discipline layer, not another brain.

You almost certainly already have the AI host — Claude Code, Codex, Copilot. So let’s answer this head-on.

Claude Code is the brilliant generalist brain. Out of the box, when you say “research X,” it gives you one confident blended answer — it does not, by default, grade each source, refuse to use the weak ones, adversarially fact-check itself, and attach a citation to every claim. ruvn is the discipline layer that makes it do all of that, every time, in a fixed order.

And it doesn’t replace your AI — it rides inside it. ruvn ships no model of its own; the kernel “makes no model calls — your host provides the model.” You keep your same AI, your same login, your same bill. ruvn just adds the six-agent research pipeline and the grading rubric on top.

Before → after on your own question

	Plain AI chat	With ruvn
Sources	Blended, unlabeled	Each graded A/B/C/D `source-grader`
What the answer is built from	Whatever it found	A & B sources only `synthesizer`
Self-checking	None by default	Adversarial fact-check; unsupported claims deleted `fact-checker`
Citations	Sometimes, inconsistently	Every claim cited or it doesn’t ship `citer`
What you can hand a skeptic	“Trust me”	An auditable dossier

Technical view — ruvn is a thin discipline layer on the host you already run

Why it’s not redundant with the AI you already pay for: ruvn ships no model. Your host still provides the brain, the web search, and your files. ruvn is a thin configuration layer — six agent prompts plus the grading rubric — that calls down into your host and forces the grade-verify-cite discipline every time.

Friendly view — same content, before and after the discipline

A grey blurry trust-me paragraph on the left transforming via an arrow into a tidy verified case file with numbered citations and green grade stamps on the right. — The same AI, the same question — just run through ruvn’s discipline. The left is what you get for free; the right is what ruvn turns it into.

07

Use-case gallery — six real ways people use it

Open each card for its own picture, command, and result.

Every card below runs the same six-agent pipeline (scout → web-searcher → source-grader → synthesizer → fact-checker → citer). The variety is in the question and the audience — not the machinery. Each opens to its own diagram.

1Check a health or wellness claim before you repeat it

SituationA client or friend asks “does 40 Hz light-and-sound therapy help sleep or focus?” and you don’t want to parrot hype.

Commandruvn init, then in Claude Code ask it to run the ruvn dossier pipeline on the question.

What it doesScouts sub-questions → finds sources → grades each A–D → writes only from A/B → fact-checks → cites.

You getA TL;DR + cited body + graded bibliography you can stand behind — with the honest reminder that this is research, not medical advice.

A wellness claim in, a graded-and-cited dossier out — with the “research, not medical advice” line kept honestly on the page.

2Decide between two options with real evidence (compare A vs B)

SituationYou’re weighing option A vs option B — the “compare modalities/dosing, responder profiles, safety evidence” job ruvn was built for.

CommandAsk a comparative question: “X vs Y for outcome Z, and what does the safety evidence say?”

What it doesscout splits it into precise comparison sub-questions; grading keeps only trustworthy sources; the fact-checker flags contradictions and strips unsupported claims.

You getA side-by-side, cited dossier where contradictions are flagged explicitly (the synthesizer is told to “Flag contradictions explicitly”).

The scout splits the comparison into per-option sub-questions; the fact-checker flags where the evidence disagrees instead of hiding it. You get an honest side-by-side, not a false tidy verdict.

3Sanity-check a viral or scary claim (“does X cure Y?”)

SituationA sensational headline is going around and you want the honest status.

CommandFeed ruvn the claim verbatim: “Claim: ‘<the headline>.’ Verify it against current evidence.”

What it doesThe fact-checker is adversarial by design — it marks each claim CONFIRMED / DISPUTED / UNSUPPORTED and strips the UNSUPPORTED ones. (The repo’s own validation even runs this on “40 Hz light flicker cures Alzheimer’s.”)

You getA verdict backed by graded sources, with the hype removed — not a polite hedge.

The fact-checker labels each claim CONFIRMED, DISPUTED or UNSUPPORTED and deletes the unsupported ones — so the hype is removed rather than softened into a hedge.

4Build a cited brief for a report, post, or decision

SituationYou need a short, defensible write-up that a manager or audience can trust.

CommandRun the dossier pipeline on your topic.

What it doesThe citer’s job is literally “every claim must cite a graded source… the dossier must NOT contain any claim without a citation.”

You getA ready-to-paste Markdown dossier — TL;DR, fully-cited body, bibliography with [A]/[B]/[C] tags.

The citer guarantees a paste-ready brief: every sentence carries a numbered receipt and the bibliography shows each source’s grade — defensible the moment a manager opens it.

5Wire research into your existing tools — whichever AI you use

SituationYour team isn’t all on Claude Code; some use Codex, Copilot, OpenCode, or CI.

CommandPoint the host at the shipped config — ruvn ships adapters for 9 hosts (Claude Code, Codex, GitHub Copilot, OpenCode, GitHub Actions, pi-dev, Hermes, OpenClaw, RVM).

What it doesThe same six-agent pipeline drops into each host via its own config file (e.g. .claude/settings.json, .codex/config.toml, .vscode/mcp.json).

You getThe identical graded-and-cited research discipline, no matter which AI a teammate happens to use.

The same pipeline ships with config for nine hosts, so a teammate on Codex or Copilot gets exactly the same grade-verify-cite discipline as the one on Claude Code.

6Automate research in CI (a dossier on every issue or comment)

SituationYou want a dossier generated automatically when someone files an issue or comments.

CommandThe shipped GitHub Actions host fires on workflow_dispatch or an issue comment and runs the harness non-interactively.

What it doesRuns the harness against the comment/event as the task, with a default-deny contents: read permission posture.

You getHands-free, repeatable research wired into your repo’s workflow.

The shipped GitHub Action runs the same pipeline non-interactively under a locked-down contents: read posture — so research happens automatically on each issue, with safe defaults.

08

How to implement it — three commands

Now that you know why, here’s how.

Straight from the repo. Install once, wire it into your host, confirm it loaded:

# installs the `ruvn` command npm i -g @ruvnet/ruvn # wires the harness into Claude Code (.claude/ settings + plugin) ruvn init # health check — confirms the kernel + host adapter load ruvn doctor # or one-off, no install: npx @ruvnet/ruvn init

Then, in Claude Code, ask it to run the research pipeline on your question — the agents and the grading rubric in CLAUDE.md are now available to it.

Want to prove it against a live model first?

Set an OpenRouter key and run the validation — it exercises all 6 agents on a real sample question, and each must return a sensible, on-task answer.

export OPENROUTER_API_KEY=sk-or-... npm run validate:openrouter

What’s actually inside (the pieces you get)

6 agent definitions — each a plain prompt + model tier you can read and tweak (src/agents/*.ts). scout / grader / synth / fact-check / citer run on the sonnet tier; web-searcher runs on the cheaper haiku tier.
A tiny CLI — init, doctor, --version (bin/cli.js).
Host config for 9 hosts — the .claude/, .codex/, .vscode/, .opencode/, .openclaw/, .github/, plus AGENTS.md / SYSTEM.md / trust.json / cli-config.yaml / rvm.manifest.toml.
A real smoke test — boots the kernel + host adapter so a broken install fails loudly (__tests__/smoke.test.ts).

Technical view — what each command does to your project

Each command, demystified: install adds the CLI; ruvn init writes the host config and the CLAUDE.md gate into your project; ruvn doctor boots the adapter to confirm it loaded. Then you just ask your AI to run the pipeline.

Friendly view — the six helpers are now on call

Six friendly paper-character helpers ready at a worktable, representing the ruvn pipeline now installed and available inside your AI host. — After three commands, the six little helpers are on call inside the AI you already use. Ask for research, and they line up — scout, search, grade, write, fact-check, cite.

🎧

The NotebookLM studio — listen, watch, skim, or open the whole thing

A full media pack: audio, video, slides, an infographic & a report.

We fed ruvn’s own primer, README, and CLAUDE.md into NotebookLM and had it build a whole media pack — each piece tuned for a newcomer: the same plain-language story (a confident, unsourced AI paragraph → a graded, cited dossier you can actually defend) told as audio, video, slides, an infographic, and a written report. Start wherever you like — or open the live notebook and explore the sources yourself.

Open the NotebookLM studio — audio, video, slides & more

The live, public notebook: every artifact below plus the source documents they were built from. Free, no sign-up to view.

▶ Open the studio →

Audio overview — play this first

A warm two-host conversation: the confident-blend trap, how grading + citation fix it, and how to start.

Written report — the briefing doc

A skimmable deep-dive: the six-agent pipeline (with model tiers), the A/B/C/D rubric, scope, and honest limits.

Read the report →

Video overview

A short animated explainer: the before→after, in motion.

Open the video →

Tip: also playable in the public notebook above.

Slide deck

A detailed deck you can flip through — or hand to a colleague.

Open the slides (PDF) →

PDF rides inside the drop-in zip too.

Infographic — the whole story on one sheet

From AI “vibes” to graded evidence: before→after, the A/B/C/D rubric, and the three commands to start. Tap to open full-size.

ruvn infographic: From AI vibes to graded evidence. Before — standard AI blends a real study with random forum posts into one unverified paragraph. After — a graded dossier with primary and expert sources (Grade A/B), a TL;DR, and verified, cited synthesis. The A/B/C/D standard: A/B primaries used for synthesis, C (Wikipedia) for context only, D (forums, dead links) discarded. It rides inside tools you already use like Claude Code, is honest early beta, and starts in three steps: npm i -g @ruvnet/ruvn, ruvn init, ask your question.

Open full-size →

Made with NotebookLM Every piece is generated from ruvn’s own primer, README, and CLAUDE.md, then checked against the source. The audio, report, slides (PDF) and infographic also ride inside the drop-in zip (for-humans/studio/); the full video and the live notebook are linked from the zip’s README to keep the download light — see the file-tree below.

09

The drop-in — one download, two halves

What’s actually inside the zip.

Beyond installing the tool, there’s a drop-in knowledge bundle: one kb/ folder with two halves — one for you to read, one for your AI to query against the real source. Here’s the actual file tree, every file with a plain-English note on what it is:

ruvn-dropin/one download — two halves, plus a README + manifest

├─ for-humans/ — the half you read & listen to

├─ ruvn-primer.mdthe whole tool explained in plain English — what it is, why, how to use it

└─ studio/ NotebookLM mediaan AI-generated media pack — start here if you’d rather listen, watch or skim than read

├─ 🎧 ruvn-audio-overview.m4aaudio overview — play this first. A two-host conversation that walks you through ruvn end-to-end.

├─ 📄 ruvn-report.mdthe report — a written deep-dive briefing on ruvn you can skim or search.

├─ 📈 ruvn-infographic.pngthe infographic — the whole before→after story on one sheet.

├─ 📊 ruvn-slides.pdfthe slide deck — a detailed deck to flip through or hand on.

└─ audio-overview-prompt.mdthe exact prompt used to generate the audio — reuse or tweak it. (The full video + the live public notebook are linked from the README to keep the zip light.)

└─ for-ai/ — the half your AI queries

├─ ruvn-kb.rvfthe “brain” — one searchable vector DB of ruvn’s own code + docs (384-dim, runs anywhere Node 18+ does)

├─ ruvn-kb.passages.jsonlevery doc + all the source, as plain searchable text

├─ ruvn-kb.ids.jsonthe index that joins each vector back to its passage

├─ ask-kb.mjsask the brain a question straight from the command line

├─ kb-mcp-server.mjswire the brain into Claude Code / Cursor as an MCP tool

├─ kb.config.mjsthe retrieval config the tools read (intent routing, primer slugs)

└─ package.jsonthe two deps (@ruvector/rvf + the local embedder) — npm i and go

└─ (top level)

├─ README.mdwhat it is + the 3-step setup below (points you at the studio audio first)

└─ manifest.jsonprovenance — source repo, SHA, build date, embedder — so you can tell it’s current

The 3-step drop-in

Listen first (optional): open for-humans/studio/ and play 🎧 ruvn-audio-overview.m4a — a friendly two-host walkthrough of the whole tool. Prefer reading? Open 📄 ruvn-report.md.
Unzip for-ai/ into your project, then npm i (two deps) and ask the brain a question: node ask-kb.mjs ruvn "how does the source-grader decide A vs B?"
Wire it into your AI host: add the 2-line .mcp.json pointing at kb-mcp-server.mjs, and paste the CLAUDE.md gate — now your AI answers from the real ruvn source, not from guesses.

Confirm it works After wiring it in, ask: “Using the kb tool, how does ruvn’s source-grader decide an A vs a B?” — a correct, grounded answer that quotes the rubric means the drop-in is live. (The whole bundle is self-contained: the for-humans/studio/ audio + report ride inside the zip, not just on this page.)

Friendly view — the bundle dropping into your laptop

A small kb folder being dropped into an open laptop, splitting into two labeled halves: a printed primer booklet for humans on the left, and a glowing brain-file for your AI on the right. — One small folder, two halves: a primer booklet *you* read, and a queryable “brain” *your AI* answers from. Drop it in, wire two lines, and the AI stops guessing about ruvn and starts quoting the real source.

⇩ Download the drop-in Smart Zip

Click or press Enter. Self-contained: the for-humans/studio/ audio overview + report ride inside.

10

Honest limits — what ruvn is not

Stated plainly, not hidden.

Early beta — v0.1.1. This is a young, small package.
It’s a research tool, not an oracle. It grades and cites evidence; it does not give medical advice or make efficacy claims. Its output is “a starting dossier to verify, not a conclusion.”
It needs a host with web access. ruvn ships no model of its own — the kernel makes no model calls; your AI host provides the brain and the web-search tool. No host, no run.
Quality is bounded by what’s findable and by the grader’s judgment. Grades are assigned by an AI applying the A/B/C/D rubric — solid, but not a human peer-reviewer. Treat the dossier as a strong first pass to verify.
Domain DNA. It was built for the gamma-entrainment / ruv-neural project, so its built-in examples and validation tasks are 40 Hz-flavored. The pipeline is general; the examples just smell of its origin.
“Signed session bundles” is the sibling tool’s job. That phrase in the one-liner belongs to ruv-neural (the closed-loop OS that runs / measures / signs protocols); ruvn is the research front-end. Don’t oversell ruvn as the protocol-signer.

In one line ruvn makes research auditable — it does not make it infallible. It hands you a strong, checkable first pass; the checking is still yours to do.

Stop trusting theconfident paragraph.Get a graded dossier.