AI Product · Built solo

BestAnswers.AI

A multi-agent answer engine: four AI personas debate a question in parallel, and a meta-judge merges the strongest reasoning into one answer with the disagreements left visible.

Role: Product, design & engineering (solo)
When: Personal product · ongoing
Stack: React · TypeScript · Vite · Tailwind · multi-LLM

Visit the live product

01 Problem

Ask a single model and you get one confident voice — and no way to tell a careful answer from a fluent-sounding wrong one. People cope by pasting the same prompt into ChatGPT, Claude, and Gemini and eyeballing the differences. That’s slow, and it hides the most useful signal of all: where capable models actually disagree. I wanted a tool where disagreement is the product, not a thing to paper over — so you can trust an answer because you can see what was contested and why one line of reasoning won.

02 Constraints

Solo build — every decision had to be cheap enough for one person to ship and maintain.
Multiple model providers (Gemini, Llama, Mistral) with different latencies and failure modes.
Responses stream at different speeds — the UI can’t assume they arrive together.
A non-technical visitor has to grasp “four AIs argued, here’s the verdict” in seconds.

03 Core design decision

Make the disagreement visible instead of hiding it behind one merged answer.

The whole bet is that trust comes from transparency. So the verdict is never shown alone — it sits on top of the four persona positions, with the meta-judge’s reasoning for why one won. The hard part was making that legible rather than overwhelming, which is where the rejected directions came in.

Directions I rejected

Single merged answer, sources hiddenTested cleanest, but it’s just another confident black box — it throws away the one thing that makes this different.
Four columns streaming simultaneouslyBuilt it; it read as chaos. Four live text streams racing each other is unreadable and stressful, not transparent.
Chat-style transcript of the “debate”Cute, but it buried the verdict and tripled reading time. Theatre over usefulness.

04 Design ↔ code, in one loop

I couldn’t spec the verdict view in a static frame — the feel of it depended entirely on how real model responses behaved at runtime. So I designed it in code: built the debate view, wired up four live calls, and watched what actually happened.

What happened was the simultaneous-streaming chaos above. Seeing it run told me the sequencing was the real design problem, not the layout. I reworked the order — personas resolve, then the judge speaks — in the same file I’d just been styling. Design decision and implementation were one act, and the running thing corrected the next decision. The verdict graph in this portfolio’s hero is that same idea, miniaturised.

05 What shipped & what it moved

BestAnswers.AI is live and built end-to-end by me.

Live: shipped and publicly usableTry it directly at bestanswersai.com — no signup to ask a question.
4: models debating per queryParallel calls across Gemini, Llama, and Mistral, merged by a meta-judge.
1: transparent verdictEvery answer ships with the reasoning for why one position won, not just the conclusion.

06 What I’d redesign today

Add a “why these four?” affordance — the persona choice is currently implicit; it should be inspectable.
Persist a verdict as a shareable link, so the transparency travels with the answer.
Let the user weight the personas for their context (e.g. favour the Engineer for a technical question) and show how that changes the verdict.

← All work Get in touch →