Glass Box

A live tool that turns an AI agent from a black box into a glass one, every decision readable, reversible, and sourced, without reading a line of code.

Product Design|AI Agents|Trust & Transparency|Interactive Prototype

Concept, design + build, solo·2026

Try the concept ↓

The problem

An agent did a few hours of work overnight, and now you have to bet on it.

It scanned five hundred candidates, surfaced twelve, and skipped the rest. The output arrives as a clean ranked list, a verdict with no visible reasoning behind it. And you're about to act on that verdict: who gets contacted, who gets passed over, whose shot at a role quietly ends in a filter you never saw run. The agent's whole value is that it worked while you didn't. That's also exactly what makes it frightening.

You get two bad options. Trust the list blindly, fast, and a leap of faith every time. Or redo the agent's reasoning yourself to check it, which means you've hired yourself as the agent's intern and thrown away the reason you used an agent at all. Speed or trust, pick one.

And there's a second party nobody's designing for: the people on the wrong side of an invisible decision. The forty-seven candidates cut for "tenure under eighteen months" don't get a second look, because no human can see why they were cut, or that the rule was even applied. Opacity isn't only a trust problem for the operator. It's a fairness problem for everyone the agent decided about and never had to explain.

The approach

Make the box glass.

Not "add an explainability tab", make visibility the primary surface, so seeing what the agent did is the default, not a place you go digging. Three properties carry it.

Readable

Every match traces to the exact line that triggered it, not a vague "92% match," but the actual sentence in the résumé the agent keyed on. Reasoning is shown as the evidence, not a score.

Reversible

Any decision, a match, a skip, is one click to undo. That single interaction reframes everything: the agent's call is a draft, not a verdict. It proposes; you dispose.

Steerable

Toggle a filter chip and the whole ranking recomputes live. You correct the agent's priorities in flight, watch candidates re-rank in real time, without touching code or waiting on a re-run.

Transparency as a feature, not a footnote. You should be able to audit an agent without becoming an engineer, and the audit trail itself becomes the thing that earns trust, instead of asking for it.

source line per match

The evidence, not a score, every match traces to the exact sentence the agent keyed on.

~500

decisions per run

Each one made readable, not just the twelve that surfaced.

1 click

to reverse a skip

The agent proposes, the human disposes, any decision is one click to undo.

Real time

re-ranking

Correct a priority and the list recomputes live, minutes, not a fresh run.

Try it

Click Run Agent. Watch it think out loud.

This is the actual interactive concept, not a video. Run the agent, then poke at its decisions: every match has a source, every skip can be undone, every filter re-ranks the list as you watch.

Where this breaks

Turning the box to glass solves "you can't see the decision." It quietly raises five harder questions.

A reason you can read isn't a reason that's right.

The reasoning shown explains why the agent says it matched. But a fluent, well-sourced explanation can rationalize a bad call as smoothly as a good one. Legibility doesn't just reveal quality; it can manufacture confidence in decisions that don't deserve it.

Provenance assumes the stated reason is the real reason.

Showing "the line that triggered the match" implies the agent decided the way it explains. With LLMs, the explanation is often post-hoc, generated after the fact, plausible, and not necessarily how the decision was actually reached. I'm showing a reason. I can't fully prove it's the reason.

Reversible decisions move the bias; they don't remove it.

Letting a human undo a skip surfaces fairness, but the human re-ranking brings their own bias, now wearing the agent's authority. "Steerable" can quietly mean "steerable toward my own blind spots," with a clean audit trail that makes it look rigorous.

“Without reading code” means trusting a translation.

The visibility layer is itself an interpretation of what the agent did; I chose what to surface, what to compress, what to leave out. The lens has a point of view. A glass box you didn't build is still a box you have to take on faith.

Transparency at scale becomes noise.

Seeing twelve decisions is clarity. Seeing five hundred is a second haystack. The unsolved design question isn't "show more", it's how much to show before visibility becomes its own way of hiding, and I don't think I've found that line yet.

What I learned

I went in thinking the hard part was making the agent's decisions visible; the hard part was realizing that visible and trustworthy aren't the same thing; a clear explanation can make a wrong decision more convincing, not less.

"Reversible" turned out to be the most honest word in the project: the moment you let someone undo the agent's call, you've admitted it was never the authority, just the first draft, and that reframing changed how I designed every screen after it.

And I kept catching myself designing the lens to flatter the agent, to make its reasoning look sound, which taught me that a transparency tool has a point of view, and pretending it's neutral is the most dangerous thing it can do.

Try the live concept→

Explore more case studies

Design System