top of page
Search

I Designed an AI Agent Workspace Before We Called It That

  • Writer: Neelasaraswathi Venkataraman
    Neelasaraswathi Venkataraman
  • Mar 5
  • 7 min read

There's a particular kind of design challenge that doesn't come with a reference library. No established patterns, no clear analogies, no "how competitor X solved this." You're building something that didn't exist before — and that means you're also building the mental models your users will rely on.

That was Covalent.


The product, briefly

Before you could run AI workloads across cloud GPUs, quantum processors, or your university's computing cluster in one unified workflow, you had to do it manually — connect to each system separately, split your code into pieces, submit each piece, wait, collect results, and stitch everything back together. Dozens of files. Multiple terminals. No single view of what ran, what failed, or what it cost.

Covalent changed that. You write your entire Python workflow in one file. The platform handles dispatch — routing tasks to the right compute resources, whether that's a GPU cluster, a quantum processor, or your own hardware — and brings back a unified result. For physics researchers and quantum computing teams, this was genuinely new infrastructure.

I was the product designer on the platform from 2023 to 2025, working with a fully remote team of engineers, researchers, and founders who were also, many of them, physicists who used the product themselves.


Problem Statement

The platform had shipped its MVP and the team was using it in earnest — running real experiments, submitting real dispatches, accumulating real costs. A dispatch is a workflow run: you send code, it executes across compute resources, you get back results and a record of everything that happened.

The problem was that researchers don't run one dispatch. They run dozens — iterating on inputs, comparing outputs, hunting down failures. And the interface had been built for a simpler version of that reality. As usage grew, the friction became impossible to ignore.

Three areas kept surfacing:

Finding things was hard. The navigation hierarchy reflected how the product was architected, not how people actually moved through it. Users came in looking for a specific dispatch, not navigating a tree of experiments and projects.

Comparing runs was nearly impossible. Researchers needed to look across multiple dispatches simultaneously — same experiment, different parameters — and the interface gave them no real way to do that.

Understanding what happened inside a run was painful. The graph showing execution flow was visually prominent but functionally limited. Key details were buried in stacked popovers and a side modal that fought the interface for space.


Discovery

My background is in technology, but not quantum computing. So I started where you have to start when the domain is genuinely new to you: I talked to the people who lived in it.

The team at Covalent was unusual in a way that shaped everything. This wasn't a typical SaaS company where product, engineering, and users are separate groups. The founders and engineers were physicists and researchers who ran their own experiments on the platform. They were building the tool they needed — and they felt its friction firsthand.

I spoke with them directly about how they used the platform day to day, watched sessions where they moved through real workflows, and paid close attention to the moments where they worked around the interface rather than with it — copying IDs into separate notes, keeping a mental count of failed nodes, navigating away and back to reconstruct context they'd lost.

That last pattern was the most telling. When people build their own workarounds for a tool they also built, something is genuinely broken.

One thing I had to calibrate early: the mental model for "user overwhelm" doesn't apply here the way it does in consumer products. These users run simulations on heavy, outdated scientific software. They work in complex IDEs. Complexity itself wasn't the problem — hidden complexity was. If something failed, they needed to find the cause fast. Burying information in extra layers would have been more frustrating than a dense interface.



Synthesis

A few clear themes emerged from what I heard and observed.

The pinned dispatches feature — meant to give quick access to frequently-used runs — occupied significant screen real estate but was rarely used. The assumption behind it didn't match actual behaviour.

The navigation structure (Experimentation → Projects → Dispatches) made sense architecturally but not mentally. Users oriented around dispatches directly, not the hierarchy above them.

The dispatch table needed to support comparison, not just listing. Researchers needed to see input and output values across multiple rows simultaneously — and the current table made that a multi-step exercise.

The graph page had a compounding interaction problem. Clicking a node opened a popover. Interacting with that popover opened another. Three layers deep, stacked over a dense dark interface, with no clear way back. The graph itself was genuinely useful — a quick visual read of how far a workflow had progressed, where it branched, where it failed — but it was being obscured by the interaction model around it.

And cost had no home. Running quantum and GPU compute is expensive, and an inattentive configuration choice could generate an unexpectedly large bill. There was no persistent, clear view of what a dispatch had cost. For a team of PhDs running their own experiments, this wasn't a minor UX gap.


Design Decisions

Rethinking the dispatch list

We removed pinned dispatches from prime real estate and restructured the page around the list itself. Pagination was replaced with infinite scroll and temporal grouping — This Week and Past Dispatches — borrowing a scroll marker pattern from Google Photos that lets users navigate by date without losing their place. Latest dispatches surfaced first, matching how people actually accessed their work.

The status filter row — Completed, Failed, Running, Pending — became more than navigation. Each state showed a count, turning the filter into a lightweight summary of dispatch distribution at a glance.



The horizontal scroll table

There was some internal debate about this one. Horizontal scroll doesn't read as a "sophisticated" design choice on the surface. But the reasoning was straightforward.

Think of it like tracking a delivery. You don't open the tracking page every time — only when you're anticipating something, or when something goes wrong. The equivalent for these users was the dispatch detail view. You go there when you need to dig in. The list view is where you scan, compare, and decide what needs attention.

For that scanning to work, you need comparable information visible at once. We introduced a horizontally scrollable table with a sticky Dispatch ID column, so users could scroll across Input and Output values for multiple dispatches simultaneously without losing track of which row they were on. Users were running on one or two monitors — wide enough that the full table was usually visible — and we were moving away from pinned columns anyway, simplifying the interaction model.

The goal wasn't elegance. It was giving researchers the fastest possible read across the data they actually needed to compare.


Fixing the graph page

We kept the graph. There was a real question about whether to remove it entirely in favour of a pure list view — at a certain scale, a graph with hundreds of nodes is hard to parse. But the graph earns its place at a glance: you can immediately see how far a workflow has progressed, where branches converge, where something broke and how much of the run was still intact. That spatial overview is genuinely hard to replace with a list. We kept it as an anchor and fixed what was broken around it.

The popover problem was resolved by removing the stacking interaction entirely. We introduced a structured list view alongside the graph — all functions in the dispatch, sortable and filterable — with errored functions surfaced at the top by default. The mental model being: if you're looking at this page after something went wrong, the first thing you see is what failed.

The result summary moved out of the side modal and into a dedicated overview panel at the top of the page. Status, cost, inputs, outputs, dispatch metrics — all present before you scrolled to the graph. The modal had been fighting the interface for space. The overview panel gave that information the legibility it deserved.



Rebuilding the design system

Running in parallel to all of this was the design system.

Covalent's MVP had been built quickly, as MVPs are. Components were inconsistent, patterns had diverged, and the visual language had accumulated debt. As we rebuilt the interface, I rebuilt the system alongside it — versioning out old components deliberately, creating new ones grounded in the actual patterns we were introducing, and documenting decisions so engineering could move without needing to check in on every implementation.

The system isn't the point. The decisions inside it are. But without it, what we were building would have been a one-time improvement rather than a foundation. In a product at this stage — genuinely new category, small team, moving fast — a maintained system is what makes consistency possible without slowing everything down.



Outcome

The company was acquired before structured usage data could be collected on the redesigned interface. What I have is qualitative: the team used it, found it faster to navigate, and the improvements were incorporated into the product carried forward.

That's an incomplete outcome by any rigorous measure, and I'd rather say that than construct a number. What I can point to is the reasoning behind each decision — and the fact that those decisions were grounded in how the actual users moved through a genuinely novel kind of tool.


The thing I keep thinking about now

At the time, we described this as a quantum computing orchestration platform. We talked about dispatches and lattices and compute backends. The vocabulary was specific to the domain.

But the underlying design problem — how do you give someone visibility and control over a complex workflow that runs across multiple systems, surfaces partial failures, and needs to show cost, status, and results in a way that's actually scannable — that problem didn't stay in quantum computing.

That's exactly what agentic AI interfaces are now grappling with. An AI agent runs a task, breaks it into subtasks, routes them to different tools or models, some succeed, some fail, costs accumulate, and the user needs to understand what happened and why. The mental models are structurally identical.

The design patterns we reached for in Covalent — progressive disclosure, prioritising errors, cost visibility as a trust signal, list views alongside graph views, recency-first information architecture — these are the same decisions that matter in every agent workspace being designed today.

I didn't know I was designing an early agentic interface. I was just trying to make a complex system legible to the people who depended on it. But looking back, the problems were the same. The vocabulary just hadn't caught up yet.

That's the thing about designing at the edge of a new category: the patterns you develop aren't domain-specific. They're the foundation for what comes next.

— Neela

 
 
 

Comments


bottom of page