Browser Resilient Workflows

The Challenge

Long-running AI workflows need resilience. When they fail silently, users lose trust.

AI Ticket Generation Takes Time

AI ticket specification generation takes 60-90 seconds. If users closed their browser during generation, the work was lost and tickets never got updated. We needed background processing with automatic resume.

UX CriticalBackground Processing

No Progress Visibility

Users had no idea what was happening during 60+ second workflows

Lost Work on Browser Close

Closing browser meant starting over — poor UX for long processes

Complex State Management

Needed to track workflow state, resume polling, and update UI seamlessly

Architecture Decision

Webhooks vs polling? Server-side queue vs client-side? Many options to evaluate

The Secret Weapon

Structured ticket specifications created a shared language between developer and AI.

Q-165_ticket.json

Structured blueprint for implementation

Clear Acceptance Criteria

•Progress indicator displays immediately
•Real-time updates stream to UI
•Ticket updates persist if browser closes
•Generation completes in browser close tests

Risk Mitigation

•Orphaned processes → add timeouts
•Firebase costs → monitor usage
•Security → validate authenticated users

Success Metrics

•>95% completion rate → achieved 100%
•<2s to first update → achieved <1s
•User feedback on visibility

How It Guided Implementation

Dependencies Section Pointed to Existing Code

Mentioned "Progress component in src/components/ui/progress.tsx" — we reused it instead of building new

Tasks Identified Exact Files to Modify

TicketView.tsx, useMastraClient.ts — no guessing about architecture

Risks Prevented Common Pitfalls

Added cleanup for temp runIds and Firestore security rules proactively

Metrics Defined Success Upfront

Clear targets meant we knew exactly when we were done

The Implementation

What we built and why we chose client-side polling over a server-side queue.

Pre-Created Tracking Documents

Firestore subcollection created before workflow starts

Immediate RunId Updates

Callback updates document within 1 second of workflow start

Automatic Resume Detection

Page load checks for pending workflows and resumes

Real-Time Progress Bar

Updates based on workflow step completion

Seamless UI Updates

Specification appears without page refresh

Cleanup for Failed Workflows

Handles temp runIds and stale documents

Architecture Decision: Client-Side Polling

The ticket suggested "server-side queue" but we chose client-side polling with Firestore persistence instead.

Why This Worked Better

Simpler — no additional infrastructure
More reliable — direct Mastra API polling
Easier to debug and monitor

What We Avoided

Complex webhook setup
Additional server-side queue
Network reliability concerns

The Results

100%

Success Rate

Even with browser close

<1s

First Update

Beat 2s target

2hr

Total Time

Including testing

Before

60+ seconds with no feedback
Lost work if browser closed
Users didn't know if it was working
Had to stay on page entire time

After

Real-time progress updates
Work continues in background
Clear visibility into each step
Can close browser and return later

Key Takeaways

Lessons from building resilient AI workflows in a single session.

Structured Specifications Enable AI Collaboration

Well-structured ticket specifications (with acceptance criteria, tasks, dependencies, risks, and metrics) create a shared language between developers and AI. This structure eliminates ambiguity and enables smooth, iterative collaboration.

Clear Metrics Drive Success

Defining success metrics upfront (">95% completion rate", "<2s to first update") meant we knew exactly when we were done. We didn't just meet these targets — we exceeded them (100%, <1s).

Simpler is Often Better

The ticket suggested a complex server-side queue, but client-side polling with Firestore persistence proved simpler, more reliable, and easier to maintain. Don't over-engineer when a straightforward solution works.

"The structured ticket spec was the difference between AI collaboration and AI confusion. Every decision, every risk, every metric was already documented — the AI just executed."

QEEK Team

Internal case study

Ready to build with structured specifications?

See how QEEK can help your team write better tickets and ship faster.

Try QEEK Free More case studies

Browser-ResilientAI Workflows