The Challenge
Long-running AI workflows need resilience. When they fail silently, users lose trust.
AI Ticket Generation Takes Time
AI ticket specification generation takes 60-90 seconds. If users closed their browser during generation, the work was lost and tickets never got updated. We needed background processing with automatic resume.
No Progress Visibility
Users had no idea what was happening during 60+ second workflows
Lost Work on Browser Close
Closing browser meant starting over — poor UX for long processes
Complex State Management
Needed to track workflow state, resume polling, and update UI seamlessly
Architecture Decision
Webhooks vs polling? Server-side queue vs client-side? Many options to evaluate
The Secret Weapon
Structured ticket specifications created a shared language between developer and AI.
Q-165_ticket.json
Structured blueprint for implementation
- •Progress indicator displays immediately
- •Real-time updates stream to UI
- •Ticket updates persist if browser closes
- •Generation completes in browser close tests
- •Orphaned processes → add timeouts
- •Firebase costs → monitor usage
- •Security → validate authenticated users
- •>95% completion rate → achieved 100%
- •<2s to first update → achieved <1s
- •User feedback on visibility
How It Guided Implementation
Dependencies Section Pointed to Existing Code
Mentioned "Progress component in src/components/ui/progress.tsx" — we reused it instead of building new
Tasks Identified Exact Files to Modify
TicketView.tsx, useMastraClient.ts — no guessing about architecture
Risks Prevented Common Pitfalls
Added cleanup for temp runIds and Firestore security rules proactively
Metrics Defined Success Upfront
Clear targets meant we knew exactly when we were done
The Implementation
What we built and why we chose client-side polling over a server-side queue.
Pre-Created Tracking Documents
Firestore subcollection created before workflow starts
Immediate RunId Updates
Callback updates document within 1 second of workflow start
Automatic Resume Detection
Page load checks for pending workflows and resumes
Real-Time Progress Bar
Updates based on workflow step completion
Seamless UI Updates
Specification appears without page refresh
Cleanup for Failed Workflows
Handles temp runIds and stale documents
Architecture Decision: Client-Side Polling
The ticket suggested "server-side queue" but we chose client-side polling with Firestore persistence instead.
Why This Worked Better
- Simpler — no additional infrastructure
- More reliable — direct Mastra API polling
- Easier to debug and monitor
What We Avoided
- Complex webhook setup
- Additional server-side queue
- Network reliability concerns
The Results
Before
- 60+ seconds with no feedback
- Lost work if browser closed
- Users didn't know if it was working
- Had to stay on page entire time
After
- Real-time progress updates
- Work continues in background
- Clear visibility into each step
- Can close browser and return later
Key Takeaways
Lessons from building resilient AI workflows in a single session.
Structured Specifications Enable AI Collaboration
Well-structured ticket specifications (with acceptance criteria, tasks, dependencies, risks, and metrics) create a shared language between developers and AI. This structure eliminates ambiguity and enables smooth, iterative collaboration.
Clear Metrics Drive Success
Defining success metrics upfront (">95% completion rate", "<2s to first update") meant we knew exactly when we were done. We didn't just meet these targets — we exceeded them (100%, <1s).
Simpler is Often Better
The ticket suggested a complex server-side queue, but client-side polling with Firestore persistence proved simpler, more reliable, and easier to maintain. Don't over-engineer when a straightforward solution works.
"The structured ticket spec was the difference between AI collaboration and AI confusion. Every decision, every risk, every metric was already documented — the AI just executed."
Ready to build with structured specifications?
See how QEEK can help your team write better tickets and ship faster.