Problem framing
"Our users are drowning in email."
This is a verbalized headache, not a problem statement. Who is the user — a Gmail power user with 200 emails/day, or an exec with 30 high-stakes threads? Their pain looks completely different. The PRD never says.
Reframe: "Mid-career managers with 80+ daily emails miss the 3-5 messages that matter, on average twice a week, by their own self-report." Now you have something you can measure.
Success metrics
"Increase DAU. Reduce inbox time."
Activity disguised as ambition. DAU goes up if the feature is annoying enough that people keep checking. Inbox time goes down if users abandon the app. Neither is the outcome you want.
Pick a metric that only moves if the feature is actually serving the user. Try: "median time-to-first-action on high-importance threads, measured at the user level." That goes down only if your model is right AND surfacing it well.
AI-specific gaps
You don't say:
- What's your eval set, and who labels truth? ("Important to whom" is the whole game.)
- What's the user moment when the model is wrong? Quiet failure (mislabel) or loud failure (the user notices)?
- What's the recovery? A "this wasn't important" button? An undo?
For an AI triage product, your eval plan IS your product plan. Spend a week on it before writing the second draft of this PRD.
Bottom line
Right instinct, wrong altitude. Rewrite the Problem section so an angry user can recognize themselves. Replace the metrics with one outcome metric you can move only by being right. Add an Eval section — that's where this PRD will live or die.
