Skip to main content
Back to Blog
insights8 min read

The $0.99 Resolution: Why Outcome Pricing Only Works When Outcomes Are Binary

Outcome-based AI pricing aligns incentives when outcomes are binary and measurable. Data from Intercom, Chargeflow, Sierra, Salesforce, and HubSpot reveals why definition clarity determines whether the model works or breaks.

BLT

Bear Lumen Team

Research

#outcome-based pricing#AI billing#Intercom#finance operations#resolution pricing

Intercom charges $0.99 per resolution. Chargeflow takes 25% of recovered chargebacks. Sierra AI crossed $150M in ARR selling outcomes to enterprise brands.

Outcome-based pricing has become the dominant model for AI agents in customer support, and the thesis is appealing: the vendor gets paid when the buyer gets value. Perfect alignment, at least on paper.

In practice, one question determines whether the model works or breaks: what counts as an outcome? The companies getting it right chose outcomes that are binary and measurable. A chargeback won or lost, a ticket resolved or escalated. The companies struggling chose outcomes that are subjective, opaque, or defined by the vendor who profits from counting more of them.


The Binary Test

Outcome pricing works when the outcome passes a two-part test: it's binary, meaning it happened or it didn't, and both sides can verify it independently.

Chargeflow passes cleanly. A chargeback is either won or lost, the credit card network decides rather than Chargeflow, and the merchant can verify the result in their own payment processor dashboard. Chargeflow takes 25% of recovered funds, and if the chargeback is lost, the merchant pays nothing. The incentive structure is airtight because the outcome is a fact, not an interpretation.

Sierra negotiates outcomes per contract. A "resolved support conversation" is defined in writing before the deal closes, a "saved cancellation" means the customer stayed, a "revenue event" means money moved. At $150K+ annual commitments, both sides invest in getting the definition right, so the outcome is negotiated but still binary once defined.

Intercom's Fin is where the line starts to blur. A "resolution" can be hard (the customer confirms) or soft (no follow-up within 24 hours). The hard resolution is binary. The soft resolution is an assumption, where silence within an arbitrary window gets treated as success.

That assumption carries a cost.


When Silence Passes for Success

One support team followed up with customers who received soft resolutions from Fin, and only 62% said their issue was actually resolved. A 38% false-positive rate, with every false positive billed at the full $0.99.

Silence can mean the answer worked. It can also mean the customer gave up, called phone support instead, or tried the solution 25 hours later and found it broken. All four count as billable resolutions under a 24-hour window.

Community posts have flagged another edge case: Fin counts a resolution even when a human agent takes over the conversation, as long as the AI responded first. The AI's contribution might have been a generic greeting, but the resolution fee still applies.

At that point the outcome is no longer binary. It's probabilistic, and the entity assigning the probability is the same entity collecting the fee. With token-based billing you can at least count tokens yourself, and with seats you know how many you have. Resolution billing means trusting the vendor's counting methodology with no independent verification path.


Five Vendors, Five Definitions

The word "resolution" sounds like a standard. It isn't.

ProviderPriceWhat CountsVerificationRisk
Intercom (Fin)$0.99/resolutionCustomer confirms OR 24h silence24-hour windowPays for abandoned conversations
Zendesk$1.50-$2.00/resolutionAI analyzes conversation for relevance72-hour AI evaluationBlack-box classification
HubSpot (Breeze)$0.50/resolved conversation"Conversation successfully completed"Not disclosedLowest price, least transparency
Salesforce (Agentforce)$2/conversation or $0.10/actionCompleted conversation or discrete action24h inactivitySimple query costs the same as complex case
SierraCustomContractually defined per dealAgreed per contractRequires $150K+ commitment

Price points vary 4x for superficially similar products. HubSpot charges $0.50 and Zendesk charges up to $2.00, both resolving customer support tickets with AI, and the difference isn't quality. It's definition breadth and verification rigor.

Zendesk's 72-hour window with AI-based verification is more conservative than Intercom's 24-hour silence rule, but the AI that decides whether a resolution was "satisfactory" is a black box. You can't audit the classification logic. You can only compare your internal satisfaction scores against the invoice and hope they correlate.

Salesforce's trajectory tells the clearest story. The original $2-per-conversation model drew criticism because a simple order status check cost the same as a complex multi-step case, so Salesforce introduced $0.10 per action pricing. When the market leader changes its billing unit within 18 months, the model is still finding its footing.


The Risk Inversion

For vendors building on outcome pricing, there's a structural consequence worth sitting with. The model shifts risk from the buyer to the vendor. The buyer pays only when value is delivered, which is the selling point, and it's also the danger.

The vendor absorbs the cost of every failed attempt. If 30% of AI resolutions require human escalation, the vendor eats the inference cost on that 30% with zero revenue: the model inputs, the compute, the API calls to the underlying LLM, all consumed and none recovered. For the buyer this is elegant. For the vendor, every failed resolution is a sunk cost that never appears on the invoice but absolutely appears on the margin report.

This is, frankly, how outcome pricing should work. The product mistake belongs to the builder, and the bill shouldn't. But absorbing variance deliberately is very different from absorbing it blindly.

And the margin dynamics compound in a way that's easy to miss. Better AI resolves more tickets, which is good, but better AI also attempts more complex tickets, which raises the failure rate on hard cases. The resolution rate improves on average while the cost per failed attempt grows at the tail. The aggregate number looks healthy. The per-outcome unit economics tell a different story.

One support team improved their help center content, making Fin more effective. Resolution rate jumped from 40% to 65%, and costs rose over 60%. The AI got better and the invoice got bigger, the same quality-cost inversion that hit GitHub Copilot: product improvement drives cost increase.


Forecasting in the Dark

Traditional SaaS costs are predictable: 50 seats at $100/month equals $5,000/month, every month. Outcome-based costs depend on customer behavior, issue complexity, AI effectiveness, and seasonality, all at once.

One founder's Intercom bill went from $200/month to $1,400 during a product launch, a 7x swing from a single external event. The spike isn't a bug. It's the model working as designed, and it's exactly the kind of number a CFO can't put in a quarterly plan. "Somewhere between $3,000 and $12,000 depending on how good our knowledge base gets" is not a number.

Resolution costs also compound with growth. More customers means more interactions means more resolutions, and unlike infrastructure there are no real economies of scale on resolutions at most providers. Zendesk offers committed rates ($1.50 vs $2.00), but the discount requires pre-purchasing volume, which only works if you can forecast volume accurately. The model creates a circular problem.

For vendors, the forecasting problem is worse. The buyer's unpredictable volume is at least bounded by their customer base. The vendor's cost per outcome is bounded by nothing: an upstream model price increase, a shift in ticket complexity, a regression in the knowledge base, and the cost-per-outcome number moves without the vendor changing a single line of code.


The Dividing Line

Outcome pricing is, we'd argue, the future for AI products that deliver measurable results. No other model ties payment so directly to impact, and the alignment between vendor revenue and buyer value is real.

But the model has a prerequisite most vendors skip: cost-per-outcome visibility.

The companies succeeding with it, Sierra and Chargeflow among them, share two traits. Their outcomes are binary and independently verifiable, and they know precisely what each outcome costs them to deliver. Sierra negotiates the definition; Chargeflow lets the card network decide. Neither leaves the outcome definition to a probabilistic model controlled by the party collecting the fee.

The vendors struggling, or pivoting in Salesforce's case, share the opposite trait. The outcome is fuzzy, the measurement is opaque, and nobody on the vendor side can answer the most important question in the business: what does it cost us to deliver one successful outcome?

Without that number, outcome pricing is a promise without a foundation. You're guaranteeing results to the buyer while flying blind on what those results cost you, and when the margin report arrives, the gap between "resolutions billed" and "resolutions that cost less than $0.99 to deliver" decides whether the model is sustainable or quietly bleeding cash.

If you're building toward outcome pricing, Bear Lumen connects inference costs to individual outcomes for you, so you see your true cost per resolution continuously instead of discovering it on the margin report. The pricing framework covers where that number fits in the rest of the pricing decision.

Share this article