Operations Metrics
On this page
- Collections: Contact rate over 50%, promise-kept over 60%, recovery rate over 5%
- Monitoring: Alert response under 15 min (critical), false alert rate under 20%
- Reconciliation: Match rate over 99%, exception rate under 1%, resolution under 24h
- Incident: MTTD under 15 min, MTTR under 30 min, recurrence under 10%
- Related: Fraud Metrics, Chargeback Metrics
Operations metrics tell you whether your payment infrastructure is running smoothly. Fraud and chargeback metrics measure outcomes; operations metrics measure the machine. If reconciliation is breaking, alerts are slow, or incidents keep recurring, your outcomes will suffer no matter how good your fraud rules or chargeback representment are.
Most SMBs don't need all of these on day one. Start with reconciliation and monitoring metrics. Add the rest as your volume and team grow.
Collections Metrics
Collections metrics apply only if you invoice customers, run payment plans, or manage subscription dunning. If you're a standard e-commerce merchant where customers pay at checkout, skip to Monitoring Metrics below.
Relevant if you invoice customers, offer payment plans, or need to recover failed payments (including dunning for subscriptions).
| Metric | Definition | Target |
|---|---|---|
| Contact rate | Reached / Attempted | >50% |
| Promise-to-pay rate | Promises / Contacts | >30% |
| Promise kept rate | Kept / Promised | >60% |
| Roll rate | Moved to worse bucket / Total in bucket | Under 40% |
| Recovery rate | Recovered $ / Charged-off $ | >5% |
How to Read Collections Metrics
The collections funnel works like a sales funnel: each step has a conversion rate, and the product of all steps gives you your overall recovery.
Example:
100 past-due accounts
x 55% contact rate = 55 reached
x 35% promise-to-pay = 19 promises
x 65% promise kept = 12 recovered
Overall recovery: 12 / 100 = 12%
What "recovery rate" means for different business types:
| Business Type | What You're Recovering | Typical Rate |
|---|---|---|
| Subscriptions (failed payment dunning) | Expired/declined card retries | 10-30% |
| Invoicing (net-30/60 terms) | Late payments | 85-95% (most pay eventually) |
| Bad debt (charged-off accounts) | Accounts sent to collections | 5-15% |
For most SMBs, the collections metric that matters most is dunning recovery rate: what percentage of failed subscription payments you recover through retry logic and customer outreach. See Failed Payment Collection for dunning strategies.
When Collections Metrics Signal a Problem
- Contact rate under 30%. Your email is going to spam, your phone numbers are wrong, or customers are avoiding you. Check delivery rates and try alternate channels.
- Promise-kept rate under 40%. Customers are agreeing to pay and then not following through. Shorten the payment window, offer immediate payment links, or require partial payment upfront.
- Roll rate over 50%. Accounts are aging faster than you're recovering them. Your dunning sequence may need more urgency or earlier escalation.
Monitoring Metrics
These measure how quickly you detect and respond to issues in your payment operations.
| Metric | Definition | Target |
|---|---|---|
| Alert response time | Time to acknowledge alert | Under 15 min (critical) |
| False alert rate | False alerts / Total alerts | Under 20% |
| Mean time to detect | Discovery to alert | Under 5 min |
| Dashboard uptime | Available time / Total time | >99.9% |
What to Monitor (and How to Know If It's Working)
| What You Monitor | Why | Alert Threshold |
|---|---|---|
| Authorization rate | Sudden drop = processor issue or rule problem | Drops more than 5 percentage points from baseline |
| Payout arrival | Late payouts = cash flow risk | Any payout more than 1 business day late |
| Chargeback volume | Spike = fraud attack or fulfillment issue | More than 2x your daily average |
| Decline rate | Spike = false positive rule or issuer problem | Rises more than 3 percentage points |
| Processor status | Downtime = lost revenue | Any outage notification |
For SMBs: What "Monitoring" Actually Looks Like
If you're running under $100K/month, "monitoring" doesn't mean a dashboard with real-time charts. It means:
- Check your processor dashboard daily. 5 minutes. Look at yesterday's sales count, decline count, and any flagged transactions.
- Set up email alerts. Stripe, Square, and most processors can email you when a chargeback is filed or a payout fails.
- Review payouts weekly. Make sure the money hitting your bank matches what your processor dashboard says.
That's it. As you grow, add more:
| Volume | Monitoring Level |
|---|---|
| Under $50K/month | Daily dashboard check + email alerts |
| $50K-$250K/month | Add weekly reconciliation + chargeback ratio tracking |
| $250K-$1M/month | Add automated alerts on auth rate, decline rate, fraud rate |
| Over $1M/month | Real-time dashboards, dedicated ops person, SLA tracking |
False Alert Rate: The Alert Fatigue Problem
False Alert Rate = Alerts that required no action / Total alerts
Example:
You receive 20 alerts per week.
4 require investigation or action.
16 are noise (normal fluctuations, known issues, duplicates).
False alert rate = 16 / 20 = 80% (way too high)
If your false alert rate is above 30%, you'll start ignoring alerts, and the one time a real problem occurs, you'll miss it. Fix by:
- Raising thresholds so only meaningful deviations trigger alerts
- Deduplicating alerts (one alert per issue, not one per occurrence)
- Adding time-based suppression (don't alert on a 5-minute blip; alert on a 30-minute trend)
Reconciliation Metrics
Reconciliation is matching your processor's records against your bank deposits to make sure you received the money you're owed.
| Metric | Definition | Target |
|---|---|---|
| Match rate | Auto-matched / Total items | >99% |
| Exception rate | Manual review needed / Total | Under 1% |
| Resolution time | Exception to resolution | Under 24 hours |
| Aged exceptions | Exceptions >3 days old | 0 |
How Reconciliation Works
Your records say: Processor says: Bank deposit:
100 transactions 100 transactions $9,500
$10,000 gross $10,000 gross (after fees)
$500 in fees $500 in fees
When all three match, you're reconciled. When they don't, you have an exception to investigate.
Common reconciliation exceptions and what causes them:
| Exception | Common Cause | Resolution |
|---|---|---|
| Bank deposit doesn't match processor total | Processor held funds (reserve, rolling hold) | Check processor dashboard for holds |
| Transaction in your records but not processor's | Transaction voided or not captured | Check auth-vs-capture timing |
| Processor shows a transaction you don't have | Duplicate charge, test transaction, or glitch | Investigate and refund if needed |
| Fees don't match expected amount | Downgrade, rate change, or chargeback fee | Compare fee schedule to actual charges |
For SMBs: Minimum Viable Reconciliation
At minimum, check weekly:
- Do your payouts match your sales? Compare your processor dashboard's payout report against your bank statement. If the numbers are off by more than your expected fee percentage, investigate.
- Are all payouts arriving on time? Your processor's payout schedule (usually T+2 business days) should be consistent. Late payouts can signal a hold or an issue.
- Are there any unexpected fees? Look for chargeback fees, PCI non-compliance fees, or statement fees you didn't expect.
This takes 10 minutes per week. It catches 90% of issues before they become serious.
Incident Metrics
An "incident" is anything that disrupts your ability to process payments: processor outage, sudden decline spike, fraud attack, payout failure, or integration error.
| Metric | Definition | Target |
|---|---|---|
| Mean time to detect (MTTD) | Incident start to detection | Under 15 min |
| Mean time to respond (MTTR) | Detection to response start | Under 30 min |
| Mean time to resolve | Detection to resolution | Under 4 hours |
| Incident recurrence rate | Repeat incidents / Total | Under 10% |
Common Payment Incidents and Response
| Incident | Detection Method | First Response |
|---|---|---|
| Processor outage | Auth rate drops to 0% | Check processor status page, notify customers, enable backup processor if available |
| Fraud spike | Chargeback alerts or sudden block rate increase | Tighten rules, check for card testing, review recent approvals |
| Payout failure | Bank deposit missing on expected date | Contact processor support, check for holds or compliance issues |
| Integration error | Checkout errors spike in your logs | Roll back recent code changes, check API status |
| Decline spike | Auth rate drops 10%+ in hours | Check if a fraud rule is over-triggering, contact processor if systemic |
Post-Incident: The Only Metric That Prevents Repeats
Recurrence rate measures whether the same type of incident keeps happening. If it's above 10%, you're treating symptoms instead of causes.
After every significant incident, document:
- What happened
- When you detected it
- How you fixed it
- What you'll change to prevent it from happening again
The prevention step is the one most teams skip. A processor outage might prompt you to set up a backup processor. A fraud spike might prompt you to add a velocity rule that would have caught it earlier.
Process Metrics
| Metric | Definition | Target |
|---|---|---|
| SLA adherence | Within SLA / Total | >95% |
| Process error rate | Errors / Total actions | Under 1% |
| Escalation rate | Escalated / Total | Under 5% |
| Documentation completeness | Required fields complete | 100% |
These matter most for teams with more than one person handling payment operations. If you're a solo operator, focus on reconciliation and monitoring metrics first and add process metrics when you hire your second person.
SLA Adherence: What SLAs to Set
| Process | Reasonable SLA |
|---|---|
| Chargeback response | Within 5 business days of receipt |
| Refund processing | Within 2 business days of approval |
| Customer escalation | Same business day |
| Reconciliation exception | Resolved within 24 hours |
| Fraud alert review | Within 4 hours (business hours) |
Staffing Metrics
| Metric | Definition | Notes |
|---|---|---|
| Cases per analyst | Active cases / FTE | Track trend |
| Utilization rate | Productive time / Available | 75-85% target |
| Training completion | Completed / Required | 100% |
| Quality score | Accuracy on audited cases | >95% |
Staffing metrics are relevant when you have dedicated payment operations staff. For most SMBs under $1M/month, payment operations is a part-time responsibility, not a full-time role.
When to Hire for Payment Operations
| Volume | Who Handles Payments |
|---|---|
| Under $100K/month | Founder or office manager (1-2 hours/week) |
| $100K-$500K/month | Designated person, part-time (5-10 hours/week) |
| $500K-$2M/month | Full-time operations person |
| Over $2M/month | Operations team (2-3 people) |
See Who Owns What for role definitions and Scaling Milestones for when to add headcount.
Test to Run (2 Weeks)
Reconciliation accuracy check:
- Pick one week of payouts from your processor.
- Match each payout against your bank deposits. Do the amounts match (within expected fee range)?
- For any mismatch, identify the cause: held funds, chargeback deduction, fee discrepancy, or timing difference.
- Track how long each mismatch takes to resolve.
- Set up a simple weekly check going forward (15 minutes, same day each week).
Success criteria: All payouts match bank deposits within your expected fee range. Any exceptions are explained and resolved within 24 hours. If you find unexplained discrepancies, contact your processor.
Where This Breaks
- Solo operator trying to track everything. Start with reconciliation and monitoring only. Add incident, process, and staffing metrics as your team grows. Tracking 25 metrics with no team to act on them is busywork.
- Alerts without action. Monitoring metrics are worthless if nobody responds. Before adding a new alert, ask: "Who will see this, and what will they do?" If the answer is "nobody" and "nothing," don't create it.
- Measuring process metrics without process. If your chargeback response workflow is "whoever notices it handles it," process error rate and SLA adherence are meaningless. Document the process first, then measure it.
- Seasonal distortion. Holiday volume spikes can make incident rates and exception rates look better (lower percentage on higher volume) while masking real problems. Compare like periods: this January vs. last January, not January vs. December.
- Vendor dashboard as single source of truth. Your processor's dashboard may not show holds, reserves, or mid-cycle fee changes clearly. Reconcile against your bank statement, not just the processor report.
Next Steps
New to operations metrics?
- Start with the operations checklist - Daily and weekly monitoring tasks
- Understand benchmarks - Industry comparison standards
Setting up monitoring?
- Configure alerts - Set up notifications for key thresholds
- Review processor management - Acquirer relationship metrics
Tracking specific areas?
- Chargeback metrics - Dispute ratio monitoring
- Fraud metrics - Fraud rate tracking
- Reading statements - Understanding your costs
Tracking operations is just one piece. See also: Payments Metrics · Fraud Metrics · Chargeback Metrics · Compliance Metrics
Related Topics
- Fraud Metrics - Fraud rate tracking
- Chargeback Metrics - Dispute ratios
- Compliance Metrics - Compliance KPIs
- Processor Management - Acquirer relationships
- Reading Statements - Understanding costs
- ACH Operations - Bank payment metrics
- Terminal Operations - CP metrics
- Representment Workflow - Win rate tracking
- Benchmarks - Industry comparisons
- Who Owns What - Ownership clarity
- Operations Checklist - Daily and weekly monitoring
- Scaling Milestones - Growth thresholds
- Running Fraud Operations - Operational cadence