Skip to main content

Rules vs. ML

Prerequisites

Before choosing detection approaches, understand:

TL;DR
  • Rules: Fast to deploy, interpretable, great for known patterns and regulatory requirements
  • ML: Better for complex patterns, scale, novel fraud—but needs data science expertise
  • Best approach: Hybrid—hard rules (blocklists, OFAC) → ML scoring → soft rules (thresholds, overrides)
  • Start with rules: Use processor ML (Stripe Radar, Adyen Risk) until you have data science resources
  • Layer with velocity rules, device fingerprinting, behavioral analytics

Choosing between rule-based and machine learning approaches for fraud detection.

Overview

Both rules and machine learning have their place in fraud detection. The question isn't "which one" but "how to combine them effectively."

Comparison

AspectRulesMachine Learning
InterpretabilityHigh – clear logicLower – "black box"
Speed to deployFast – hours/daysSlower – weeks/months
MaintenanceManual updatesRetraining required
Novel fraudMisses new patternsCan detect anomalies
Known fraudExcellentGood
False positivesHigher if too strictOptimizable
Expertise neededDomain knowledgeData science + domain

When to Use Rules

Rules excel when:

Known Fraud Patterns

IF transaction_country != billing_country
AND account_age < 7_days
AND transaction_amount > $500
THEN block

Regulatory Requirements

IF customer_on_OFAC_list = true
THEN block (no exceptions)

See AML Basics for compliance requirements.

Business Logic

IF order_contains(gift_cards) 
AND order_total > $1000
AND first_order = true
THEN manual_review

Immediate Response

When you discover a new fraud pattern, rules can be deployed in minutes.

When to Use ML

Machine learning excels when:

Pattern Complexity

  • Hundreds of features interacting
  • Non-linear relationships
  • Patterns too complex for human rule-writing

Scale

  • Millions of transactions
  • Need for real-time scoring
  • Too many segments for manual rules

Evolution

  • Fraud patterns shifting constantly
  • Need to catch novel approaches
  • Want to optimize over time

Probability Needed

  • Gradated risk scores (not just yes/no)
  • Threshold tuning required
  • Different actions at different confidence levels

The Hybrid Approach

Most production systems combine both:

Layer 1: Hard Rules (Pre-ML)

Layer 2: ML Scoring

  • Probability of fraud
  • Contextual risk assessment
  • Feature-rich evaluation

Layer 3: Soft Rules (Post-ML)

  • Threshold application (score > 80 = block) - See risk scoring
  • Override rules (VIP customers)
  • Manual review triggers
  • Business logic gates

Building Effective Rules

Rule Anatomy

RULE: High-risk first purchase
CONDITIONS:
- first_order = true
- order_value > $300
- shipping_address != billing_address
- email_age_days < 30
ACTION: manual_review
RATIONALE: New customers with high-value orders
to different addresses have 3x fraud rate
PERFORMANCE:
- Triggers: 2.3% of orders
- Fraud rate when triggered: 8.2%
- FP rate: 45%

Rule Hygiene

  1. Document every rule – Purpose, conditions, rationale
  2. Track performance – Hit rate, precision, recall
  3. Review quarterly – Remove underperformers
  4. Sunset old rules – Don't let rules accumulate
  5. Version control – Track changes over time

Building Effective ML

Feature Engineering

Best features typically include:

  • Velocity (transactions per hour/day/week)
  • Deviation from normal (customer's own baseline)
  • Network features (links to other accounts)
  • Device/IP reputation
  • Time-based patterns

Model Considerations

FactorRecommendation
AlgorithmGradient boosting (XGBoost, LightGBM) often wins
Training dataUse confirmed fraud, not just chargebacks
Refresh frequencyMonthly minimum, weekly ideal
Feature stabilityMonitor for drift
ExplainabilityUse SHAP values for investigation

Next Steps

Just starting fraud detection?

  1. Start with rules → Known patterns are easier to block with rules
  2. Use your processor's ML → Stripe Radar, Adyen Risk for baseline scoring
  3. Build a review queue → Some transactions need human eyes

Improving your detection?

  1. Analyze your rule performance → Which rules catch fraud? Which just add friction?
  2. Add velocity rules → Velocity Rules guide
  3. Layer signals → Rules + ML + device fingerprinting together

Going advanced?

  1. Build custom ML models → If you have the data science resources
  2. Invest in feature engineering → Better signals beat better algorithms
  3. A/B test continuously → Track precision/recall tradeoffs