System Design

Real-Time Fraud Detection: Architecture for Fintech Security

5 min read

Feb 16

In our previous explorations, we deconstructed how TikTok predicts what you’ll love and how Elasticsearch finds needles in petabyte-scale haystacks. But in the world of Fintech, the engineering challenge isn't just about discovery or engagement it’s about adversarial defense.

When a user swipes a credit card or initiates a wire transfer, the system has roughly 50 to 200 milliseconds to decide: Is this legitimate, or is this a heist? If the system is too slow, the user experience suffers (latency). If it’s too lenient, the company loses millions (False Negatives). If it’s too strict, you block a legitimate customer’s honeymoon dinner (False Positives).

Today, we deconstruct the architecture of a modern, real-time fraud detection engine.

1. The Latency Budget: The "Hot Path" vs. "Cold Path"

A fraud detection system is split into two distinct temporal loops:

The Hot Path (Synchronous)

This happens while the transaction is "pending." The payment gateway is literally waiting for a 200 OK or a 403 Forbidden.

Budget: < 100ms.
Logic: Simple velocity checks (e.g., "Has this card been used 5 times in the last minute?") and pre-computed ML model scoring.

The Cold Path (Asynchronous)

This happens after the decision is made. It involves complex graph analysis, deep learning retraining, and human-in-the-loop review.

Budget: Seconds to minutes.
Logic: Detecting organized fraud rings or updating "blacklists" that will be pushed to the Hot Path for future transactions.

2. Feature Engineering: The State Management Problem

The hardest part of fraud detection isn't the machine learning model; it’s the Data Pipeline. To know if a transaction is fraudulent, a model needs "Features."

Simple features like transaction_amount are easy. But "Stateful Features" are difficult:

How many distinct IP addresses has this user logged in from in the last 24 hours?
What is the Z-score of this transaction amount compared to the user's average over 6 months?

The Solution: Stream Processing with Flink & Redis

Modern architectures use Apache Flink to process event streams (from Kafka). Flink maintains "Sliding Windows" in memory to calculate these aggregates.

For example, a Flink job calculates the windowed sum:

These aggregates are then pushed into a low-latency Key-Value store like Redis or Aerospike. When the Hot Path receives a transaction, it "enriches" the request by fetching these pre-computed features from Redis in ~1ms.

3. The Decision Engine: Rules + ML

A robust system uses a "Champion-Challenger" model combining hard rules and machine learning.

Hard Rules (The Safety Net)

Even the best AI can be "hallucinatory" or slow to adapt to a new exploit. Hard rules (e.g., "Block all transactions from countries on the OFAC sanctions list") act as the first line of defense. These are often managed in a Rules Engine like Drools or a custom Go-based evaluator.

Machine Learning (The Scalpel)

For subtle patterns, systems use models like XGBoost (Gradient Boosted Trees) or Isolation Forests.

Input: A feature vector of ~500 variables (user age, device fingerprint, geolocation, past behavior).
Output: A probability score P(Fraud).
Thresholding: If P > 0.9, block. If 0.7 < P < 0.9, trigger Multi-Factor Authentication (MFA). If P < 0.7, allow.

4. Graph Databases: Detecting the "Sybil" Attack

Professional fraudsters don't just use one stolen card; they use thousands of accounts that appear unrelated. This is where Graph Neural Networks (GNNs) and Graph Databases (like Neo4j or AWS Neptune) become critical.

By mapping transactions as a graph:

Nodes: Users, Cards, Devices, IP Addresses.
Edges: "Transacted with," "Logged in from," "Shared email with."

The system can detect Identity Clusters. If 500 different accounts all share the same hardware ID (MAC address) or have sent money to the same "mule" account, the graph reveals a fraud ring that a row-based SQL query would never find.

5. The "Feedback Loop": Fighting Model Decay

Fraudsters are constantly evolving. A model that worked yesterday will fail tomorrow. This is known as Model Decay.

Shadow Deployments

Before a new fraud model goes live, it runs in "Shadow Mode." It scores real transactions, but its decisions aren't used. Data scientists compare the shadow scores against actual fraud cases (Chargebacks) that arrive 30-60 days later to calculate Precision and Recall.

Online Learning

Some advanced systems utilize Online Learning, where the model updates its weights incrementally as new labeled data arrives, rather than waiting for a weekly batch retrain.

Summary: Architecture Checklist

Inbound Event: Captured via Kafka.
Enrichment: Flink/Redis provides windowed features (Velocity, Averages).
Scoring: ML Model (XGBoost) provides a P(Fraud) score.
Action: Rules engine decides: Allow, Block, or Challenge (MFA).
Analytics: Graph DB uncovers hidden connections for the Cold Path.

References & Further Reading

Stripe: Scaling Fraud Detection with ML - An industry-standard look at how Stripe uses "Radar" to score billions of transactions.
Uber Michelangelo: Real-time Feature Engineering - How Uber handles the "State" problem at massive scale.
Monzo: Building a Modern Fraud Engine - A fintech-specific view on combining rules and models in a microservices environment.
DoorDash: Using Graph Neural Networks for Fraud - A deep dive into how graph architecture prevents promotional abuse.
Zillow: Real-time Feature Store Architecture - Technical details on the plumbing required to keep ML models fed with fresh data.

Newsletter

Level Up Your Tech Knowledge

Join 5,000+ developers receiving expert insights, coding tips, and exclusive content delivered straight to your inbox.

No spam, ever. Unsubscribe at any time.

Comments0

Leave a thought

No comments yet.
Be the first to share your thoughts!

Explore related posts

Chaos Engineering: How to Build Systems That Embrace Failure

Don't wait for a crash. How to use tools like Chaos Monkey to break your system intentionally and build resilience.

Mar 27 min read

SYSTEM DESIGN

Serving AI Agents: Scalable LLM Inference Architecture

Moving beyond chatbots. How to architect systems that run autonomous AI agents using vector databases and RAG.

Feb 275 min read

SYSTEM DESIGN

Collaborative Editing: How Google Docs Handles Concurrency

How two people type at once without overwriting each other. Explaining Operational Transformation (OT) and CRDTs.

Feb 235 min read

SYSTEM DESIGN

System Design

Real-Time Fraud Detection: Architecture for Fintech Security

Written byTanyaradzwa

5 min read

Feb 16

Today, we deconstruct the architecture of a modern, real-time fraud detection engine.

1. The Latency Budget: The "Hot Path" vs. "Cold Path"

A fraud detection system is split into two distinct temporal loops:

The Hot Path (Synchronous)

This happens while the transaction is "pending." The payment gateway is literally waiting for a 200 OK or a 403 Forbidden.

Budget: < 100ms.
Logic: Simple velocity checks (e.g., "Has this card been used 5 times in the last minute?") and pre-computed ML model scoring.

The Cold Path (Asynchronous)

This happens after the decision is made. It involves complex graph analysis, deep learning retraining, and human-in-the-loop review.

Budget: Seconds to minutes.
Logic: Detecting organized fraud rings or updating "blacklists" that will be pushed to the Hot Path for future transactions.

2. Feature Engineering: The State Management Problem

The hardest part of fraud detection isn't the machine learning model; it’s the Data Pipeline. To know if a transaction is fraudulent, a model needs "Features."

Simple features like transaction_amount are easy. But "Stateful Features" are difficult:

How many distinct IP addresses has this user logged in from in the last 24 hours?
What is the Z-score of this transaction amount compared to the user's average over 6 months?

The Solution: Stream Processing with Flink & Redis

Modern architectures use Apache Flink to process event streams (from Kafka). Flink maintains "Sliding Windows" in memory to calculate these aggregates.

For example, a Flink job calculates the windowed sum:

3. The Decision Engine: Rules + ML

A robust system uses a "Champion-Challenger" model combining hard rules and machine learning.

Hard Rules (The Safety Net)

Machine Learning (The Scalpel)

For subtle patterns, systems use models like XGBoost (Gradient Boosted Trees) or Isolation Forests.

Input: A feature vector of ~500 variables (user age, device fingerprint, geolocation, past behavior).
Output: A probability score P(Fraud).
Thresholding: If P > 0.9, block. If 0.7 < P < 0.9, trigger Multi-Factor Authentication (MFA). If P < 0.7, allow.

4. Graph Databases: Detecting the "Sybil" Attack

By mapping transactions as a graph:

Nodes: Users, Cards, Devices, IP Addresses.
Edges: "Transacted with," "Logged in from," "Shared email with."

5. The "Feedback Loop": Fighting Model Decay

Fraudsters are constantly evolving. A model that worked yesterday will fail tomorrow. This is known as Model Decay.

Shadow Deployments

Online Learning

Some advanced systems utilize Online Learning, where the model updates its weights incrementally as new labeled data arrives, rather than waiting for a weekly batch retrain.

Summary: Architecture Checklist

Inbound Event: Captured via Kafka.
Enrichment: Flink/Redis provides windowed features (Velocity, Averages).
Scoring: ML Model (XGBoost) provides a P(Fraud) score.
Action: Rules engine decides: Allow, Block, or Challenge (MFA).
Analytics: Graph DB uncovers hidden connections for the Cold Path.

References & Further Reading

Stripe: Scaling Fraud Detection with ML - An industry-standard look at how Stripe uses "Radar" to score billions of transactions.
Uber Michelangelo: Real-time Feature Engineering - How Uber handles the "State" problem at massive scale.
Monzo: Building a Modern Fraud Engine - A fintech-specific view on combining rules and models in a microservices environment.
DoorDash: Using Graph Neural Networks for Fraud - A deep dive into how graph architecture prevents promotional abuse.
Zillow: Real-time Feature Store Architecture - Technical details on the plumbing required to keep ML models fed with fresh data.

Newsletter

Level Up Your Tech Knowledge

Join 5,000+ developers receiving expert insights, coding tips, and exclusive content delivered straight to your inbox.

No spam, ever. Unsubscribe at any time.

Comments0

Leave a thought

No comments yet.
Be the first to share your thoughts!

Explore related posts

Chaos Engineering: How to Build Systems That Embrace Failure

Don't wait for a crash. How to use tools like Chaos Monkey to break your system intentionally and build resilience.

Mar 27 min read

SYSTEM DESIGN

Serving AI Agents: Scalable LLM Inference Architecture

Moving beyond chatbots. How to architect systems that run autonomous AI agents using vector databases and RAG.

Feb 275 min read

SYSTEM DESIGN

Collaborative Editing: How Google Docs Handles Concurrency

How two people type at once without overwriting each other. Explaining Operational Transformation (OT) and CRDTs.

Feb 235 min read

SYSTEM DESIGN