Payment Systems: the simple map and the messy details (so you don’t learn them the hard way)

24 September, 2025
4 min read
64 views

People imagine payments are “just” clicking a button. In reality, a payment system is a small army of components working together under tight compliance rules, strict latency requirements, and constant fraud pressure.

The actors and the happy path

At the highest level:

  • Payer (cardholder / buyer): has a card, wallet, or bank account.

  • Merchant: your store / app.

  • Payment Gateway: the API/stack that accepts card data from frontend and talks to processors.

  • Acquirer (merchant bank): receives the merchant’s transactions and talks to card networks.

  • Card Network (Visa, Mastercard, etc.): routes the transaction to the issuing bank.

  • Issuer (cardholder’s bank): approves or declines the authorization.

  • Settlement / Clearing: money is moved and fees are applied; reconciliation happens.

Happy path (short):

  1. User hits “Pay.”

  2. Merchant gateway sends authorization request.

  3. Issuer approves → gateway returns success.

  4. Merchant captures payment (immediate or later).

  5. Settlement: funds flow from issuer → acquirer → merchant (minus fees).

The two-phase model: Authorization vs Capture

  • Authorization: “Does the card have enough funds / is it valid?” This needs to be fast and reversible.

  • Capture: When you actually take the money (can be immediate or after fulfillment). Captures can fail; plan for partial captures/refunds.

Why separate? It protects merchants who need to confirm stock or do shipping checks before taking money.

Key technical patterns you must implement

1. Tokenization

Never store raw card numbers (PANs). Tokenization swaps a PAN for a token your system uses; tokenization can be done by a gateway (recommended). This reduces PCI scope.

2. Idempotency

Requests may be retried by clients or networks. Provide an Idempotency-Key header so duplicate authorizations don’t charge twice.

Simple header example:

Idempotency-Key: <uuid-v4>

Server should store recent keys + response for at least 24–72 hours.

3. Webhook handling & signature verification

Gateways send asynchronous events (settlement, chargeback). Verify webhooks using HMAC signatures and respond with 2xx only after successful processing.

HMAC check (pseudo):

const computed = HMAC_SHA256(secret, request.rawBody);
if (computed !== request.headers['x-gw-signature']) reject;

4. Reconciliation & eventual consistency

Settlement happens later. Your DB transactions should mark authorized, captured, settled, refunded, and support reconciliation jobs to match gateway reports with ledger entries.

5. Retry & backoff

Network calls to gateways should use exponential backoff with jitter. But don’t retry blindly when gateway says declined — only on transport errors or unknown.

6. Observability & SLAs

Track metrics: authorization latency, success rate, decline reasons, webhook retries, chargeback rate. Set SLOs (e.g., 99.95% successful authorizations under 500ms).

Security & Compliance checklist (non-negotiable)

  • Use PCI-DSS compliant gateway or vault PANs in HSMs.

  • TLS everywhere; HSTS on web.

  • Tokenize/payment vaulting so your servers never touch PANs.

  • Rate-limit payment endpoints and monitor for credential stuffing.

  • Store minimal PII; encrypt at rest.

  • Implement 3DS (friction tradeoff) for high-risk flows.

Fraud & risk controls (practical)

  • Real-time rules: velocity limits per card/IP, BIN checks, country/geolocation mismatches.

  • Device fingerprinting (but be privacy-aware & compliant).

  • ML scoring for risk — start with simple rules then evolve.

  • Use chargeback thresholds and alerts (chargebacks are business-killers).

Data model (simplified)

Transactions(
  id UUID PRIMARY KEY,
  merchant_id UUID,
  amount_cents int,
  currency CHAR(3),
  status ENUM('authorized','captured','settled','refunded','failed'),
  gateway_tx_id VARCHAR,
  idempotency_key VARCHAR,
  created_at TIMESTAMP,
  updated_at TIMESTAMP
)

Keep an audit log with raw gateway responses for disputes.

API contract example (minimal)

POST /payments
Payload:

{
  "amount": 4200,
  "currency": "USD",
  "payment_method_token": "tok_abc",
  "order_id": "order_123"
}

Headers:

Idempotency-Key: <uuid>
Authorization: Bearer <service-token>

Responses need structured decline codes to show users or to trigger retries (e.g., insufficient_funds, card_expired, suspected_fraud).

Failure modes & defenses

  • Network timeouts → idempotency + retries

  • Duplicate submissions → idempotency keys

  • Gateway down → failover to second gateway, queue events for later

  • Chargebacks → fast dispute handling, customer support playbook

  • Fraud spikes → emergency blocklists + rate limits

Build vs Buy — a brutally honest take

  • Buy (use a reputable gateway) if you want to ship fast, reduce PCI and legal burden, and have < millions of transactions/month. Most startups should buy.

  • Build only if you handle huge volume and margins justify running your own acquiring/settlement stack. Even then, start with third-party gateways and evolve.

Operational runbook (what to have ready)

  • Playbook for webhook signature rotation.

  • Reconciliation job—daily, hourly for high-volume.

  • Alerts for suspicious decline pattern, CPU on payment workers, webhook backlog.

  • Chargeback response templates and timelines.

A simple sequence diagram (Mermaid)

sequenceDiagram
  buyer->>merchant: Submit card/token
  merchant->>gateway: Authorize request
  gateway->>card_network: Route auth
  card_network->>issuer: Request approval
  issuer-->>card_network: Approve/Decline
  card_network-->>gateway: Return result
  gateway-->>merchant: Auth result
  merchant-->>buyer: Show success/decline

Final advice — what I’d do if I were you (straight talk)

  1. Use a vaulting gateway (Stripe, Adyen, Braintree, Mollie, etc.). Don’t store PANs.

  2. Implement idempotency & robust webhook verification day 1. These two save you from the worst incidents.

  3. Automate reconciliation and alerts: money mismatches destroy trust.

  4. Start with simple fraud rules, then add ML after you have labeled data.

  5. Measure everything: decline rates by BIN, issuer, geography, and device.