How one payments network does it all: Scaling without headcount, and keeping humans in the loop

TL;DR

AI handles volume; humans make every remediation call. Affirm keeps a human in the loop on every remediation decision—blocking an indicator, quarantining an email, isolating a host—because organizational context still can’t be taught to a model.
Affirm cut investigations requiring manual intervention by 4x. They handed layer one triage to Expel and put the reclaimed time into novel detections and custom automation.
GuardDuty, CloudTrail, and Expel are baked into infrastructure-as-code at the AWS Organizations level. Every new account is covered the moment it’s provisioned. No manual onboarding, no gaps.

This blog is based on a webinar featuring Tyler Zito, Senior Technical Partner Manager at Expel (host); Drew Gallis, Staff Security Engineer (Observability & Detection Engineer Lead) at Affirm; Guhan Kumaraguru Staff Security Engineer (Platform Security Lead) at Affirm; and Ashok Mahajan, Senior Partner Solution Architect at AWS. Register to watch the full session on demand.

When Affirm stood up its security operations program, the team was small. Three to five people. And they were facing a question most security leaders eventually face: build in-house, or partner with a managed detection and response (MDR) provider? The answer shaped not just how they’d handle alert volume—but how they’d keep humans in the loop on every decision that actually matters.

Drew Gallis, Staff Security Engineer (Observability & Detection Engineer Lead) at Affirm, made the math sound simple: “We would honestly need around five times the headcount just to do layer one services.”

A team that size can’t cover dozens of log sources, write detections for every platform they use, and still build the work that actually moves a program forward. Something has to give.

We sat down with Drew; Guhan Kumaraguru, Staff Security Engineer (Platform Security Lead) at Affirm; and Ashok Mahajan, Senior Partner Solution Architect at AWS, to walk through how Affirm built a cloud detection and response program that scales without growing headcount to match.

The full session goes deeper on the architecture, workflows, and what they’d do differently. Register to watch on demand.

The real cost of doing it all in-house

The instinct to build everything yourself is understandable. More control, more visibility, no dependency on a third party. But, in practice, it looks different.

“Even doing that in-house can generate a lot of unnecessary alert fatigue because, at the end of the day, you’re playing that layer one triage strategy,” he said. “Without this layer one funnel, we would definitely be stuck just responding to alerts to determine whether or not they’re true positives or false positives.”

The result is a reactive cycle that’s hard to break out of. Active threats get the attention they need. The strategic work that would actually grow the program keeps sliding.

Affirm’s answer was to let Expel absorb the volume, so the team could spend its time where Affirm’s own context mattered: novel detections, custom automation, and coverage for the in-house platforms that no vendor could build for them.

Build the foundation first

None of that works without an AWS architecture that doesn’t fight against you. For Ashok, that means security observability isn’t something you bolt on later. You design it into the infrastructure from day one.

In practice, that means:

Use AWS Organizations to centralize everything. A dedicated security account holds all your logging and monitoring: GuardDuty delegated admin, org-level CloudTrail, and Security Hub aggregating findings across every account and region.
Enable GuardDuty and CloudTrail at the org level. Every new account is covered the moment it’s provisioned. No manual step to forget, no account that slips through.
Build a tagging strategy into account vending from day one. Tags like environment, application owner, and data classification become critical metadata when you’re prioritizing findings at scale. Add them early. Retrofitting is painful.

“When you do that,” Ashok said, “onboarding a new account becomes a non-event. It just works.”

Guhan’s team took it one step further and embedded Expel directly into their infrastructure-as-code provisioning pipeline. When Affirm spins up a new AWS account, monitoring coverage comes with it automatically, with no separate engineering effort. “Security is an innovation enabler rather than a blocker,” he said. “We’re not coming in the way of business growth.”

A new account connected through a CloudFormation stack takes about seven to nine minutes. And because everything rolls up to an org-level CloudTrail, new accounts are pulled automatically into the stream Expel is already ingesting.

Drew and Guhan walk through their full org-level setup in the session. Hear how they wired it together.

What the signal-to-noise math looks like

More than 15 coverage areas feed into Expel: AWS native services, SaaS tools, identity platforms, and Affirm’s own in-house applications. That’s a large volume of raw events every month. The number that reaches Affirm’s on-call engineers looks nothing like it.

“We’ve seen approximately a 4x decrease in investigations where we would have to do manual intervention,” Guhan said.

The funnel works like this. GuardDuty findings and CloudTrail events flow into Expel, where analysts, supported by Ruxie, Expel’s AI, enrich each alert with organizational context before making a triage call. Take a developer connecting straight into a running container from an unusual IP. In isolation, it looks suspicious. With context, the engineer is on the Kubernetes team and the pattern shows up all the time. It’s authorized, and it never pages Affirm.

What escalates is the stuff that needs human eyes: a publicly accessible S3 bucket policy that doesn’t fit normal patterns, or an IAM action that’s off for the account in question. The kind of thing an automated system might wave through, but that an analyst with Affirm-specific context flags for a second look.

“A vast majority of the alerts that were generated did not require direct intervention from Affirm,” Drew said. The number, he added, speaks for itself in terms of how much time the team has gotten back.

Want to see how the funnel holds up under real traffic? Register for the full webinar.

AI for volume, humans for judgment

Every vendor is promising fully autonomous SOC capabilities right now. Affirm is deliberate about where that framing breaks down.

“The part where Expel has real humans in the loop for the first layer of triage remains pretty pivotal for us,” Guhan said. “To make the final call is where we still really value avoiding hallucination and focusing on strategic and deeper analysis during incidents specifically.”

At Affirm, remediation always requires human sign-off. Blocking a malicious indicator, quarantining an email, isolating a host: each one carries organizational weight that automation can’t fully assess.

Ashok frames it as the shared responsibility model applied to detection and response. AWS provides the telemetry and the native detection layer. Partners like Expel add correlation, enrichment, and judgment across the full environment, not just AWS but identity providers, SaaS tools, and endpoints too. “The goal is not to replace humans,” he said. “It’s to amplify them. Give them better tools, better data, more context to focus on the problem that actually requires human intelligence.”

Drew put the sharpest point on it. “Context is one of the most important things when it comes to incident response. That context is not something you can teach AI out of the box for every system you work with. Context is king.”

The metrics leadership actually cares about

For security teams making the case internally, Guhan tracks three things:

Mean time to detect and mean time to respond. Affirm pulls these from Expel’s investigation data, and not just for escalated incidents. It’s everything Expel closed, and how fast. That gives leadership a full picture of program performance instead of a cherry-picked subset.
Alert volume as a funnel. The 4x reduction is a concrete, intuitive answer to “what are we getting for this?” Affirm processes a large volume of events, Expel handles the vast majority, and the on-call team only gets paged when something genuinely needs a human.
Operational efficiency. How much strategic work is getting done alongside the day-to-day load? Guhan’s team adds log sources, writes novel detections, and grows coverage every quarter with no matching jump in headcount. That’s the metric that proves the model works.

“Although our team size isn’t necessarily growing very much year over year, our ability to cover larger platforms and systems is,” Drew said.

The takeaway

Security operations doesn’t have to be a headcount problem. Affirm’s foundation did the heavy lifting: AWS native tooling at the org level, infrastructure-as-code provisioning, and an MDR provider handling layer one triage. With that in place, a small team can cover a large and growing environment without drowning in alerts.

The catch is timing. The teams that get this right designed for it from the start. As Ashok put it: “Once you’re drowning in alerts, it’s too late to redesign the system.”

Build it before you’re underwater. Watch Drew, Guhan, and Ashok lay out the whole playbook. Register for the on-demand session.

How one payments network does it all: Scaling without headcount, and keeping humans in the loop

TL;DR

The real cost of doing it all in-house

Build the foundation first

What the signal-to-noise math looks like

AI for volume, humans for judgment

The metrics leadership actually cares about

The takeaway

MDR

The AI SOC debate is missing the point

MDR

9 questions to ask about your SIEM (and what the answers reveal)

MDR

ITDR for remote and hybrid workforces

MDR

Why identity security is a verb, not a noun

How one payments network does it all: Scaling without headcount, and keeping humans in the loop

TL;DR

The real cost of doing it all in-house

Build the foundation first

What the signal-to-noise math looks like

AI for volume, humans for judgment

The metrics leadership actually cares about

The takeaway

Related Articles

MDR

The AI SOC debate is missing the point

MDR

9 questions to ask about your SIEM (and what the answers reveal)

MDR

ITDR for remote and hybrid workforces

MDR

Why identity security is a verb, not a noun