EDITORIAL · 2026-04-01

Detection engineering against low-and-slow operations

When the dwell time is measured in months rather than minutes, the detection problem stops being about signatures and becomes about baselines. Notes from a year of working financial-sector telemetry.

By: IMF Research
Published: 2026-04-01
Reading time: 5 min

Most threat hunting curricula teach detection as a signature problem. Identify the malicious indicator, write a rule, ship the rule. This works for ransomware crews on a forty-eight-hour clock and for commodity malware that drops the same DLL on every host. It does not work for the operations that actually concern us in financial infrastructure: the ones that take eight months from initial access to the moment a settlement file is altered.

Low-and-slow operations look like normal operations, individually. The attacker logs in during business hours from an account whose holder does that for a living. They run commands the holder runs daily. They touch files the holder touches monthly. There is no rule that fires on any one of these actions because the rule would also fire ten thousand times a day on the work of the people we are trying to protect.

What works instead is baselines.

The shift from rules to baselines

A rule asks: did this specific thing happen? A baseline asks: how does the recent distribution of activity compare to the prior distribution? The unit of detection is no longer “malicious event” but “deviation from established norm.”

The cost of this shift is significant. A rule is a one-line YAML or Sigma file that any analyst can author. A baseline is a continuously- updated statistical model with explicit assumptions about stationarity, seasonal effects, and confidence intervals. It requires storage of historical telemetry, often for a year or more. It requires a labeling process to confirm or reject anomalies, and that process must run fast enough to feed back into the model before drift accumulates.

Most teams that announce a “behavioural detection program” do not do this. They install a tool that claims to. The tool computes z-scores on a handful of fixed features over a fixed lookback, alerts on the top n outliers per day, and gradually trains the analysts to acknowledge and dismiss the alerts. The detection rate against actual low-and-slow operations is approximately zero.

What detection coverage looks like, concretely

For a corporate banking environment, the telemetry sources that have proven load-bearing in our engagements are these.

Authentication. Every successful authentication, successful or failed, including non-interactive service-account use. Source IP, geolocation, ASN, time-of-day, day-of-week, prior session continuity. Per-principal baseline, refreshed weekly with a thirty-day lookback. The useful signal is not “user logged in from a new country”; that fires constantly and the noise drowns the operations we care about. The useful signal is “user logged in from a country whose ASN distribution does not match this user’s prior pattern, and the session originated outside the user’s prior temporal envelope, and the immediate post-login action sequence does not match this user’s prior task pattern.” All three must agree.

Process lineage. Every process start with full command-line and parent chain, retained for a year minimum. Per-host and per-user-role baselines. The attackers we lose to know this telemetry exists; they also know that any single command they run will look ordinary in isolation. What they cannot easily fake is the full chain of parent and child processes across hours. If excel.exe starts powershell.exe, that is not unusual on its own — Excel macros do this all the time. If the same powershell.exe later spawns nslookup to a domain whose DNS fingerprint matches none of the user’s prior queries, and then exits without producing a visible side-effect, the chain is anomalous even though no individual hop is.

Egress. Per-host outbound flow logs at the netflow or VPC-flow level, with TLS SNI and HTTP host where available. The signal is again not the single new destination but the aggregate distribution shift. Adversaries who exfiltrate on a low-and-slow timescale typically use a destination that has been live for months and looks like a SaaS endpoint. What they cannot avoid is that, at some point, traffic to that destination starts to grow.

Financial-system audit logs. This is the work that actually pays for itself. Read access to settlement files, payment-instruction tables, beneficial-ownership records. Per-account, per-role baselines. The detection that lands the conviction in a low-and-slow operation is rarely “someone did the wrong thing”; it is “someone read records they have never read before, in a sequence that traces a target.” The data is already there; very few institutions actually look at it.

The labeling problem

A baseline is only useful if its alerts are graded back. Of every alert emitted, you need a recorded outcome — true positive, false positive benign, false positive misconfiguration — so that the model can be tuned. A team that does not grade alerts does not have a detection program; it has an alert-ignoring program.

The grading must be fast. Two weeks is too long; the analyst no longer remembers context. Same-day is the floor. To make same-day grading possible, the alert volume must be low enough — single-digit alerts per analyst per day, at most. To make alert volume that low, the baselines must be tight, which means they must be based on real per-principal history, which means the data retention must be long, which means the storage budget must be generous. None of these requirements are optional.

What we are watching for in 2026

Three trends increase the cost of detection and the value of doing it well. First, more identity providers are emitting telemetry that looks like the telemetry of long-lived sessions, even when the underlying session has actually rotated. The signal of “stable session” is weakening. Second, agent-driven account activity — the legitimate kind — is accelerating. An automated process refactoring a codebase looks more and more like an unauthorised process exfiltrating that codebase. Distinguishing the two requires intent context that most SIEM schemas do not carry. Third, treasury-management and payment-rail integrations are expanding the surface across which a low-and-slow operation can land.

We expect the operations we catch in 2026 to require attention to sequences spanning more than one telemetry domain — identity, process, flow, application — over windows of weeks. The teams that win on this will have invested in the tedious infrastructure work years before the incident; the teams that lose will be the ones who bought the dashboard.

We do not run a mailing list. New advisories and editorial appear in the RSS feed and on /advisories. Add the feed to whichever reader you already use.

Detection engineering against low-and-slow operations

The shift from rules to baselines

What detection coverage looks like, concretely

The labeling problem

What we are watching for in 2026

New advisories in your reader, not your inbox