Sandbox detection orchestrator

← recon index · docs/index

TL;DR

You want to bail out before doing anything risky if the host looks like a sandbox or analyst's machine. No single signal is conclusive (a low-end laptop has 2 cores too, an admin's hostname might just be "DESKTOP-X1") — so this orchestrator stacks 7 orthogonal dimensions and fires when ANY of them flags.

Each dimension catches a different sandbox class:

DimensionCatchesFalse positive on
DebuggerLive analyst with attached debuggernothing in practice
VM/HypervisorCuckoo, Joe Sandbox, most public sandboxesHyper-V on a real Win11 laptop
Hardware (cores / RAM / disk)Underprovisioned VMs (2 CPU / 4 GB RAM / 60 GB disk baseline)Low-end real machines
User/host nameGeneric analyst defaults (admin / user / sandbox / malware / WORKSTATION-1)Lazy real-user provisioning
Analysis tool processesprocmon / wireshark / fiddler / x64dbg actively runningReverse engineers on real machines
Fake-domain DNSSandbox internet simulation (every domain resolves)Captive-portal hotspots
Time-basedSandboxes that fast-forward time.SleepReal machines under heavy load

Quick-pick:

You want to…UseCost
Apply the canonical defender-baseline checkDefaultConfig + Checker.IsSandboxed<100ms total (most checks are syscalls/file reads)
Tighten/relax a specific dimensionMutate Config fields then NewChecker(cfg)same
Stop on first hit (default) vs collect all reasonsConfig.StopOnFirst (true/false)StopOnFirst=false sums all check times

What this DOES NOT do:

  • Doesn't bypass anything — orchestrator only DECIDES. Pair with os.Exit(0) or a "play dead" branch in your implant.
  • No HVCI / hardware-virt-aware probes — that's recon/antivm.HypervisorPresent. Add manually if you want it scoring.
  • Time-based check on goroutine dispatch — high-CPU real hosts can trip the timing dimension. Tune EvasionTimeout generously or disable when targeting under-resourced Windows servers.

Primer — vocabulary

Five terms recur on this page:

Sandbox — automated analysis environment (Cuckoo, JoeSandbox, ANY.RUN, hybrid-analysis, AV vendor labs) that runs your sample in a VM and traces every syscall + network packet. The implant's job is to bail before producing trace data the analyst can use.

Hardware threshold — a sandbox is usually a VM provisioned with the bare minimum to boot Windows: 2 cores, 4 GB RAM, 60 GB disk. Real user machines have more. Setting min thresholds catches under-provisioned VMs without lying about real laptop baselines (which exceed these by 4-8×).

Fake-domain DNS — most sandboxes simulate internet to keep the sample doing things. Side effect: nonexistent-deadbeef.invalid resolves with a real-looking IP. A real machine returns NXDOMAIN.

Time-based detection — some sandboxes accelerate Sleep/time.Sleep so the sample doesn't waste analysis budget on long sleep loops. The orchestrator measures wall-clock elapsed during a known-duration sleep; mismatch → sandbox.

Indicator dimension — a category of fingerprint signal (debugger / hardware / username / etc.). Each runs independently; the orchestrator combines verdicts.

How It Works

flowchart TD
    subgraph cfg [Config dimensions]
        DEBUG[Debugger]
        VM[VM/Hypervisor]
        HW[Hardware<br>cores / RAM / disk]
        IDENT[User / hostname]
        PROC[Process names]
        DNS[Fake-domain DNS]
        TIME[Time-based]
    end
    DEBUG --> AGG[Checker.IsSandboxed]
    VM --> AGG
    HW --> AGG
    IDENT --> AGG
    PROC --> AGG
    DNS --> AGG
    TIME --> AGG
    AGG --> OUT{any check fires?}
    OUT --> RES[true + reason]
    OUT --> NORMAL[false + nil]

Per-dimension tunables in Config: each check has a threshold and an enable flag. DefaultConfig ships defender-baseline values; operators harden against specific targets by tightening or relaxing.

API → godoc

pkg.go.dev/github.com/oioio-space/maldev/recon/sandbox is the authoritative reference for every exported symbol. This page teaches the concepts; the godoc is the specification.

Examples

Simple — defender baseline

import (
    "context"
    "os"

    "github.com/oioio-space/maldev/recon/sandbox"
)

c := sandbox.New(sandbox.DefaultConfig())
if hit, reason, _ := c.IsSandboxed(context.Background()); hit {
    fmt.Fprintf(os.Stderr, "bail: %s\n", reason)
    os.Exit(0)
}

Composed — strict thresholds

Harden against a specific defender pipeline by raising hardware thresholds and adding custom usernames.

cfg := sandbox.DefaultConfig()
cfg.MinCPUCores = 4
cfg.MinRAMGB = 8
cfg.BadUsernames = append(cfg.BadUsernames,
    "test", "demo", "vagrant",
)
c := sandbox.New(cfg)

Advanced — full audit + report

results := c.CheckAll(ctx)
for _, r := range results {
    if r.Detected {
        fmt.Printf("%-15s %s\n", r.Name, r.Detail)
    }
}

Replace the binary IsSandboxed verdict with a 0..100 score so operators can tune the bail threshold per engagement.

import "github.com/oioio-space/maldev/recon/sandbox"

c := sandbox.New(sandbox.DefaultConfig())
results := c.CheckAll(ctx)
score := sandbox.Score(results)
if score >= 60 {
    log.Printf("bail: sandbox score=%d", score)
    return
}

Audit / tune the weights:

for name, w := range sandbox.Weights() {
    log.Printf("weight[%s] = %d", name, w)
}

OPSEC & Detection

ArtefactWhere defenders look
Many checks then early exitSandboxes self-flag — they exhausted their analysis budget
Fake-domain DNS resolutionSandboxes often sinkhole; the DNS query itself is logged
Analysis-tool process enumerationSandboxes know they run wireshark; the enumeration succeeds
BusyWait followed by exitTime-based sandbox decoys

D3FEND counters:

  • D3-EI — sandbox design itself.

Hardening for the operator:

  • Calibrate thresholds against the actual target stack — too strict means false positives on real low-spec targets.
  • Layer with timing BusyWait; sandboxes time out before a 30-second wait completes.
  • Run the full IsSandboxed once at startup, then cache — re-running on every callback is wasted effort.

MITRE ATT&CK

T-IDNameSub-coverageD3FEND counter
T1497Virtualization/Sandbox Evasionfull — multi-factor orchestratorD3-EI

Limitations

  • No bypass for VMI. Bare-metal volatility analysis defeats every check.
  • False positives on low-spec real users. Tightening hardware thresholds catches sandboxes but may catch real embedded / minimal-VM targets. The Score helper + operator-chosen threshold gives finer control than the binary IsSandboxed: a single hardware check failing on a real low-spec target only contributes 3-5 points; the operator's bail threshold (typically 50-70) absorbs that noise.
  • Score weights are static. The current detectionWeights are tuned for "default-defender baseline" target shapes. Targets with unusual hardware (cheap VPS, dense Docker hosts) may need re-weighting via Weights() audit + a custom aggregator.
  • DNS check requires outbound resolution. Air-gapped sandboxes that NXDOMAIN everything still defeat the fake-domain probe.
  • No rootkit awareness. Hooks installed by sandbox kernel drivers are out of scope; pair with evasion/unhook + recon/hwbp for kernel-hook detection.

See also