Sandbox detection orchestrator

← recon index · docs/index

TL;DR

Multi-factor sandbox / VM / analysis-environment detector. Aggregates 7 check dimensions (antidebug, antivm, hardware thresholds, suspicious user/host names, analysis-tool processes, fake-domain DNS interception, time-based) into a single Checker.IsSandboxed result. Returns (true, reason, err) so callers can bail and log why.

Primer

No single signal is conclusive. CPU core count alone won't tell you Cuckoo from a low-end laptop; VM detection alone misses bare-metal forensic workstations. The orchestrator stacks indicators across orthogonal dimensions so high-confidence sandboxes (Cuckoo, Joe Sandbox, ANY.RUN, hybrid-analysis) light up across multiple checks while real targets light up across zero or one.

The default configuration is calibrated against the canonical public sandbox baselines: 2 cores, 4 GB RAM, 60 GB disk, generic usernames (admin, user, sandbox, malware), analysis tools (procmon, wireshark, fiddler, x32dbg/x64dbg).

How It Works

flowchart TD
    subgraph cfg [Config dimensions]
        DEBUG[Debugger]
        VM[VM/Hypervisor]
        HW[Hardware<br>cores / RAM / disk]
        IDENT[User / hostname]
        PROC[Process names]
        DNS[Fake-domain DNS]
        TIME[Time-based]
    end
    DEBUG --> AGG[Checker.IsSandboxed]
    VM --> AGG
    HW --> AGG
    IDENT --> AGG
    PROC --> AGG
    DNS --> AGG
    TIME --> AGG
    AGG --> OUT{any check fires?}
    OUT --> RES[true + reason]
    OUT --> NORMAL[false + nil]

Per-dimension tunables in Config: each check has a threshold and an enable flag. DefaultConfig ships defender-baseline values; operators harden against specific targets by tightening or relaxing.

API Reference

SymbolDescription
type ConfigPer-dimension thresholds + enable flags
DefaultConfig() ConfigDefender-baseline calibration
type CheckerOrchestrator instance
New(cfg) *CheckerBuild a checker
Checker.IsSandboxed(ctx) (bool, string, error)Run all enabled checks; first match wins (binary verdict)
Checker.CheckAll(ctx) []ResultRun every check; return all results (per-check breakdown)
Score(results []Result) intAggregate []Result into a 0..100 confidence score, capped at 100
Weights() map[string]intReturns a copy of the per-check score weights for audit/tuning

Scoring weights

CheckWeightRationale
debugger20active analyst attached
vm18virt detection probe matched
domain15sandbox DNS resolves a known-fake domain
process13analysis tool (procmon / wireshark / …) running
username12analyst-flavour user name
hostname12analyst-flavour hostname
process_count7unusually low PID population
connectivity6no real internet egress
ram5below MinRAMGB
disk5below MinDiskGB
cpu3below MinCPUCores

Sum of all weights = 116. The aggregate is capped at 100 so a "matched everything" outcome lands at the ceiling. Operators pick a bail threshold (typically 50–70) per their tolerance for false positives.

Examples

Simple — defender baseline

import (
    "context"
    "os"

    "github.com/oioio-space/maldev/recon/sandbox"
)

c := sandbox.New(sandbox.DefaultConfig())
if hit, reason, _ := c.IsSandboxed(context.Background()); hit {
    fmt.Fprintf(os.Stderr, "bail: %s\n", reason)
    os.Exit(0)
}

Composed — strict thresholds

Harden against a specific defender pipeline by raising hardware thresholds and adding custom usernames.

cfg := sandbox.DefaultConfig()
cfg.MinCPUCores = 4
cfg.MinRAMGB = 8
cfg.SuspiciousUsernames = append(cfg.SuspiciousUsernames,
    "test", "demo", "vagrant",
)
c := sandbox.New(cfg)

Advanced — full audit + report

results := c.CheckAll(ctx)
for _, r := range results {
    if r.Detected {
        fmt.Printf("%-15s %s\n", r.Name, r.Detail)
    }
}

Replace the binary IsSandboxed verdict with a 0..100 score so operators can tune the bail threshold per engagement.

import "github.com/oioio-space/maldev/recon/sandbox"

c := sandbox.New(sandbox.DefaultConfig())
results := c.CheckAll(ctx)
score := sandbox.Score(results)
if score >= 60 {
    log.Printf("bail: sandbox score=%d", score)
    return
}

Audit / tune the weights:

for name, w := range sandbox.Weights() {
    log.Printf("weight[%s] = %d", name, w)
}

OPSEC & Detection

ArtefactWhere defenders look
Many checks then early exitSandboxes self-flag — they exhausted their analysis budget
Fake-domain DNS resolutionSandboxes often sinkhole; the DNS query itself is logged
Analysis-tool process enumerationSandboxes know they run wireshark; the enumeration succeeds
BusyWait followed by exitTime-based sandbox decoys

D3FEND counters:

  • D3-EI — sandbox design itself.

Hardening for the operator:

  • Calibrate thresholds against the actual target stack — too strict means false positives on real low-spec targets.
  • Layer with timing BusyWait; sandboxes time out before a 30-second wait completes.
  • Run the full IsSandboxed once at startup, then cache — re-running on every callback is wasted effort.

MITRE ATT&CK

T-IDNameSub-coverageD3FEND counter
T1497Virtualization/Sandbox Evasionfull — multi-factor orchestratorD3-EI

Limitations

  • No bypass for VMI. Bare-metal volatility analysis defeats every check.
  • False positives on low-spec real users. Tightening hardware thresholds catches sandboxes but may catch real embedded / minimal-VM targets. The Score helper + operator-chosen threshold gives finer control than the binary IsSandboxed: a single hardware check failing on a real low-spec target only contributes 3-5 points; the operator's bail threshold (typically 50-70) absorbs that noise.
  • Score weights are static. The current detectionWeights are tuned for "default-defender baseline" target shapes. Targets with unusual hardware (cheap VPS, dense Docker hosts) may need re-weighting via Weights() audit + a custom aggregator.
  • DNS check requires outbound resolution. Air-gapped sandboxes that NXDOMAIN everything still defeat the fake-domain probe.
  • No rootkit awareness. Hooks installed by sandbox kernel drivers are out of scope; pair with evasion/unhook + recon/hwbp for kernel-hook detection.

See also