This repository has been archived on 2026-03-13. You can view files and clone it. You cannot open issues or pull requests or push a commit.
Files
greywall/analysis.md
Mathieu Virbel 4ea4592d75 docs: add macOS learning mode analysis with fs_usage approach
Document fs_usage as a viable alternative to strace for macOS
--learning mode. SIP blocks all dtrace-based tools (dtrace, dtruss,
opensnoop) even with sudo, but fs_usage uses the kdebug kernel
facility which is unaffected. Requires admin access only for the
passive monitor process — the sandboxed command stays unprivileged.
2026-02-22 19:07:30 -06:00

75 KiB

Greywall Sandboxing Architecture — Deep Analysis

Overview

Greywall is a sandboxing layer that wraps commands in restrictive environments. It blocks network access by default (allowlist-based), restricts filesystem operations, and controls command execution. It supports Linux (bubblewrap + seccomp + Landlock + eBPF) and macOS (sandbox-exec / Seatbelt SBPL profiles).


Part 1: How Components Work Together (Linux)

The Problem Each Component Solves

There are five distinct security concerns. No single Linux technology can address all of them, which is why Greywall composes multiple layers:

Security Concern Technology Why This One?
Process/network isolation Bubblewrap (namespaces) Only namespaces can create a truly isolated network stack and PID space
Filesystem visibility Bubblewrap (mount namespace) Only mount namespaces can make files literally invisible (not mounted = doesn't exist)
Filesystem access rights Landlock (LSM) Only a kernel LSM can enforce access rights that survive mount misconfiguration
Dangerous syscall blocking Seccomp BPF Only seccomp can block specific system calls (ptrace, mount, reboot)
Violation visibility eBPF (bpftrace) Only kernel tracing can observe denied operations across all layers

Why No Single Layer Is Sufficient

Why can't Bubblewrap do everything? Bubblewrap controls what's visible in the filesystem (mount-time). But once a file IS mounted, bwrap has no say in what operations are performed on it. A read-only bind mount prevents writes, but bwrap cannot block ptrace, mount, or reboot syscalls — those aren't filesystem operations. And if a mount is misconfigured (edge case with symlinks, race conditions), bwrap alone provides no fallback.

Why can't Seccomp do everything? Seccomp filters syscalls by number and (optionally) argument values. It can block ptrace(101) or mount(165), but it cannot make path-based decisions. Seccomp sees openat(fd, "path", flags) but cannot evaluate whether "path" resolves to /home/user/.ssh/id_rsa or /tmp/safe.txt — that would require following the kernel's path resolution logic inside a BPF program, which is impossible. Seccomp is blind to filesystem semantics.

Why can't Landlock do everything? Landlock is a filesystem-only LSM. It controls READ_FILE, WRITE_FILE, EXECUTE, MAKE_DIR, etc. But it has zero knowledge of network operations, process tracing (ptrace), kernel module loading (init_module), or system control (reboot). Landlock also cannot isolate namespaces — it restricts access within the current namespace, it doesn't create new ones.

Why can't eBPF do everything? eBPF (as used here via bpftrace) is observation-only. It attaches to tracepoints at syscall exit and reads return values. It cannot block, modify, or deny any operation. Its purpose is to report violations after they've been caught by other layers. (eBPF can enforce policy via LSM hooks in newer kernels, but Greywall uses bpftrace for monitoring, not enforcement.)

Component Interaction Diagram

                    ┌─────────────────────────────────────────────────────────┐
                    │                    THREAT                               │
                    │  Sandboxed process attempts forbidden action            │
                    └──────────────┬──────────────────────────────────────────┘
                                   │
        ┌──────────────────────────┼──────────────────────────────┐
        │                          │                              │
        ▼                          ▼                              ▼
  ┌───────────┐          ┌──────────────┐              ┌──────────────────┐
  │ Filesystem │          │   Syscall    │              │    Network       │
  │  access    │          │   (ptrace,   │              │    connection    │
  │            │          │   mount...)  │              │                  │
  └─────┬─────┘          └──────┬───────┘              └────────┬─────────┘
        │                       │                               │
        ▼                       ▼                               ▼
  ┌───────────┐          ┌──────────────┐              ┌──────────────────┐
  │ Bubblewrap │          │   Seccomp    │              │   Bubblewrap     │
  │ Mount NS   │          │   BPF        │              │   Network NS     │
  │            │          │              │              │   (--unshare-net)│
  │ File not   │          │ Syscall #    │              │                  │
  │ mounted?   │          │ in blocklist?│              │ Isolated stack,  │
  │ → ENOENT   │          │ → EPERM      │              │ no host network  │
  └─────┬─────┘          └──────┬───────┘              └────────┬─────────┘
        │ (file IS mounted)     │                               │
        ▼                       │                               ▼
  ┌───────────┐                 │                      ┌──────────────────┐
  │ Landlock   │                 │                      │ tun2socks        │
  │ (kernel)   │                 │                      │ TUN device       │
  │            │                 │                      │                  │
  │ Has access │                 │                      │ All traffic →    │
  │ right?     │                 │                      │ SOCKS5 proxy     │
  │ → EACCES   │                 │                      │ (allowlist)      │
  └─────┬─────┘                 │                      └────────┬─────────┘
        │                       │                               │
        └───────────┬───────────┘───────────────────────────────┘
                    │
                    ▼
            ┌──────────────┐
            │   eBPF       │
            │   (bpftrace) │
            │              │
            │  Observes    │
            │  EACCES,     │
            │  EPERM,      │
            │  EROFS       │
            │  returns     │
            │              │
            │  → Logs to   │
            │    stderr    │
            └──────────────┘

Execution Timeline

This is the exact order of operations when greywall -- <command> runs:

PHASE 1: PRE-SANDBOX (on host)
─────────────────────────────────────────────────────────────────
1. Command blocking engine checks command against deny/allow lists
2. Environment sanitization strips LD_PRELOAD, LD_LIBRARY_PATH, etc.
3. ProxyBridge starts: socat creates Unix socket → external SOCKS5
4. DnsBridge starts: socat creates Unix socket → host DNS server
5. ReverseBridge starts: socat listens on exposed ports → Unix sockets
6. Seccomp BPF filter generated and written to temp file
7. Bubblewrap arguments assembled (mounts, namespaces, capabilities)
8. Inner bash script generated (network setup + command execution)

PHASE 2: SANDBOX CREATION (bwrap)
─────────────────────────────────────────────────────────────────
9.  bwrap creates new PID namespace (--unshare-pid)
10. bwrap creates new network namespace (--unshare-net)
11. bwrap sets up filesystem:
    - --tmpfs / (empty root) OR --ro-bind / / (read-only root)
    - System paths mounted read-only (/usr, /bin, /lib, /etc...)
    - CWD mounted read-write
    - /dev mounted with --dev-bind
    - /proc mounted fresh
    - /tmp as tmpfs
    - .env files masked with /dev/null bind mount
    - Protected files forced read-only
    - Unix sockets from bridges bind-mounted in
12. bwrap loads seccomp BPF filter (via fd 3 → --seccomp 3)
13. bwrap drops capabilities (except CAP_NET_ADMIN if proxy enabled)
14. bwrap executes inner bash script

PHASE 3: INNER SCRIPT (inside sandbox)
─────────────────────────────────────────────────────────────────
15. Script brings up loopback: ip link set lo up
16. Script creates TUN device: ip tuntap add dev tun0 mode tun
17. Script configures routing: ip route add default via 198.18.0.1
18. Script starts socat (localhost:18321 → proxy Unix socket)
19. Script starts tun2socks (TUN → SOCKS5 on localhost:18321)
20. Script configures DNS (socat relay or resolv.conf)
21. Script starts reverse bridge listeners (socat for each port)
22. Script waits 0.3s for services to initialize

PHASE 4: LANDLOCK APPLICATION (inside sandbox)
─────────────────────────────────────────────────────────────────
23. greywall re-executes: greywall --landlock-apply -- bash -c "<cmd>"
24. Reads config from GREYWALL_CONFIG_JSON env var
25. Sets PR_SET_NO_NEW_PRIVS (required for Landlock)
26. Creates Landlock ruleset (filesystem access rights bitmask)
27. Adds PATH_BENEATH rules for each allowed path
28. Applies LANDLOCK_RESTRICT_SELF (irrevocable)
29. syscall.Exec() replaces process with user command

PHASE 5: COMMAND EXECUTION (fully sandboxed)
─────────────────────────────────────────────────────────────────
30. User command runs with ALL layers active simultaneously:
    - Network: isolated namespace, traffic through TUN → proxy
    - Filesystem: bwrap mounts (visibility) + Landlock (access rights)
    - Syscalls: seccomp BPF blocking dangerous calls
    - Environment: sanitized (no LD_PRELOAD etc.)

PHASE 6: MONITORING (parallel, on host)
─────────────────────────────────────────────────────────────────
31. eBPF monitor started AFTER sandbox process begins
32. bpftrace attaches to syscall exit tracepoints
33. Filters for EACCES/EPERM/EROFS returns from sandbox PIDs
34. Logs violations to stderr in real-time

PHASE 7: CLEANUP (on host, after command exits)
─────────────────────────────────────────────────────────────────
35. eBPF monitor stopped (bpftrace killed)
36. ReverseBridge stopped (socat killed, sockets removed)
37. DnsBridge stopped (socat killed, socket removed)
38. ProxyBridge stopped (socat killed, socket removed)
39. tun2socks binary removed from /tmp
40. Seccomp filter file removed

What Catches What — Attack Scenarios

Attack 1st defense 2nd defense 3rd defense Reported by
Read ~/.ssh/id_rsa bwrap: file not mounted → ENOENT Landlock: no READ_FILE right → EACCES eBPF: logs EACCES
Write to .env bwrap: masked with /dev/null bind → writes go nowhere Landlock: no WRITE_FILE right → EACCES eBPF: logs EACCES
curl evil.com bwrap: --unshare-net → no host network tun2socks: routes through proxy → proxy denies eBPF: logs ECONNREFUSED
ptrace(pid) seccomp: syscall 101 blocked → EPERM eBPF: logs EPERM
mount /dev/sda /mnt seccomp: syscall 165 blocked → EPERM eBPF: logs EPERM
reboot (inside sandbox) seccomp: syscall 169 blocked → EPERM eBPF: logs EPERM
LD_PRELOAD=evil.so cmd Env sanitization: LD_PRELOAD stripped before sandbox starts
greywall -- "git push" Command blocker: denied before sandbox created (caller-side only)
git push (inside sandbox) Network namespace: no outbound connectivity tun2socks: proxy denies eBPF: logs ECONNREFUSED
Move .bashrc elsewhere bwrap: .bashrc mounted read-only → EROFS Landlock: no REMOVE_FILE right eBPF: logs EROFS
Create file in /etc bwrap: /etc mounted --ro-bindEROFS Landlock: no MAKE_REG right → EACCES eBPF: logs EROFS

The Layering Principle

┌──────────────────────────────────────────────────────────────┐
│                                                              │
│  LAYER 0: CALLER-SIDE PRE-FILTER (before sandbox)            │
│  ┌────────────────────────────────────────────────────────┐  │
│  │ Command blocking: rejects the TOP-LEVEL command        │  │
│  │   string passed to greywall by the caller (e.g., an   │  │
│  │   AI agent framework). Does NOT intercept commands     │  │
│  │   executed inside the sandbox by child processes.      │  │
│  │ Env sanitization: strip LD_PRELOAD, DYLD_* etc.       │  │
│  │   from the environment before launching the sandbox.   │  │
│  └────────────────────────────────────────────────────────┘  │
│                                                              │
│  LAYER 1: ISOLATION (bwrap namespaces)                       │
│  ┌────────────────────────────────────────────────────────┐  │
│  │ What it does: Creates a separate world                 │  │
│  │ Filesystem: Only mounted files are visible             │  │
│  │ Network: Separate network stack, no host access        │  │
│  │ PID: Can't see/signal host processes                   │  │
│  │                                                        │  │
│  │ Analogy: Putting process in a room with only           │  │
│  │ selected items. Items not in the room don't exist.     │  │
│  └────────────────────────────────────────────────────────┘  │
│                                                              │
│  LAYER 2: ENFORCEMENT (seccomp + Landlock)                   │
│  ┌────────────────────────────────────────────────────────┐  │
│  │ Seccomp: "You cannot USE these capabilities"           │  │
│  │  → Blocks ptrace, mount, reboot, kernel module load    │  │
│  │  → Operates on syscall numbers (capability-based)      │  │
│  │  → Cannot make path-based decisions                    │  │
│  │                                                        │  │
│  │ Landlock: "You cannot ACCESS these paths/operations"   │  │
│  │  → Controls read/write/execute/create/delete per path  │  │
│  │  → Operates on filesystem paths (resource-based)       │  │
│  │  → Cannot block non-filesystem syscalls                │  │
│  │                                                        │  │
│  │ Together: Seccomp blocks dangerous capabilities,       │  │
│  │ Landlock restricts resource access. Orthogonal.        │  │
│  └────────────────────────────────────────────────────────┘  │
│                                                              │
│  LAYER 3: NETWORK CONTROL (tun2socks + bridges)              │
│  ┌────────────────────────────────────────────────────────┐  │
│  │ What it does: Routes ALL traffic through SOCKS5 proxy  │  │
│  │ TUN device captures every packet (transparent)         │  │
│  │ Proxy applies allowlist (domain/IP filtering)          │  │
│  │ DNS either bridged to host or forced through proxy     │  │
│  │                                                        │  │
│  │ Why not just bwrap --unshare-net?                      │  │
│  │ → That blocks ALL network. We need selective access.   │  │
│  │ → tun2socks re-enables controlled network via proxy.   │  │
│  └────────────────────────────────────────────────────────┘  │
│                                                              │
│  LAYER 4: OBSERVATION (eBPF)                                 │
│  ┌────────────────────────────────────────────────────────┐  │
│  │ What it does: Watches for denied operations            │  │
│  │ Doesn't block anything — reports what was blocked      │  │
│  │ Catches EACCES/EPERM/EROFS from ANY layer              │  │
│  │                                                        │  │
│  │ Why needed: Without monitoring, violations are silent. │  │
│  │ eBPF tells you WHAT was blocked and WHY.               │  │
│  │ Essential for debugging sandbox configs.               │  │
│  └────────────────────────────────────────────────────────┘  │
│                                                              │
└──────────────────────────────────────────────────────────────┘

Seccomp vs Landlock — Why Both?

These two enforcement layers are orthogonal — they protect against completely different threat classes:

                    Seccomp BPF                 Landlock
                    ───────────                 ────────
Domain:             Syscall numbers             Filesystem paths
Question asked:     "Is this syscall allowed?"  "Can this path be accessed?"
Granularity:        Per-syscall                 Per-path + per-operation
Can block ptrace:   YES                         NO (not a filesystem op)
Can block mount:    YES                         NO (not a filesystem op)
Can block reboot:   YES                         NO
Can block read():   NO (too common)             YES (per-path)
Can block write():  NO (too common)             YES (per-path)
Can block mkdir():  NO (too common)             YES (per-path)
Can block rename:   NO                          YES (REFER right, ABI v2+)
Runs at:            Syscall entry               VFS operation
Fallback:           Returns EPERM               Returns EACCES

Seccomp answers: "Should this process be able to call ptrace() at all?" Landlock answers: "Should this process be able to read /home/user/.ssh/id_rsa?"

They have zero overlap in what they protect. Using only one leaves an entire class of attacks unaddressed.

Bubblewrap vs Landlock — Why Both?

These two have overlapping filesystem protection, deliberately:

                    Bubblewrap mounts            Landlock
                    ─────────────────            ────────
Mechanism:          Mount-time visibility        Runtime access control
When applied:       Before process starts        After process starts
Failure mode:       File doesn't exist           File exists but EACCES
Granularity:        Whole directories             Individual files + operations
Symlink handling:   Resolves before mounting     Kernel handles
Race conditions:    Possible (mount vs access)   None (kernel-enforced)
Edge cases:         /etc/resolv.conf symlinks,   None known
                    cross-mount boundaries

Why Landlock exists as a backup to bwrap:

  1. Mount misconfiguration: If bwrap accidentally makes a file visible (symlink edge case, mount order issue), Landlock still denies access
  2. Operation granularity: bwrap can make a file visible but read-only. Landlock can additionally block EXECUTE, REMOVE, TRUNCATE separately
  3. Defense in depth: Two independent mechanisms with different failure modes. Both must fail for access to be granted

Real example from the code: .env files are masked with --ro-bind /dev/null .env in bwrap. But the Landlock layer ALSO denies READ_FILE on those paths. If the bwrap mask somehow fails (race condition, mount order), Landlock catches it.


Part 2: macOS Architecture

What macOS Has (Single-Layer Model)

macOS uses Apple's Seatbelt (sandbox-exec) — a single, comprehensive policy engine:

┌──────────────────────────────────────────────────────────────┐
│              User Command (sandboxed)                         │
├──────────────────────────────────────────────────────────────┤
│  Seatbelt SBPL profile (sandbox-exec)                        │
│                                                              │
│   ┌─ Filesystem rules ──────────────────────────────────┐    │
│   │  file-read-data, file-write-data, file-write-unlink │    │
│   │  (deny default) + explicit allows per path          │    │
│   └─────────────────────────────────────────────────────┘    │
│   ┌─ Network rules ─────────────────────────────────────┐    │
│   │  network-outbound, network-inbound, network-bind    │    │
│   │  IP:port based filtering                            │    │
│   └─────────────────────────────────────────────────────┘    │
│   ┌─ IPC rules ─────────────────────────────────────────┐    │
│   │  mach-lookup (XPC service allowlist)                │    │
│   │  ~20 essential macOS services whitelisted           │    │
│   └─────────────────────────────────────────────────────┘    │
│   ┌─ Hardware rules ────────────────────────────────────┐    │
│   │  iokit-open, iokit-get-properties                   │    │
│   │  sysctl-read, sysctl-write                          │    │
│   └─────────────────────────────────────────────────────┘    │
│   ┌─ Log tagging ───────────────────────────────────────┐    │
│   │  CMD64_<base64>_END_<session> in deny messages      │    │
│   │  Violations go to system.log                        │    │
│   └─────────────────────────────────────────────────────┘    │
│                                                              │
├──────────────────────────────────────────────────────────────┤
│  Environment sanitization (DYLD_* stripped)                   │
├──────────────────────────────────────────────────────────────┤
│  Command blocking engine (shared with Linux)                 │
├──────────────────────────────────────────────────────────────┤
│  Host macOS kernel (MACF hooks enforce Seatbelt)             │
└──────────────────────────────────────────────────────────────┘

Seatbelt is enforced by MACF (Mandatory Access Control Framework) kernel hooks — the same infrastructure that enforces SIP. In some ways, it's architecturally similar to Linux's Landlock (kernel-level, path-based), but it covers more domains (network, IPC, hardware) in a single profile rather than requiring separate technologies.

What Seatbelt Already Covers (vs Linux Equivalent)

Linux component macOS Seatbelt equivalent Coverage
Landlock filesystem control file-read-data, file-write-data, file-write-unlink Equivalent
Seccomp (partially) process-exec, process-fork, signal, mach-lookup Partial — blocks at operation level, not raw syscall level
Network namespace rules network-outbound, network-inbound, network-bind Similar — IP/port filtering, but no true isolation
No equivalent on Linux mach-lookup (XPC service control) macOS-only — controls which system services are accessible
No equivalent on Linux iokit-open, sysctl-read/write macOS-only — hardware and kernel parameter access

Key insight: Seatbelt is not "weaker" than any single Linux component — it's a different architecture. It's one comprehensive policy engine instead of multiple composable ones. The trade-off is: less defense-in-depth (single point of failure) but broader coverage per layer (filesystem + network + IPC + hardware in one profile).


Part 3: Security Level Comparison

Per-Threat Security Rating

Threat Linux macOS Gap
Read unauthorized files ●●●●● ●●●○○ Linux: 2 layers (bwrap+Landlock). macOS: 1 layer (Seatbelt). No backup if Seatbelt misconfigured
Write to protected files ●●●●● ●●●●○ Linux: 2 layers + .env masking. macOS: Seatbelt + unlink blocking. Close, but single layer
Exfiltrate data via network ●●●●● ●●○○○ Linux: namespace isolation + transparent proxy (catches ALL traffic). macOS: env var proxy (apps can ignore it)
DNS exfiltration ●●●●○ ●○○○○ Linux: DNS bridge with filtering. macOS: no DNS control at all
Dangerous syscalls (ptrace, mount) ●●●●● ●●●○○ Linux: seccomp blocks 24 syscalls. macOS: SIP blocks some, Seatbelt blocks some process ops, but no explicit syscall filtering
Library injection (LD_PRELOAD) ●●●●● ●●●●● Both strip dangerous env vars. macOS additionally has SIP preventing DYLD injection on system binaries
Process visibility ●●●●● ●○○○○ Linux: PID namespace. macOS: no PID isolation (can see all processes)
Command execution control ●●●●● ●●●●● Same engine on both platforms
SSH command control ●●●●● ●●●●● Same engine on both platforms
Protected file awareness (.env, .gitconfig) ●●●●● ●●●●● Same lists, same protection on both
Violation detection/reporting ●●●●○ ●●○○○ Linux: eBPF real-time monitoring. macOS: violations in system.log, not captured programmatically
Config auto-generation (learning) ●●●●● ○○○○○ Linux: strace-based learning. macOS: not implemented. Two viable approaches: (1) Seatbelt (allow (with report)) + log stream (no root), (2) sudo fs_usage (requires admin, more reliable — not dtrace-based, unaffected by SIP)
IPC control ●●○○○ ●●●●● macOS: explicit Mach IPC allowlist. Linux: no equivalent (D-Bus not filtered)
Sandbox escape via child process ●●●●● ●●●○○ Linux: namespace inherits to all children. macOS: sandbox-exec profile inherits, but no namespace boundary

Overall Security Posture

Linux:   ████████████████████░░  ~90% (multi-layer, defense in depth)
macOS:   ████████████░░░░░░░░░░  ~60% (single comprehensive layer, gaps in network/monitoring/learning)

The 30% gap breaks down as:

  • ~10% Network control (no transparent proxy, no DNS filtering)
  • ~8% No learning mode (can't auto-generate configs)
  • ~5% No violation monitoring (can't see what's being blocked)
  • ~4% No PID isolation (can see host processes)
  • ~3% No defense-in-depth for filesystem (single Seatbelt layer)

Part 4: Closing the macOS Gap (Constraint: No Root, No Containers)

Hard Constraints

These are non-negotiable requirements for any macOS solution:

  1. No root/sudo/admin access — must run as a regular user
  2. No containerization — process runs on the host (macOS 26 Linux containers are out of scope)
  3. SIP enabled — default macOS, cannot ask users to disable it
  4. All traffic must be captured — same fail-closed behavior as Linux

Tool-by-Tool Privilege Audit

Every macOS tracing/filtering tool was evaluated against the "no root" constraint:

Tool Requires root? Requires admin? Works as user? Verdict
sandbox-exec (Seatbelt) No No Yes Current approach, works
log stream (violation monitoring) No No Yes Viable for monitoring
fs_usage (filesystem tracing) Yes No Blocked without sudo. Viable for learning mode with admin access — not dtrace-based, unaffected by SIP. See "macOS Learning Mode via fs_usage" section.
eslogger (Endpoint Security) Yes + Full Disk Access No Blocked
dtrace / dtruss / opensnoop Yes + SIP blocks entirely No Blocked even with sudo — SIP disables the syscall dtrace provider system-wide, not just for SIP-protected binaries. All three tools are dtrace-based and fail identically.
pfctl / pf rules Yes No Blocked
Network Extensions No Yes (system dialog) No Blocked
Endpoint Security framework Yes + Apple-restricted entitlement No Blocked
TUN/TAP devices Yes No Blocked
DYLD_INSERT_LIBRARIES No No Partial Works for non-hardened binaries only

Only three mechanisms work as a regular user: sandbox-exec, log stream, and DYLD_INSERT_LIBRARIES (with caveats).


1. Learning Mode

Why traditional tracing tools are all blocked

Tool Why it fails without root
fs_usage Requires root. Uses kernel tracing facility (kdebug). "Permission denied" without sudo.
eslogger Requires root + Full Disk Access. Endpoint Security API enforces ES_NEW_CLIENT_RESULT_ERR_NOT_PRIVILEGED.
dtrace / dtruss Requires root. With SIP enabled, even root cannot trace SIP-protected system binaries (/bin/sh, /usr/bin/env).
opensnoop Requires root. Built on DTrace — same limitation.
Endpoint Security framework Requires com.apple.developer.endpoint-security.client restricted entitlement (Apple approval, months-long process) AND root.
FileMonitor (Objective-See) Built on Endpoint Security — same entitlement + root requirement.
FSEvents API Works as user BUT has no per-process attribution — it reports filesystem changes without recording which process caused them. Useless for learning.
kqueue EVFILT_VNODE Works as user for monitoring writes/deletes/renames, BUT cannot detect reads/opens (no NOTE_OPEN or NOTE_ACCESS flag). Also no per-process attribution.

There is no standalone filesystem tracing tool on macOS that works without root. Apple considers process tracing a privileged operation.

Viable approach: Seatbelt (allow ... (with report)) + log stream

This is positive learning — like strace — not iterative reverse learning. The process runs once, to completion, and greywall captures what it accessed.

How it works:

The SBPL language supports a (with report) modifier on allow rules. By default, only denied operations are logged. Adding (with report) causes the sandbox kernel to also log permitted operations to the macOS unified log. These entries are readable via log stream without root.

; Learning profile — allow everything, report everything
(version 1)
(allow default)  ; permissive baseline

; Report all file reads
(allow file-read-data (with report) (subpath "/"))

; Report all file writes
(allow file-write-data (with report) (subpath "/"))
(allow file-write-create (with report) (subpath "/"))
(allow file-write-unlink (with report) (subpath "/"))

; Report all network
(allow network-outbound (with report))

On the host side, greywall captures the log stream:

log stream --style compact --predicate 'sender=="Sandbox"'

Single-pass learning flow:

greywall --learning -- <command>
    ↓
1. Generate a PERMISSIVE Seatbelt profile with (allow ... (with report)) on
   every operation category (file-read, file-write, network, etc.)
    ↓
2. Start `log stream` in background goroutine, filtering for Sandbox events
   with the session suffix tag
    ↓
3. Run: sandbox-exec -p '<learning-profile>' /bin/bash -c '<command>'
   → Process runs to completion normally (all operations allowed)
   → Every file read/write/network operation logged by the sandbox kernel
    ↓
4. After command exits, stop `log stream`, parse captured events
   → Extract file paths and operation types (read vs write)
   → Filter out system paths, temp paths, shared libraries
    ↓
5. Feed into existing CollapsePaths() and buildTemplate()
   (same platform-independent code as Linux learning.go)
    ↓
6. Save template to ~/.config/greywall/learned/<cmdname>.json
   → Auto-loaded on next run

Comparison with Linux strace approach:

Aspect Linux (strace) macOS (Seatbelt report)
Mechanism ptrace syscall tracing MACF kernel hooks + unified log
Runs as user? Yes (ptrace same-UID) Yes (sandbox-exec + log stream)
Single-pass? Yes Yes
Traces child processes? Yes (-f flag) Yes (sandbox profile inherits to children)
Read vs write distinction? Yes (O_RDONLY vs O_WRONLY flags) Yes (file-read-data vs file-write-data operations)
Path detail? Full path in syscall args Full path in log message
Interactive/TUI support? Yes (foreground strace) Yes (sandbox-exec preserves terminal)
Security during learning? Relaxed (no seccomp/Landlock) Relaxed (permissive profile)
Output format Structured syscall log Unified log text (needs parsing)

What this does NOT capture (limitations vs strace):

  • Exact open flags (O_CREAT, O_TRUNC, O_APPEND) — Seatbelt logs the operation type but not the libc-level flags
  • File descriptor numbers — not relevant for template generation
  • Syscall return values — Seatbelt doesn't log success/failure details for allowed operations

These limitations don't matter for template generation — the goal is "which paths were read, which were written," and Seatbelt (with report) provides exactly that.

Implementation requirements:

  • New file: learning_darwin.go (parallel to learning_linux.go)
  • CheckLearningAvailable() → verify sandbox-exec exists (always true on macOS)
  • GenerateLearningProfile() → SBPL with (allow ... (with report)) rules
  • ParseSandboxLog() → parse log stream output, extract paths + operation types, map to StraceResult{WritePaths, ReadPaths}
  • Rest of pipeline: reuse GenerateLearnedTemplate(), CollapsePaths(), buildTemplate() from learning.go

Alternative: Iterative reverse learning (fallback)

If the (with report) log entries prove too noisy or unreliable, a fallback approach exists:

  1. Run with a restrictive sandbox profile
  2. Capture denial messages via log stream --predicate 'sender=="Sandbox"'
  3. Parse denials → discover what paths the process needs
  4. Relax the config, run again
  5. Repeat until no more denials

This is slower (multi-pass, process may fail early on first run) but doesn't depend on (with report) behavior.


2. Violation Monitoring — WORKS Without Root

log stream is the one viable tool. It can filter sandbox violations without root:

log stream --style compact --predicate 'sender=="Sandbox"'

Greywall already generates a unique sessionSuffix per sandbox session (in macos.go). All Seatbelt deny rules include this suffix in the (with message ...) tag, so violations can be filtered to the current session.

Implementation plan:

  • Start log stream as a subprocess when --monitor flag is set
  • Filter predicate: sender=="Sandbox" AND eventMessage CONTAINS "<sessionSuffix>"
  • Parse output in a goroutine, extract violation type and path
  • Format and print to stderr (matching Linux eBPF output format)

This is directly achievable and proven by Anthropic's sandbox-runtime.


3. DNS Filtering — IMPOSSIBLE as Transparent, PARTIAL via Seatbelt

The problem: On macOS, DNS resolution goes through mDNSResponder via a Unix domain socket (/private/var/run/mDNSResponder). There is no way to intercept or redirect DNS queries for a specific process without root.

Why a local DNS proxy doesn't work without root:

  • A user-level process CAN bind to a high port (e.g., 5353) and run a DNS proxy
  • But the sandboxed process uses the system resolver, which queries mDNSResponder via Unix socket, NOT via UDP to a configurable nameserver
  • You cannot override per-process DNS on macOS without root (scutil --dns requires root, networksetup requires admin)
  • Setting NAMESERVER env var does nothing — macOS programs use getaddrinfo() which goes through mDNSResponder

What Seatbelt CAN do:

  • Block the mDNSResponder Unix socket connection: (deny network-outbound (remote unix-socket (path-literal "/private/var/run/mDNSResponder"))) — this blocks ALL DNS, including resolution of allowed hosts
  • There is no Seatbelt rule to selectively allow DNS for specific domains

Realistic options:

  1. Block all DNS via Seatbelt (deny the mDNSResponder socket) — nuclear option, breaks most programs
  2. Allow all DNS (current behavior) — no filtering
  3. DYLD_INSERT_LIBRARIES to intercept getaddrinfo() and filter at the libc level — works for non-hardened binaries only (see section 4)

Verdict: Per-domain DNS filtering is not possible on macOS as a regular user without DYLD_INSERT_LIBRARIES, and even then only for non-hardened binaries.


4. Transparent Network Proxy — PARTIAL via DYLD_INSERT_LIBRARIES

This is the most nuanced gap. There is exactly one mechanism that works without root: library interposition.

How it works:

┌─────────────────────────────────────────────────────┐
│  greywall sets up:                                   │
│  1. Local SOCKS5 proxy on localhost:PORT             │
│  2. redirect.dylib (intercepts connect/sendto/etc.)  │
│  3. Seatbelt profile: deny all network EXCEPT        │
│     localhost:PORT                                    │
│  4. DYLD_INSERT_LIBRARIES=redirect.dylib             │
├─────────────────────────────────────────────────────┤
│  sandbox-exec runs the target with:                  │
│  - Seatbelt profile active (kernel-enforced)         │
│  - redirect.dylib loaded (user-space interposition)  │
├─────────────────────────────────────────────────────┤
│  When target calls connect():                        │
│  - redirect.dylib intercepts → rewrites to proxy     │
│  - Seatbelt allows because dest is localhost:PORT    │
│                                                      │
│  If interposition fails (hardened binary):            │
│  - connect() goes to original destination             │
│  - Seatbelt BLOCKS because dest is not localhost:PORT │
│  - Result: EPERM → no network access (FAIL-CLOSED)  │
└─────────────────────────────────────────────────────┘

The key property is fail-closed: Even when DYLD interposition doesn't work, the Seatbelt network deny is kernel-enforced and cannot be bypassed. The process simply has no network access rather than unfiltered access.

When DYLD_INSERT_LIBRARIES works:

  • Homebrew-installed binaries (/opt/homebrew/bin/*)
  • User-compiled binaries
  • Binaries without hardened runtime or __RESTRICT segment
  • Most CLI tools installed via package managers (npm global, pip, cargo, etc.)

When it does NOT work (dyld strips the variable):

  • System binaries (/usr/bin/*, /bin/*, /sbin/*) — SIP-protected
  • Mac App Store apps — hardened runtime
  • Binaries with CS_RESTRICT, CS_REQUIRE_LV, or CS_RUNTIME code signing flags
  • Binaries with __RESTRICT,__restrict segment

Execution order matters: DYLD_INSERT_LIBRARIES injection happens BEFORE Seatbelt profile enforcement. The dylib loads during dyld initialization, then sandbox-exec applies the profile via __mac_syscall. This is documented behavior and the basis of multiple CVEs.

What the redirect.dylib would need to intercept:

  • connect() — TCP connection establishment
  • sendto() — UDP (including DNS if not going through mDNSResponder)
  • getaddrinfo() — DNS resolution (for DNS filtering)
  • bind() — Prevent binding to non-localhost addresses

Implementation approach:

  • Write a small C dylib using __attribute__((used)) __DATA,__interpose section
  • Dylib redirects all connect() calls to a local SOCKS5 proxy
  • Dylib intercepts getaddrinfo() for domain-based filtering
  • Embed the compiled dylib in the greywall binary (extract to temp at runtime, like tun2socks on Linux)
  • Seatbelt profile denies all network except localhost:PROXY_PORT

Comparison with Linux approach:

Aspect Linux (tun2socks) macOS (DYLD + Seatbelt)
Capture rate 100% (kernel TUN device) Variable (depends on binary hardening)
Bypass possible? No (namespace isolation) Yes (hardened/system binaries)
Fail mode when bypass N/A Fail-closed (Seatbelt blocks all network)
Works for Go binaries? Yes (kernel-level) No (Go uses raw syscalls, not libc)
Works for static binaries? Yes (kernel-level) No (no dynamic linker)
Root required? No (bwrap is unprivileged) No

5. Summary: What's Achievable

Linux feature macOS equivalent Works as user? Coverage
Filesystem sandbox Seatbelt SBPL Yes Full — kernel-enforced, same as Linux
Network deny Seatbelt (deny network*) Yes Full — kernel-enforced
Network redirect DYLD interposition + Seatbelt Yes Partial — non-hardened binaries only. Fail-closed for hardened.
DNS filtering DYLD getaddrinfo() intercept Yes Partial — same binary restrictions as network redirect
Violation monitoring log stream parsing Yes Full — captures all Seatbelt denials
Learning mode Seatbelt (allow (with report)) + log stream Yes Full for sandbox-exec scope — single-pass, positive learning
Env sanitization Strip DYLD_* Yes Full — already implemented
Command blocking Shared engine Yes Full — already implemented
Seccomp equivalent Seatbelt operation-level rules Yes Partial — covers process/network/IPC but not raw syscall numbers
PID isolation Impossible
Landlock equivalent Impossible

6. Honest Assessment: The Fundamental macOS Limitation

The core problem is Apple's security architecture:

On Linux, sandboxing tools run as an unprivileged user because the kernel provides user namespaces — a mechanism explicitly designed for unprivileged process isolation. bwrap uses CLONE_NEWUSER to create sandboxes without root. strace uses ptrace which is allowed between processes with the same UID.

On macOS, Apple's security model is: the OS protects the user FROM processes, not processes FROM each other. All the powerful mechanisms (Endpoint Security, Network Extensions, DTrace, pf) are designed for system administrators and enterprise MDM, not for user-level process isolation. Apple's answer to "how do I sandbox a process?" is "use the App Sandbox entitlement" — which requires being an app developer with code signing, not a CLI tool.

The result:

  • Filesystem control: Fully solved. sandbox-exec works as a user and is kernel-enforced.
  • Network control: Partially solved. Seatbelt deny is kernel-enforced (fail-closed), but transparent redirection only works for non-hardened binaries via DYLD interposition.
  • Observability: Mostly solved. log stream works for both violation monitoring and learning mode via Seatbelt's (with report) mechanism.
  • Process isolation: Not possible. No user namespaces on macOS.

There is no path to full parity with Linux without root privileges for network redirection and process isolation. But filesystem sandboxing, learning mode, and violation monitoring are all achievable as a regular user — the macOS gap is narrower than it first appears.

What Would Change With Root

For documentation completeness — if the "no root" constraint were relaxed:

Feature Tool Privilege needed Improvement over user-level
Learning mode (structured) fs_usage sudo More structured output, exact flags
Learning mode (JSON) eslogger sudo + Full Disk Access Process tree tracking, JSON format
DNS filtering (system-level) Local DNS proxy + scutil sudo System-level DNS redirect, works for all programs
Transparent proxy (all traffic) pf rules sudo Catches 100% of traffic including hardened binaries
Transparent proxy (Apple-approved) Network Extension Admin (system dialog) Officially supported, future-proof

macOS Learning Mode via fs_usage (with admin access)

Why dtrace-based tools are ALL blocked by SIP

A common misconception is that SIP only blocks tracing of SIP-protected binaries (those in /usr/bin/, /System/, etc.). In reality, SIP disables the syscall dtrace provider entirely, regardless of what process you're targeting. Even sudo dtrace fails:

$ sudo dtrace -n 'syscall::open*:entry /execname == "myapp"/ { printf("%s", copyinstr(arg0)); }'
dtrace: system integrity protection is on, some features will not be available
dtrace: failed to initialize dtrace: DTrace requires additional privileges

This eliminates all three commonly suggested tools:

Tool Based on Status with SIP
dtrace (custom scripts) dtrace Blocked — syscall provider disabled
dtruss (strace equivalent) dtrace Blocked — same reason
opensnoop (file open tracer) dtrace Blocked — same reason

Disabling SIP (csrutil enable --without dtrace) requires rebooting into recovery mode and is not a viable requirement for end users.

fs_usage — the only viable macOS tracing tool

fs_usage is not dtrace-based. It reads directly from the kernel's kdebug tracing facility, which SIP does not restrict. It works with just sudo:

sudo fs_usage -w -f filesys opencode

This was confirmed working on macOS with SIP enabled (tested February 2026).

Key properties:

  • Name-based filtering: fs_usage <processname> filters by comm name, not PID. Start it before the process exists — it catches events from the moment any process with that name spawns. No PID race condition.
  • Full path capture: Output includes resolved absolute paths for every filesystem operation.
  • Operation types: Distinguishes reads (open, read) from writes (open with write flags, write, mkdir, unlink, rename).
  • No SIP issues: Works for ALL binaries — system, Homebrew, user-compiled, hardened or not.
  • Pre-installed: Ships with every macOS installation.

Architecture: privilege separation

The critical design point: sudo is used only for the monitoring process, completely isolated from the sandboxed command. The sandboxed command never receives elevated privileges.

greywall (orchestrator, unprivileged)
├── sudo fs_usage -w -f filesys <cmdname>     ← privileged monitor (separate process)
│   └── reads kernel kdebug tracebuffer        ← passive observer, no interaction with sandbox
│   └── output piped to temp file
│
└── sandbox-exec -p '<permissive>' -- <cmd>    ← sandboxed command (unprivileged)
    └── runs as current user, no sudo
    └── no access to the monitor process

The monitor and the sandboxed command share no file descriptors, no IPC, no environment. The sudo elevation cannot leak to the sandbox.

Learning flow on macOS

greywall --learning -- <command>
    ↓
1. Prompt user for admin password (sudo)
    ↓
2. Start `sudo fs_usage -w -f filesys <cmdname>` in background
   → Output redirected to temp file
    ↓
3. Start `sandbox-exec -p '<permissive-profile>' -- <command>`
   → Permissive Seatbelt profile (allow default)
   → Runs as current user, no elevation
    ↓
4. Wait for sandboxed command to exit
    ↓
5. Kill fs_usage monitor, parse temp file
   → Extract file paths and operation types
   → Distinguish reads vs writes from fs_usage output format
   → Filter out system paths, temp paths, shared libraries
    ↓
6. Feed into existing CollapsePaths() and buildTemplate()
   → Same platform-independent pipeline as Linux
    ↓
7. Save template to ~/.config/greywall/learned/<cmdname>.json
   → Auto-loaded on next run

Comparison: Linux strace vs macOS fs_usage for learning

Aspect Linux (strace) macOS (fs_usage)
Underlying mechanism ptrace syscall tracing kdebug kernel tracing facility
Privilege required None (ptrace own child) sudo (for kdebug access)
SIP/security restriction N/A No restriction (not dtrace-based)
Process filtering Automatic (traces child) Name-based (fs_usage <name>)
Catches from process start? Yes (strace launches the command) Yes (name filter, start monitor first)
Full paths? Yes (in syscall arguments) Yes (resolved in output)
Read vs write distinction? Yes (O_RDONLY vs O_WRONLY flags) Yes (operation type in output)
Traces child processes? Yes (-f flag) Yes (all processes matching name)
Works for all binaries? Yes Yes (no SIP restriction)
Interactive/TUI support? Yes (foreground strace) Yes (sandbox-exec preserves terminal)
Output format Structured syscall log Text with operation, path, timing
Post-processing ParseStraceLog() New ParseFsUsageLog() needed

Implementation requirements

  • New file: learning_darwin.go (replace current stub that returns "learning mode is only available on Linux")
  • CheckFsUsageAvailable() → verify fs_usage exists (always true on macOS) and sudo access
  • StartFsUsageMonitor(cmdName string) → spawn sudo fs_usage -w -f filesys <cmdname>, redirect output to temp file
  • StopFsUsageMonitor() → kill the fs_usage process
  • ParseFsUsageLog(logPath string) → parse fs_usage output, extract paths + operation types, return *StraceResult{WritePaths, ReadPaths}
  • Rest of pipeline: reuse GenerateLearnedTemplate(), CollapsePaths(), buildTemplate() from learning.go
  • Manager changes: wrapCommandLearning() needs a macOS path that starts the monitor, runs sandbox-exec with permissive profile, then stops monitor

Open questions

  • Name collision: fs_usage <name> matches ALL processes with that name system-wide. If another process named opencode is running, its events would be captured too. Mitigation: post-filter by PID range and timing (events between sandbox start and stop), or warn the user.
  • sudo UX: How to handle the sudo prompt? Options: (a) prompt inline before learning starts, (b) use osascript for a macOS password dialog, (c) require pre-authentication (sudo -v) before running greywall.
  • fs_usage output format: The exact format varies slightly across macOS versions. Needs testing on macOS 13, 14, and 15 to ensure the parser is robust.

Part 4b: Complete Functionality Table — Linux vs macOS

Every sandboxing capability, how it's implemented on each platform, and what to do about gaps.

Filesystem Control

Functionality Linux macOS (current) macOS (proposed, no root)
Deny-by-default filesystem bwrap --tmpfs / + selective --ro-bind mounts Seatbelt (deny default) + explicit (allow file-read-data (subpath ...)) Already implemented. Equivalent.
Read control (system paths) bwrap mounts /usr, /bin, /lib, /etc read-only Seatbelt (allow file-read-data (subpath "/usr")) etc. Already implemented. Equivalent.
Read control (user paths) bwrap mounts specific home dirs read-only Seatbelt (allow file-read-data (subpath "~/.nvm")) etc. Already implemented. Equivalent.
Write control (CWD) bwrap --bind cwd cwd Seatbelt (allow file-write* (subpath cwd)) Already implemented. Equivalent.
Sensitive file masking (.env) bwrap --ro-bind /dev/null .env (file is replaced with empty) Seatbelt (deny file-read-data (literal ".env")) (file exists but is unreadable) Already implemented. Slightly different: Linux hides the file, macOS blocks access. Both prevent data leakage.
Protected file read-only (.bashrc, .gitconfig) bwrap --ro-bind Seatbelt (deny file-write* (literal ...)) Already implemented. Equivalent.
Glob pattern matching Landlock: expand globs, add PATH_BENEATH rules Seatbelt: convert globs to regex via GlobToRegex() Already implemented. Equivalent.
File movement blocking Landlock REFER right (ABI v2+) blocks cross-dir renames Seatbelt (deny file-write-unlink) with ancestor path blocking Already implemented. macOS is actually more explicit here.
Symlink escape prevention bwrap resolves symlinks before mounting Seatbelt resolves at kernel level (MACF hooks) Already implemented. Both handle this.
Kernel-level fs enforcement (defense-in-depth) Landlock LSM — second layer behind bwrap mounts Not possible — no user-space LSM on macOS Gap: impossible. Seatbelt is the single enforcement layer. No kernel LSM available to regular users. If Seatbelt has a bug, there's no backup.
Filesystem visibility (mount namespace) bwrap --tmpfs / — files literally don't exist in sandbox Not possible — no mount namespace on macOS Gap: impossible. Seatbelt denies access but files remain visible to stat() (metadata). (allow file-read-metadata) is required globally for path resolution.

Network Control

Functionality Linux macOS (current) macOS (proposed, no root)
Network isolation (namespace) bwrap --unshare-net — separate network stack Not possible — no network namespace on macOS Gap: impossible. Seatbelt denies network operations but the process shares the host network stack.
Block all outbound bwrap network namespace (no interfaces) Seatbelt (deny network*) — kernel-enforced via MACF Already implemented. Equivalent effect.
Transparent proxy (all traffic) tun2socks: TUN device captures all packets, routes through SOCKS5 Not implemented Proposed: DYLD_INSERT_LIBRARIES with connect() interposition dylib + Seatbelt (deny network-outbound) except proxy port. Works for non-hardened binaries. Fail-closed: hardened binaries get zero network (Seatbelt blocks).
Proxy env var fallback HTTP_PROXY/HTTPS_PROXY/ALL_PROXY set when TUN unavailable HTTP_PROXY/HTTPS_PROXY/ALL_PROXY set when proxy configured Already implemented. Equivalent. Apps that respect env vars are proxied on both platforms.
DNS filtering DnsBridge: socat relays DNS over Unix socket to host DNS server with filtering Not implemented Proposed: DYLD_INSERT_LIBRARIES intercepting getaddrinfo() for domain filtering. Same binary restrictions as network redirect. Alternative: block all DNS via Seatbelt (nuclear, breaks most programs).
Exposed ports (inbound) ReverseBridge: socat on host forwards to Unix socket inside sandbox Seatbelt (allow network-bind (local ip "localhost:*")) + (allow network-inbound) Already implemented. Different mechanism, same effect.
Unix socket access bwrap --bind mounts sockets into sandbox Seatbelt (allow network* (remote unix-socket (path-literal ...))) Already implemented. Equivalent.

Syscall/Operation Control

Functionality Linux macOS (current) macOS (proposed, no root)
Block ptrace seccomp BPF: ptrace(101) → EPERM Seatbelt: no process-info* for other processes; SIP also blocks ptrace on system binaries Partially covered. Seatbelt + SIP provide equivalent protection for most scenarios.
Block mount/umount seccomp BPF: mount(165), umount2(166) → EPERM Not relevant — regular users cannot mount on macOS (no mount privilege) No action needed. macOS doesn't allow user-level mounting.
Block reboot/shutdown seccomp BPF: reboot(169) → EPERM + command blocker denies reboot Command blocker denies reboot, shutdown etc. Already implemented. Command blocker is shared. macOS also requires admin for reboot.
Block kernel module ops seccomp BPF: init_module, delete_module → EPERM Not relevant — kext loading requires root + SIP exemption on macOS No action needed.
Block privilege escalation seccomp BPF: personality, userfaultfd, perf_event_open, bpf, keyctl Not relevant — these syscalls either don't exist or are already restricted by SIP on macOS No action needed. macOS kernel restricts these at the platform level.
Mach IPC control Not applicable (Linux has no Mach IPC) Seatbelt (allow mach-lookup ...) — allowlist of ~20 essential XPC services Already implemented. macOS-specific.
IOKit control Not applicable Seatbelt (allow iokit-open ...) — GPU, power management Already implemented. macOS-specific.
sysctl control Not implemented (could add via seccomp argument filtering) Seatbelt (allow sysctl-read ...) — 50+ sysctls allowlisted Already implemented. macOS is ahead here.

Process Isolation

Functionality Linux macOS (current) macOS (proposed, no root)
PID namespace bwrap --unshare-pid — can't see host processes Not possible Gap: impossible. No user namespace equivalent on macOS. Sandboxed process can see all host PIDs.
Session isolation bwrap --new-session — detach from controlling terminal Not possible Gap: impossible. sandbox-exec does not detach sessions.
Capability dropping bwrap drops caps; --cap-add for specific ones Not applicable — macOS doesn't use Linux capabilities model Different model. macOS uses entitlements, not capabilities.

Observability

Functionality Linux macOS (current) macOS (proposed, no root)
Violation monitoring eBPF (bpftrace): attaches to syscall exit tracepoints, reports EACCES/EPERM/EROFS Not implemented Proposed: log stream filtering for sender=="Sandbox" with session suffix. Works as user. Proven by Anthropic's sandbox-runtime. Infrastructure (session suffix, log tagging) already exists in greywall codebase.
Learning mode (positive) strace: traces file-access syscalls, parses log, generates config template Not implemented Proposed: Seatbelt (allow ... (with report)) + log stream. Permissive sandbox profile logs every permitted operation. Parse log → extract paths → generate template. Single-pass, positive learning, works as user.
Log tagging Not implemented (eBPF shows PID but no command tag) Seatbelt (with message "CMD64_<base64>_END_<suffix>") — per-session violation tags Already implemented. macOS is ahead here.
Template auto-loading Saved to ~/.config/greywall/learned/<cmd>.json, loaded on next run Not implemented Reuse same mechanism — template format and auto-loading are platform-independent.

Caller-Side Pre-Filter

These operate BEFORE the sandbox is created. They filter the top-level command string submitted to greywall -- <command> by the caller (e.g., an AI agent framework). They do NOT intercept commands executed by child processes inside the sandbox — that's the job of the runtime layers above (seccomp blocks the reboot syscall, Seatbelt blocks filesystem operations, network namespace blocks connections, etc.).

Functionality Linux macOS (current) macOS (proposed, no root)
Command blocking Shared engine: deny/allow lists, shell parsing, nested expansion. Rejects greywall -- "git push" before sandbox starts. Same Already implemented. Shared code.
SSH policy Shared engine: host patterns, remote command filtering Same Already implemented. Shared code.
Env sanitization (Linux) Strip LD_PRELOAD, LD_LIBRARY_PATH, LD_AUDIT, all LD_* N/A on macOS N/A.
Env sanitization (macOS) N/A on Linux Strip DYLD_INSERT_LIBRARIES, DYLD_LIBRARY_PATH, all DYLD_* Already implemented. Note: greywall must selectively NOT strip DYLD_INSERT_LIBRARIES when it's setting it for the redirect.dylib. The sanitization should strip user-provided DYLD vars but preserve greywall's own.
Dangerous file lists Shared: .gitconfig, .bashrc, .zshrc, .env*, .git/hooks, .vscode, .idea Same Already implemented. Shared lists.

Summary: Gap Status

Gap Status Reason
Learning mode Solvable Two approaches: (1) Seatbelt (allow (with report)) + log stream — no root needed, (2) sudo fs_usage — requires admin, more reliable, not dtrace-based, unaffected by SIP.
Violation monitoring Solvable log stream with session suffix filtering. No root needed.
Transparent network proxy Partially solvable DYLD interposition for non-hardened binaries. Fail-closed via Seatbelt for the rest.
DNS filtering Partially solvable DYLD getaddrinfo() intercept for non-hardened binaries.
Kernel-level fs backup (Landlock) Impossible No user-space LSM on macOS.
PID namespace isolation Impossible No user namespaces on macOS.
Network namespace isolation Impossible No user namespaces on macOS.
Mount namespace (file visibility) Impossible No mount namespace on macOS.

Part 5: Detailed Component Reference

Linux Components

Bubblewrap (bwrap) — Namespace Isolation

Primary sandboxing primitive. Creates isolated namespaces for the sandboxed process.

Namespaces used:

Namespace Flag Purpose
Network --unshare-net Isolates network stack (no host network access)
PID --unshare-pid Process ID isolation
Session --new-session Detach from controlling terminal (disabled in learning mode)

Filesystem mounting — three modes:

Mode Trigger Root mount Description
Deny-by-default defaultDenyRead: true (default) --tmpfs / Empty root; system paths selectively mounted read-only. CWD mounted read-write.
Legacy defaultDenyRead: false --ro-bind / / Entire root filesystem mounted read-only; specific paths overridden.
Learning --learning flag --ro-bind / / Root read-only, home + CWD writable. Relaxed for strace tracing.

Special filesystem handling:

Path Mount type Reason
/dev --dev-bind Preserve host device permissions
/proc --proc Fresh procfs
/tmp --tmpfs Always writable, isolated from host
/etc/resolv.conf Special cross-mount handling May be a symlink crossing mount boundaries
.env* files Empty file bind mount Mask sensitive project files

Seccomp BPF — Syscall Filtering

BPF program generated and loaded at sandbox startup to block dangerous syscalls.

  1. BPF program generated as raw bytecode (8 bytes per instruction)
  2. Program loads the syscall number, compares against a blocklist
  3. Blocked syscalls return SECCOMP_RET_ERRNO | EPERM (silent denial)
  4. Unblocked syscalls return SECCOMP_RET_ALLOW
  5. Filter passed to bwrap via file descriptor: exec 3<filter; bwrap --seccomp 3

Blocked syscalls (24 total):

Category Syscalls
Process debugging/injection ptrace, process_vm_readv, process_vm_writev
Kernel/privilege escalation personality, userfaultfd, perf_event_open, bpf, keyctl, add_key, request_key
System control mount, umount2, pivot_root, swapon, swapoff, sethostname, setdomainname
Kernel manipulation kexec_load, kexec_file_load, reboot, init_module, finit_module, delete_module
System operations syslog, acct, ioperm (x86_64), iopl (x86_64)

Landlock — Kernel Filesystem Access Control

Linux Security Module (LSM) providing fine-grained filesystem access control, available since Linux 5.13.

  1. Greywall re-executes itself with --landlock-apply flag inside the sandbox
  2. Config passed via GREYWALL_CONFIG_JSON environment variable
  3. Ruleset created with SYS_LANDLOCK_CREATE_RULESET
  4. Rules added for each allowed path with SYS_LANDLOCK_ADD_RULE (type PATH_BENEATH)
  5. Ruleset applied with SYS_LANDLOCK_RESTRICT_SELF (irrevocable)
  6. PR_SET_NO_NEW_PRIVS required via prctl()

Access rights controlled (ABI v1-v5):

Right ABI Description
EXECUTE v1 Execute files
READ_FILE, READ_DIR v1 Read files and list directories
WRITE_FILE v1 Write/truncate files
MAKE_REG, MAKE_DIR, MAKE_SOCK, MAKE_FIFO, MAKE_SYM v1 Create filesystem objects
REMOVE_FILE, REMOVE_DIR v1 Delete filesystem objects
REFER v2 Cross-directory renames
TRUNCATE v3 Truncate files
IOCTL_DEV v5 Device ioctl operations

eBPF Monitoring — Violation Detection

Real-time monitoring of sandbox violations via bpftrace. Observation only — does not enforce.

  1. bpftrace script generated with the sandbox PID
  2. Tracepoints attached to syscall exit points: openat, unlinkat, mkdirat, connect
  3. Filters for error codes: EACCES (-13), EPERM (-1), EROFS (-30), ECONNREFUSED (-111)
  4. PID filtering: pid >= SANDBOX_PID to exclude system daemons
  5. Violations formatted and printed to stderr

Requirements: CAP_BPF or root, plus bpftrace installed. Graceful fallback if unavailable.

Transparent Network Proxy (tun2socks + bridges)

Network traffic routed through SOCKS5 proxy via TUN device for allowlist-based filtering.

Sandboxed process → TUN device (198.18.0.0/15) → tun2socks → socat → Unix socket → host socat → external SOCKS5 proxy

Bridges (all use socat + Unix sockets to cross namespace boundary):

Bridge Direction Purpose
ProxyBridge Sandbox → Host SOCKS5 proxy access
DnsBridge Sandbox → Host DNS resolution
ReverseBridge Host → Sandbox Inbound connections to sandbox services

Environment Sanitization

Platform Stripped variables Risk
Linux LD_PRELOAD, LD_LIBRARY_PATH, LD_AUDIT, LD_DEBUG, all LD_* Shared library injection
macOS DYLD_INSERT_LIBRARIES, DYLD_LIBRARY_PATH, DYLD_FRAMEWORK_PATH, all DYLD_* Dylib injection

macOS Components

Seatbelt / sandbox-exec — Profile-Based Sandbox

Uses Apple's built-in sandbox-exec command with SBPL (Sandbox Profile Language) profiles. Enforced by MACF kernel hooks.

Profile structure:

  1. (deny default (with message "logTag")) — block everything by default
  2. Essential process permissions (process-exec, process-fork, signal)
  3. Mach IPC allowlist (~20 essential system services)
  4. IOKit access (GPU memory, power management)
  5. sysctl reads (50+ hardware/kernel parameters)
  6. Filesystem read rules (system paths, CWD, user tooling)
  7. Filesystem write rules (CWD, tmpdir, default write paths)
  8. Mandatory deny rules (.env, .gitconfig, .bashrc, .git/hooks)
  9. Network rules (proxy host:port or localhost binding)
  10. PTY support (optional)

Network control modes:

Mode Rules Use case
Unrestricted (allow network*) Explicitly allowed
Full block No network rules Default (no proxy)
Local binding (allow network-bind (local ip "localhost:*")) Exposed ports
Proxy-based (allow network-outbound (remote ip "host:port")) External proxy access

Shared Components (Both Platforms)

Command Blocking Engine

Category Commands
System control shutdown, reboot, halt, poweroff, init 0/6, systemctl poweroff/reboot/halt
Kernel manipulation insmod, rmmod, modprobe, kexec
Disk manipulation mkfs.*, fdisk, parted, dd if=
Container escape docker run -v /:/, docker run --privileged
Namespace escape chroot, unshare, nsenter

Shell parsing splits on |, ||, &&, ;. Nested invocations (bash -c 'git push') are expanded.

SSH Policy

Dedicated rules: allowed hosts (wildcards), denied hosts, allowed/denied remote commands, optional inheritance of global deny rules.

Dangerous File/Directory Protection

Category Items
Dangerous files .gitconfig, .gitmodules, .bashrc, .bash_profile, .zshrc, .zprofile, .profile, .ripgreprc, .mcp.json
Dangerous directories .vscode, .idea, .claude/commands, .claude/agents
Sensitive project files .env, .env.local, .env.development, .env.production, .env.staging, .env.test

Learning Mode (Linux-only, macOS planned)

Traces filesystem access patterns and generates configuration templates.

greywall --learning -- <command>
    → Relaxed sandbox (bwrap, no seccomp/Landlock)
    → strace traces file-access syscalls
    → Log parsed → paths extracted → collapsed → filtered
    → JSON template generated → saved to ~/.config/greywall/learned/
    → Auto-loaded on next run of same command

Why seccomp and Landlock are disabled in learning mode: strace uses ptrace(2) to trace syscalls. Seccomp blocks ptrace → strace can't attach. Since the goal is observability (not security), all enforcement layers except basic bwrap are disabled.


Part 6: Configuration Reference

{
  "extends": "base-config.json",
  "network": {
    "proxyUrl": "socks5://host:1080",
    "dnsAddr": "localhost:3153",
    "allowUnixSockets": ["/path/to.sock"],
    "allowAllUnixSockets": false,
    "allowLocalBinding": false,
    "allowLocalOutbound": null
  },
  "filesystem": {
    "defaultDenyRead": true,
    "allowRead": ["~/extra-data"],
    "denyRead": ["~/.ssh/id_*"],
    "allowWrite": ["."],
    "denyWrite": [],
    "allowGitConfig": false
  },
  "command": {
    "deny": ["git push", "npm publish"],
    "allow": ["git status"],
    "useDefaults": true
  },
  "ssh": {
    "allowedHosts": ["github.com"],
    "deniedHosts": [],
    "allowedCommands": ["git-upload-pack"],
    "deniedCommands": [],
    "allowAllCommands": false,
    "inheritDeny": false
  },
  "allowPty": false
}

Runtime Dependencies

Dependency Platform Required Purpose
bubblewrap (bwrap) Linux Yes Namespace isolation
socat Linux Yes (if proxy/DNS) Unix socket bridging
tun2socks Linux Embedded Transparent network proxy
ip (iproute2) Linux Yes (if TUN) TUN device setup
strace Linux Only for --learning Filesystem access tracing
bpftrace Linux Optional Violation monitoring
sandbox-exec macOS Yes (built-in) Seatbelt sandbox
fs_usage macOS Only for --learning (requires sudo) Filesystem access tracing (kdebug-based, not affected by SIP)

Go dependencies (4): doublestar (glob), cobra (CLI), jsonc (config), golang.org/x/sys (syscalls).