How I Harden Docker Containers

The current landscape of constant automated attacks means container hardening isn’t optional. Here’s the configuration I apply to every publicly exposed Docker container.

These settings assume docker-compose.yml files. The goal: minimize blast radius when (not if) something gets compromised.

Non-Root User

Containers running as root can escalate to host root under certain conditions. Force a non-privileged user:

1
user: "1000:1000"

Match this to a real user/group on your host. On Unraid, 99:100 maps to nobody:users.

Disable TTY and Stdin

Interactive shells are useful for debugging, not production. An attacker with container access gains nothing from these:

1
2
tty: false
stdin_open: false

Read-Only Filesystem

If the container doesn’t need to write anywhere outside of explicitly mounted volumes, lock it down:

1
read_only: true

This breaks containers that write to unexpected locations. Start with it enabled, check logs, add writable tmpfs mounts where genuinely needed.

Block Privilege Escalation

Prevent processes inside the container from gaining new privileges after startup:

1
2
security_opt:
  - no-new-privileges:true

This blocks setuid binaries and other privilege escalation vectors.

Drop All Capabilities

By default, Docker grants containers around 14 Linux capabilities. Most containers need zero of them. Drop everything:

1
2
cap_drop:
  - ALL

If the container fails, add back only what it specifically requires:

1
2
3
4
cap_drop:
  - ALL
cap_add:
  - NET_BIND_SERVICE  # Only if binding to ports below 1024

For containers that need network scanning or raw sockets (rare), you might need NET_RAW. Question why before adding it.

Harden /tmp

The /tmp directory is the classic payload staging area. Mount it as tmpfs with execution disabled:

1
2
tmpfs:
  - /tmp:rw,noexec,nosuid,nodev,size=128m

The size limit prevents a malicious process from filling your host disk. noexec prevents downloaded payloads from running. nosuid blocks setuid binaries. nodev prevents device file creation.

Some applications (Plex, for example) need to execute files in /tmp for auto-updates. In those cases, change noexec to exec but keep the other restrictions.

Resource Limits

Prevent containers from consuming all host resources:

1
2
3
pids_limit: 512
mem_limit: 3g
cpus: 3

Without pids_limit, a fork bomb inside the container takes down your host. Without mem_limit, a memory leak eventually triggers the OOM killer on random processes.

Set these based on actual container needs. Most services need far less than 3GB RAM and 3 CPUs.

Logging Limits

An attacker can fill your disk by generating massive log output. Cap it:

1
2
3
4
5
logging:
  driver: json-file
  options:
    max-size: "50m"
    max-file: "5"

This keeps total logs under 250MB per container. Adjust based on your monitoring needs.

Read-Only Volume Mounts

If a container only reads data, mount it read-only:

1
2
3
volumes:
  - /mnt/tank/media:/media:ro
  - /mnt/tank/config:/config:rw  # Only config needs writing

A compromised Plex container shouldn’t be able to encrypt your media library.

Network Isolation

Run exposed containers in a separate network. If one gets compromised, it can’t reach your internal services:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
networks:
  dmz:
    driver: bridge
  internal:
    driver: bridge
    internal: true

services:
  nginx:
    networks:
      - dmz
  database:
    networks:
      - internal

The database becomes unreachable from the dmz network. The exposed nginx can only reach what you explicitly connect to both networks.

Complete Example

Here’s a hardened container configuration:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
services:
  myapp:
    image: myapp:1.0  # Pin versions, avoid :latest
    user: "1000:1000"
    read_only: true
    tty: false
    stdin_open: false
    security_opt:
      - no-new-privileges:true
    cap_drop:
      - ALL
    tmpfs:
      - /tmp:rw,noexec,nosuid,nodev,size=128m
    pids_limit: 256
    mem_limit: 512m
    cpus: 1
    logging:
      driver: json-file
      options:
        max-size: "10m"
        max-file: "3"
    volumes:
      - ./config:/config:rw
      - /data/media:/media:ro
    networks:
      - dmz

Reusing Configuration

YAML anchors prevent duplication across services:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
x-hardened: &hardened
  restart: unless-stopped
  read_only: true
  tty: false
  stdin_open: false
  security_opt:
    - no-new-privileges:true
  cap_drop:
    - ALL
  pids_limit: 256
  logging:
    driver: json-file
    options:
      max-size: "10m"
      max-file: "3"

services:
  api:
    <<: *hardened
    image: my-api:1.0
    user: "1000:1000"
    # service-specific config...

  worker:
    <<: *hardened
    image: my-worker:1.0
    user: "1000:1000"
    # service-specific config...

For larger setups, the include: directive splits configurations across files:

1
2
3
include:
  - path: services/nginx/docker-compose.yml
  - path: services/api/docker-compose.yml

When Things Break

Some containers won’t start with these restrictions. My approach:

Apply all restrictions
Check docker logs <container>
Remove one restriction at a time until it works
Document why that specific container needs the exception

Common issues:

read_only fails: Add specific writable tmpfs mounts
cap_drop breaks networking: Add back NET_BIND_SERVICE or NET_RAW
noexec /tmp breaks updates: Switch to exec for that container

Limitations

These settings reduce blast radius. They don’t eliminate risk. A compromised container can still:

Exfiltrate data it has read access to
Attack other containers on the same network
Attempt to exploit kernel vulnerabilities

For higher-risk services, consider running them in VMs or on dedicated hardware.

This works for my self-hosted setup. Adjust the resource limits and network topology for yours.