InfraRunBook
    Back to articles

    Docker Disk Space Usage Growing

    Docker
    Published: Apr 18, 2026
    Updated: Apr 18, 2026

    Docker won't clean up after itself — this runbook walks through every major cause of runaway disk usage on Docker hosts, with real CLI commands to diagnose and fix each one.

    Docker Disk Space Usage Growing

    Symptoms

    You log in to sw-infrarunbook-01 and

    df -h
    shows
    /var/lib/docker
    sitting at 94% utilization. Maybe a
    docker pull
    just failed mid-layer with
    write /var/lib/docker/overlay2/...: no space left on device
    . Containers are refusing to start. A build pipeline that was green yesterday is now dying. Your monitoring alert fired at 3 AM and you need to recover fast.

    Docker disk usage is one of those problems that creeps up slowly and then hits you all at once. The daemon doesn't aggressively reclaim space on its own — it's designed to cache for speed and leave cleanup to the operator. If you haven't built cleanup into your workflow, the disk will fill. Every time.

    Common things you'll see when this happens:

    • docker pull
      fails with
      write /var/lib/docker/overlay2/...: no space left on device
    • Container fails to start with
      Error response from daemon: mkdir ...: no space left on device
    • Builds die mid-layer during a
      RUN
      step with no obvious error in the Dockerfile
    • df -h
      shows
      /
      or a dedicated Docker partition at 90% or higher
    • du -sh /var/lib/docker/*
      shows several gigabytes spread across
      overlay2
      ,
      containers
      , and
      volumes

    Before chasing individual causes, start with Docker's own accounting command. It gives you a breakdown by category and shows you exactly where to focus:

    infrarunbook-admin@sw-infrarunbook-01:~$ docker system df
    TYPE            TOTAL     ACTIVE    SIZE      RECLAIMABLE
    Images          47        12        18.4GB    14.1GB (76%)
    Containers      23        6         1.2GB     980MB (81%)
    Local Volumes   31        8         9.7GB     6.3GB (64%)
    Build Cache     186       0         3.2GB     3.2GB

    That output tells a story. Over 33 GB is sitting on disk right now, and the vast majority of it is reclaimable. Let's go through each root cause systematically.


    Root Cause 1: Unused Images Not Cleaned Up

    This is the most common offender on any long-running Docker host. Every time you pull a new version of an image, run a build, or have a CI pipeline push updated tags, the old image layers stay on disk. Docker doesn't delete them automatically. The old layers are kept because Docker doesn't know whether another image or container might still reference them — but in practice, on a busy host, most of those layers are completely orphaned.

    There are two categories to care about. Dangling images are untagged intermediate images — the kind produced when you rebuild an image and the old one loses its tag. Unreferenced images are fully tagged images that no running or stopped container is actually using. In my experience, a host that's been running a daily build pipeline for a few months can accumulate 30–50 image versions. That's easily 15–25 GB of layer data sitting idle.

    How to Identify

    infrarunbook-admin@sw-infrarunbook-01:~$ docker images
    REPOSITORY                                  TAG     IMAGE ID       CREATED        SIZE
    registry.solvethenetwork.com/app/api        v1.42   a3b4c5d6e7f8   2 hours ago    1.1GB
    registry.solvethenetwork.com/app/api        v1.41   9f8e7d6c5b4a   2 days ago     1.1GB
    registry.solvethenetwork.com/app/api        v1.40   1a2b3c4d5e6f   5 days ago     1.0GB
    <none>                                      <none>  deadbeef1234   6 days ago     980MB
    nginx                                       1.25    abcdef123456   1 week ago     192MB
    nginx                                       1.24    fedcba654321   3 weeks ago    190MB
    
    infrarunbook-admin@sw-infrarunbook-01:~$ docker images -f dangling=true
    REPOSITORY   TAG       IMAGE ID       CREATED       SIZE
    <none>       <none>    deadbeef1234   6 days ago    980MB

    How to Fix

    To remove only dangling images:

    infrarunbook-admin@sw-infrarunbook-01:~$ docker image prune
    WARNING! This will remove all dangling images.
    Are you sure you want to continue? [y/N] y
    Deleted Images:
    deleted: sha256:deadbeef1234...
    
    Total reclaimed space: 980MB

    The more useful option — removing all images not referenced by any container, running or stopped:

    infrarunbook-admin@sw-infrarunbook-01:~$ docker image prune -a
    WARNING! This will remove all images without at least one container associated to them.
    Are you sure you want to continue? [y/N] y
    Deleted Images:
    untagged: registry.solvethenetwork.com/app/api:v1.40
    deleted: sha256:1a2b3c4d5e6f...
    untagged: nginx:1.24
    deleted: sha256:fedcba654321...
    
    Total reclaimed space: 14.1GB

    If you need to be selective — for example, keeping images from the last 24 hours — you can filter by age:

    docker image prune -a --filter until=24h
    .


    Root Cause 2: Orphaned Volumes

    Docker volumes are intentionally persistent. When a container is removed, its volume is not — that's by design, so you don't lose your database data just because a container restarted. But this means every

    docker rm
    without the
    -v
    flag leaves a volume behind. Over time, especially on hosts running docker-compose stacks that get torn down and rebuilt regularly, orphaned volumes accumulate quietly.

    I've seen this happen constantly with teams that run short-lived compose stacks for testing or staging. They do

    docker-compose up -d
    , test something, then
    docker-compose down
    — not realizing that
    down
    does not remove named volumes by default. Do that a hundred times over a few months and you have a hundred orphaned volumes, some of which might be holding gigabytes of database files that will never be read again.

    How to Identify

    infrarunbook-admin@sw-infrarunbook-01:~$ docker volume ls
    DRIVER    VOLUME NAME
    local     postgres_data_20240301
    local     postgres_data_20240315
    local     postgres_data_20240401
    local     redis_cache_old
    local     app_uploads_backup
    local     7f3a1b2c4d5e6f7a8b9c0d1e2f3a4b5c
    
    infrarunbook-admin@sw-infrarunbook-01:~$ docker volume ls -f dangling=true
    DRIVER    VOLUME NAME
    local     postgres_data_20240301
    local     postgres_data_20240315
    local     redis_cache_old
    local     7f3a1b2c4d5e6f7a8b9c0d1e2f3a4b5c
    
    infrarunbook-admin@sw-infrarunbook-01:~$ du -sh /var/lib/docker/volumes/*
    4.1G    /var/lib/docker/volumes/postgres_data_20240301
    4.0G    /var/lib/docker/volumes/postgres_data_20240315
    210M    /var/lib/docker/volumes/redis_cache_old
    12K     /var/lib/docker/volumes/7f3a1b2c4d5e6f7a8b9c0d1e2f3a4b5c

    How to Fix

    Before pruning volumes, verify the dangling ones are genuinely unused. Check what containers were using them and whether that data has been backed up or migrated. Named volumes that look like

    postgres_data_20240301
    should be treated carefully — old doesn't mean unneeded. Once you're certain:

    infrarunbook-admin@sw-infrarunbook-01:~$ docker volume prune
    WARNING! This will remove anonymous local volumes not used by at least one container.
    Are you sure you want to continue? [y/N] y
    Deleted Volumes:
    redis_cache_old
    7f3a1b2c4d5e6f7a8b9c0d1e2f3a4b5c
    
    Total reclaimed space: 210MB

    To remove specific named volumes you've verified as safe:

    infrarunbook-admin@sw-infrarunbook-01:~$ docker volume rm postgres_data_20240301 postgres_data_20240315
    postgres_data_20240301
    postgres_data_20240315

    Going forward, use

    docker-compose down -v
    whenever you want volumes cleaned up along with the stack. Build that habit into your teardown scripts from the start.


    Root Cause 3: Build Cache Not Pruned

    The Docker build cache is where intermediate image layers live during and after a build. Docker keeps these around so subsequent builds can reuse layers that haven't changed — this is what makes rebuilds fast when only your application code changes. The cache isn't free, though. On an active build host, it can quietly grow to 20–40 GB because it doesn't show up prominently in

    docker images
    .

    BuildKit, which is the default builder since Docker 23.x, maintains its own separate cache on top of the classic layer cache. If you're running BuildKit builds — which is likely if you're on a modern Docker version — you need to account for both when investigating disk usage.

    How to Identify

    infrarunbook-admin@sw-infrarunbook-01:~$ docker system df
    TYPE            TOTAL     ACTIVE    SIZE      RECLAIMABLE
    Images          47        12        18.4GB    14.1GB (76%)
    Containers      23        6         1.2GB     980MB (81%)
    Local Volumes   31        8         9.7GB     6.3GB (64%)
    Build Cache     186       0         3.2GB     3.2GB
    
    infrarunbook-admin@sw-infrarunbook-01:~$ docker builder du
    ID                                   RECLAIMABLE   SIZE        LAST ACCESSED
    s3b4hk9f5qlxf2a1c7d8e9               true          1.2GB       2 hours ago
    l1m2n3o4p5q6r7s8t9u0v1               true          890MB       6 hours ago
    w2x3y4z5a6b7c8d9e0f1g2               true          780MB       1 day ago
    ...
    Total: 3.2GB

    How to Fix

    To prune only dangling build cache — layers with no references to current images:

    infrarunbook-admin@sw-infrarunbook-01:~$ docker builder prune
    WARNING! This will remove all dangling build cache.
    Are you sure you want to continue? [y/N] y
    Deleted build cache objects:
    s3b4hk9f5qlxf2a1c7d8e9
    l1m2n3o4p5q6r7s8t9u0v1
    
    Total reclaimed space: 2.09GB

    To wipe the entire build cache — including layers that could theoretically speed up future builds:

    infrarunbook-admin@sw-infrarunbook-01:~$ docker builder prune --all
    WARNING! This will remove all build cache.
    Are you sure you want to continue? [y/N] y
    
    Total reclaimed space: 3.2GB

    The tradeoff is that your next build will be slower — all layers pull fresh. On a CI host that rebuilds from scratch on every run anyway, this is no loss at all. On a developer workstation where you rebuild frequently, be more selective with

    --filter until=24h
    to preserve recent cache entries.


    Root Cause 4: Container Logs Not Rotated

    By default, Docker's

    json-file
    log driver writes container stdout and stderr to
    /var/lib/docker/containers/<container-id>/<container-id>-json.log
    with no size limit and no rotation. A container that logs aggressively — a web server recording every HTTP request, a service with a runaway debug logger, or an application caught in an error loop — will write indefinitely until the disk is full.

    I've seen this take down production hosts. A single Java service that had its log level accidentally set to DEBUG wrote 40 GB in under 18 hours. The container appeared perfectly healthy from a process standpoint. The disk did not.

    How to Identify

    infrarunbook-admin@sw-infrarunbook-01:~$ du -sh /var/lib/docker/containers/*/*-json.log | sort -rh | head -10
    38G     /var/lib/docker/containers/a1b2c3d4e5f6.../a1b2c3d4e5f6...-json.log
    2.1G    /var/lib/docker/containers/7f8e9d0c1b2a.../7f8e9d0c1b2a...-json.log
    450M    /var/lib/docker/containers/3c4d5e6f7a8b.../3c4d5e6f7a8b...-json.log
    
    infrarunbook-admin@sw-infrarunbook-01:~$ docker ps --format "{{.ID}} {{.Names}}"
    a1b2c3d4e5f6 api-service
    7f8e9d0c1b2a nginx-proxy
    3c4d5e6f7a8b worker

    You can also look up the log path for a specific container directly:

    infrarunbook-admin@sw-infrarunbook-01:~$ docker inspect --format='{{.LogPath}}' api-service
    /var/lib/docker/containers/a1b2c3d4e5f6.../a1b2c3d4e5f6...-json.log
    
    infrarunbook-admin@sw-infrarunbook-01:~$ ls -lh /var/lib/docker/containers/a1b2c3d4e5f6.../
    total 38G
    -rw-r----- 1 root root 38G Apr 18 03:22 a1b2c3d4e5f6...-json.log

    How to Fix

    For immediate disk recovery, truncate the log file without restarting the container. Don't delete it — the container process holds the file descriptor open and the space won't be freed until the process releases the handle:

    infrarunbook-admin@sw-infrarunbook-01:~$ truncate -s 0 /var/lib/docker/containers/a1b2c3d4e5f6.../a1b2c3d4e5f6...-json.log

    Then fix the root cause. Configure log rotation globally in

    /etc/docker/daemon.json
    :

    {
      "log-driver": "json-file",
      "log-opts": {
        "max-size": "100m",
        "max-file": "5"
      }
    }

    Restart the Docker daemon to apply. Note this restarts all containers, so plan accordingly:

    infrarunbook-admin@sw-infrarunbook-01:~$ systemctl restart docker

    You can also set log options per-container in your compose file, which takes precedence over the daemon default and is useful when specific services need tighter or looser limits:

    services:
      api-service:
        image: registry.solvethenetwork.com/app/api:v1.42
        logging:
          driver: json-file
          options:
            max-size: "100m"
            max-file: "5"

    Existing containers don't inherit daemon.json changes retroactively. You need to recreate them for new log settings to apply — a rolling restart works fine if you're using Compose.


    Root Cause 5: Overlay Filesystem Fragmentation and Inode Exhaustion

    The

    overlay2
    storage driver maintains a directory per image layer and per container writable layer under
    /var/lib/docker/overlay2/
    . On a host that has created and destroyed many containers over time, two distinct and often misdiagnosed problems emerge: filesystem fragmentation where allocated blocks are scattered inefficiently, and inode exhaustion where the filesystem runs out of directory entries even though block space appears available.

    The inode problem is the one that catches people off guard. You run

    docker pull
    and it fails. You check
    df -h
    and see 40% free space. The host looks fine. Then you check
    df -i
    :

    How to Identify

    infrarunbook-admin@sw-infrarunbook-01:~$ df -h /var/lib/docker
    Filesystem      Size  Used Avail Use% Mounted on
    /dev/sda1        200G  120G   80G  60% /
    
    infrarunbook-admin@sw-infrarunbook-01:~$ df -i /var/lib/docker
    Filesystem      Inodes   IUsed    IFree  IUse% Mounted on
    /dev/sda1     12582912 12580344    2568   100% /
    
    infrarunbook-admin@sw-infrarunbook-01:~$ ls /var/lib/docker/overlay2 | wc -l
    18453

    There it is. 100% inode usage, over 18,000 overlay2 directories, and plenty of block space. Docker can't create any new directories — every new container or layer creation fails. The host is effectively out of disk from Docker's perspective even though

    df -h
    looks fine.

    For the deleted-but-open-file problem — where disk usage appears high but you can't account for it with

    du
    :

    infrarunbook-admin@sw-infrarunbook-01:~$ lsof | grep deleted | grep docker
    dockerd  1234  root  45u  REG  8,1  4294967296  1234567 /var/lib/docker/containers/a1b2c3.../a1b2c3...-json.log (deleted)
    
    infrarunbook-admin@sw-infrarunbook-01:~$ df -h /var/lib/docker
    Filesystem      Size  Used Avail Use% Mounted on
    /dev/sda1        200G  198G  2.0G  99% /

    The log file was deleted from the filesystem, but the container process still has it open. The kernel won't free those blocks until the file descriptor is closed — meaning until that container stops or restarts. This is a classic discrepancy between

    df
    and
    du
    .

    How to Fix

    For inode exhaustion, the primary fix is removing unused Docker objects to free directory entries. A full system prune is often the fastest path to recovery:

    infrarunbook-admin@sw-infrarunbook-01:~$ docker system prune -a --volumes
    WARNING! This will remove:
      - all stopped containers
      - all networks not used by at least one container
      - all volumes not used by at least one container
      - all images without at least one container associated to them
      - all build cache
    
    Are you sure you want to continue? [y/N] y
    
    Total reclaimed space: 31.4GB
    
    infrarunbook-admin@sw-infrarunbook-01:~$ df -i /var/lib/docker
    Filesystem      Inodes   IUsed    IFree  IUse% Mounted on
    /dev/sda1     12582912  421083 12161829    4% /

    For the deleted-open-file scenario, restart the offending container to release its file descriptors:

    infrarunbook-admin@sw-infrarunbook-01:~$ docker restart api-service

    If you're persistently hitting inode limits, consider moving

    /var/lib/docker
    to a dedicated filesystem formatted with a higher inode density. With
    mkfs.ext4
    , the
    -i
    flag controls bytes-per-inode — a smaller value like
    -i 4096
    gives you more inodes for the same block count, at the cost of slightly less usable space.


    Root Cause 6: Exited Containers Accumulating

    Every stopped container retains its writable layer on disk until it's explicitly removed. This writable layer exists even if the container wrote nothing at all during its lifetime — it's allocated at container creation and holds any filesystem changes the container made. On a host running many short-lived jobs, cron containers, one-off migrations, or test runners, these writable layers pile up fast and silently.

    How to Identify

    infrarunbook-admin@sw-infrarunbook-01:~$ docker ps -a --filter status=exited
    CONTAINER ID   IMAGE                                           COMMAND          CREATED       STATUS
    b1c2d3e4f5a6   registry.solvethenetwork.com/app/migrate:v12   "./migrate up"   2 days ago    Exited (0) 2 days ago
    c2d3e4f5a6b7   registry.solvethenetwork.com/app/migrate:v11   "./migrate up"   4 days ago    Exited (0) 4 days ago
    ...
    
    infrarunbook-admin@sw-infrarunbook-01:~$ docker ps -a --filter status=exited | wc -l
    89

    How to Fix

    infrarunbook-admin@sw-infrarunbook-01:~$ docker container prune
    WARNING! This will remove all stopped containers.
    Are you sure you want to continue? [y/N] y
    Deleted Containers:
    b1c2d3e4f5a6...
    c2d3e4f5a6b7...
    ...
    
    Total reclaimed space: 1.2GB

    For any container you run as a one-off job, pass

    --rm
    to
    docker run
    so the container is removed automatically when it exits. This single habit eliminates the accumulation entirely:

    infrarunbook-admin@sw-infrarunbook-01:~$ docker run --rm registry.solvethenetwork.com/app/migrate:v12 ./migrate up

    Prevention

    Set daemon-level log rotation before you run a single container in production. Put this in

    /etc/docker/daemon.json
    on every Docker host during provisioning. A 100 MB limit with five rotated files gives you 500 MB of log retention per container — more than enough for debugging, not enough to kill a disk:

    {
      "log-driver": "json-file",
      "log-opts": {
        "max-size": "100m",
        "max-file": "5"
      }
    }

    Schedule a nightly prune job. A systemd timer or cron entry that runs

    docker system prune -f
    at 3 AM keeps things from accumulating. The
    -f
    skips the confirmation prompt for automated execution. If your workload allows removing unused images too, run the more aggressive variant:

    # /etc/cron.d/docker-cleanup
    0 3 * * * root docker system prune -af --volumes >> /var/log/docker-prune.log 2>&1

    Always use

    --rm
    for one-off containers. Any container running a job and exiting — migrations, backups, test runners, data imports — should be launched with
    docker run --rm
    . Make it a team standard. It costs nothing and eliminates an entire category of disk accumulation with zero operational overhead.

    Tag cleanup in your CI/CD pipeline. After pushing a new image and deploying it, have your pipeline explicitly prune old images on the build host. Don't rely on manual cleanup. A post-deploy

    docker image prune -af
    step takes a few seconds and keeps the build host lean indefinitely.

    Use

    docker-compose down -v
    for ephemeral stacks. When tearing down compose stacks that you don't intend to reuse — test environments, staging stacks spun up for a review — always include
    -v
    . Build it into your teardown scripts from day one and it's never a problem.

    Put Docker on its own partition. If you're provisioning new hosts, move

    /var/lib/docker
    to a dedicated LVM volume or block device. When Docker fills its disk, the host OS, SSH daemon, and system logs are all unaffected. Recovery becomes a Docker problem, not a full host recovery problem. You can also resize that volume independently without touching the root filesystem — a much lower-stakes operation at 3 AM.

    Monitor with alert thresholds. Set alerts at 70% and 85% disk utilization on the Docker host's filesystem. 70% is your early warning — plenty of time to schedule a prune during business hours. 85% is your page-me-now threshold. Don't wait for 100%, because by then you're already in recovery mode.

    Docker disk usage growing is a solved problem. It just requires intentional habits and a bit of automation. The daemon won't clean up after itself, so the operator has to. Build the prune job, set the log rotation, use

    --rm
    , and this stops being an incident.

    Frequently Asked Questions

    What is the fastest way to recover disk space from Docker right now?

    Run 'docker system prune -af --volumes' to remove all stopped containers, unused images, build cache, and orphaned volumes in one command. This is the nuclear option — it will reclaim the most space immediately but you'll lose cached image layers, meaning your next build or pull will be slower.

    Why does 'df' show high disk usage but 'du /var/lib/docker' shows much less?

    This is almost always caused by deleted files that are still held open by a running process. Docker log files that were deleted manually while a container is still running are a common cause. The kernel doesn't free the blocks until the file descriptor is closed. Find them with 'lsof | grep deleted | grep docker' and restart the offending containers to release the handles.

    How do I prevent Docker from filling the disk on a CI build server?

    Set up a nightly cron job running 'docker system prune -af' to remove dangling images and stopped containers. Add a post-deploy step in your pipeline that runs 'docker image prune -af' after pushing new images. For one-off build containers, always use 'docker run --rm' so writable layers are cleaned up on exit.

    My Docker host has plenty of block space but containers still fail to create. What's wrong?

    You've likely exhausted the filesystem's inode table. The overlay2 storage driver creates thousands of directories — one per image layer and container — which burns through inodes even when block space is plentiful. Check with 'df -i /var/lib/docker'. If IUse% is at or near 100%, run 'docker system prune -a --volumes' to remove unused layers and free directory entries.

    Does configuring log rotation in daemon.json apply to existing containers?

    No. Log rotation settings in daemon.json only apply to containers created after the daemon is restarted with the new configuration. Existing containers continue using whatever log settings were in place when they were created. You need to recreate those containers — for example with 'docker-compose up --force-recreate' — for the new settings to take effect.

    Related Articles