InfraRunBook
    Back to articles

    Docker Build Failing in CI Pipeline

    CI/CD
    Published: Apr 11, 2026
    Updated: Apr 11, 2026

    A senior engineer's troubleshooting guide for Docker build failures in CI pipelines, covering every major root cause from missing base images and registry auth failures to secrets injection, layer cache invalidation, and disk exhaustion.

    Docker Build Failing in CI Pipeline

    Symptoms

    Your CI pipeline kicks off, Docker starts building, and then it dies. Sometimes it dies fast with a cryptic error about a missing image. Sometimes it sits there for five minutes before timing out on a network pull. Sometimes it worked perfectly yesterday and now it doesn't, and nothing in the diff looks relevant. Whatever the failure mode, the build step is red and your team's deployment is blocked.

    Common symptoms that bring engineers to this page include:

    • The pipeline exits with a non-zero code during the
      docker build
      step
    • Error messages referencing
      manifest unknown
      ,
      not found
      , or
      unauthorized
    • Builds that succeed locally but fail in CI every single time
    • Build times that have ballooned from 90 seconds to 12 minutes without explanation
    • The CI runner running out of disk mid-build
    • Secrets that are clearly set as CI environment variables but are completely invisible inside the build context

    These failures cluster around a handful of root causes. Let's walk through each one with the exact error output you'll see and the concrete steps to fix it.


    Root Cause 1: Base Image Not Found

    Why It Happens

    Your

    FROM
    line references an image that doesn't exist at the registry. The tag was deleted after a cleanup job, the image name has a typo, the image is private and the runner isn't authenticated, or you're pointing at an internal registry that the CI runner can't reach over the network. In my experience, the sneakiest version of this is when someone builds a custom base image on their workstation, pushes it to an internal registry, updates the
    FROM
    line, and then commits without documenting how to rebuild that base. Every other engineer on the team has the image cached locally and never notices — until CI picks up a fresh runner that has never seen it.

    How to Identify It

    The error is usually unambiguous. You'll see something like this in your pipeline log:

    Step 1/12 : FROM node:18-alpine-custom
    ERROR: failed to solve: node:18-alpine-custom: failed to resolve source metadata
      for docker.io/library/node:18-alpine-custom: docker.io/library/node:18-alpine-custom: not found

    For a private internal registry the message shifts slightly:

    Step 1/12 : FROM registry.solvethenetwork.com/internal/base-node:3.1
    ERROR: failed to solve: registry.solvethenetwork.com/internal/base-node:3.1:
      failed to authorize: failed to fetch oauth token: unexpected status: 401 Unauthorized

    Reproduce it locally by running a direct pull:

    docker pull registry.solvethenetwork.com/internal/base-node:3.1

    If that fails on your workstation too, the image genuinely doesn't exist at that tag. If it succeeds locally but fails in CI, you have a registry auth problem — skip ahead to Root Cause 5. Run this to check if a local cached copy has been silently powering your local builds all along:

    docker image ls | grep base-node

    How to Fix It

    First, confirm which tags are actually available in the registry. For an internal registry behind token auth:

    curl -u infrarunbook-admin:$REGISTRY_TOKEN \
      https://registry.solvethenetwork.com/v2/internal/base-node/tags/list

    If the tag doesn't exist, either rebuild and push it or update your

    FROM
    to a tag that does. For public base images, stop using mutable tags like
    latest
    or
    18-alpine
    — those can silently change under you. Prefer digest pinning:

    FROM node:18-alpine@sha256:a1b2c3d4e5f67890abcdef1234567890abcdef1234567890abcdef1234567890

    Get the current digest for any image with:

    docker pull node:18-alpine && docker inspect node:18-alpine --format='{{index .RepoDigests 0}}'

    Root Cause 2: Build Context Too Large

    Why It Happens

    When you run

    docker build .
    , Docker streams the entire build context — everything in the current directory — to the daemon before executing a single
    FROM
    instruction. Without a
    .dockerignore
    file, that context includes
    node_modules
    (often 300MB+), the
    .git
    directory (which can be gigabytes on long-lived repos), build artifacts, test fixtures, log files, and whatever else has accumulated. In CI this creates two compounding problems: the transfer itself is slow and eats into job time limits, and some runner configurations cap the context size or have memory constraints that cause the daemon to OOM during the send.

    How to Identify It

    Look for this line at the very top of your build output, before any numbered steps:

    Sending build context to Docker daemon  1.247GB

    Anything over 100MB deserves scrutiny. A typical application's build context should be under 50MB. Find the biggest contributors before touching anything:

    du -sh * .[^.]* 2>/dev/null | sort -rh | head -20

    You can also inspect exactly what Docker would include in the context without actually building:

    docker build --no-cache --progress=plain . 2>&1 | head -5

    How to Fix It

    Create a

    .dockerignore
    file in the root of your build context. The syntax is identical to
    .gitignore
    . A solid baseline that covers most Node.js and Python projects:

    .git
    .gitignore
    .dockerignore
    node_modules
    dist
    build
    *.log
    *.md
    .env
    .env.*
    coverage
    .nyc_output
    __pycache__
    *.pyc
    .pytest_cache
    .vscode
    .idea
    tests/fixtures/large-dataset

    After adding this, the difference in context size is usually dramatic:

    # Before .dockerignore
    Sending build context to Docker daemon  1.247GB
    
    # After .dockerignore
    Sending build context to Docker daemon  4.821MB

    If your repository structure puts the Dockerfile somewhere other than the project root, or you genuinely cannot place a

    .dockerignore
    where Docker expects it, pass the Dockerfile path and context directory separately:

    docker build -f ./docker/Dockerfile ./src

    Root Cause 3: Secret Not Available During Build

    Why It Happens

    Build-time secrets are genuinely tricky, and this is one of the most common sources of confusion I see from engineers who are new to CI/CD. You've set the CI variable — an NPM token, a private pip index password, a GitHub PAT for installing private packages — and it's absolutely there in the runner environment. But your

    RUN npm install
    step fails with a 401. The reason is that Docker build runs in an isolated environment. Environment variables from the CI runner are not automatically forwarded into the build. You have to explicitly declare and pass them, and if you get the method wrong, you either don't get the secret at all or you accidentally bake it into an image layer where it can be extracted later.

    How to Identify It

    The failure surfaces as a package manager authentication error during a

    RUN
    step:

    Step 7/14 : RUN npm ci
    npm ERR! code E401
    npm ERR! 401 Unauthorized - GET https://npm.solvethenetwork.com/@internal%2fcore - unauthenticated
    
    # Or for pip:
    Step 8/14 : RUN pip install -r requirements.txt --index-url https://pypi.solvethenetwork.com/simple/
    ERROR: 401 Client Error: Unauthorized for url: https://pypi.solvethenetwork.com/simple/requests/

    Confirm the variable exists on the runner but isn't reaching the build. In a CI debug step before your

    docker build
    command:

    Frequently Asked Questions

    Why does my Docker build work locally but fail in CI?

    The most common reasons are missing registry credentials on the CI runner, build secrets that exist in the CI environment but aren't forwarded into the Docker build context, or a local image cache that your workstation has but the ephemeral runner does not. Start by running the build with DOCKER_BUILDKIT=1 and --progress=plain to see which layers are being rebuilt, and confirm the runner can authenticate to your registry with an explicit docker login step.

    How do I pass environment variable secrets into a Docker build?

    Use BuildKit's --secret flag with a RUN --mount=type=secret instruction in your Dockerfile. This mounts the secret only during that specific build step without persisting it in any image layer. Set DOCKER_BUILDKIT=1 in your CI environment and pass the secret with: docker build --secret id=my_secret,env=MY_ENV_VAR -t myimage . Then inside the Dockerfile use: RUN --mount=type=secret,id=my_secret VALUE=$(cat /run/secrets/my_secret) command

    What is the fastest way to reduce Docker build context size?

    Create a .dockerignore file in your build context root and exclude node_modules, .git, dist, build artifacts, logs, and test fixtures. A proper .dockerignore can reduce a gigabyte-scale context down to a few megabytes, dramatically improving both build speed and CI reliability. Run du -sh * .[^.]* | sort -rh to identify the biggest directories before writing the file.

    How do I preserve Docker layer cache across CI jobs?

    Use registry-based cache with BuildKit's inline cache. Pass --build-arg BUILDKIT_INLINE_CACHE=1 and --cache-from pointing at a previously pushed cache image. After a successful build, push the image with a stable tag (like :cache) in addition to the commit SHA tag. On the next run, --cache-from will pull that image and use its embedded cache metadata to skip unchanged layers even on a fresh runner.

    How do I stop Docker Hub rate limiting from breaking my CI builds?

    Authenticate to Docker Hub in your CI pipeline even when pulling public images — authenticated pulls have a much higher rate limit than anonymous pulls. For high-volume pipelines, mirror frequently used base images to your internal registry and reference them in your Dockerfiles instead of pulling directly from Docker Hub on every build.

    Related Articles