InfraRunBook
    Back to articles

    Nginx Upstream Load Balancing: Round-Robin, Least Connections, IP Hash, and Failover

    Nginx
    Published: Mar 25, 2026
    Updated: Mar 25, 2026

    A complete production guide to Nginx upstream load balancing covering round-robin, least_conn, ip_hash, weighted distribution, keepalive connections, passive health checks, and failover for multi-backend deployments.

    Nginx Upstream Load Balancing: Round-Robin, Least Connections, IP Hash, and Failover

    Introduction

    When your application outgrows a single backend server, Nginx's upstream load balancing becomes the backbone of your horizontal scaling strategy. Unlike DNS round-robin or hardware load balancers, Nginx gives you granular control over traffic distribution, failure detection, and connection pooling — all from a single configuration file on sw-infrarunbook-01.

    This guide covers every major upstream load balancing method available in Nginx open source: round-robin, weighted distribution, least connections, IP hash, and generic hash. You will also learn how to configure passive health checks, backup servers, upstream keepalive connections, retry logic, and timeout tuning to build a production-grade load balancing layer for solvethenetwork.com.

    Prerequisites

    • Nginx installed on sw-infrarunbook-01 (Ubuntu 22.04 or AlmaLinux 9)
    • Two or more backend application servers reachable within your RFC 1918 network
    • DNS for solvethenetwork.com pointing to the Nginx server's public IP
    • Root or sudo access on sw-infrarunbook-01

    Understanding the Upstream Block

    The

    upstream
    block lives inside the
    http
    context in your Nginx configuration. It defines a named group of backend servers that a
    proxy_pass
    directive can reference by name. The upstream block owns the routing logic — which server receives each request, how failures are tracked, and how connections are managed.

    A minimal upstream block looks like this:

    upstream app_backend {
        server 10.0.1.11:8080;
        server 10.0.1.12:8080;
        server 10.0.1.13:8080;
    }

    You reference it in a server block using

    proxy_pass
    :

    server {
        listen 80;
        server_name solvethenetwork.com;
    
        location / {
            proxy_pass http://app_backend;
            proxy_set_header Host              $host;
            proxy_set_header X-Real-IP         $remote_addr;
            proxy_set_header X-Forwarded-For   $proxy_add_x_forwarded_for;
        }
    }

    With no additional directives, Nginx uses weighted round-robin by default — distributing requests evenly across all available servers in sequence.

    Round-Robin Load Balancing (Default)

    Round-robin is the implicit default. Nginx cycles through the server list sequentially: first request goes to

    10.0.1.11
    , second to
    10.0.1.12
    , third to
    10.0.1.13
    , then back to
    10.0.1.11
    . No extra directive is needed.

    upstream app_backend {
        server 10.0.1.11:8080;
        server 10.0.1.12:8080;
        server 10.0.1.13:8080;
    }

    Round-robin works best when backends are homogeneous — same CPU, memory, and network throughput — and when individual requests have roughly equal processing cost. It is the simplest starting point and performs well for the majority of stateless API workloads.

    Weighted Load Balancing

    When backend servers have unequal capacity, the

    weight
    parameter directs proportionally more traffic to stronger nodes. A server with
    weight=5
    receives five times as many requests as one with
    weight=1
    .

    upstream app_backend {
        server 10.0.1.11:8080 weight=5;
        server 10.0.1.12:8080 weight=3;
        server 10.0.1.13:8080 weight=1;
    }

    Out of every nine requests in this example,

    10.0.1.11
    handles five,
    10.0.1.12
    handles three, and
    10.0.1.13
    handles one. Weighted round-robin is particularly useful during rolling hardware upgrades where a newer, more powerful server should absorb the majority of traffic while a legacy node is being decommissioned.

    Least Connections (
    least_conn
    )

    Round-robin distributes requests evenly without regard to how long each takes. If some requests hold a backend connection open for several seconds — think file uploads, report exports, or slow database queries — that server accumulates a backlog while others sit idle. The

    least_conn
    directive routes each new request to the backend with the fewest active connections at that instant.

    upstream app_backend {
        least_conn;
        server 10.0.1.11:8080;
        server 10.0.1.12:8080;
        server 10.0.1.13:8080;
    }

    Use

    least_conn
    for any workload where request duration varies significantly. It pairs naturally with
    weight
    — a server with a higher weight receives new connections proportionally more often when connection counts are otherwise equal across the pool.

    IP Hash (
    ip_hash
    )

    When your application stores session state in server memory rather than in a shared external store, you need session persistence: the same client must always reach the same backend. The

    ip_hash
    directive hashes the client IP and pins that client to a specific upstream server for the duration of its session.

    upstream app_backend {
        ip_hash;
        server 10.0.1.11:8080;
        server 10.0.1.12:8080;
        server 10.0.1.13:8080;
    }

    For IPv4, Nginx hashes the first three octets, meaning all clients within the same /24 subnet resolve to the same backend. For IPv6, the full address is used. If the assigned backend becomes unavailable, Nginx automatically rehashes the client to a different live server.

    Operational note:
    ip_hash
    cannot be combined with
    least_conn
    . For deployments that require both persistence and load-aware routing, the better architectural choice is to externalize session state to Redis and use
    least_conn
    without
    ip_hash
    .

    Generic Hash (
    hash
    )

    The

    hash
    directive lets you define an arbitrary Nginx variable as the hashing key. This enables consistent routing by request URI, cookie, query argument, or any combination. Routing the same URL to the same backend consistently maximizes in-process cache hit rates on each node.

    upstream app_backend {
        hash $request_uri consistent;
        server 10.0.1.11:8080;
        server 10.0.1.12:8080;
        server 10.0.1.13:8080;
    }

    The optional

    consistent
    keyword enables ketama-style consistent hashing, which minimizes key redistribution when a server is added to or removed from the pool. Without
    consistent
    , adding one server causes every hash bucket to recalculate, invalidating all existing routing assignments simultaneously.

    Passive Health Checks and Failover

    Nginx open source relies on passive health checking: it monitors real traffic responses rather than issuing synthetic probes. The

    max_fails
    and
    fail_timeout
    parameters control how quickly a troubled server is pulled from rotation and how long it remains out of service.

    upstream app_backend {
        least_conn;
        server 10.0.1.11:8080 max_fails=3 fail_timeout=30s;
        server 10.0.1.12:8080 max_fails=3 fail_timeout=30s;
        server 10.0.1.13:8080 max_fails=3 fail_timeout=30s;
    }

    In this configuration, a backend is marked unavailable after three consecutive failures within a 30-second window. It stays out of rotation for 30 seconds before Nginx attempts to route a single probe request to it again. If that probe succeeds, the server re-enters the pool. The Nginx defaults are

    max_fails=1
    and
    fail_timeout=10s
    — raising
    max_fails
    to 3 prevents brief network blips from triggering unnecessary failovers.

    Backup Servers

    The

    backup
    parameter designates a server as a hot standby. Nginx routes no traffic to a backup server while at least one primary is alive. Only when every non-backup server is in a failed or
    down
    state does Nginx begin sending traffic to the backup.

    upstream app_backend {
        least_conn;
        server 10.0.1.11:8080 max_fails=3 fail_timeout=30s;
        server 10.0.1.12:8080 max_fails=3 fail_timeout=30s;
        server 10.0.1.13:8080 max_fails=3 fail_timeout=30s;
        server 10.0.1.20:8080 backup;
    }

    Use a backup server for a degraded-mode endpoint that serves cached or simplified responses during a full primary outage, or for a geographically distant server that is too far away for normal traffic but acceptable as a last resort.

    To temporarily drain a server without removing its configuration line — useful during maintenance windows — mark it

    down
    . Nginx will not route any traffic to it until
    down
    is removed and the config is reloaded:

    upstream app_backend {
        server 10.0.1.11:8080;
        server 10.0.1.12:8080 down;
        server 10.0.1.13:8080;
    }

    Keepalive Connections to Upstream

    By default Nginx opens a new TCP connection to the upstream for every proxied request and tears it down afterward. At high request rates this creates significant overhead from repeated TCP three-way handshakes and TIME_WAIT socket accumulation. The

    keepalive
    directive instructs each Nginx worker process to maintain a cache of idle persistent connections to the upstream group.

    upstream app_backend {
        least_conn;
        server 10.0.1.11:8080 max_fails=3 fail_timeout=30s;
        server 10.0.1.12:8080 max_fails=3 fail_timeout=30s;
        server 10.0.1.13:8080 max_fails=3 fail_timeout=30s;
        keepalive 32;
    }

    The value

    32
    sets the maximum number of idle keepalive connections cached per worker process for this upstream group. When all cached connections are in use, Nginx opens new connections normally. For keepalive to function correctly, the proxy location block must also be configured as follows:

    location / {
        proxy_pass         http://app_backend;
        proxy_http_version 1.1;
        proxy_set_header   Connection "";
        proxy_set_header   Host             $host;
        proxy_set_header   X-Real-IP        $remote_addr;
        proxy_set_header   X-Forwarded-For  $proxy_add_x_forwarded_for;
    }

    Setting

    proxy_http_version 1.1
    is required because HTTP/1.0 does not support persistent connections. Clearing the
    Connection
    header with
    proxy_set_header Connection ""
    prevents Nginx from forwarding a
    Connection: close
    header from the client to the upstream, which would force the upstream to terminate the connection after every response.

    Proxy Timeout Configuration

    Three directives govern how long Nginx waits at each phase of an upstream interaction:

    • proxy_connect_timeout — Maximum time to establish the TCP connection to the upstream. The default is 60 seconds; for backends on the same LAN, 5 seconds is more appropriate and enables faster failover.
    • proxy_send_timeout — Maximum idle time between two successive write operations to the upstream. Resets with each write. Default is 60 seconds.
    • proxy_read_timeout — Maximum idle time between two successive read operations from the upstream. This is not the total request duration but the inter-packet gap. Default is 60 seconds.
    location / {
        proxy_pass http://app_backend;
        proxy_http_version 1.1;
        proxy_set_header Connection "";
        proxy_set_header Host             $host;
        proxy_set_header X-Real-IP        $remote_addr;
        proxy_set_header X-Forwarded-For  $proxy_add_x_forwarded_for;
    
        proxy_connect_timeout  5s;
        proxy_send_timeout    60s;
        proxy_read_timeout    60s;
    }

    For long-running backend operations such as PDF generation or large data exports, raise

    proxy_read_timeout
    appropriately. For fast internal microservice calls, keep
    proxy_connect_timeout
    short so Nginx detects a downed backend within seconds and retries on another node.

    Retry Logic with proxy_next_upstream

    The

    proxy_next_upstream
    directive defines the conditions under which Nginx retries a failed request on the next available server. This complements passive health checks by handling in-flight request failures rather than server-level failures.

    location / {
        proxy_pass http://app_backend;
        proxy_http_version 1.1;
        proxy_set_header Connection "";
    
        proxy_next_upstream      error timeout invalid_header http_500 http_502 http_503;
        proxy_next_upstream_tries 2;
        proxy_next_upstream_timeout 10s;
    }

    proxy_next_upstream_tries 2
    limits the retry chain to two attempts total, preventing a slow backend from forcing the request to be tried against every server in the pool before returning an error to the client. Be conservative when enabling retry for
    http_500
    on non-idempotent HTTP methods such as POST — retrying a payment submission or database write could cause duplicate side effects. To restrict retries to safe methods only, omit
    http_500
    from the directive values and handle application-level 500 errors separately.

    Complete Production Configuration

    The following is a full production-ready configuration for sw-infrarunbook-01, load balancing HTTPS traffic to three application backends serving solvethenetwork.com using least connections, keepalive, passive health checks, a backup server, and retry logic.

    # /etc/nginx/conf.d/solvethenetwork.conf
    
    upstream app_backend {
        least_conn;
    
        server 10.0.1.11:8080 max_fails=3 fail_timeout=30s;
        server 10.0.1.12:8080 max_fails=3 fail_timeout=30s;
        server 10.0.1.13:8080 max_fails=3 fail_timeout=30s;
    
        # Hot standby: receives traffic only when all primaries are down
        server 10.0.1.20:8080 backup;
    
        keepalive 32;
    }
    
    server {
        listen 80;
        server_name solvethenetwork.com;
        return 301 https://$host$request_uri;
    }
    
    server {
        listen 443 ssl;
        server_name solvethenetwork.com;
    
        ssl_certificate      /etc/letsencrypt/live/solvethenetwork.com/fullchain.pem;
        ssl_certificate_key  /etc/letsencrypt/live/solvethenetwork.com/privkey.pem;
        ssl_protocols        TLSv1.2 TLSv1.3;
        ssl_ciphers          HIGH:!aNULL:!MD5;
    
        access_log  /var/log/nginx/solvethenetwork_access.log;
        error_log   /var/log/nginx/solvethenetwork_error.log warn;
    
        location / {
            proxy_pass         http://app_backend;
            proxy_http_version 1.1;
            proxy_set_header   Connection        "";
            proxy_set_header   Host              $host;
            proxy_set_header   X-Real-IP         $remote_addr;
            proxy_set_header   X-Forwarded-For   $proxy_add_x_forwarded_for;
            proxy_set_header   X-Forwarded-Proto $scheme;
    
            proxy_connect_timeout   5s;
            proxy_send_timeout     60s;
            proxy_read_timeout     60s;
    
            proxy_next_upstream      error timeout http_502 http_503;
            proxy_next_upstream_tries 2;
        }
    
        # Lightweight liveness check served by Nginx itself
        location /nginx-health {
            access_log off;
            return 200 "OK\n";
            add_header Content-Type text/plain;
        }
    }

    Testing and Validation

    Test the configuration syntax on sw-infrarunbook-01 before applying it:

    sudo nginx -t

    Expected output on success:

    nginx: the configuration file /etc/nginx/nginx.conf syntax is ok
    nginx: configuration file /etc/nginx/nginx.conf test is successful

    Reload Nginx gracefully, without dropping active connections:

    sudo systemctl reload nginx

    Verify distribution across all backends by sending a batch of requests and checking each backend's access log:

    for i in {1..12}; do curl -s -o /dev/null -w "%{http_code}\n" https://solvethenetwork.com/; done

    Each of the three primary backends should show roughly four entries in its access log for the twelve requests sent. With

    least_conn
    , distribution may not be perfectly even if some backends are already serving open connections when the test runs — this is expected and correct behavior.

    Monitoring with stub_status

    Nginx open source includes a basic status module that exposes aggregate connection and request counters. Enable it on a loopback-only listener so it is never reachable from the internet:

    server {
        listen 127.0.0.1:8081;
    
        location /nginx_status {
            stub_status;
            allow 127.0.0.1;
            deny  all;
        }
    }
    curl http://127.0.0.1:8081/nginx_status

    Sample output:

    Active connections: 47
    server accepts handled requests
     142680 142680 389201
    Reading: 0 Writing: 5 Waiting: 42

    For per-upstream metrics — request rate per backend, active connections per server, upstream error rates — consider the open-source

    nginx-module-vts
    module or the
    nginx-prometheus-exporter
    sidecar to feed data into a Prometheus and Grafana stack. Nginx Plus provides these natively through its upstream health check API, but the open-source path covered here is sufficient for the majority of production deployments.


    Frequently Asked Questions

    Q: What is the default load balancing method in Nginx if I do not specify one?

    A: The default is weighted round-robin. Nginx cycles through the server list in order, sending one request to each server in turn. If all servers have equal weight (the default weight is 1), requests are distributed evenly. You do not need to add any directive to use round-robin — it is active as soon as you define an upstream block with multiple server entries.

    Q: Can I combine ip_hash with least_conn in the same upstream block?

    A: No.

    ip_hash
    and
    least_conn
    are mutually exclusive load balancing methods and cannot be used together in the same upstream block. If you need session persistence alongside connection-aware routing, the recommended approach is to move session state out of application memory and into an external shared store such as Redis, then use
    least_conn
    alone without
    ip_hash
    .

    Q: What is the difference between max_fails and proxy_connect_timeout?

    A: They operate at different layers.

    max_fails
    is an upstream-level counter that tracks how many consecutive failed requests a server produces within the
    fail_timeout
    window before Nginx removes it from rotation.
    proxy_connect_timeout
    is a per-request timer that controls how long Nginx waits when attempting to open a single TCP connection to an upstream. A connection timeout counts as one failure toward
    max_fails
    , but the two settings serve different purposes and should both be configured.

    Q: How do I temporarily take an upstream server out of rotation for maintenance?

    A: Add the

    down
    parameter to the server entry in the upstream block and reload Nginx. Nginx will not route any traffic to that server until
    down
    is removed and the configuration is reloaded again. This is safer than deleting the server line entirely because it preserves the configuration and makes the maintenance state explicit and auditable in version control.

    Q: Does proxy_next_upstream retry POST requests safely?

    A: By default Nginx does not retry non-idempotent requests (POST, LOCK, PATCH) to prevent unintended duplicate side effects such as double-submitting a payment or inserting a database record twice. If you include

    non_idempotent
    in the
    proxy_next_upstream
    directive value, Nginx will retry these methods, but this should only be done when your backend handlers are explicitly designed to be idempotent. For most applications, leave non-idempotent retry disabled.

    Q: What value should I set for the keepalive directive?

    A: The

    keepalive
    value is the number of idle connections cached per worker process, not a global total. A reasonable starting point is 32 for most applications. Multiply this by the number of Nginx worker processes and the number of upstream servers to estimate total idle connections held open. For example, 4 worker processes with
    keepalive 32
    and 3 backend servers can maintain up to 384 idle connections cluster-wide. Tune this based on your backend server's connection limits and your actual request concurrency.

    Q: Can I use upstream load balancing to proxy to HTTPS backend servers?

    A: Yes. Change the upstream server entries to use port 443 and change

    proxy_pass
    to
    https://app_backend
    . Nginx will establish TLS to each upstream server. You can control upstream TLS verification with
    proxy_ssl_verify on
    and
    proxy_ssl_trusted_certificate
    . For internal backends with self-signed certificates, you may set
    proxy_ssl_verify off
    , though enabling verification is strongly recommended in production environments.

    Q: What happens to in-flight requests when I reload Nginx after changing the upstream block?

    A: Nginx's graceful reload (

    systemctl reload nginx
    or
    nginx -s reload
    ) spawns new worker processes with the updated configuration and lets existing worker processes finish their current requests before shutting down. In-flight requests are not dropped. Only after all active connections on the old workers have closed do the old workers exit. This means a reload is safe to perform during live traffic as long as the reload itself completes successfully.

    Q: How does the consistent keyword in the hash directive affect backend additions?

    A: Without

    consistent
    , the hash algorithm maps keys to buckets using a simple modulo of the total server count. Adding or removing one server changes the modulo divisor, causing nearly every key to remap to a different server — destroying cache locality. With
    consistent
    , Nginx uses ketama consistent hashing, where each server occupies a range on a virtual ring. Adding a server only remaps the keys that fall within that server's new range, leaving the rest of the key space undisturbed. This makes pool membership changes far less disruptive for cache-affinity workloads.

    Q: Is the Nginx open-source upstream module the same as Nginx Plus upstream?

    A: The core upstream proxying functionality is the same, but Nginx Plus adds active health checks (Nginx sends periodic synthetic probes to upstreams instead of relying solely on real traffic failures), a live upstream reconfiguration API, per-upstream dashboard metrics, and the

    sticky
    cookie persistence directive. For most production use cases the open-source passive health checking covered in this guide is entirely sufficient. Active health checks become valuable when backend failures are infrequent and you cannot afford even a single request hitting a failed server before Nginx detects the outage.

    Q: How many upstream groups can I define in a single Nginx configuration?

    A: There is no hard limit enforced by Nginx. You can define as many upstream blocks as needed within the

    http
    context. Each upstream group is independent, with its own server pool, load balancing method, health check settings, and keepalive pool. It is common to define separate upstream groups for different services — one for the main application, one for an API gateway, one for a media processing cluster — all proxied from different
    location
    blocks within the same or different server blocks.

    Q: Do I need to restart or reload Nginx when a backend server recovers after being marked as failed?

    A: No manual intervention is required. After a server is marked unavailable and the

    fail_timeout
    period expires, Nginx automatically sends a single probe request to it. If that request succeeds, the server is returned to the active pool without any reload. This recovery is handled entirely by the Nginx worker processes at runtime. A configuration reload is only necessary when you change the static upstream configuration itself, such as adding or removing server entries or changing parameters.

    Frequently Asked Questions

    What is the default load balancing method in Nginx if I do not specify one?

    The default is weighted round-robin. Nginx cycles through the server list in order, sending one request to each server in turn. If all servers have equal weight (the default weight is 1), requests are distributed evenly. You do not need to add any directive to use round-robin — it is active as soon as you define an upstream block with multiple server entries.

    Can I combine ip_hash with least_conn in the same upstream block?

    No. ip_hash and least_conn are mutually exclusive load balancing methods and cannot be used together in the same upstream block. If you need session persistence alongside connection-aware routing, the recommended approach is to move session state into an external shared store such as Redis, then use least_conn alone without ip_hash.

    What is the difference between max_fails and proxy_connect_timeout?

    They operate at different layers. max_fails is an upstream-level counter that tracks how many consecutive failed requests a server produces within the fail_timeout window before Nginx removes it from rotation. proxy_connect_timeout is a per-request timer controlling how long Nginx waits to open a single TCP connection to an upstream. A connection timeout counts as one failure toward max_fails, but the two settings serve different purposes and should both be configured.

    How do I temporarily take an upstream server out of rotation for maintenance?

    Add the down parameter to the server entry in the upstream block and reload Nginx. Nginx will not route any traffic to that server until down is removed and the configuration is reloaded again. This is safer than deleting the server line entirely because it preserves the configuration and makes the maintenance state explicit and auditable in version control.

    Does proxy_next_upstream retry POST requests safely?

    By default Nginx does not retry non-idempotent requests (POST, LOCK, PATCH) to prevent duplicate side effects such as double-submitting a payment or inserting a database record twice. If you include non_idempotent in the proxy_next_upstream directive, Nginx will retry these methods, but this should only be done when your backend handlers are explicitly designed to be idempotent.

    What value should I set for the keepalive directive?

    The keepalive value is the number of idle connections cached per worker process, not a global total. A reasonable starting point is 32 for most applications. Multiply this by the number of Nginx worker processes and upstream servers to estimate total idle connections held open. Tune based on your backend connection limits and actual request concurrency.

    Can I use upstream load balancing to proxy to HTTPS backend servers?

    Yes. Change server entries to port 443 and set proxy_pass to https://app_backend. Nginx will establish TLS to each upstream. Control upstream TLS verification with proxy_ssl_verify on and proxy_ssl_trusted_certificate. For internal backends with self-signed certificates you may set proxy_ssl_verify off, though enabling verification is recommended in production.

    What happens to in-flight requests when I reload Nginx after changing the upstream block?

    Nginx's graceful reload spawns new worker processes with the updated configuration and lets existing workers finish their current requests before exiting. In-flight requests are not dropped. Only after all active connections on the old workers close do those workers exit, making a reload safe to perform during live traffic.

    How does the consistent keyword in the hash directive affect backend additions?

    Without consistent, adding or removing one server changes the modulo divisor and causes nearly every key to remap to a different server, destroying cache locality. With consistent, Nginx uses ketama consistent hashing where adding a server only remaps the keys within that server's new range, leaving the rest of the key space undisturbed.

    Is the Nginx open-source upstream module the same as Nginx Plus upstream?

    The core proxying functionality is the same, but Nginx Plus adds active health checks (periodic synthetic probes instead of relying solely on real traffic failures), a live upstream reconfiguration API, per-upstream dashboard metrics, and the sticky cookie persistence directive. For most production use cases the open-source passive health checking is entirely sufficient.

    Related Articles