Nginx Upstream Load Balancing: Round-Robin, Least Connections, IP Hash, and Failover

Published: Mar 25, 2026

Updated: Mar 25, 2026

A complete production guide to Nginx upstream load balancing covering round-robin, least_conn, ip_hash, weighted distribution, keepalive connections, passive health checks, and failover for multi-backend deployments.

Nginx Upstream Load Balancing: Round-Robin, Least Connections, IP Hash, and Failover

Introduction

When your application outgrows a single backend server, Nginx's upstream load balancing becomes the backbone of your horizontal scaling strategy. Unlike DNS round-robin or hardware load balancers, Nginx gives you granular control over traffic distribution, failure detection, and connection pooling — all from a single configuration file on sw-infrarunbook-01.

This guide covers every major upstream load balancing method available in Nginx open source: round-robin, weighted distribution, least connections, IP hash, and generic hash. You will also learn how to configure passive health checks, backup servers, upstream keepalive connections, retry logic, and timeout tuning to build a production-grade load balancing layer for solvethenetwork.com.

Prerequisites

Nginx installed on sw-infrarunbook-01 (Ubuntu 22.04 or AlmaLinux 9)
Two or more backend application servers reachable within your RFC 1918 network
DNS for solvethenetwork.com pointing to the Nginx server's public IP
Root or sudo access on sw-infrarunbook-01

Understanding the Upstream Block

The

upstream

block lives inside the

http

context in your Nginx configuration. It defines a named group of backend servers that a

proxy_pass

directive can reference by name. The upstream block owns the routing logic — which server receives each request, how failures are tracked, and how connections are managed.

A minimal upstream block looks like this:

upstream app_backend {
    server 10.0.1.11:8080;
    server 10.0.1.12:8080;
    server 10.0.1.13:8080;
}

You reference it in a server block using

proxy_pass

server {
    listen 80;
    server_name solvethenetwork.com;

    location / {
        proxy_pass http://app_backend;
        proxy_set_header Host              $host;
        proxy_set_header X-Real-IP         $remote_addr;
        proxy_set_header X-Forwarded-For   $proxy_add_x_forwarded_for;
    }
}

With no additional directives, Nginx uses weighted round-robin by default — distributing requests evenly across all available servers in sequence.

Round-Robin Load Balancing (Default)

Round-robin is the implicit default. Nginx cycles through the server list sequentially: first request goes to

10.0.1.11

, second to

10.0.1.12

, third to

10.0.1.13

, then back to

10.0.1.11

. No extra directive is needed.

upstream app_backend {
    server 10.0.1.11:8080;
    server 10.0.1.12:8080;
    server 10.0.1.13:8080;
}

Round-robin works best when backends are homogeneous — same CPU, memory, and network throughput — and when individual requests have roughly equal processing cost. It is the simplest starting point and performs well for the majority of stateless API workloads.

Weighted Load Balancing

When backend servers have unequal capacity, the

weight

parameter directs proportionally more traffic to stronger nodes. A server with

weight=5

receives five times as many requests as one with

weight=1

upstream app_backend {
    server 10.0.1.11:8080 weight=5;
    server 10.0.1.12:8080 weight=3;
    server 10.0.1.13:8080 weight=1;
}

Out of every nine requests in this example,

10.0.1.11

handles five,

10.0.1.12

handles three, and

10.0.1.13

handles one. Weighted round-robin is particularly useful during rolling hardware upgrades where a newer, more powerful server should absorb the majority of traffic while a legacy node is being decommissioned.

Least Connections (
least_conn
)

Round-robin distributes requests evenly without regard to how long each takes. If some requests hold a backend connection open for several seconds — think file uploads, report exports, or slow database queries — that server accumulates a backlog while others sit idle. The

least_conn

directive routes each new request to the backend with the fewest active connections at that instant.

upstream app_backend {
    least_conn;
    server 10.0.1.11:8080;
    server 10.0.1.12:8080;
    server 10.0.1.13:8080;
}

Use

least_conn

for any workload where request duration varies significantly. It pairs naturally with

weight

— a server with a higher weight receives new connections proportionally more often when connection counts are otherwise equal across the pool.

IP Hash (
ip_hash
)

When your application stores session state in server memory rather than in a shared external store, you need session persistence: the same client must always reach the same backend. The

ip_hash

directive hashes the client IP and pins that client to a specific upstream server for the duration of its session.

upstream app_backend {
    ip_hash;
    server 10.0.1.11:8080;
    server 10.0.1.12:8080;
    server 10.0.1.13:8080;
}

For IPv4, Nginx hashes the first three octets, meaning all clients within the same /24 subnet resolve to the same backend. For IPv6, the full address is used. If the assigned backend becomes unavailable, Nginx automatically rehashes the client to a different live server.

Operational note:
ip_hash
cannot be combined with
least_conn
. For deployments that require both persistence and load-aware routing, the better architectural choice is to externalize session state to Redis and use
least_conn
without
ip_hash
.

Generic Hash (
hash
)

The

hash

directive lets you define an arbitrary Nginx variable as the hashing key. This enables consistent routing by request URI, cookie, query argument, or any combination. Routing the same URL to the same backend consistently maximizes in-process cache hit rates on each node.

upstream app_backend {
    hash $request_uri consistent;
    server 10.0.1.11:8080;
    server 10.0.1.12:8080;
    server 10.0.1.13:8080;
}

The optional

consistent

keyword enables ketama-style consistent hashing, which minimizes key redistribution when a server is added to or removed from the pool. Without

consistent

, adding one server causes every hash bucket to recalculate, invalidating all existing routing assignments simultaneously.

Passive Health Checks and Failover

Nginx open source relies on passive health checking: it monitors real traffic responses rather than issuing synthetic probes. The

max_fails

and

fail_timeout

parameters control how quickly a troubled server is pulled from rotation and how long it remains out of service.

upstream app_backend {
    least_conn;
    server 10.0.1.11:8080 max_fails=3 fail_timeout=30s;
    server 10.0.1.12:8080 max_fails=3 fail_timeout=30s;
    server 10.0.1.13:8080 max_fails=3 fail_timeout=30s;
}

In this configuration, a backend is marked unavailable after three consecutive failures within a 30-second window. It stays out of rotation for 30 seconds before Nginx attempts to route a single probe request to it again. If that probe succeeds, the server re-enters the pool. The Nginx defaults are

max_fails=1

and

fail_timeout=10s

— raising

max_fails

to 3 prevents brief network blips from triggering unnecessary failovers.

Backup Servers

The

backup

parameter designates a server as a hot standby. Nginx routes no traffic to a backup server while at least one primary is alive. Only when every non-backup server is in a failed or

down

state does Nginx begin sending traffic to the backup.

upstream app_backend {
    least_conn;
    server 10.0.1.11:8080 max_fails=3 fail_timeout=30s;
    server 10.0.1.12:8080 max_fails=3 fail_timeout=30s;
    server 10.0.1.13:8080 max_fails=3 fail_timeout=30s;
    server 10.0.1.20:8080 backup;
}

Use a backup server for a degraded-mode endpoint that serves cached or simplified responses during a full primary outage, or for a geographically distant server that is too far away for normal traffic but acceptable as a last resort.

To temporarily drain a server without removing its configuration line — useful during maintenance windows — mark it

down

. Nginx will not route any traffic to it until

down

is removed and the config is reloaded:

upstream app_backend {
    server 10.0.1.11:8080;
    server 10.0.1.12:8080 down;
    server 10.0.1.13:8080;
}

Keepalive Connections to Upstream

By default Nginx opens a new TCP connection to the upstream for every proxied request and tears it down afterward. At high request rates this creates significant overhead from repeated TCP three-way handshakes and TIME_WAIT socket accumulation. The

keepalive

directive instructs each Nginx worker process to maintain a cache of idle persistent connections to the upstream group.

upstream app_backend {
    least_conn;
    server 10.0.1.11:8080 max_fails=3 fail_timeout=30s;
    server 10.0.1.12:8080 max_fails=3 fail_timeout=30s;
    server 10.0.1.13:8080 max_fails=3 fail_timeout=30s;
    keepalive 32;
}

The value

32

sets the maximum number of idle keepalive connections cached per worker process for this upstream group. When all cached connections are in use, Nginx opens new connections normally. For keepalive to function correctly, the proxy location block must also be configured as follows:

location / {
    proxy_pass         http://app_backend;
    proxy_http_version 1.1;
    proxy_set_header   Connection "";
    proxy_set_header   Host             $host;
    proxy_set_header   X-Real-IP        $remote_addr;
    proxy_set_header   X-Forwarded-For  $proxy_add_x_forwarded_for;
}

Setting

proxy_http_version 1.1

is required because HTTP/1.0 does not support persistent connections. Clearing the

Connection

header with

proxy_set_header Connection ""

prevents Nginx from forwarding a

Connection: close

header from the client to the upstream, which would force the upstream to terminate the connection after every response.

Proxy Timeout Configuration

Three directives govern how long Nginx waits at each phase of an upstream interaction:

proxy_connect_timeout — Maximum time to establish the TCP connection to the upstream. The default is 60 seconds; for backends on the same LAN, 5 seconds is more appropriate and enables faster failover.
proxy_send_timeout — Maximum idle time between two successive write operations to the upstream. Resets with each write. Default is 60 seconds.
proxy_read_timeout — Maximum idle time between two successive read operations from the upstream. This is not the total request duration but the inter-packet gap. Default is 60 seconds.

location / {
    proxy_pass http://app_backend;
    proxy_http_version 1.1;
    proxy_set_header Connection "";
    proxy_set_header Host             $host;
    proxy_set_header X-Real-IP        $remote_addr;
    proxy_set_header X-Forwarded-For  $proxy_add_x_forwarded_for;

    proxy_connect_timeout  5s;
    proxy_send_timeout    60s;
    proxy_read_timeout    60s;
}

For long-running backend operations such as PDF generation or large data exports, raise

proxy_read_timeout

appropriately. For fast internal microservice calls, keep

proxy_connect_timeout

short so Nginx detects a downed backend within seconds and retries on another node.

Retry Logic with proxy_next_upstream

The

proxy_next_upstream

directive defines the conditions under which Nginx retries a failed request on the next available server. This complements passive health checks by handling in-flight request failures rather than server-level failures.

location / {
    proxy_pass http://app_backend;
    proxy_http_version 1.1;
    proxy_set_header Connection "";

    proxy_next_upstream      error timeout invalid_header http_500 http_502 http_503;
    proxy_next_upstream_tries 2;
    proxy_next_upstream_timeout 10s;
}

proxy_next_upstream_tries 2

limits the retry chain to two attempts total, preventing a slow backend from forcing the request to be tried against every server in the pool before returning an error to the client. Be conservative when enabling retry for

http_500

on non-idempotent HTTP methods such as POST — retrying a payment submission or database write could cause duplicate side effects. To restrict retries to safe methods only, omit

http_500

from the directive values and handle application-level 500 errors separately.

Complete Production Configuration

The following is a full production-ready configuration for sw-infrarunbook-01, load balancing HTTPS traffic to three application backends serving solvethenetwork.com using least connections, keepalive, passive health checks, a backup server, and retry logic.

# /etc/nginx/conf.d/solvethenetwork.conf

upstream app_backend {
    least_conn;

    server 10.0.1.11:8080 max_fails=3 fail_timeout=30s;
    server 10.0.1.12:8080 max_fails=3 fail_timeout=30s;
    server 10.0.1.13:8080 max_fails=3 fail_timeout=30s;

    # Hot standby: receives traffic only when all primaries are down
    server 10.0.1.20:8080 backup;

    keepalive 32;
}

server {
    listen 80;
    server_name solvethenetwork.com;
    return 301 https://$host$request_uri;
}

server {
    listen 443 ssl;
    server_name solvethenetwork.com;

    ssl_certificate      /etc/letsencrypt/live/solvethenetwork.com/fullchain.pem;
    ssl_certificate_key  /etc/letsencrypt/live/solvethenetwork.com/privkey.pem;
    ssl_protocols        TLSv1.2 TLSv1.3;
    ssl_ciphers          HIGH:!aNULL:!MD5;

    access_log  /var/log/nginx/solvethenetwork_access.log;
    error_log   /var/log/nginx/solvethenetwork_error.log warn;

    location / {
        proxy_pass         http://app_backend;
        proxy_http_version 1.1;
        proxy_set_header   Connection        "";
        proxy_set_header   Host              $host;
        proxy_set_header   X-Real-IP         $remote_addr;
        proxy_set_header   X-Forwarded-For   $proxy_add_x_forwarded_for;
        proxy_set_header   X-Forwarded-Proto $scheme;

        proxy_connect_timeout   5s;
        proxy_send_timeout     60s;
        proxy_read_timeout     60s;

        proxy_next_upstream      error timeout http_502 http_503;
        proxy_next_upstream_tries 2;
    }

    # Lightweight liveness check served by Nginx itself
    location /nginx-health {
        access_log off;
        return 200 "OK\n";
        add_header Content-Type text/plain;
    }
}

Testing and Validation

Test the configuration syntax on sw-infrarunbook-01 before applying it:

sudo nginx -t

Expected output on success:

nginx: the configuration file /etc/nginx/nginx.conf syntax is ok
nginx: configuration file /etc/nginx/nginx.conf test is successful

Reload Nginx gracefully, without dropping active connections:

sudo systemctl reload nginx

Verify distribution across all backends by sending a batch of requests and checking each backend's access log:

for i in {1..12}; do curl -s -o /dev/null -w "%{http_code}\n" https://solvethenetwork.com/; done

Each of the three primary backends should show roughly four entries in its access log for the twelve requests sent. With

least_conn

, distribution may not be perfectly even if some backends are already serving open connections when the test runs — this is expected and correct behavior.

Monitoring with stub_status

Nginx open source includes a basic status module that exposes aggregate connection and request counters. Enable it on a loopback-only listener so it is never reachable from the internet:

server {
    listen 127.0.0.1:8081;

    location /nginx_status {
        stub_status;
        allow 127.0.0.1;
        deny  all;
    }
}

curl http://127.0.0.1:8081/nginx_status

Sample output:

Active connections: 47
server accepts handled requests
 142680 142680 389201
Reading: 0 Writing: 5 Waiting: 42

For per-upstream metrics — request rate per backend, active connections per server, upstream error rates — consider the open-source

nginx-module-vts

module or the

nginx-prometheus-exporter

sidecar to feed data into a Prometheus and Grafana stack. Nginx Plus provides these natively through its upstream health check API, but the open-source path covered here is sufficient for the majority of production deployments.

Frequently Asked Questions

Q: What is the default load balancing method in Nginx if I do not specify one?

A: The default is weighted round-robin. Nginx cycles through the server list in order, sending one request to each server in turn. If all servers have equal weight (the default weight is 1), requests are distributed evenly. You do not need to add any directive to use round-robin — it is active as soon as you define an upstream block with multiple server entries.

Q: Can I combine ip_hash with least_conn in the same upstream block?

A: No.

ip_hash

and

least_conn

are mutually exclusive load balancing methods and cannot be used together in the same upstream block. If you need session persistence alongside connection-aware routing, the recommended approach is to move session state out of application memory and into an external shared store such as Redis, then use

least_conn

alone without

ip_hash

Q: What is the difference between max_fails and proxy_connect_timeout?

A: They operate at different layers.

max_fails

is an upstream-level counter that tracks how many consecutive failed requests a server produces within the

fail_timeout

window before Nginx removes it from rotation.

proxy_connect_timeout

is a per-request timer that controls how long Nginx waits when attempting to open a single TCP connection to an upstream. A connection timeout counts as one failure toward

max_fails

, but the two settings serve different purposes and should both be configured.

Q: How do I temporarily take an upstream server out of rotation for maintenance?

A: Add the

down

parameter to the server entry in the upstream block and reload Nginx. Nginx will not route any traffic to that server until

down

is removed and the configuration is reloaded again. This is safer than deleting the server line entirely because it preserves the configuration and makes the maintenance state explicit and auditable in version control.

Q: Does proxy_next_upstream retry POST requests safely?

A: By default Nginx does not retry non-idempotent requests (POST, LOCK, PATCH) to prevent unintended duplicate side effects such as double-submitting a payment or inserting a database record twice. If you include

non_idempotent

in the

proxy_next_upstream

directive value, Nginx will retry these methods, but this should only be done when your backend handlers are explicitly designed to be idempotent. For most applications, leave non-idempotent retry disabled.

Q: What value should I set for the keepalive directive?

A: The

keepalive

value is the number of idle connections cached per worker process, not a global total. A reasonable starting point is 32 for most applications. Multiply this by the number of Nginx worker processes and the number of upstream servers to estimate total idle connections held open. For example, 4 worker processes with

keepalive 32

and 3 backend servers can maintain up to 384 idle connections cluster-wide. Tune this based on your backend server's connection limits and your actual request concurrency.

Q: Can I use upstream load balancing to proxy to HTTPS backend servers?

A: Yes. Change the upstream server entries to use port 443 and change

proxy_pass

https://app_backend

. Nginx will establish TLS to each upstream server. You can control upstream TLS verification with

proxy_ssl_verify on

and

proxy_ssl_trusted_certificate

. For internal backends with self-signed certificates, you may set

proxy_ssl_verify off

, though enabling verification is strongly recommended in production environments.

Q: What happens to in-flight requests when I reload Nginx after changing the upstream block?

A: Nginx's graceful reload (

systemctl reload nginx

nginx -s reload

) spawns new worker processes with the updated configuration and lets existing worker processes finish their current requests before shutting down. In-flight requests are not dropped. Only after all active connections on the old workers have closed do the old workers exit. This means a reload is safe to perform during live traffic as long as the reload itself completes successfully.

Q: How does the consistent keyword in the hash directive affect backend additions?

A: Without

consistent

, the hash algorithm maps keys to buckets using a simple modulo of the total server count. Adding or removing one server changes the modulo divisor, causing nearly every key to remap to a different server — destroying cache locality. With

consistent

, Nginx uses ketama consistent hashing, where each server occupies a range on a virtual ring. Adding a server only remaps the keys that fall within that server's new range, leaving the rest of the key space undisturbed. This makes pool membership changes far less disruptive for cache-affinity workloads.

Q: Is the Nginx open-source upstream module the same as Nginx Plus upstream?

A: The core upstream proxying functionality is the same, but Nginx Plus adds active health checks (Nginx sends periodic synthetic probes to upstreams instead of relying solely on real traffic failures), a live upstream reconfiguration API, per-upstream dashboard metrics, and the

sticky

cookie persistence directive. For most production use cases the open-source passive health checking covered in this guide is entirely sufficient. Active health checks become valuable when backend failures are infrequent and you cannot afford even a single request hitting a failed server before Nginx detects the outage.

Q: How many upstream groups can I define in a single Nginx configuration?

A: There is no hard limit enforced by Nginx. You can define as many upstream blocks as needed within the

http

context. Each upstream group is independent, with its own server pool, load balancing method, health check settings, and keepalive pool. It is common to define separate upstream groups for different services — one for the main application, one for an API gateway, one for a media processing cluster — all proxied from different

location

blocks within the same or different server blocks.

Q: Do I need to restart or reload Nginx when a backend server recovers after being marked as failed?

A: No manual intervention is required. After a server is marked unavailable and the

fail_timeout

period expires, Nginx automatically sends a single probe request to it. If that request succeeds, the server is returned to the active pool without any reload. This recovery is handled entirely by the Nginx worker processes at runtime. A configuration reload is only necessary when you change the static upstream configuration itself, such as adding or removing server entries or changing parameters.

Frequently Asked Questions

What is the default load balancing method in Nginx if I do not specify one?

The default is weighted round-robin. Nginx cycles through the server list in order, sending one request to each server in turn. If all servers have equal weight (the default weight is 1), requests are distributed evenly. You do not need to add any directive to use round-robin — it is active as soon as you define an upstream block with multiple server entries.

Can I combine ip_hash with least_conn in the same upstream block?

No. ip_hash and least_conn are mutually exclusive load balancing methods and cannot be used together in the same upstream block. If you need session persistence alongside connection-aware routing, the recommended approach is to move session state into an external shared store such as Redis, then use least_conn alone without ip_hash.

What is the difference between max_fails and proxy_connect_timeout?

They operate at different layers. max_fails is an upstream-level counter that tracks how many consecutive failed requests a server produces within the fail_timeout window before Nginx removes it from rotation. proxy_connect_timeout is a per-request timer controlling how long Nginx waits to open a single TCP connection to an upstream. A connection timeout counts as one failure toward max_fails, but the two settings serve different purposes and should both be configured.

How do I temporarily take an upstream server out of rotation for maintenance?

Add the down parameter to the server entry in the upstream block and reload Nginx. Nginx will not route any traffic to that server until down is removed and the configuration is reloaded again. This is safer than deleting the server line entirely because it preserves the configuration and makes the maintenance state explicit and auditable in version control.

Does proxy_next_upstream retry POST requests safely?

By default Nginx does not retry non-idempotent requests (POST, LOCK, PATCH) to prevent duplicate side effects such as double-submitting a payment or inserting a database record twice. If you include non_idempotent in the proxy_next_upstream directive, Nginx will retry these methods, but this should only be done when your backend handlers are explicitly designed to be idempotent.

What value should I set for the keepalive directive?

The keepalive value is the number of idle connections cached per worker process, not a global total. A reasonable starting point is 32 for most applications. Multiply this by the number of Nginx worker processes and upstream servers to estimate total idle connections held open. Tune based on your backend connection limits and actual request concurrency.

Can I use upstream load balancing to proxy to HTTPS backend servers?

Yes. Change server entries to port 443 and set proxy_pass to https://app_backend. Nginx will establish TLS to each upstream. Control upstream TLS verification with proxy_ssl_verify on and proxy_ssl_trusted_certificate. For internal backends with self-signed certificates you may set proxy_ssl_verify off, though enabling verification is recommended in production.

What happens to in-flight requests when I reload Nginx after changing the upstream block?

Nginx's graceful reload spawns new worker processes with the updated configuration and lets existing workers finish their current requests before exiting. In-flight requests are not dropped. Only after all active connections on the old workers close do those workers exit, making a reload safe to perform during live traffic.

How does the consistent keyword in the hash directive affect backend additions?

Without consistent, adding or removing one server changes the modulo divisor and causes nearly every key to remap to a different server, destroying cache locality. With consistent, Nginx uses ketama consistent hashing where adding a server only remaps the keys within that server's new range, leaving the rest of the key space undisturbed.

Is the Nginx open-source upstream module the same as Nginx Plus upstream?

The core proxying functionality is the same, but Nginx Plus adds active health checks (periodic synthetic probes instead of relying solely on real traffic failures), a live upstream reconfiguration API, per-upstream dashboard metrics, and the sticky cookie persistence directive. For most production use cases the open-source passive health checking is entirely sufficient.

Introduction

Prerequisites

Understanding the Upstream Block

Round-Robin Load Balancing (Default)

Weighted Load Balancing

Least Connections (least_conn)

IP Hash (ip_hash)

Generic Hash (hash)

Passive Health Checks and Failover

Backup Servers

Keepalive Connections to Upstream

Proxy Timeout Configuration

Retry Logic with proxy_next_upstream

Complete Production Configuration

Testing and Validation

Monitoring with stub_status

Frequently Asked Questions

Q: What is the default load balancing method in Nginx if I do not specify one?

Q: Can I combine ip_hash with least_conn in the same upstream block?

Q: What is the difference between max_fails and proxy_connect_timeout?

Q: How do I temporarily take an upstream server out of rotation for maintenance?

Q: Does proxy_next_upstream retry POST requests safely?

Q: What value should I set for the keepalive directive?

Q: Can I use upstream load balancing to proxy to HTTPS backend servers?

Q: What happens to in-flight requests when I reload Nginx after changing the upstream block?

Q: How does the consistent keyword in the hash directive affect backend additions?

Q: Is the Nginx open-source upstream module the same as Nginx Plus upstream?

Q: How many upstream groups can I define in a single Nginx configuration?

Q: Do I need to restart or reload Nginx when a backend server recovers after being marked as failed?

Frequently Asked Questions

What is the default load balancing method in Nginx if I do not specify one?

Can I combine ip_hash with least_conn in the same upstream block?

What is the difference between max_fails and proxy_connect_timeout?

How do I temporarily take an upstream server out of rotation for maintenance?

Does proxy_next_upstream retry POST requests safely?

What value should I set for the keepalive directive?

Can I use upstream load balancing to proxy to HTTPS backend servers?

What happens to in-flight requests when I reload Nginx after changing the upstream block?

How does the consistent keyword in the hash directive affect backend additions?

Is the Nginx open-source upstream module the same as Nginx Plus upstream?

Related Articles

Least Connections (
least_conn
)

IP Hash (
ip_hash
)

Generic Hash (
hash
)