Introduction
When your application outgrows a single backend server, Nginx's upstream load balancing becomes the backbone of your horizontal scaling strategy. Unlike DNS round-robin or hardware load balancers, Nginx gives you granular control over traffic distribution, failure detection, and connection pooling — all from a single configuration file on sw-infrarunbook-01.
This guide covers every major upstream load balancing method available in Nginx open source: round-robin, weighted distribution, least connections, IP hash, and generic hash. You will also learn how to configure passive health checks, backup servers, upstream keepalive connections, retry logic, and timeout tuning to build a production-grade load balancing layer for solvethenetwork.com.
Prerequisites
- Nginx installed on sw-infrarunbook-01 (Ubuntu 22.04 or AlmaLinux 9)
- Two or more backend application servers reachable within your RFC 1918 network
- DNS for solvethenetwork.com pointing to the Nginx server's public IP
- Root or sudo access on sw-infrarunbook-01
Understanding the Upstream Block
The
upstreamblock lives inside the
httpcontext in your Nginx configuration. It defines a named group of backend servers that a
proxy_passdirective can reference by name. The upstream block owns the routing logic — which server receives each request, how failures are tracked, and how connections are managed.
A minimal upstream block looks like this:
upstream app_backend {
server 10.0.1.11:8080;
server 10.0.1.12:8080;
server 10.0.1.13:8080;
}You reference it in a server block using
proxy_pass:
server {
listen 80;
server_name solvethenetwork.com;
location / {
proxy_pass http://app_backend;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
}
}With no additional directives, Nginx uses weighted round-robin by default — distributing requests evenly across all available servers in sequence.
Round-Robin Load Balancing (Default)
Round-robin is the implicit default. Nginx cycles through the server list sequentially: first request goes to
10.0.1.11, second to
10.0.1.12, third to
10.0.1.13, then back to
10.0.1.11. No extra directive is needed.
upstream app_backend {
server 10.0.1.11:8080;
server 10.0.1.12:8080;
server 10.0.1.13:8080;
}Round-robin works best when backends are homogeneous — same CPU, memory, and network throughput — and when individual requests have roughly equal processing cost. It is the simplest starting point and performs well for the majority of stateless API workloads.
Weighted Load Balancing
When backend servers have unequal capacity, the
weightparameter directs proportionally more traffic to stronger nodes. A server with
weight=5receives five times as many requests as one with
weight=1.
upstream app_backend {
server 10.0.1.11:8080 weight=5;
server 10.0.1.12:8080 weight=3;
server 10.0.1.13:8080 weight=1;
}Out of every nine requests in this example,
10.0.1.11handles five,
10.0.1.12handles three, and
10.0.1.13handles one. Weighted round-robin is particularly useful during rolling hardware upgrades where a newer, more powerful server should absorb the majority of traffic while a legacy node is being decommissioned.
Least Connections (least_conn
)
Round-robin distributes requests evenly without regard to how long each takes. If some requests hold a backend connection open for several seconds — think file uploads, report exports, or slow database queries — that server accumulates a backlog while others sit idle. The
least_conndirective routes each new request to the backend with the fewest active connections at that instant.
upstream app_backend {
least_conn;
server 10.0.1.11:8080;
server 10.0.1.12:8080;
server 10.0.1.13:8080;
}Use
least_connfor any workload where request duration varies significantly. It pairs naturally with
weight— a server with a higher weight receives new connections proportionally more often when connection counts are otherwise equal across the pool.
IP Hash (ip_hash
)
When your application stores session state in server memory rather than in a shared external store, you need session persistence: the same client must always reach the same backend. The
ip_hashdirective hashes the client IP and pins that client to a specific upstream server for the duration of its session.
upstream app_backend {
ip_hash;
server 10.0.1.11:8080;
server 10.0.1.12:8080;
server 10.0.1.13:8080;
}For IPv4, Nginx hashes the first three octets, meaning all clients within the same /24 subnet resolve to the same backend. For IPv6, the full address is used. If the assigned backend becomes unavailable, Nginx automatically rehashes the client to a different live server.
Operational note:ip_hashcannot be combined withleast_conn. For deployments that require both persistence and load-aware routing, the better architectural choice is to externalize session state to Redis and useleast_connwithoutip_hash.
Generic Hash (hash
)
The
hashdirective lets you define an arbitrary Nginx variable as the hashing key. This enables consistent routing by request URI, cookie, query argument, or any combination. Routing the same URL to the same backend consistently maximizes in-process cache hit rates on each node.
upstream app_backend {
hash $request_uri consistent;
server 10.0.1.11:8080;
server 10.0.1.12:8080;
server 10.0.1.13:8080;
}The optional
consistentkeyword enables ketama-style consistent hashing, which minimizes key redistribution when a server is added to or removed from the pool. Without
consistent, adding one server causes every hash bucket to recalculate, invalidating all existing routing assignments simultaneously.
Passive Health Checks and Failover
Nginx open source relies on passive health checking: it monitors real traffic responses rather than issuing synthetic probes. The
max_failsand
fail_timeoutparameters control how quickly a troubled server is pulled from rotation and how long it remains out of service.
upstream app_backend {
least_conn;
server 10.0.1.11:8080 max_fails=3 fail_timeout=30s;
server 10.0.1.12:8080 max_fails=3 fail_timeout=30s;
server 10.0.1.13:8080 max_fails=3 fail_timeout=30s;
}In this configuration, a backend is marked unavailable after three consecutive failures within a 30-second window. It stays out of rotation for 30 seconds before Nginx attempts to route a single probe request to it again. If that probe succeeds, the server re-enters the pool. The Nginx defaults are
max_fails=1and
fail_timeout=10s— raising
max_failsto 3 prevents brief network blips from triggering unnecessary failovers.
Backup Servers
The
backupparameter designates a server as a hot standby. Nginx routes no traffic to a backup server while at least one primary is alive. Only when every non-backup server is in a failed or
downstate does Nginx begin sending traffic to the backup.
upstream app_backend {
least_conn;
server 10.0.1.11:8080 max_fails=3 fail_timeout=30s;
server 10.0.1.12:8080 max_fails=3 fail_timeout=30s;
server 10.0.1.13:8080 max_fails=3 fail_timeout=30s;
server 10.0.1.20:8080 backup;
}Use a backup server for a degraded-mode endpoint that serves cached or simplified responses during a full primary outage, or for a geographically distant server that is too far away for normal traffic but acceptable as a last resort.
To temporarily drain a server without removing its configuration line — useful during maintenance windows — mark it
down. Nginx will not route any traffic to it until
downis removed and the config is reloaded:
upstream app_backend {
server 10.0.1.11:8080;
server 10.0.1.12:8080 down;
server 10.0.1.13:8080;
}Keepalive Connections to Upstream
By default Nginx opens a new TCP connection to the upstream for every proxied request and tears it down afterward. At high request rates this creates significant overhead from repeated TCP three-way handshakes and TIME_WAIT socket accumulation. The
keepalivedirective instructs each Nginx worker process to maintain a cache of idle persistent connections to the upstream group.
upstream app_backend {
least_conn;
server 10.0.1.11:8080 max_fails=3 fail_timeout=30s;
server 10.0.1.12:8080 max_fails=3 fail_timeout=30s;
server 10.0.1.13:8080 max_fails=3 fail_timeout=30s;
keepalive 32;
}The value
32sets the maximum number of idle keepalive connections cached per worker process for this upstream group. When all cached connections are in use, Nginx opens new connections normally. For keepalive to function correctly, the proxy location block must also be configured as follows:
location / {
proxy_pass http://app_backend;
proxy_http_version 1.1;
proxy_set_header Connection "";
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
}Setting
proxy_http_version 1.1is required because HTTP/1.0 does not support persistent connections. Clearing the
Connectionheader with
proxy_set_header Connection ""prevents Nginx from forwarding a
Connection: closeheader from the client to the upstream, which would force the upstream to terminate the connection after every response.
Proxy Timeout Configuration
Three directives govern how long Nginx waits at each phase of an upstream interaction:
- proxy_connect_timeout — Maximum time to establish the TCP connection to the upstream. The default is 60 seconds; for backends on the same LAN, 5 seconds is more appropriate and enables faster failover.
- proxy_send_timeout — Maximum idle time between two successive write operations to the upstream. Resets with each write. Default is 60 seconds.
- proxy_read_timeout — Maximum idle time between two successive read operations from the upstream. This is not the total request duration but the inter-packet gap. Default is 60 seconds.
location / {
proxy_pass http://app_backend;
proxy_http_version 1.1;
proxy_set_header Connection "";
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_connect_timeout 5s;
proxy_send_timeout 60s;
proxy_read_timeout 60s;
}For long-running backend operations such as PDF generation or large data exports, raise
proxy_read_timeoutappropriately. For fast internal microservice calls, keep
proxy_connect_timeoutshort so Nginx detects a downed backend within seconds and retries on another node.
Retry Logic with proxy_next_upstream
The
proxy_next_upstreamdirective defines the conditions under which Nginx retries a failed request on the next available server. This complements passive health checks by handling in-flight request failures rather than server-level failures.
location / {
proxy_pass http://app_backend;
proxy_http_version 1.1;
proxy_set_header Connection "";
proxy_next_upstream error timeout invalid_header http_500 http_502 http_503;
proxy_next_upstream_tries 2;
proxy_next_upstream_timeout 10s;
}proxy_next_upstream_tries 2limits the retry chain to two attempts total, preventing a slow backend from forcing the request to be tried against every server in the pool before returning an error to the client. Be conservative when enabling retry for
http_500on non-idempotent HTTP methods such as POST — retrying a payment submission or database write could cause duplicate side effects. To restrict retries to safe methods only, omit
http_500from the directive values and handle application-level 500 errors separately.
Complete Production Configuration
The following is a full production-ready configuration for sw-infrarunbook-01, load balancing HTTPS traffic to three application backends serving solvethenetwork.com using least connections, keepalive, passive health checks, a backup server, and retry logic.
# /etc/nginx/conf.d/solvethenetwork.conf
upstream app_backend {
least_conn;
server 10.0.1.11:8080 max_fails=3 fail_timeout=30s;
server 10.0.1.12:8080 max_fails=3 fail_timeout=30s;
server 10.0.1.13:8080 max_fails=3 fail_timeout=30s;
# Hot standby: receives traffic only when all primaries are down
server 10.0.1.20:8080 backup;
keepalive 32;
}
server {
listen 80;
server_name solvethenetwork.com;
return 301 https://$host$request_uri;
}
server {
listen 443 ssl;
server_name solvethenetwork.com;
ssl_certificate /etc/letsencrypt/live/solvethenetwork.com/fullchain.pem;
ssl_certificate_key /etc/letsencrypt/live/solvethenetwork.com/privkey.pem;
ssl_protocols TLSv1.2 TLSv1.3;
ssl_ciphers HIGH:!aNULL:!MD5;
access_log /var/log/nginx/solvethenetwork_access.log;
error_log /var/log/nginx/solvethenetwork_error.log warn;
location / {
proxy_pass http://app_backend;
proxy_http_version 1.1;
proxy_set_header Connection "";
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
proxy_connect_timeout 5s;
proxy_send_timeout 60s;
proxy_read_timeout 60s;
proxy_next_upstream error timeout http_502 http_503;
proxy_next_upstream_tries 2;
}
# Lightweight liveness check served by Nginx itself
location /nginx-health {
access_log off;
return 200 "OK\n";
add_header Content-Type text/plain;
}
}Testing and Validation
Test the configuration syntax on sw-infrarunbook-01 before applying it:
sudo nginx -tExpected output on success:
nginx: the configuration file /etc/nginx/nginx.conf syntax is ok
nginx: configuration file /etc/nginx/nginx.conf test is successfulReload Nginx gracefully, without dropping active connections:
sudo systemctl reload nginxVerify distribution across all backends by sending a batch of requests and checking each backend's access log:
for i in {1..12}; do curl -s -o /dev/null -w "%{http_code}\n" https://solvethenetwork.com/; doneEach of the three primary backends should show roughly four entries in its access log for the twelve requests sent. With
least_conn, distribution may not be perfectly even if some backends are already serving open connections when the test runs — this is expected and correct behavior.
Monitoring with stub_status
Nginx open source includes a basic status module that exposes aggregate connection and request counters. Enable it on a loopback-only listener so it is never reachable from the internet:
server {
listen 127.0.0.1:8081;
location /nginx_status {
stub_status;
allow 127.0.0.1;
deny all;
}
}curl http://127.0.0.1:8081/nginx_statusSample output:
Active connections: 47
server accepts handled requests
142680 142680 389201
Reading: 0 Writing: 5 Waiting: 42For per-upstream metrics — request rate per backend, active connections per server, upstream error rates — consider the open-source
nginx-module-vtsmodule or the
nginx-prometheus-exportersidecar to feed data into a Prometheus and Grafana stack. Nginx Plus provides these natively through its upstream health check API, but the open-source path covered here is sufficient for the majority of production deployments.
Frequently Asked Questions
Q: What is the default load balancing method in Nginx if I do not specify one?
A: The default is weighted round-robin. Nginx cycles through the server list in order, sending one request to each server in turn. If all servers have equal weight (the default weight is 1), requests are distributed evenly. You do not need to add any directive to use round-robin — it is active as soon as you define an upstream block with multiple server entries.
Q: Can I combine ip_hash with least_conn in the same upstream block?
A: No.
ip_hashand
least_connare mutually exclusive load balancing methods and cannot be used together in the same upstream block. If you need session persistence alongside connection-aware routing, the recommended approach is to move session state out of application memory and into an external shared store such as Redis, then use
least_connalone without
ip_hash.
Q: What is the difference between max_fails and proxy_connect_timeout?
A: They operate at different layers.
max_failsis an upstream-level counter that tracks how many consecutive failed requests a server produces within the
fail_timeoutwindow before Nginx removes it from rotation.
proxy_connect_timeoutis a per-request timer that controls how long Nginx waits when attempting to open a single TCP connection to an upstream. A connection timeout counts as one failure toward
max_fails, but the two settings serve different purposes and should both be configured.
Q: How do I temporarily take an upstream server out of rotation for maintenance?
A: Add the
downparameter to the server entry in the upstream block and reload Nginx. Nginx will not route any traffic to that server until
downis removed and the configuration is reloaded again. This is safer than deleting the server line entirely because it preserves the configuration and makes the maintenance state explicit and auditable in version control.
Q: Does proxy_next_upstream retry POST requests safely?
A: By default Nginx does not retry non-idempotent requests (POST, LOCK, PATCH) to prevent unintended duplicate side effects such as double-submitting a payment or inserting a database record twice. If you include
non_idempotentin the
proxy_next_upstreamdirective value, Nginx will retry these methods, but this should only be done when your backend handlers are explicitly designed to be idempotent. For most applications, leave non-idempotent retry disabled.
Q: What value should I set for the keepalive directive?
A: The
keepalivevalue is the number of idle connections cached per worker process, not a global total. A reasonable starting point is 32 for most applications. Multiply this by the number of Nginx worker processes and the number of upstream servers to estimate total idle connections held open. For example, 4 worker processes with
keepalive 32and 3 backend servers can maintain up to 384 idle connections cluster-wide. Tune this based on your backend server's connection limits and your actual request concurrency.
Q: Can I use upstream load balancing to proxy to HTTPS backend servers?
A: Yes. Change the upstream server entries to use port 443 and change
proxy_passto
https://app_backend. Nginx will establish TLS to each upstream server. You can control upstream TLS verification with
proxy_ssl_verify onand
proxy_ssl_trusted_certificate. For internal backends with self-signed certificates, you may set
proxy_ssl_verify off, though enabling verification is strongly recommended in production environments.
Q: What happens to in-flight requests when I reload Nginx after changing the upstream block?
A: Nginx's graceful reload (
systemctl reload nginxor
nginx -s reload) spawns new worker processes with the updated configuration and lets existing worker processes finish their current requests before shutting down. In-flight requests are not dropped. Only after all active connections on the old workers have closed do the old workers exit. This means a reload is safe to perform during live traffic as long as the reload itself completes successfully.
Q: How does the consistent keyword in the hash directive affect backend additions?
A: Without
consistent, the hash algorithm maps keys to buckets using a simple modulo of the total server count. Adding or removing one server changes the modulo divisor, causing nearly every key to remap to a different server — destroying cache locality. With
consistent, Nginx uses ketama consistent hashing, where each server occupies a range on a virtual ring. Adding a server only remaps the keys that fall within that server's new range, leaving the rest of the key space undisturbed. This makes pool membership changes far less disruptive for cache-affinity workloads.
Q: Is the Nginx open-source upstream module the same as Nginx Plus upstream?
A: The core upstream proxying functionality is the same, but Nginx Plus adds active health checks (Nginx sends periodic synthetic probes to upstreams instead of relying solely on real traffic failures), a live upstream reconfiguration API, per-upstream dashboard metrics, and the
stickycookie persistence directive. For most production use cases the open-source passive health checking covered in this guide is entirely sufficient. Active health checks become valuable when backend failures are infrequent and you cannot afford even a single request hitting a failed server before Nginx detects the outage.
Q: How many upstream groups can I define in a single Nginx configuration?
A: There is no hard limit enforced by Nginx. You can define as many upstream blocks as needed within the
httpcontext. Each upstream group is independent, with its own server pool, load balancing method, health check settings, and keepalive pool. It is common to define separate upstream groups for different services — one for the main application, one for an API gateway, one for a media processing cluster — all proxied from different
locationblocks within the same or different server blocks.
Q: Do I need to restart or reload Nginx when a backend server recovers after being marked as failed?
A: No manual intervention is required. After a server is marked unavailable and the
fail_timeoutperiod expires, Nginx automatically sends a single probe request to it. If that request succeeds, the server is returned to the active pool without any reload. This recovery is handled entirely by the Nginx worker processes at runtime. A configuration reload is only necessary when you change the static upstream configuration itself, such as adding or removing server entries or changing parameters.
