Symptoms
When HAProxy runs out of connections, the failure mode depends on which limit you've actually hit, but the experience for end users is consistent: requests start hanging, then fail with 503s. You'll typically see log entries like this:
[WARNING] 101/143022 (1234) : Proxy web_frontend reached the maxconn limit (2000). Please check the global maxconn and the frontend's maxconn.
192.168.1.50:54321 [12/Apr/2026:14:32:01.123] web_frontend web_backend/app01 0/30000/-1/-1/30001 503 212 - - sQ-- 47/47/47/47/0 0/0 "GET /api/data HTTP/1.1"
Or, at the socket level:
connect() failed for server web_backend/app01: no more sockets
On the client side, browsers report connection timeouts or reset connections. Your monitoring dashboards show queue depth climbing, and the HAProxy stats page at http://192.168.1.10:8404/stats shows backend servers going orange or red with session counts pinned at their configured maximum. Error rates spike. On-call gets paged.
The tricky part is that "connection limits reached" isn't a single error — it's a category of failure with at least five distinct root causes, each requiring a different fix. Raising the wrong number gets you nowhere. This article covers all of them.
Root Cause 1: Global maxconn Too Low
Why It Happens
The global maxconn directive in
haproxy.cfgcaps the total number of concurrent connections HAProxy will accept across all frontends combined. When this limit is hit, HAProxy stops accepting new TCP connections at the socket level. New clients receive a connection refused or a timeout, depending on how the OS handles the listen backlog.
This is the most common cause I see on freshly deployed instances. The HAProxy default maxconn is 2000 — fine for a lab, completely inadequate for any real production workload. Someone installs it, forgets to tune it, and everything is fine until the first real traffic spike.
How to Identify It
Query the stats socket directly:
echo "show info" | socat stdio /var/run/haproxy/admin.sock | grep -E "Maxconn|MaxconnReached|CurrConns"
Expected output when you're hitting the limit:
Maxconn: 2000
MaxconnReached: 1847
CurrConns: 2000
If
CurrConnsequals
Maxconnand
MaxconnReachedis a non-zero and growing number, you've found it. The HAProxy log will also contain the explicit warning shown in the Symptoms section above.
How to Fix It
Edit
/etc/haproxy/haproxy.cfgand raise the global maxconn in the
globalsection:
global
maxconn 50000
log /dev/log local0
log /dev/log local1 notice
stats socket /var/run/haproxy/admin.sock mode 660 level admin expose-fd listeners
stats timeout 30s
user haproxy
group haproxy
Before settling on 50000, do the math. Each connection consumes roughly 20–50 KB of kernel memory for socket buffers. At 50,000 connections you're looking at up to 2.5 GB of potential memory pressure. Size it to your actual traffic peak plus a 30% headroom buffer. You can estimate peak concurrent connections from your access logs:
awk '{print $4}' /var/log/haproxy.log | cut -d: -f1-3 | sort | uniq -c | sort -rn | head -5
After updating the config, validate and reload:
haproxy -c -f /etc/haproxy/haproxy.cfg && systemctl reload haproxy
Root Cause 2: Per-Server maxconn Too Low
Why It Happens
Even with a generous global maxconn, each backend server has its own connection limit. When a server's per-server maxconn is reached, HAProxy doesn't reject the connection outright — it queues it. If the queue fills up too, then you get immediate 503s. This is a layered failure and it's easy to miss if you're only looking at the global stats.
The default per-server maxconn is 0 (unlimited) unless you've explicitly set it. But I've seen inherited configs where someone set a conservative value years ago, the backend scaled up, and now those limits are consistently being hit during peak hours.
How to Identify It
Check the current server state:
echo "show servers state web_backend" | socat stdio /var/run/haproxy/admin.sock
Output:
1 web_backend 1 app01 192.168.10.11 2 0 1 1 100 100 0 0 0 0
1 web_backend 2 app02 192.168.10.12 2 0 1 1 100 100 0 0 0 0
The current-sessions and max-sessions columns will show saturation. You can also pull CSV stats to compare current vs. max sessions per server:
echo "show stat" | socat stdio /var/run/haproxy/admin.sock | awk -F',' 'NR==1{for(i=1;i<=NF;i++) if($i=="scur"||$i=="smax"||$i=="pxname"||$i=="svname") col[i]=$i} NR>1{out=""; for(i in col) out=out col[i]"="$i" "; print out}'
If a specific server shows its
scur(current sessions) permanently pegged at the same value during load, that server is the bottleneck, not the global limit.
How to Fix It
Raise the maxconn for each server in the backend definition:
backend web_backend
balance leastconn
option httpchk GET /healthz
server app01 192.168.10.11:8080 check maxconn 500 weight 1
server app02 192.168.10.12:8080 check maxconn 500 weight 1
server app03 192.168.10.13:8080 check maxconn 500 weight 1
The right value depends on what your application server can actually handle. Raising HAProxy's per-server maxconn beyond what the app can tolerate just pushes the bottleneck downstream. If you're running Gunicorn with 8 workers and 4 threads each, the effective max is 32 concurrent requests — no amount of HAProxy tuning changes that. Check your app server's thread pool configuration and match your HAProxy per-server maxconn to it, or slightly below to leave room for healthcheck overhead.
Root Cause 3: File Descriptor Limit
Why It Happens
Every TCP connection requires a file descriptor. HAProxy opens two FDs per proxied connection — one for the client side, one for the backend side — plus additional FDs for log sockets, the stats socket, TLS session storage, and config files. If the OS or process-level FD limit is lower than what HAProxy's maxconn requires, HAProxy will either fail to start correctly or begin refusing connections before it ever reaches its configured maxconn ceiling.
The rough formula is: required FDs = (maxconn × 2) + 500. This catches people every time they raise maxconn without raising the corresponding FD limits. HAProxy will actually warn you at startup if this happens, but those warnings scroll by fast and are easy to miss.
How to Identify It
Check HAProxy's actual process limits:
cat /proc/$(pidof haproxy | awk '{print $1}')/limits | grep "open files"
Output showing a problem:
Max open files 4096 4096 files
A limit of 4096 with a maxconn of 50000 means HAProxy will exhaust FDs at around 2000 active connections and start failing. Also check the system-wide limit:
ulimit -n
sysctl fs.file-max
And check HAProxy's startup journal for the explicit warning:
journalctl -u haproxy --since "2 hours ago" | grep -iE "ulimit|descriptor|fd limit|too low"
HAProxy startup output when FD limit is insufficient:
[WARNING] 101/143022 (1234) : FD limit (4096) too low for maxconn=50000. Please raise 'ulimit-n' to at least 100500.
How to Fix It
Set
ulimit-nin the global section of
haproxy.cfg:
global
maxconn 50000
ulimit-n 100500
Then lock in the limit at the systemd level so it survives restarts:
mkdir -p /etc/systemd/system/haproxy.service.d/
cat > /etc/systemd/system/haproxy.service.d/limits.conf <<'EOF'
[Service]
LimitNOFILE=100500
EOF
Add the system-wide PAM limits in
/etc/security/limits.conf:
haproxy soft nofile 100500
haproxy hard nofile 100500
root soft nofile 100500
root hard nofile 100500
And tune the kernel in
/etc/sysctl.conf:
fs.file-max = 500000
net.ipv4.ip_local_port_range = 1024 65535
net.core.somaxconn = 65535
net.ipv4.tcp_tw_reuse = 1
Apply everything and verify:
sysctl -p
systemctl daemon-reload
systemctl reload haproxy
cat /proc/$(pidof haproxy | awk '{print $1}')/limits | grep "open files"
Expected output after fix:
Max open files 100500 100500 files
Root Cause 4: Thread Saturation
Why It Happens
HAProxy processes connections across worker threads. In older single-process configurations, everything ran on a single thread and you could hit CPU saturation long before numerical connection limits. With modern HAProxy 2.x, the
nbthreaddirective distributes connection handling across multiple threads tied to CPU cores.
Thread saturation happens when all threads are blocked — waiting on slow TLS handshakes, waiting on backend I/O, or stuck in kernel socket operations. The effect looks identical to a maxconn limit from the outside: new connections queue up and time out. But the fix is entirely different. Raising maxconn won't help at all if your threads are the bottleneck.
In my experience, this manifests most often on systems that were migrated from old single-core VM configurations where
nbthreadwas never set, or was left at 1 or 2 on an 8-core host.
How to Identify It
Check the configured thread count and current task load:
echo "show info" | socat stdio /var/run/haproxy/admin.sock | grep -E "^Nbthread|^Tasks|^Run_queue|^Idle_pct"
Output showing saturation:
Nbthread: 2
Tasks: 51234
Run_queue: 8472
Idle_pct: 3
An
Idle_pctnear zero with a large
Run_queuerelative to
Nbthreadis the clearest signal. Cross-reference with CPU utilization:
top -p $(pidof haproxy | tr ' ' ',')
If you see two cores pegged at 100% while six others sit idle, you're thread-saturated. On a loaded system, also check the per-thread CPU breakdown:
echo "show threads" | socat stdio /var/run/haproxy/admin.sock
How to Fix It
Set
nbthreadto match your core count. Leave one core for the OS on busy systems:
global
maxconn 50000
nbthread 7
ulimit-n 100500
cpu-map auto:1/1-7 0-6
The
cpu-mapdirective pins each HAProxy thread to a specific CPU core. This reduces cache misses and prevents HAProxy threads from fighting the kernel scheduler for CPU time. The format
auto:1/1-7 0-6maps threads 1 through 7 of process 1 to CPU cores 0 through 6.
Reload and verify:
haproxy -c -f /etc/haproxy/haproxy.cfg && systemctl reload haproxy
echo "show info" | socat stdio /var/run/haproxy/admin.sock | grep -E "^Nbthread|^Idle_pct"
Expected output after fix:
Nbthread: 7
Idle_pct: 42
Worth noting: if your backend responses are genuinely slow (consistently over 500ms average), adding threads provides some relief but doesn't fully solve the problem. You also need to address backend latency or add more backend servers. Thread saturation amplifies slow-backend problems rather than causing them independently.
Root Cause 5: Queue Overflow
Why It Happens
When a backend server hits its per-server maxconn, HAProxy doesn't immediately send a 503. Instead, it places the request into a queue and waits for a connection slot to free up. That queue has its own limit — configured by the
maxqueuedirective or defaulting to 0 (unlimited in older HAProxy versions, or calculated from maxconn in newer ones). When the queue fills, HAProxy sends a 503 immediately with no further waiting.
Queue overflow is almost always a secondary failure. It's a symptom of per-server maxconn being too low, backends being too slow, or a sudden traffic spike that outpaces your connection pool. But the log signatures it produces are distinct, and understanding them helps you diagnose the primary cause faster.
How to Identify It
Look for the queue-related status flags in HAProxy logs. The
sQtermination state in the log line means the session ended while waiting in the server queue:
192.168.1.50:54321 [12/Apr/2026:14:32:01.123] web_frontend web_backend/app01 0/30000/-1/-1/30001 503 212 - - sQ-- 47/47/47/47/0 0/0 "GET /api/data HTTP/1.1"
The
-1for the backend connect time confirms HAProxy never actually connected to the backend — the request died while queued. Pull current queue depths directly:
echo "show stat" | socat stdio /var/run/haproxy/admin.sock | awk -F',' 'NR==1{for(i=1;i<=NF;i++){if($i=="pxname"||$i=="svname"||$i=="qcur"||$i=="qmax") col[i]=$i}} NR>1{line=""; for(i in col) line=line col[i]"="$i" "; print line}'
qcuris the current queue depth;
qmaxis the historical maximum observed. If
qcuris regularly non-zero on a server, that server is the bottleneck and connections are queuing up waiting for it.
How to Fix It
Address the underlying cause first — usually per-server maxconn being too low relative to what the app can handle, or slow backend response times causing connections to pile up. Then tune queue behavior explicitly so failures are fast rather than slow:
backend web_backend
balance leastconn
option httpchk GET /healthz HTTP/1.1\r\nHost:\ solvethenetwork.com
timeout queue 10s
timeout connect 3s
timeout server 30s
default-server maxconn 300 maxqueue 100 check inter 5s fall 3 rise 2
server app01 192.168.10.11:8080 weight 1
server app02 192.168.10.12:8080 weight 1
server app03 192.168.10.13:8080 weight 1
Setting
timeout queueto 10 seconds means requests don't queue indefinitely — they fail fast with a 503 rather than hanging until the client times out. A client that gets a 503 in 10 seconds can retry or show a meaningful error. A client waiting 90 seconds for a queue timeout is burning a socket on both ends and delivering a worse user experience anyway.
Set
maxqueueproportional to your expected burst duration times your request rate. If you process 100 req/s and legitimate traffic bursts last 2 seconds before backends catch up, a queue of 200 is reasonable. A queue of 10,000 just means 10,000 clients wait longer before getting their 503, and your backend has no chance of draining it before the next wave arrives.
Root Cause 6: Slow Backend Responses Causing Connection Buildup
Why It Happens
This one is situational, but I've seen it contribute to connection limit exhaustion more often than any pure misconfiguration. If your backend servers respond slowly — database queries taking 5–10 seconds, GC pauses, upstream API latency — connections pile up because HAProxy holds each connection open until the backend responds or the timeout fires. At steady-state traffic levels, slow backends mean connections accumulate faster than they're released, and you'll hit your limits even if they're configured correctly for normal conditions.
How to Identify It
Check average backend response times from the stats output:
echo "show stat" | socat stdio /var/run/haproxy/admin.sock | awk -F',' 'NR==1{for(i=1;i<=NF;i++){if($i=="pxname"||$i=="svname"||$i=="rtime"||$i=="ttime") col[i]=$i}} NR>1{line=""; for(i in col) line=line col[i]"="$i" "; print line}'
rtimeis the average backend response time in milliseconds. If it's climbing above 1000ms consistently, your backends are struggling and connections will pile up under load. Also look at active session count relative to your traffic rate:
echo "show info" | socat stdio /var/run/haproxy/admin.sock | grep -E "^CurrConns|^MaxConnReached|^Uptime"
If concurrent connections are high but your request rate is moderate, long-lived connections are the issue. Check your HAProxy log format for response times — the fifth timing field in the default log format is total session time. Sort by it to find outliers:
awk '{print $NF, $0}' /var/log/haproxy.log | sort -rn | head -20
How to Fix It
Set aggressive but realistic server timeouts. Don't leave them at 0 or a multi-minute value in production:
defaults
timeout connect 3s
timeout client 30s
timeout server 30s
timeout queue 10s
I've encountered configurations where someone set
timeout server 0to "avoid timeouts" and the result was thousands of half-open connections holding the load balancer hostage. Timeouts exist to evict dead connections, not to punish your users. Set them to the 95th-percentile response time of your slowest legitimate request plus a reasonable buffer — not infinity.
If slow responses are from a specific backend endpoint rather than all traffic, use ACLs to apply tighter timeouts selectively:
backend slow_api_backend
timeout server 60s
server api01 192.168.10.21:9090 check maxconn 50
Prevention
The best time to tune connection limits is before a production incident, not during one. Here's the full reference configuration for sw-infrarunbook-01 running HAProxy 2.8, incorporating all the fixes discussed above:
global
maxconn 50000
nbthread 7
ulimit-n 100500
cpu-map auto:1/1-7 0-6
log /dev/log local0
log /dev/log local1 notice
stats socket /var/run/haproxy/admin.sock mode 660 level admin expose-fd listeners
stats timeout 30s
user haproxy
group haproxy
defaults
log global
mode http
option httplog
option dontlognull
timeout connect 3s
timeout client 30s
timeout server 30s
timeout queue 10s
errorfile 503 /etc/haproxy/errors/503.http
frontend web_frontend
bind 192.168.1.10:80
bind 192.168.1.10:443 ssl crt /etc/haproxy/certs/solvethenetwork.com.pem
maxconn 40000
default_backend web_backend
backend web_backend
balance leastconn
option httpchk GET /healthz HTTP/1.1\r\nHost:\ solvethenetwork.com
default-server maxconn 300 maxqueue 100 check inter 5s fall 3 rise 2
server app01 192.168.10.11:8080 weight 1
server app02 192.168.10.12:8080 weight 1
server app03 192.168.10.13:8080 weight 1
listen stats
bind 192.168.1.10:8404
stats enable
stats uri /stats
stats refresh 10s
stats auth infrarunbook-admin:changeme
Build connection capacity estimation into your deployment process. Use this rough formula when sizing a new instance:
- maxconn = available_memory_MB × 20 (conservative) — each connection uses roughly 50 KB
- ulimit-n = (maxconn × 2) + 500
- per-server maxconn = app server thread pool size × 0.9
- maxqueue = expected burst requests × burst duration in seconds
Monitor proactively rather than reactively. Add these metrics to your alerting stack before the limits become an incident:
# Prometheus alerting rules (HAProxy exporter)
- alert: HaproxyFrontendSaturation
expr: haproxy_frontend_current_sessions / haproxy_frontend_limit_sessions > 0.80
for: 2m
annotations:
summary: "HAProxy frontend {{ $labels.proxy }} at {{ $value | humanizePercentage }} capacity"
- alert: HaproxyBackendQueueBuildup
expr: haproxy_backend_current_queue > 10
for: 1m
annotations:
summary: "HAProxy backend {{ $labels.proxy }} queue depth {{ $value }}"
- alert: HaproxyServerSaturation
expr: haproxy_server_current_sessions / haproxy_server_limit_sessions > 0.85
for: 2m
annotations:
summary: "HAProxy server {{ $labels.server }} at capacity"
Load-test your limits before they test you. Drive sustained load against a staging environment with real request patterns and watch the stats socket in real time:
wrk -t12 -c1000 -d60s http://192.168.1.10/healthz
# In a second terminal, watch the stats live
watch -n1 "echo 'show info' | socat stdio /var/run/haproxy/admin.sock | grep -E 'CurrConns|MaxconnReached|Tasks|Run_queue|Idle_pct'"
This tells you exactly where saturation begins before it happens in production. Document the results. When traffic doubles six months from now, you'll want to know which limit to raise first without having to rediscover it under pressure.
