InfraRunBook
    Back to articles

    HAProxy Connection Limits Reached

    HAProxy
    Published: Apr 12, 2026
    Updated: Apr 12, 2026

    Diagnose and fix HAProxy connection limit failures covering global maxconn, per-server limits, file descriptors, thread saturation, and queue overflow with real commands and config examples.

    HAProxy Connection Limits Reached

    Symptoms

    When HAProxy runs out of connections, the failure mode depends on which limit you've actually hit, but the experience for end users is consistent: requests start hanging, then fail with 503s. You'll typically see log entries like this:

    [WARNING] 101/143022 (1234) : Proxy web_frontend reached the maxconn limit (2000). Please check the global maxconn and the frontend's maxconn.
    192.168.1.50:54321 [12/Apr/2026:14:32:01.123] web_frontend web_backend/app01 0/30000/-1/-1/30001 503 212 - - sQ-- 47/47/47/47/0 0/0 "GET /api/data HTTP/1.1"

    Or, at the socket level:

    connect() failed for server web_backend/app01: no more sockets

    On the client side, browsers report connection timeouts or reset connections. Your monitoring dashboards show queue depth climbing, and the HAProxy stats page at http://192.168.1.10:8404/stats shows backend servers going orange or red with session counts pinned at their configured maximum. Error rates spike. On-call gets paged.

    The tricky part is that "connection limits reached" isn't a single error — it's a category of failure with at least five distinct root causes, each requiring a different fix. Raising the wrong number gets you nowhere. This article covers all of them.


    Root Cause 1: Global maxconn Too Low

    Why It Happens

    The global maxconn directive in

    haproxy.cfg
    caps the total number of concurrent connections HAProxy will accept across all frontends combined. When this limit is hit, HAProxy stops accepting new TCP connections at the socket level. New clients receive a connection refused or a timeout, depending on how the OS handles the listen backlog.

    This is the most common cause I see on freshly deployed instances. The HAProxy default maxconn is 2000 — fine for a lab, completely inadequate for any real production workload. Someone installs it, forgets to tune it, and everything is fine until the first real traffic spike.

    How to Identify It

    Query the stats socket directly:

    echo "show info" | socat stdio /var/run/haproxy/admin.sock | grep -E "Maxconn|MaxconnReached|CurrConns"

    Expected output when you're hitting the limit:

    Maxconn: 2000
    MaxconnReached: 1847
    CurrConns: 2000

    If

    CurrConns
    equals
    Maxconn
    and
    MaxconnReached
    is a non-zero and growing number, you've found it. The HAProxy log will also contain the explicit warning shown in the Symptoms section above.

    How to Fix It

    Edit

    /etc/haproxy/haproxy.cfg
    and raise the global maxconn in the
    global
    section:

    global
        maxconn 50000
        log /dev/log local0
        log /dev/log local1 notice
        stats socket /var/run/haproxy/admin.sock mode 660 level admin expose-fd listeners
        stats timeout 30s
        user haproxy
        group haproxy

    Before settling on 50000, do the math. Each connection consumes roughly 20–50 KB of kernel memory for socket buffers. At 50,000 connections you're looking at up to 2.5 GB of potential memory pressure. Size it to your actual traffic peak plus a 30% headroom buffer. You can estimate peak concurrent connections from your access logs:

    awk '{print $4}' /var/log/haproxy.log | cut -d: -f1-3 | sort | uniq -c | sort -rn | head -5

    After updating the config, validate and reload:

    haproxy -c -f /etc/haproxy/haproxy.cfg && systemctl reload haproxy

    Root Cause 2: Per-Server maxconn Too Low

    Why It Happens

    Even with a generous global maxconn, each backend server has its own connection limit. When a server's per-server maxconn is reached, HAProxy doesn't reject the connection outright — it queues it. If the queue fills up too, then you get immediate 503s. This is a layered failure and it's easy to miss if you're only looking at the global stats.

    The default per-server maxconn is 0 (unlimited) unless you've explicitly set it. But I've seen inherited configs where someone set a conservative value years ago, the backend scaled up, and now those limits are consistently being hit during peak hours.

    How to Identify It

    Check the current server state:

    echo "show servers state web_backend" | socat stdio /var/run/haproxy/admin.sock

    Output:

    1 web_backend 1 app01 192.168.10.11 2 0 1 1 100 100 0 0 0 0
    1 web_backend 2 app02 192.168.10.12 2 0 1 1 100 100 0 0 0 0

    The current-sessions and max-sessions columns will show saturation. You can also pull CSV stats to compare current vs. max sessions per server:

    echo "show stat" | socat stdio /var/run/haproxy/admin.sock | awk -F',' 'NR==1{for(i=1;i<=NF;i++) if($i=="scur"||$i=="smax"||$i=="pxname"||$i=="svname") col[i]=$i} NR>1{out=""; for(i in col) out=out col[i]"="$i" "; print out}'

    If a specific server shows its

    scur
    (current sessions) permanently pegged at the same value during load, that server is the bottleneck, not the global limit.

    How to Fix It

    Raise the maxconn for each server in the backend definition:

    backend web_backend
        balance leastconn
        option httpchk GET /healthz
        server app01 192.168.10.11:8080 check maxconn 500 weight 1
        server app02 192.168.10.12:8080 check maxconn 500 weight 1
        server app03 192.168.10.13:8080 check maxconn 500 weight 1

    The right value depends on what your application server can actually handle. Raising HAProxy's per-server maxconn beyond what the app can tolerate just pushes the bottleneck downstream. If you're running Gunicorn with 8 workers and 4 threads each, the effective max is 32 concurrent requests — no amount of HAProxy tuning changes that. Check your app server's thread pool configuration and match your HAProxy per-server maxconn to it, or slightly below to leave room for healthcheck overhead.


    Root Cause 3: File Descriptor Limit

    Why It Happens

    Every TCP connection requires a file descriptor. HAProxy opens two FDs per proxied connection — one for the client side, one for the backend side — plus additional FDs for log sockets, the stats socket, TLS session storage, and config files. If the OS or process-level FD limit is lower than what HAProxy's maxconn requires, HAProxy will either fail to start correctly or begin refusing connections before it ever reaches its configured maxconn ceiling.

    The rough formula is: required FDs = (maxconn × 2) + 500. This catches people every time they raise maxconn without raising the corresponding FD limits. HAProxy will actually warn you at startup if this happens, but those warnings scroll by fast and are easy to miss.

    How to Identify It

    Check HAProxy's actual process limits:

    cat /proc/$(pidof haproxy | awk '{print $1}')/limits | grep "open files"

    Output showing a problem:

    Max open files            4096                 4096                 files

    A limit of 4096 with a maxconn of 50000 means HAProxy will exhaust FDs at around 2000 active connections and start failing. Also check the system-wide limit:

    ulimit -n
    sysctl fs.file-max

    And check HAProxy's startup journal for the explicit warning:

    journalctl -u haproxy --since "2 hours ago" | grep -iE "ulimit|descriptor|fd limit|too low"

    HAProxy startup output when FD limit is insufficient:

    [WARNING] 101/143022 (1234) : FD limit (4096) too low for maxconn=50000. Please raise 'ulimit-n' to at least 100500.

    How to Fix It

    Set

    ulimit-n
    in the global section of
    haproxy.cfg
    :

    global
        maxconn 50000
        ulimit-n 100500

    Then lock in the limit at the systemd level so it survives restarts:

    mkdir -p /etc/systemd/system/haproxy.service.d/
    cat > /etc/systemd/system/haproxy.service.d/limits.conf <<'EOF'
    [Service]
    LimitNOFILE=100500
    EOF

    Add the system-wide PAM limits in

    /etc/security/limits.conf
    :

    haproxy    soft    nofile    100500
    haproxy    hard    nofile    100500
    root       soft    nofile    100500
    root       hard    nofile    100500

    And tune the kernel in

    /etc/sysctl.conf
    :

    fs.file-max = 500000
    net.ipv4.ip_local_port_range = 1024 65535
    net.core.somaxconn = 65535
    net.ipv4.tcp_tw_reuse = 1

    Apply everything and verify:

    sysctl -p
    systemctl daemon-reload
    systemctl reload haproxy
    cat /proc/$(pidof haproxy | awk '{print $1}')/limits | grep "open files"

    Expected output after fix:

    Max open files            100500               100500               files

    Root Cause 4: Thread Saturation

    Why It Happens

    HAProxy processes connections across worker threads. In older single-process configurations, everything ran on a single thread and you could hit CPU saturation long before numerical connection limits. With modern HAProxy 2.x, the

    nbthread
    directive distributes connection handling across multiple threads tied to CPU cores.

    Thread saturation happens when all threads are blocked — waiting on slow TLS handshakes, waiting on backend I/O, or stuck in kernel socket operations. The effect looks identical to a maxconn limit from the outside: new connections queue up and time out. But the fix is entirely different. Raising maxconn won't help at all if your threads are the bottleneck.

    In my experience, this manifests most often on systems that were migrated from old single-core VM configurations where

    nbthread
    was never set, or was left at 1 or 2 on an 8-core host.

    How to Identify It

    Check the configured thread count and current task load:

    echo "show info" | socat stdio /var/run/haproxy/admin.sock | grep -E "^Nbthread|^Tasks|^Run_queue|^Idle_pct"

    Output showing saturation:

    Nbthread: 2
    Tasks: 51234
    Run_queue: 8472
    Idle_pct: 3

    An

    Idle_pct
    near zero with a large
    Run_queue
    relative to
    Nbthread
    is the clearest signal. Cross-reference with CPU utilization:

    top -p $(pidof haproxy | tr ' ' ',')

    If you see two cores pegged at 100% while six others sit idle, you're thread-saturated. On a loaded system, also check the per-thread CPU breakdown:

    echo "show threads" | socat stdio /var/run/haproxy/admin.sock

    How to Fix It

    Set

    nbthread
    to match your core count. Leave one core for the OS on busy systems:

    global
        maxconn 50000
        nbthread 7
        ulimit-n 100500
        cpu-map auto:1/1-7 0-6

    The

    cpu-map
    directive pins each HAProxy thread to a specific CPU core. This reduces cache misses and prevents HAProxy threads from fighting the kernel scheduler for CPU time. The format
    auto:1/1-7 0-6
    maps threads 1 through 7 of process 1 to CPU cores 0 through 6.

    Reload and verify:

    haproxy -c -f /etc/haproxy/haproxy.cfg && systemctl reload haproxy
    echo "show info" | socat stdio /var/run/haproxy/admin.sock | grep -E "^Nbthread|^Idle_pct"

    Expected output after fix:

    Nbthread: 7
    Idle_pct: 42

    Worth noting: if your backend responses are genuinely slow (consistently over 500ms average), adding threads provides some relief but doesn't fully solve the problem. You also need to address backend latency or add more backend servers. Thread saturation amplifies slow-backend problems rather than causing them independently.


    Root Cause 5: Queue Overflow

    Why It Happens

    When a backend server hits its per-server maxconn, HAProxy doesn't immediately send a 503. Instead, it places the request into a queue and waits for a connection slot to free up. That queue has its own limit — configured by the

    maxqueue
    directive or defaulting to 0 (unlimited in older HAProxy versions, or calculated from maxconn in newer ones). When the queue fills, HAProxy sends a 503 immediately with no further waiting.

    Queue overflow is almost always a secondary failure. It's a symptom of per-server maxconn being too low, backends being too slow, or a sudden traffic spike that outpaces your connection pool. But the log signatures it produces are distinct, and understanding them helps you diagnose the primary cause faster.

    How to Identify It

    Look for the queue-related status flags in HAProxy logs. The

    sQ
    termination state in the log line means the session ended while waiting in the server queue:

    192.168.1.50:54321 [12/Apr/2026:14:32:01.123] web_frontend web_backend/app01 0/30000/-1/-1/30001 503 212 - - sQ-- 47/47/47/47/0 0/0 "GET /api/data HTTP/1.1"

    The

    -1
    for the backend connect time confirms HAProxy never actually connected to the backend — the request died while queued. Pull current queue depths directly:

    echo "show stat" | socat stdio /var/run/haproxy/admin.sock | awk -F',' 'NR==1{for(i=1;i<=NF;i++){if($i=="pxname"||$i=="svname"||$i=="qcur"||$i=="qmax") col[i]=$i}} NR>1{line=""; for(i in col) line=line col[i]"="$i" "; print line}'

    qcur
    is the current queue depth;
    qmax
    is the historical maximum observed. If
    qcur
    is regularly non-zero on a server, that server is the bottleneck and connections are queuing up waiting for it.

    How to Fix It

    Address the underlying cause first — usually per-server maxconn being too low relative to what the app can handle, or slow backend response times causing connections to pile up. Then tune queue behavior explicitly so failures are fast rather than slow:

    backend web_backend
        balance leastconn
        option httpchk GET /healthz HTTP/1.1\r\nHost:\ solvethenetwork.com
        timeout queue 10s
        timeout connect 3s
        timeout server 30s
        default-server maxconn 300 maxqueue 100 check inter 5s fall 3 rise 2
        server app01 192.168.10.11:8080 weight 1
        server app02 192.168.10.12:8080 weight 1
        server app03 192.168.10.13:8080 weight 1

    Setting

    timeout queue
    to 10 seconds means requests don't queue indefinitely — they fail fast with a 503 rather than hanging until the client times out. A client that gets a 503 in 10 seconds can retry or show a meaningful error. A client waiting 90 seconds for a queue timeout is burning a socket on both ends and delivering a worse user experience anyway.

    Set

    maxqueue
    proportional to your expected burst duration times your request rate. If you process 100 req/s and legitimate traffic bursts last 2 seconds before backends catch up, a queue of 200 is reasonable. A queue of 10,000 just means 10,000 clients wait longer before getting their 503, and your backend has no chance of draining it before the next wave arrives.


    Root Cause 6: Slow Backend Responses Causing Connection Buildup

    Why It Happens

    This one is situational, but I've seen it contribute to connection limit exhaustion more often than any pure misconfiguration. If your backend servers respond slowly — database queries taking 5–10 seconds, GC pauses, upstream API latency — connections pile up because HAProxy holds each connection open until the backend responds or the timeout fires. At steady-state traffic levels, slow backends mean connections accumulate faster than they're released, and you'll hit your limits even if they're configured correctly for normal conditions.

    How to Identify It

    Check average backend response times from the stats output:

    echo "show stat" | socat stdio /var/run/haproxy/admin.sock | awk -F',' 'NR==1{for(i=1;i<=NF;i++){if($i=="pxname"||$i=="svname"||$i=="rtime"||$i=="ttime") col[i]=$i}} NR>1{line=""; for(i in col) line=line col[i]"="$i" "; print line}'

    rtime
    is the average backend response time in milliseconds. If it's climbing above 1000ms consistently, your backends are struggling and connections will pile up under load. Also look at active session count relative to your traffic rate:

    echo "show info" | socat stdio /var/run/haproxy/admin.sock | grep -E "^CurrConns|^MaxConnReached|^Uptime"

    If concurrent connections are high but your request rate is moderate, long-lived connections are the issue. Check your HAProxy log format for response times — the fifth timing field in the default log format is total session time. Sort by it to find outliers:

    awk '{print $NF, $0}' /var/log/haproxy.log | sort -rn | head -20

    How to Fix It

    Set aggressive but realistic server timeouts. Don't leave them at 0 or a multi-minute value in production:

    defaults
        timeout connect 3s
        timeout client 30s
        timeout server 30s
        timeout queue 10s

    I've encountered configurations where someone set

    timeout server 0
    to "avoid timeouts" and the result was thousands of half-open connections holding the load balancer hostage. Timeouts exist to evict dead connections, not to punish your users. Set them to the 95th-percentile response time of your slowest legitimate request plus a reasonable buffer — not infinity.

    If slow responses are from a specific backend endpoint rather than all traffic, use ACLs to apply tighter timeouts selectively:

    backend slow_api_backend
        timeout server 60s
        server api01 192.168.10.21:9090 check maxconn 50

    Prevention

    The best time to tune connection limits is before a production incident, not during one. Here's the full reference configuration for sw-infrarunbook-01 running HAProxy 2.8, incorporating all the fixes discussed above:

    global
        maxconn 50000
        nbthread 7
        ulimit-n 100500
        cpu-map auto:1/1-7 0-6
        log /dev/log local0
        log /dev/log local1 notice
        stats socket /var/run/haproxy/admin.sock mode 660 level admin expose-fd listeners
        stats timeout 30s
        user haproxy
        group haproxy
    
    defaults
        log global
        mode http
        option httplog
        option dontlognull
        timeout connect 3s
        timeout client 30s
        timeout server 30s
        timeout queue 10s
        errorfile 503 /etc/haproxy/errors/503.http
    
    frontend web_frontend
        bind 192.168.1.10:80
        bind 192.168.1.10:443 ssl crt /etc/haproxy/certs/solvethenetwork.com.pem
        maxconn 40000
        default_backend web_backend
    
    backend web_backend
        balance leastconn
        option httpchk GET /healthz HTTP/1.1\r\nHost:\ solvethenetwork.com
        default-server maxconn 300 maxqueue 100 check inter 5s fall 3 rise 2
        server app01 192.168.10.11:8080 weight 1
        server app02 192.168.10.12:8080 weight 1
        server app03 192.168.10.13:8080 weight 1
    
    listen stats
        bind 192.168.1.10:8404
        stats enable
        stats uri /stats
        stats refresh 10s
        stats auth infrarunbook-admin:changeme

    Build connection capacity estimation into your deployment process. Use this rough formula when sizing a new instance:

    • maxconn = available_memory_MB × 20 (conservative) — each connection uses roughly 50 KB
    • ulimit-n = (maxconn × 2) + 500
    • per-server maxconn = app server thread pool size × 0.9
    • maxqueue = expected burst requests × burst duration in seconds

    Monitor proactively rather than reactively. Add these metrics to your alerting stack before the limits become an incident:

    # Prometheus alerting rules (HAProxy exporter)
    - alert: HaproxyFrontendSaturation
      expr: haproxy_frontend_current_sessions / haproxy_frontend_limit_sessions > 0.80
      for: 2m
      annotations:
        summary: "HAProxy frontend {{ $labels.proxy }} at {{ $value | humanizePercentage }} capacity"
    
    - alert: HaproxyBackendQueueBuildup
      expr: haproxy_backend_current_queue > 10
      for: 1m
      annotations:
        summary: "HAProxy backend {{ $labels.proxy }} queue depth {{ $value }}"
    
    - alert: HaproxyServerSaturation
      expr: haproxy_server_current_sessions / haproxy_server_limit_sessions > 0.85
      for: 2m
      annotations:
        summary: "HAProxy server {{ $labels.server }} at capacity"

    Load-test your limits before they test you. Drive sustained load against a staging environment with real request patterns and watch the stats socket in real time:

    wrk -t12 -c1000 -d60s http://192.168.1.10/healthz
    
    # In a second terminal, watch the stats live
    watch -n1 "echo 'show info' | socat stdio /var/run/haproxy/admin.sock | grep -E 'CurrConns|MaxconnReached|Tasks|Run_queue|Idle_pct'"

    This tells you exactly where saturation begins before it happens in production. Document the results. When traffic doubles six months from now, you'll want to know which limit to raise first without having to rediscover it under pressure.

    Frequently Asked Questions

    What is the default maxconn in HAProxy and is it enough for production?

    The default global maxconn in HAProxy is 2000, which is sufficient only for very low-traffic environments or development. Most production deployments should set this to at least 10,000 and size it based on available memory using the formula: maxconn = available_memory_MB × 20.

    How do I check if HAProxy is currently hitting its connection limits?

    Run: echo "show info" | socat stdio /var/run/haproxy/admin.sock | grep -E 'Maxconn|MaxconnReached|CurrConns|Idle_pct' — if CurrConns equals Maxconn or MaxconnReached is non-zero, you are hitting the global limit. Check the stats page at your configured bind address for a visual overview.

    Why do I get 503 errors even though my global maxconn isn't reached?

    503 errors can occur when the per-server maxconn on a specific backend server is hit, when the connection queue overflows (maxqueue), when the file descriptor limit is exhausted at the OS level, or when backend servers are responding too slowly and connections pile up. Check per-server session counts and queue depths in addition to the global limit.

    How do I calculate the right ulimit-n value for HAProxy?

    Use the formula: ulimit-n = (maxconn × 2) + 500. For example, if your global maxconn is 50000, set ulimit-n to 100500. This accounts for two file descriptors per proxied connection plus overhead for sockets, log handles, and the stats socket.

    Does raising nbthread in HAProxy help with connection limits?

    Raising nbthread doesn't increase the connection limit itself, but it prevents thread saturation from masking the available capacity. If all worker threads are blocked on slow I/O or TLS handshakes, new connections will queue up even if maxconn is not reached. Use nbthread equal to your CPU core count minus one to fully utilize available processing capacity.

    What does the sQ termination state mean in HAProxy logs?

    The sQ-- termination state in HAProxy access logs means the session was killed while waiting in the server queue — HAProxy never actually established a connection to the backend. This indicates backend servers have hit their per-server maxconn limit and the connection queue filled up or the timeout queue timer expired.

    Related Articles