InfraRunBook
    Back to articles

    HAProxy Backend Health Checks: TCP, HTTP, Custom Checks, Inter, Fall/Rise Tuning, and Agent Checks

    HAProxy
    Published: Feb 17, 2026
    Updated: Feb 17, 2026

    Master HAProxy backend health checks with production-ready configurations for TCP checks, HTTP health endpoints, custom check scripts, agent checks, and precise inter/fall/rise tuning to keep your infrastructure resilient.

    HAProxy Backend Health Checks: TCP, HTTP, Custom Checks, Inter, Fall/Rise Tuning, and Agent Checks

    Introduction

    Backend health checks are the heartbeat of any HAProxy deployment. Without proper health checking, HAProxy blindly sends traffic to servers that may be down, degraded, or in a maintenance window — causing 502 errors, timeouts, and user-facing outages. This guide covers every health check mechanism HAProxy offers, from simple TCP connect probes to sophisticated external agent checks, with real production configurations you can deploy today.

    We will walk through TCP checks, Layer 7 HTTP checks, custom check scripts, external agent checks, advanced

    http-check
    directives introduced in HAProxy 2.2+, and the critical timing parameters (
    inter
    ,
    fall
    ,
    rise
    ,
    timeout check
    ) that control how fast HAProxy detects failures and recovers servers.


    1. Understanding HAProxy Health Check Architecture

    HAProxy health checks run in a dedicated subsystem, separate from data-path connections. Each backend server gets its own check task that operates on an independent timer. Key concepts:

    • Check interval (inter) — how often HAProxy probes the server
    • Fall threshold — how many consecutive failed checks before marking server DOWN
    • Rise threshold — how many consecutive passed checks before marking server UP
    • Check timeout — how long to wait for a health check response before declaring failure
    • Check address/port — optionally probe a different IP or port than the data path

    The default behavior (no

    option
    directive) is a Layer 4 TCP connect check — HAProxy opens a TCP socket and considers the server healthy if the three-way handshake completes. This is fast but tells you nothing about application health.


    2. Layer 4 TCP Health Checks

    TCP checks are the simplest and lowest-overhead option. HAProxy performs a TCP connect to the server's IP and port, then immediately closes the connection.

    2.1 Basic TCP Check

    backend bk_infrarunbook_app
        mode tcp
        balance roundrobin
        option tcp-check
        server srv-infrarunbook-01 10.10.50.11:3306 check inter 5s fall 3 rise 2
        server srv-infrarunbook-02 10.10.50.12:3306 check inter 5s fall 3 rise 2

    When you add the

    check
    keyword to a server line, HAProxy enables health checking. The
    option tcp-check
    line explicitly selects Layer 4 mode (this is also the default when no other check option is specified).

    2.2 TCP Check with Send/Expect Sequences

    For protocols that require a handshake (MySQL, Redis, SMTP), you can script the conversation:

    backend bk_infrarunbook_redis
        mode tcp
        balance roundrobin
        option tcp-check
        tcp-check connect
        tcp-check send "PING\r\n"
        tcp-check expect string +PONG
        server redis-infrarunbook-01 10.10.50.21:6379 check inter 3s fall 3 rise 2
        server redis-infrarunbook-02 10.10.50.22:6379 check inter 3s fall 3 rise 2

    This sends the Redis

    PING
    command and expects
    +PONG
    in response. If the server responds with an error or the connection is refused, the check fails.

    2.3 MySQL TCP Health Check

    backend bk_infrarunbook_mysql
        mode tcp
        balance leastconn
        option tcp-check
        tcp-check connect
        tcp-check expect binary 0a   # MySQL greeting starts with protocol version 10 (0x0a)
        server db-infrarunbook-01 10.10.50.31:3306 check inter 5s fall 3 rise 2
        server db-infrarunbook-02 10.10.50.32:3306 check inter 5s fall 3 rise 2

    2.4 SMTP TCP Health Check

    backend bk_infrarunbook_smtp
        mode tcp
        option tcp-check
        tcp-check connect
        tcp-check expect rstring ^220
        tcp-check send "EHLO solvethenetwork.com\r\n"
        tcp-check expect rstring ^250
        tcp-check send "QUIT\r\n"
        tcp-check expect rstring ^221
        server smtp-infrarunbook-01 10.10.50.41:25 check inter 10s fall 3 rise 2

    3. Layer 7 HTTP Health Checks

    HTTP health checks are the most common choice for web application backends. HAProxy sends a real HTTP request and inspects the response status code, headers, or body.

    3.1 Basic HTTP Check (Legacy Syntax)

    backend bk_infrarunbook_web
        mode http
        balance roundrobin
        option httpchk GET /healthz HTTP/1.1\r\nHost:\ app.solvethenetwork.com
        http-check expect status 200
        server web-infrarunbook-01 10.10.50.51:8080 check inter 3s fall 3 rise 2
        server web-infrarunbook-02 10.10.50.52:8080 check inter 3s fall 3 rise 2
        server web-infrarunbook-03 10.10.50.53:8080 check inter 3s fall 3 rise 2

    The

    option httpchk
    directive defines the HTTP method, path, and optional headers. The
    http-check expect
    line tells HAProxy what a healthy response looks like.

    3.2 Modern HTTP Check Syntax (HAProxy 2.2+)

    HAProxy 2.2 introduced a structured

    http-check
    syntax that is cleaner and more powerful:

    backend bk_infrarunbook_api
        mode http
        balance leastconn
    
        option httpchk
        http-check connect
        http-check send meth GET uri /api/v1/health ver HTTP/1.1 hdr Host app.solvethenetwork.com hdr User-Agent "HAProxy-HealthCheck/2.8"
        http-check expect status 200
    
        server api-infrarunbook-01 10.10.50.61:8443 check inter 3s fall 3 rise 2 ssl verify none
        server api-infrarunbook-02 10.10.50.62:8443 check inter 3s fall 3 rise 2 ssl verify none
        server api-infrarunbook-03 10.10.50.63:8443 check inter 3s fall 3 rise 2 ssl verify none

    3.3 Multiple Expect Conditions

    You can chain multiple expectations. HAProxy 2.2+ supports complex matching:

    backend bk_infrarunbook_portal
        mode http
        option httpchk
        http-check send meth GET uri /healthz ver HTTP/1.1 hdr Host portal.solvethenetwork.com
        http-check expect status 200
        http-check expect header name "Content-Type" value "application/json"
        http-check expect string "status":"healthy"
    
        server portal-infrarunbook-01 10.10.50.71:8080 check inter 5s fall 3 rise 2
        server portal-infrarunbook-02 10.10.50.72:8080 check inter 5s fall 3 rise 2

    3.4 Expect Patterns Reference

    • http-check expect status 200
      — exact status code match
    • http-check expect status 200-299
      — status code range
    • http-check expect ! status 503
      — negation (anything except 503)
    • http-check expect string OK
      — body contains exact string
    • http-check expect rstring ^{"status":"(ok|healthy)"}
      — body matches regex
    • http-check expect header name "X-Health" value "up"
      — response header match

    3.5 Checking a Different Port

    Some applications expose a health endpoint on a management port separate from the data port:

    backend bk_infrarunbook_grpc
        mode tcp
        balance roundrobin
    
        option httpchk
        http-check send meth GET uri /healthz ver HTTP/1.1 hdr Host grpc.solvethenetwork.com
        http-check expect status 200
    
        server grpc-infrarunbook-01 10.10.50.81:50051 check port 8081 inter 5s fall 3 rise 2
        server grpc-infrarunbook-02 10.10.50.82:50051 check port 8081 inter 5s fall 3 rise 2

    The

    check port 8081
    directive tells HAProxy to run the HTTP health check against port 8081 while forwarding real traffic to port 50051.

    3.6 Checking a Different IP Address

        server app-infrarunbook-01 10.10.50.91:8080 check addr 10.10.50.91 port 8081 inter 3s fall 3 rise 2

    The

    addr
    keyword lets you probe a different IP (useful when health endpoints listen on a management interface).


    4. Inter, Fall, Rise, and Timeout Tuning

    Getting these parameters right is the difference between sub-second failover and minutes of downtime.

    4.1 Parameter Definitions

    • inter <time> — interval between checks when server is in its current state (default: 2s)
    • fastinter <time> — interval between checks during a state transition (optional, defaults to inter)
    • downinter <time> — interval between checks when server is DOWN (optional, defaults to inter)
    • fall <count> — consecutive failed checks to mark server DOWN (default: 3)
    • rise <count> — consecutive passed checks to mark server UP (default: 2)
    • timeout check <time> — maximum time to wait for a check response (uses
      timeout server
      if not set)

    4.2 Calculating Failover Time

    The worst-case time to detect a failure is:

    detection_time = inter × fall
    
    # Example: inter 3s, fall 3
    detection_time = 3s × 3 = 9 seconds (worst case)

    The average detection time is roughly:

    avg_detection_time = inter × (fall - 0.5)
    
    # Example: inter 3s, fall 3
    avg_detection_time = 3s × 2.5 = 7.5 seconds

    4.3 Aggressive Health Checking for Low-Latency Failover

    backend bk_infrarunbook_critical
        mode http
        balance roundrobin
        option httpchk
        http-check send meth GET uri /healthz ver HTTP/1.1 hdr Host critical.solvethenetwork.com
        http-check expect status 200
        timeout check 2s
    
        default-server inter 1s fastinter 500ms downinter 5s fall 2 rise 3
    
        server crit-infrarunbook-01 10.10.50.101:8080 check
        server crit-infrarunbook-02 10.10.50.102:8080 check
        server crit-infrarunbook-03 10.10.50.103:8080 check

    This configuration yields:

    • Worst-case detection: 1s × 2 = 2 seconds
    • During transitions, checks run every 500ms for faster resolution
    • Once a server is confirmed DOWN, checks slow to every 5s to reduce load
    • A server must pass 3 consecutive checks (3 × 500ms = 1.5s minimum) before returning to UP

    4.4 Conservative Health Checking for Stable Backends

    backend bk_infrarunbook_static
        mode http
        balance roundrobin
        option httpchk
        http-check send meth GET uri /ping ver HTTP/1.1 hdr Host static.solvethenetwork.com
        http-check expect status 200
        timeout check 5s
    
        default-server inter 10s fall 5 rise 3
    
        server static-infrarunbook-01 10.10.50.111:80 check
        server static-infrarunbook-02 10.10.50.112:80 check

    Slower checks, higher fall threshold — less probe traffic and more tolerance for transient glitches.

    4.5 Tuning Guidelines Table

    ┌─────────────────────┬────────────┬──────────┬───────────┬───────────────────────┐
    │ Use Case            │ inter      │ fall     │ rise      │ Worst-Case Detection  │
    ├─────────────────────┼────────────┼──────────┼───────────┼───────────────────────┤
    │ Real-time API       │ 1s         │ 2        │ 3         │ 2 seconds             │
    │ Standard web app    │ 3s         │ 3        │ 2         │ 9 seconds             │
    │ Background workers  │ 10s        │ 5        │ 3         │ 50 seconds            │
    │ Database proxy      │ 5s         │ 3        │ 2         │ 15 seconds            │
    │ CDN / static files  │ 30s        │ 3        │ 2         │ 90 seconds            │
    └─────────────────────┴────────────┴──────────┴───────────┴───────────────────────┘

    5. External Check (Script-Based)

    For complex health validation that can't be expressed in TCP or HTTP patterns, HAProxy can execute an external script. The script's exit code determines the check result (0 = success, non-zero = failure).

    5.1 Enable External Checks in Global Section

    global
        log /dev/log local0
        external-check
        insecure-fork-wanted
    Security Note: External checks fork a new process for every check. This has performance implications and requires
    insecure-fork-wanted
    in HAProxy 2.x. Use external checks only when TCP/HTTP checks are insufficient.

    5.2 External Check Script

    Create the check script at

    /etc/haproxy/checks/check_infrarunbook_app.sh
    :

    #!/bin/bash
    # External health check for infrarunbook application
    # Arguments passed by HAProxy:
    #   $1 = server address
    #   $2 = server port
    #   $3 = server name
    
    SERVER_ADDR="$1"
    SERVER_PORT="$2"
    
    # Check application health endpoint
    RESPONSE=$(curl -s -o /dev/null -w "%{http_code}" \
      --connect-timeout 2 \
      --max-time 5 \
      "http://${SERVER_ADDR}:${SERVER_PORT}/healthz")
    
    if [ "$RESPONSE" = "200" ]; then
        # Check that the app reports healthy database connectivity
        DB_STATUS=$(curl -s --connect-timeout 2 --max-time 5 \
          "http://${SERVER_ADDR}:${SERVER_PORT}/healthz" | \
          python3 -c "import sys,json; print(json.load(sys.stdin).get('database','unknown'))" 2>/dev/null)
        if [ "$DB_STATUS" = "connected" ]; then
            exit 0
        fi
    fi
    
    exit 1
    chmod 755 /etc/haproxy/checks/check_infrarunbook_app.sh
    chown haproxy:haproxy /etc/haproxy/checks/check_infrarunbook_app.sh

    5.3 Backend Configuration

    backend bk_infrarunbook_complex
        mode http
        balance roundrobin
        option external-check
        external-check path "/usr/bin:/usr/local/bin:/bin"
        external-check command /etc/haproxy/checks/check_infrarunbook_app.sh
    
        server complex-infrarunbook-01 10.10.50.121:8080 check inter 10s fall 3 rise 2
        server complex-infrarunbook-02 10.10.50.122:8080 check inter 10s fall 3 rise 2

    6. Agent Health Checks

    Agent checks are a powerful mechanism where HAProxy connects to a small agent daemon running on each backend server. The agent responds with a string that can dynamically control the server's state, weight, and administrative status — enabling graceful draining and load-aware balancing.

    6.1 Agent Protocol

    The agent is a TCP service that returns one or more keywords separated by commas:

    • up
      — mark server operationally UP
    • down
      — mark server operationally DOWN
    • drain
      — stop sending new connections (existing ones continue)
    • maint
      — put server in maintenance mode
    • ready
      — remove administrative drain/maint status
    • 50%
      — set server weight to 50% of its configured value
    • maxconn:100
      — set dynamic maxconn limit

    6.2 Simple Agent Script (Python)

    Create

    /opt/infrarunbook/haproxy-agent.py
    :

    #!/usr/bin/env python3
    """HAProxy agent check for infrarunbook application.
    
    Listens on port 9707 and returns server status.
    """
    import socket
    import subprocess
    import sys
    
    def get_status():
        """Check local application health and return agent string."""
        try:
            # Check if application is running
            result = subprocess.run(
                ['systemctl', 'is-active', 'infrarunbook-app'],
                capture_output=True, text=True, timeout=3
            )
            if result.stdout.strip() != 'active':
                return 'down\n'
    
            # Check application load (CPU-based weight adjustment)
            with open('/proc/loadavg', 'r') as f:
                load_1m = float(f.read().split()[0])
    
            import multiprocessing
            cpu_count = multiprocessing.cpu_count()
            load_pct = (load_1m / cpu_count) * 100
    
            if load_pct > 95:
                return 'drain\n'
            elif load_pct > 80:
                return 'up 25%\n'
            elif load_pct > 60:
                return 'up 50%\n'
            elif load_pct > 40:
                return 'up 75%\n'
            else:
                return 'up 100%\n'
        except Exception:
            return 'down\n'
    
    def main():
        server = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
        server.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
        server.bind(('0.0.0.0', 9707))
        server.listen(5)
        print('HAProxy agent listening on port 9707', flush=True)
    
        while True:
            client, addr = server.accept()
            try:
                status = get_status()
                client.sendall(status.encode())
            finally:
                client.close()
    
    if __name__ == '__main__':
        main()

    6.3 Systemd Service for the Agent

    [Unit]
    Description=HAProxy Agent Check for infrarunbook
    After=network.target
    
    [Service]
    ExecStart=/usr/bin/python3 /opt/infrarunbook/haproxy-agent.py
    Restart=always
    RestartSec=5
    User=nobody
    Group=nogroup
    
    [Install]
    WantedBy=multi-user.target
    systemctl daemon-reload
    systemctl enable --now haproxy-agent-infrarunbook.service

    6.4 HAProxy Backend with Agent Checks

    backend bk_infrarunbook_dynamic
        mode http
        balance roundrobin
        option httpchk
        http-check send meth GET uri /healthz ver HTTP/1.1 hdr Host app.solvethenetwork.com
        http-check expect status 200
    
        server dyn-infrarunbook-01 10.10.50.131:8080 check inter 3s fall 3 rise 2 agent-check agent-port 9707 agent-inter 5s
        server dyn-infrarunbook-02 10.10.50.132:8080 check inter 3s fall 3 rise 2 agent-check agent-port 9707 agent-inter 5s
        server dyn-infrarunbook-03 10.10.50.133:8080 check inter 3s fall 3 rise 2 agent-check agent-port 9707 agent-inter 5s

    This combines both a standard HTTP health check and an agent check. The HTTP check gates UP/DOWN status, while the agent dynamically adjusts weight. Both must pass for the server to receive traffic.


    7. Graceful Server Drain with Health Checks

    Before taking a server out for maintenance, you want to drain existing connections gracefully.

    7.1 Using the Runtime API

    # Set server to DRAIN mode (no new connections, existing ones finish)
    echo "set server bk_infrarunbook_web/web-infrarunbook-01 state drain" | \
      socat stdio /var/run/haproxy/admin.sock
    
    # Check current sessions
    echo "show servers state bk_infrarunbook_web" | \
      socat stdio /var/run/haproxy/admin.sock
    
    # Wait for connections to drain, then set to MAINT
    echo "set server bk_infrarunbook_web/web-infrarunbook-01 state maint" | \
      socat stdio /var/run/haproxy/admin.sock
    
    # After maintenance, bring server back
    echo "set server bk_infrarunbook_web/web-infrarunbook-01 state ready" | \
      socat stdio /var/run/haproxy/admin.sock

    7.2 Using the Agent Check for Zero-Downtime Deploys

    Your deployment script can toggle the agent response:

    #!/bin/bash
    # deploy-infrarunbook.sh — Zero-downtime deploy
    SERVER="dyn-infrarunbook-01"
    BACKEND="bk_infrarunbook_dynamic"
    SOCK="/var/run/haproxy/admin.sock"
    
    # 1. Signal drain via agent file
    echo "drain" > /opt/infrarunbook/agent-status.txt
    sleep 30  # Wait for connections to drain
    
    # 2. Deploy new version
    systemctl stop infrarunbook-app
    rsync -a /opt/infrarunbook/releases/latest/ /opt/infrarunbook/current/
    systemctl start infrarunbook-app
    sleep 5  # Let app warm up
    
    # 3. Signal ready
    echo "up 100%" > /opt/infrarunbook/agent-status.txt

    8. Monitoring Health Check Status

    8.1 Stats Page

    frontend ft_infrarunbook_stats
        bind 10.10.50.1:9000
        mode http
        stats enable
        stats uri /haproxy-stats
        stats realm "InfraRunBook HAProxy Stats"
        stats auth infrarunbook-admin:S3cur3P@ssw0rd!
        stats refresh 5s
        stats show-legends
        stats show-node

    The stats page displays each server's check status, last check duration, number of consecutive failures, and current weight — essential for health check debugging.

    8.2 Prometheus Metrics

    frontend ft_infrarunbook_prometheus
        bind 10.10.50.1:8405
        mode http
        http-request use-service prometheus-exporter if { path /metrics }
        no log

    Key metrics to watch:

    • haproxy_server_check_failures_total
      — cumulative check failures
    • haproxy_server_check_duration_seconds
      — check latency
    • haproxy_server_status
      — current UP/DOWN/DRAIN/MAINT status
    • haproxy_server_weight
      — effective weight (affected by agent checks)

    8.3 Log-Based Monitoring

    Enable check logging to see state transitions in your syslog:

    global
        log /dev/log local0 info
    
    defaults
        log global
        option log-health-checks

    With

    option log-health-checks
    , HAProxy logs every state transition:

    Feb 17 09:42:15 lb-infrarunbook-01 haproxy[2847]: Health check for server bk_infrarunbook_web/web-infrarunbook-02 failed, reason: Layer7 wrong status, code: 503, check duration: 12ms, status: 2/3 DOWN.
    Feb 17 09:42:18 lb-infrarunbook-01 haproxy[2847]: Health check for server bk_infrarunbook_web/web-infrarunbook-02 failed, reason: Layer7 wrong status, code: 503, check duration: 8ms, status: 0/3 DOWN.
    Feb 17 09:42:48 lb-infrarunbook-01 haproxy[2847]: Health check for server bk_infrarunbook_web/web-infrarunbook-02 succeeded, reason: Layer7 check passed, code: 200, check duration: 5ms, status: 3/3 UP.

    9. Complete Production Configuration

    Here is a full, production-ready configuration combining all techniques:

    #---------------------------------------------------------------------
    # Global settings — lb-infrarunbook-01
    #---------------------------------------------------------------------
    global
        log /dev/log    local0
        log /dev/log    local1 notice
        chroot /var/lib/haproxy
        stats socket /var/run/haproxy/admin.sock mode 660 level admin expose-fd listeners
        stats timeout 30s
        user haproxy
        group haproxy
        daemon
        maxconn 50000
        ssl-default-bind-ciphers ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-RSA-AES128-GCM-SHA256
        ssl-default-bind-ciphersuites TLS_AES_128_GCM_SHA256:TLS_AES_256_GCM_SHA384
        ssl-default-bind-options ssl-min-ver TLSv1.2 no-tls-tickets
        tune.ssl.default-dh-param 2048
    
    #---------------------------------------------------------------------
    # Defaults
    #---------------------------------------------------------------------
    defaults
        log     global
        option  httplog
        option  dontlognull
        option  log-health-checks
        timeout connect 5s
        timeout client  30s
        timeout server  30s
        timeout check   3s
        retries 3
        default-server inter 3s fall 3 rise 2
    
    #---------------------------------------------------------------------
    # Stats frontend
    #---------------------------------------------------------------------
    frontend ft_infrarunbook_stats
        bind 10.10.50.1:9000
        mode http
        stats enable
        stats uri /haproxy-stats
        stats auth infrarunbook-admin:S3cur3P@ssw0rd!
        stats refresh 5s
    
    #---------------------------------------------------------------------
    # Main HTTP frontend
    #---------------------------------------------------------------------
    frontend ft_infrarunbook_https
        bind *:443 ssl crt /etc/haproxy/certs/solvethenetwork.com.pem alpn h2,http/1.1
        mode http
        http-request set-header X-Forwarded-Proto https
    
        acl is_api path_beg /api/
        acl is_static path_beg /static/ /assets/ /images/
        acl is_ws hdr(Upgrade) -i websocket
    
        use_backend bk_infrarunbook_api if is_api
        use_backend bk_infrarunbook_static if is_static
        use_backend bk_infrarunbook_ws if is_ws
        default_backend bk_infrarunbook_web
    
    #---------------------------------------------------------------------
    # Web backend — HTTP health check
    #---------------------------------------------------------------------
    backend bk_infrarunbook_web
        mode http
        balance roundrobin
        option httpchk
        http-check send meth GET uri /healthz ver HTTP/1.1 hdr Host app.solvethenetwork.com
        http-check expect status 200
    
        server web-infrarunbook-01 10.10.50.51:8080 check weight 100
        server web-infrarunbook-02 10.10.50.52:8080 check weight 100
        server web-infrarunbook-03 10.10.50.53:8080 check weight 100 backup
    
    #---------------------------------------------------------------------
    # API backend — HTTP check + agent check for load-aware balancing
    #---------------------------------------------------------------------
    backend bk_infrarunbook_api
        mode http
        balance leastconn
        option httpchk
        http-check send meth GET uri /api/v1/health ver HTTP/1.1 hdr Host api.solvethenetwork.com
        http-check expect status 200
        http-check expect string "status":"healthy"
    
        default-server inter 2s fastinter 500ms fall 2 rise 3
    
        server api-infrarunbook-01 10.10.50.61:8080 check agent-check agent-port 9707 agent-inter 5s
        server api-infrarunbook-02 10.10.50.62:8080 check agent-check agent-port 9707 agent-inter 5s
        server api-infrarunbook-03 10.10.50.63:8080 check agent-check agent-port 9707 agent-inter 5s
    
    #---------------------------------------------------------------------
    # Static backend — lightweight check, long interval
    #---------------------------------------------------------------------
    backend bk_infrarunbook_static
        mode http
        balance roundrobin
        option httpchk
        http-check send meth HEAD uri /static/health.txt ver HTTP/1.1 hdr Host static.solvethenetwork.com
        http-check expect status 200
    
        default-server inter 15s fall 3 rise 2
    
        server static-infrarunbook-01 10.10.50.111:80 check
        server static-infrarunbook-02 10.10.50.112:80 check
    
    #---------------------------------------------------------------------
    # WebSocket backend — TCP check with longer timeouts
    #---------------------------------------------------------------------
    backend bk_infrarunbook_ws
        mode http
        balance source
        option httpchk
        http-check send meth GET uri /ws/health ver HTTP/1.1 hdr Host ws.solvethenetwork.com
        http-check expect status 200
        timeout tunnel 3600s
    
        server ws-infrarunbook-01 10.10.50.141:8080 check inter 5s fall 3 rise 2
        server ws-infrarunbook-02 10.10.50.142:8080 check inter 5s fall 3 rise 2

    10. Troubleshooting Health Checks

    10.1 Common Issues and Solutions

    • All servers showing DOWN on stats page: Check that the health endpoint returns the expected status code. Test manually with curl:
      curl -v http://10.10.50.51:8080/healthz
    • Flapping servers (rapidly switching UP/DOWN): Increase
      fall
      and
      rise
      thresholds; check for intermittent application errors
    • Health checks passing but application broken: Your health endpoint isn't checking enough. Add database connectivity, cache availability, and dependency checks to your
      /healthz
      endpoint
    • High CPU on backend servers from health checks: Increase
      inter
      or use
      downinter
      to reduce checks against known-DOWN servers; consider
      HEAD
      instead of
      GET
    • Check timeout errors: Increase
      timeout check
      or investigate why your health endpoint is slow

    10.2 Runtime API Diagnostic Commands

    # Show all servers and their check status
    echo "show servers state" | socat stdio /var/run/haproxy/admin.sock
    
    # Show detailed backend info
    echo "show stat" | socat stdio /var/run/haproxy/admin.sock | \
      cut -d, -f1,2,18,19,20,21,37 | column -s, -t
    
    # Force a health check immediately
    echo "set server bk_infrarunbook_web/web-infrarunbook-01 health up" | \
      socat stdio /var/run/haproxy/admin.sock
    
    # Manually disable health checks (USE WITH CAUTION)
    echo "disable health bk_infrarunbook_web/web-infrarunbook-01" | \
      socat stdio /var/run/haproxy/admin.sock
    
    # Re-enable health checks
    echo "enable health bk_infrarunbook_web/web-infrarunbook-01" | \
      socat stdio /var/run/haproxy/admin.sock

    10.3 Simulating Health Check Requests

    # Replicate exactly what HAProxy sends
    curl -v \
      -H "Host: app.solvethenetwork.com" \
      -H "User-Agent: HAProxy-HealthCheck/2.8" \
      http://10.10.50.51:8080/healthz
    
    # For SSL backend checks
    curl -vk \
      -H "Host: api.solvethenetwork.com" \
      https://10.10.50.61:8443/api/v1/health

    11. Best Practices Summary

    1. Always use application-level (HTTP) checks for HTTP backends — TCP checks only prove the port is open, not that the app is healthy
    2. Design meaningful health endpoints — check database connections, cache availability, disk space, and critical dependencies
    3. Use
      fastinter
      to accelerate state transitions without increasing steady-state probe frequency
    4. Use
      downinter
      to reduce load on servers that are already confirmed DOWN
    5. Combine HTTP checks with agent checks for load-aware balancing in dynamic environments
    6. Set
      timeout check
      explicitly
      — don't rely on the server timeout for health checks
    7. Enable
      option log-health-checks
      so state transitions appear in your logs and alerting
    8. Use
      default-server
      to avoid repeating check parameters on every server line
    9. Test health checks manually with curl before deploying to production
    10. Monitor check duration metrics — increasing check latency is an early warning of server degradation

    Frequently Asked Questions

    What is the difference between HAProxy TCP and HTTP health checks?

    TCP health checks (Layer 4) only verify that a TCP connection can be established to the backend server — they confirm the port is open but nothing about application health. HTTP health checks (Layer 7) send an actual HTTP request and inspect the response status code, headers, or body content. For web applications, HTTP checks are always recommended because a server can accept TCP connections while the application is in an error state.

    What do the inter, fall, and rise parameters mean in HAProxy?

    The 'inter' parameter sets the time interval between consecutive health checks (default 2 seconds). The 'fall' parameter specifies how many consecutive failed checks are required before marking a server as DOWN (default 3). The 'rise' parameter specifies how many consecutive successful checks are needed before marking a DOWN server as UP again (default 2). The worst-case failure detection time equals inter × fall.

    How do I calculate worst-case failover time in HAProxy?

    Worst-case failover detection time is calculated as inter × fall. For example, with inter 3s and fall 3, the worst-case detection time is 9 seconds. The average detection time is approximately inter × (fall - 0.5). You can reduce detection time by lowering inter and fall, but this increases health check traffic and may cause false positives with flapping servers.

    What is fastinter in HAProxy and when should I use it?

    The 'fastinter' parameter sets a shorter check interval that is used during state transitions — specifically when a server is in the process of being declared DOWN (has some failed checks but hasn't reached the fall threshold) or coming back UP (has some successful checks but hasn't reached the rise threshold). This accelerates the resolution of state changes without increasing the frequency of steady-state health checks. For example, inter 5s fastinter 500ms means checks run every 5 seconds normally but every 500ms during transitions.

    How do HAProxy agent checks work?

    Agent checks connect to a small TCP daemon running on each backend server (typically on a dedicated port like 9707). The agent responds with keywords like 'up', 'down', 'drain', 'maint', 'ready', or a percentage like '50%' to dynamically adjust the server's weight. Agent checks can be combined with regular health checks — the HTTP check determines UP/DOWN status while the agent adjusts weight and administrative state. This enables load-aware balancing and graceful deployment draining.

    Can I run health checks on a different port than the data port?

    Yes, use the 'check port' option on the server line. For example: 'server app-01 10.10.50.51:8080 check port 8081 inter 3s fall 3 rise 2' sends health checks to port 8081 while real traffic goes to port 8080. You can also use 'check addr' to probe a different IP address, which is useful when health endpoints run on a management network interface.

    How do I gracefully drain a server before maintenance in HAProxy?

    Use the HAProxy runtime API via the admin socket: 'echo "set server backend/server state drain" | socat stdio /var/run/haproxy/admin.sock'. This stops new connections from being routed to the server while existing connections finish naturally. Monitor active sessions via the stats page or 'show stat' command, then set the server to 'maint' once sessions reach zero. Alternatively, use an agent check that returns 'drain' during deployments for automated zero-downtime deploys.

    Why are all my HAProxy backend servers showing as DOWN?

    Common causes include: the health endpoint returning a non-200 status code, the health endpoint not existing (404), firewall rules blocking health check connections, the Host header in the check not matching the application's virtual host configuration, or SSL/TLS misconfiguration on check connections. Debug by testing manually with curl using the exact same request HAProxy sends: 'curl -v -H "Host: your-host" http://server-ip:port/healthz'. Also check HAProxy logs with 'option log-health-checks' enabled.

    What is the difference between external checks and agent checks in HAProxy?

    External checks fork a script/process on the HAProxy server itself for each check execution — the script's exit code determines pass/fail. Agent checks connect to a TCP daemon running on the backend server that returns status keywords. External checks are more flexible (can run any logic) but have higher overhead due to process forking and require 'insecure-fork-wanted' in HAProxy 2.x. Agent checks are lightweight, run on the backend servers, and can dynamically adjust weight and administrative state, making them better for load-aware balancing.

    How do I monitor HAProxy health check status in Prometheus?

    Enable the built-in Prometheus exporter by adding a frontend with 'http-request use-service prometheus-exporter if { path /metrics }'. Key metrics include haproxy_server_check_failures_total (cumulative failures), haproxy_server_check_duration_seconds (check latency), haproxy_server_status (current state), and haproxy_server_weight (effective weight). Set up Prometheus to scrape the endpoint and create Grafana dashboards or alerts for state transitions and increasing check durations.

    Related Articles