InfraRunBook
    Back to articles

    Layer 4 vs Layer 7 Load Balancing Explained

    Networking
    Published: Apr 4, 2026
    Updated: Apr 4, 2026

    A deep-dive technical guide comparing Layer 4 and Layer 7 load balancing — covering how each works, performance trade-offs, real-world HAProxy and IPVS configurations, and when to choose one over the other.

    Layer 4 vs Layer 7 Load Balancing Explained

    What Is Load Balancing?

    Load balancing is the practice of distributing incoming network requests across a pool of backend servers to maximize throughput, minimize response time, and prevent any single resource from becoming a bottleneck. In modern infrastructure, load balancers are a foundational control plane component — but not all load balancers are created equal. The distinction between Layer 4 and Layer 7 load balancing is one of the most consequential architectural decisions you will face when designing a resilient, scalable service.

    The "layer" terminology refers to the OSI model (Open Systems Interconnection model), a conceptual framework that partitions network communication into seven distinct abstraction layers. Layer 4 is the Transport Layer, responsible for end-to-end communication via TCP and UDP. Layer 7 is the Application Layer, where protocols like HTTP, HTTPS, gRPC, DNS over TLS, and WebSocket operate. Where a load balancer sits in this stack determines everything about what it can see, what decisions it can make, and what it costs you to run it.

    How Layer 4 Load Balancing Works

    A Layer 4 load balancer operates purely on transport-level metadata. It inspects the source IP, destination IP, source port, destination port, and protocol — collectively known as the 5-tuple — to make forwarding decisions. It has no awareness of payload content whatsoever. It does not read HTTP headers, inspect cookies, parse URL paths, or evaluate TLS SNI beyond the initial handshake. To the L4 load balancer, all traffic is opaque byte streams belonging to TCP connections or UDP datagrams.

    There are two primary forwarding modes for L4 load balancers:

    • Network Address Translation (NAT): The load balancer rewrites the destination IP (and optionally the destination port) of each packet and forwards it to a selected backend. Return traffic from the backend passes back through the load balancer, which rewrites the source address to maintain the appearance of a single endpoint to the client. This is the most common mode for software-based L4 load balancers.
    • Direct Server Return (DSR): The load balancer rewrites only the destination MAC address at Layer 2 and forwards the frame directly to a backend on the same broadcast domain. The backend server holds the virtual IP on its loopback interface and responds directly to the client, completely bypassing the load balancer on the return path. Since outbound traffic (responses) is typically orders of magnitude larger than inbound traffic (requests), DSR dramatically reduces the load balancer's bandwidth requirements.

    Because L4 load balancers work at the TCP connection level, they establish a persistent mapping between a client connection and a specific backend server for the entire lifetime of that TCP session. All packets belonging to a given 5-tuple flow are consistently forwarded to the same backend. This is connection-level affinity — not to be confused with application-level session stickiness.

    A typical L4 load balancer configuration using Linux IPVS (IP Virtual Server) on sw-infrarunbook-01 looks like this:

    # L4 virtual service on sw-infrarunbook-01
    # VIP: 10.10.10.100:80 — distributes to three app backends
    
    ipvsadm -A -t 10.10.10.100:80 -s rr
    ipvsadm -a -t 10.10.10.100:80 -r 10.10.20.11:80 -m
    ipvsadm -a -t 10.10.10.100:80 -r 10.10.20.12:80 -m
    ipvsadm -a -t 10.10.10.100:80 -r 10.10.20.13:80 -m
    
    # Verify the virtual service table
    ipvsadm -L -n
    
    # Expected output:
    # IP Virtual Server version 1.2.1
    # Prot LocalAddress:Port Scheduler Flags
    #   -> RemoteAddress:Port    Forward Weight ActiveConn InActConn
    # TCP  10.10.10.100:80 rr
    #   -> 10.10.20.11:80        Masq    1      142        0
    #   -> 10.10.20.12:80        Masq    1      139        0
    #   -> 10.10.20.13:80        Masq    1      141        0
    
    # -s rr  = round-robin scheduling algorithm
    # -m     = masquerading (NAT forwarding mode)
    # -A     = add virtual service
    # -a     = add real server to virtual service

    In this configuration, sw-infrarunbook-01 is forwarding TCP connections arriving at 10.10.10.100:80 to one of three backend servers (10.10.20.11–13) using round-robin scheduling with NAT mode. The kernel-level IPVS module processes packets entirely in the forwarding path — no userspace process touches the packet content. It never reads a single byte of HTTP.

    How Layer 7 Load Balancing Works

    A Layer 7 load balancer operates as a full reverse proxy. It terminates the client's TCP connection (and TLS session, if applicable), reads and fully parses the application-layer protocol, makes a routing decision based on that application-level content, and then opens a new, independent TCP connection to the selected backend. From the backend's perspective, every request comes from the load balancer — not directly from the original client. This is why L7 load balancers must inject headers like

    X-Forwarded-For
    to preserve the original client IP address.

    This full-proxy model enables a significantly richer set of routing and traffic management capabilities:

    • Path-based routing: Route all requests to
      /api/
      to one backend cluster,
      /static/
      to a CDN origin, and
      /ws/
      to a dedicated WebSocket tier.
    • Host-based virtual hosting: Route traffic for
      app.solvethenetwork.com
      to one backend pool and
      admin.solvethenetwork.com
      to another, all on a single listener IP and port.
    • Header inspection and manipulation: Inject, strip, or rewrite request and response headers before forwarding. Add correlation IDs, remove internal headers that should not be visible to clients, or rewrite redirect URLs.
    • TLS/SSL termination: Offload the CPU-intensive work of TLS handshakes and symmetric encryption to the load balancer tier, allowing backend servers to communicate over plain HTTP on the internal RFC 1918 network.
    • Cookie-based session persistence: Insert a sticky-session cookie to ensure a client always returns to the same backend application instance — critical for stateful applications storing session data in local memory.
    • Application-layer health checking: Send real HTTP requests to a health endpoint and evaluate the HTTP response code and body, rather than merely checking if a TCP port is open.
    • Rate limiting and WAF integration: Enforce per-IP or per-token rate limits, inspect requests for injection attacks, and block malicious traffic before it reaches the application tier.

    Here is a production-style HAProxy configuration on sw-infrarunbook-01 demonstrating multi-service L7 routing:

    # /etc/haproxy/haproxy.cfg on sw-infrarunbook-01
    # L7 reverse proxy for solvethenetwork.com services
    
    global
        log         127.0.0.1 local2
        chroot      /var/lib/haproxy
        user        infrarunbook-admin
        group       infrarunbook-admin
        maxconn     50000
        tune.ssl.default-dh-param 2048
    
    defaults
        mode                    http
        log                     global
        option                  httplog
        option                  dontlognull
        option                  http-server-close
        option                  forwardfor except 10.10.0.0/16
        timeout connect         5s
        timeout client          30s
        timeout server          30s
    
    frontend https_front
        bind 10.10.10.100:443 ssl crt /etc/ssl/solvethenetwork.com.pem
        bind 10.10.10.100:80
        redirect scheme https code 301 if !{ ssl_fc }
    
        # ACL definitions
        acl is_api        path_beg      /api/
        acl is_admin      hdr(host)     -i admin.solvethenetwork.com
        acl is_static     path_beg      /static/ /assets/ /media/
        acl is_websocket  hdr(Upgrade)  -i websocket
    
        # Routing rules (evaluated top-to-bottom)
        use_backend ws_cluster     if is_websocket
        use_backend admin_cluster  if is_admin
        use_backend api_cluster    if is_api
        use_backend static_cluster if is_static
        default_backend app_cluster
    
    backend api_cluster
        balance leastconn
        option httpchk GET /api/health HTTP/1.1\r\nHost:\ solvethenetwork.com
        http-check expect status 200
        server api-01 10.10.20.21:8080 check inter 5s fall 2 rise 3
        server api-02 10.10.20.22:8080 check inter 5s fall 2 rise 3
        server api-03 10.10.20.23:8080 check inter 5s fall 2 rise 3
    
    backend app_cluster
        balance roundrobin
        cookie SRVID insert indirect nocache httponly secure
        option httpchk GET /healthz
        server app-01 10.10.20.31:8080 check cookie s01
        server app-02 10.10.20.32:8080 check cookie s02
        server app-03 10.10.20.33:8080 check cookie s03
    
    backend admin_cluster
        balance leastconn
        acl valid_admin_src src 10.10.1.0/24
        http-request deny unless valid_admin_src
        server adm-01 10.10.20.41:8443 check ssl verify none
        server adm-02 10.10.20.42:8443 check ssl verify none
    
    backend ws_cluster
        balance source
        timeout tunnel  3600s
        server ws-01 10.10.20.51:9000 check
        server ws-02 10.10.20.52:9000 check
    
    backend static_cluster
        balance uri
        server cdn-01 10.10.20.61:80 check
        server cdn-02 10.10.20.62:80 check

    Notice the frontend performing TLS termination, inspecting the Host header and URL path, applying an IP-based access control rule on the admin backend, and routing WebSocket upgrades to a dedicated backend pool — none of which is possible with a pure L4 load balancer.

    Why It Matters: Performance, Cost, and Capability Trade-offs

    Choosing between L4 and L7 load balancing is not merely a technical curiosity — it has direct implications for throughput, latency, operational complexity, security posture, and infrastructure cost.

    Throughput and Latency

    Layer 4 load balancers are significantly faster and more resource-efficient per-connection. Because they operate on packet headers only, implementations like Linux IPVS run entirely inside the kernel without userspace context switches, and hardware ASIC-based appliances can process packets at line rate. A single L4 load balancer can sustain millions of concurrent connections at very low CPU overhead. Layer 7 load balancers must terminate TLS, parse HTTP/1.1 or HTTP/2 framing, evaluate ACLs, and open a new upstream connection for every request — all of which is CPU-intensive. Modern implementations (Envoy, HAProxy, Nginx) are highly optimized and the latency overhead is typically under one millisecond per request, which is acceptable for most web application workloads.

    Feature Set

    If your routing logic requires any application awareness — path-based routing, virtual hosting, A/B testing, canary deployments, mutual TLS, gRPC stream-level load balancing, HTTP/2 multiplexing, or request-level rate limiting — you need L7. An L4 load balancer cannot inspect or act on any of these signals. For non-HTTP protocols (PostgreSQL, Redis, SMTP, custom TCP-based protocols), L4 is typically the only viable option unless a protocol-aware proxy exists for your specific protocol.

    Security

    L7 load balancers are a natural integration point for Web Application Firewalls (WAF), DDoS mitigation at the request level, and centralized certificate lifecycle management. They can sanitize and strip headers that should not propagate to backends — for example, stripping a spoofed

    X-Forwarded-For
    header injected by an untrusted client. L4 load balancers forward traffic largely unexamined, providing minimal security value beyond basic port-level filtering. However, L4 load balancers have a smaller attack surface on the load balancer process itself, since there is no HTTP parser that could be exploited.

    Observability

    L7 load balancers produce rich access logs with HTTP-level detail: status codes, request latency broken down by URL path, request and response body sizes, user-agent strings, and backend selection decisions. L4 load balancers can log only connection-level metrics: bytes transferred, TCP reset counts, and connection duration. For application performance monitoring and SLO tracking, L7 logs are vastly more actionable and are the standard source of truth for HTTP error rate and latency SLIs.

    Real-World Examples and Use Cases

    Example 1: Database Replica Load Balancing (L4)

    Distributing read queries across a PostgreSQL replica pool is a canonical L4 use case. The load balancer does not need to understand the PostgreSQL wire protocol — it simply distributes TCP connections on port 5432 across multiple read replicas. Using HAProxy in TCP mode on sw-infrarunbook-01:

    # HAProxy TCP mode for PostgreSQL read replicas
    # sw-infrarunbook-01 /etc/haproxy/haproxy.cfg
    
    frontend pg_frontend
        bind 10.10.10.200:5432
        mode tcp
        default_backend pg_replicas
    
    backend pg_replicas
        mode tcp
        balance leastconn
        option tcp-check
        tcp-check connect
        server pg-replica-01 10.10.30.11:5432 check inter 10s fall 2 rise 2
        server pg-replica-02 10.10.30.12:5432 check inter 10s fall 2 rise 2
        server pg-replica-03 10.10.30.13:5432 check inter 10s fall 2 rise 2

    This configuration distributes PostgreSQL read traffic with

    leastconn
    scheduling — important for database workloads where connection duration varies significantly. TCP-level health checks verify the port is accepting connections without needing to authenticate to the database.

    Example 2: Two-Tier Load Balancing Architecture

    Large-scale platforms frequently combine both layers. An L4 load balancer (implemented with IPVS, eBPF/XDP, or a hardware appliance) sits at the network edge and distributes TCP connections to a pool of L7 load balancer instances. The L7 tier then performs application-aware routing to the backend application clusters. This architecture provides the horizontal scalability and fault tolerance of L4 with the full feature richness of L7:

    Internet clients
            |
            v
    +------------------+
    | L4 LB Tier       |  VIP: 10.10.10.100  (IPVS or hardware ASIC)
    | sw-infrarunbook-01|  Scheduling: round-robin, no TLS, mode tcp
    +------------------+
            |
            +-----------> L7 Instance A  10.10.10.111  (HAProxy)
            +-----------> L7 Instance B  10.10.10.112  (HAProxy)
            +-----------> L7 Instance C  10.10.10.113  (HAProxy)
                                  |
                  +---------------+---------------+
                  |               |               |
                  v               v               v
         api_cluster         app_cluster     admin_cluster
      10.10.20.21-23       10.10.20.31-33  10.10.20.41-42
    
    Flow: Client -> L4 LB (TCP forward) -> L7 LB (HTTP parse + route) -> Backend

    The L4 tier scales to millions of connections with trivial CPU cost. The L7 tier scales horizontally — adding more HAProxy instances grows application-layer throughput without changing the L4 VIP configuration. This is the fundamental pattern behind major cloud provider application load balancers.

    Common Misconceptions

    Misconception 1: "L7 load balancers are always too slow for production"

    While L7 does add processing overhead compared to L4, modern L7 load balancers can handle hundreds of thousands of HTTP requests per second on commodity hardware. For typical web application workloads, the added per-request latency is under one millisecond. The performance gap is only operationally significant for extremely high-throughput, latency-critical, non-HTTP workloads — such as financial trading infrastructure, raw DNS resolvers, or high-frequency gaming servers — where even microseconds matter.

    Misconception 2: "L4 is more secure because it doesn't inspect traffic"

    This reasoning is inverted. L7 load balancers offer more security capability precisely because they inspect traffic — they can enforce WAF rules, block known attack patterns, strip dangerous headers, and rate-limit abusive clients before any malicious payload reaches the application tier. The L4 counter-argument is that a simpler load balancer has a smaller attack surface on the load balancer process itself. Both considerations are valid; the net security posture of an L7 load balancer protecting a backend cluster is superior to an L4 load balancer passing all traffic through blindly.

    Misconception 3: "Sticky sessions always require L7"

    L4 load balancers can implement a form of session persistence using IP hash scheduling — consistently mapping a client IP address to the same backend. However, this breaks down when large numbers of clients share a single external IP (carrier-grade NAT, corporate proxies), creating severe imbalance. Cookie-based stickiness, which requires L7, is far more reliable and granular because it operates at the session level rather than the network address level.

    Misconception 4: "An L7 load balancer can always inspect encrypted traffic"

    An L7 load balancer can only inspect HTTPS traffic if it terminates TLS — meaning it holds the private key and decrypts the session. If you configure TLS passthrough (SNI-based routing without decryption), the L7 load balancer can only use the SNI hostname from the TLS ClientHello, not the HTTP content. In TLS passthrough mode, the load balancer is functionally operating at Layer 4 for that traffic stream.

    Misconception 5: "You must choose one or the other"

    As illustrated in the two-tier architecture above, L4 and L7 load balancing are frequently deployed together in the same infrastructure, each doing what it does best. The L4 tier provides a horizontally scalable, highly available entry point; the L7 tier provides intelligent application routing. There is no binary choice — the right answer is often both, composed in layers.


    Frequently Asked Questions

    Q: What is the primary difference between Layer 4 and Layer 7 load balancing?

    A: Layer 4 load balancing routes traffic based solely on TCP/UDP transport-layer information — source IP, destination IP, and ports — without inspecting the payload. Layer 7 load balancing operates as a full reverse proxy, parsing application-layer protocols like HTTP to make routing decisions based on URL paths, Host headers, cookies, query parameters, and other application-level signals. L4 is faster and protocol-agnostic; L7 is more capable but protocol-specific.

    Q: When should I choose a Layer 4 load balancer over Layer 7?

    A: Choose Layer 4 when you need maximum throughput with minimal per-packet latency, when routing non-HTTP protocols (PostgreSQL, Redis, SMTP, custom TCP/UDP), when you want to preserve end-to-end TLS without terminating it at the load balancer, or when you are distributing connections across a pool of L7 load balancer instances. L4 is also the right choice for very high connection rate environments where L7 CPU overhead would become a bottleneck.

    Q: Can HAProxy do both Layer 4 and Layer 7 load balancing?

    A: Yes, within the same process and configuration file. HAProxy supports

    mode tcp
    for Layer 4 (transparent TCP proxy with no HTTP parsing) and
    mode http
    for Layer 7 (full HTTP reverse proxy with ACLs, header manipulation, and cookie management). You can define multiple frontends and backends, each operating in a different mode, simultaneously. This makes HAProxy highly versatile for mixed-protocol environments.

    Q: What is TLS termination and why does it matter for load balancing?

    A: TLS termination is the process of decrypting an inbound TLS connection at the load balancer rather than forwarding it encrypted to backend servers. It matters for several reasons: it offloads CPU-intensive cryptographic operations from application servers, it allows the L7 load balancer to inspect HTTP content that would otherwise be encrypted, it centralizes certificate management to a single point, and it simplifies backend deployments since they can listen on plain HTTP inside the RFC 1918 network. TLS termination is only possible at Layer 7 or at a dedicated TLS offload proxy.

    Q: How does health checking differ between L4 and L7 load balancers?

    A: An L4 health check typically verifies only that a TCP three-way handshake can be completed successfully to the backend port. An L7 health check sends a real HTTP request (e.g.,

    GET /healthz HTTP/1.1
    ) and validates that the response matches expected criteria — status code 200, or a specific JSON body field indicating all dependencies are healthy. L7 health checks are far more accurate: a backend process can accept TCP connections (the port is bound) while being completely unable to serve application traffic due to a crashed thread pool, a full database connection queue, or a failed downstream dependency.

    Q: What load balancing scheduling algorithms are available at each layer?

    A: Layer 4 load balancers typically support round-robin, least connections, IP hash, and weighted variants of these algorithms. Layer 7 load balancers support all of the above plus URL hash (useful for cache affinity — routing requests for the same URL to the same cache node), header hash, consistent hashing (minimizes rehashing when backends are added or removed), random-with-two-choices, and resource-aware algorithms that evaluate backend CPU or queue depth reported via health check endpoints. The richer algorithm set at L7 reflects the additional context available from application-layer data.

    Q: Does gRPC require a Layer 7 load balancer?

    A: In practice, yes. gRPC uses HTTP/2 as its transport, which multiplexes multiple RPC streams over a single long-lived TCP connection. If you place an L4 load balancer in front of gRPC backends, it will pin all RPC traffic from a given client to a single backend for the lifetime of the connection, completely negating load distribution. An L7 load balancer that understands HTTP/2 framing (Envoy, Nginx with grpc_pass, HAProxy 2.x) can distribute individual gRPC streams — not just connections — across the backend pool, providing true per-RPC load balancing.

    Q: What is Direct Server Return and when should it be used?

    A: Direct Server Return (DSR) is an L4 forwarding mode in which the load balancer routes incoming packets to a backend by rewriting the Layer 2 destination MAC address, but the backend sends responses directly to the client without the traffic returning through the load balancer. This works because backends are configured with the virtual IP on their loopback interface and ARP responses for the VIP are suppressed on backend NICs. DSR is valuable when response traffic volume is much larger than request traffic — typical for file downloads, video streaming, or large API responses. It eliminates the load balancer as a bandwidth bottleneck on the outbound path. The constraint is that the load balancer and all backends must share the same Layer 2 broadcast domain.

    Q: How do major cloud providers map their products to L4 and L7?

    A: The mapping is consistent across providers. AWS offers Network Load Balancer (NLB) for Layer 4 and Application Load Balancer (ALB) for Layer 7. Google Cloud provides TCP/UDP Proxy Load Balancing and TCP/UDP Network Load Balancing at Layer 4, and Cloud HTTP(S) Load Balancing at Layer 7. Azure offers Azure Load Balancer (L4) and Application Gateway (L7). All of these cloud-managed products handle high availability and cross-zone distribution internally. The feature sets closely match the conceptual division described in this article — NLBs have lower latency and protocol flexibility, while ALBs and Application Gateways offer path routing, WAF modules, and managed TLS certificates.

    Q: What happens to WebSocket connections at a Layer 7 load balancer?

    A: WebSocket connections begin as a standard HTTP/1.1 upgrade request containing the header

    Upgrade: websocket
    . The L7 load balancer intercepts this request, applies its normal routing ACLs, selects a backend, and forwards the upgrade. Once the backend confirms the upgrade with a
    101 Switching Protocols
    response, the load balancer switches that connection to a transparent tunnel mode, forwarding WebSocket frames bidirectionally without further HTTP parsing. Because WebSocket connections are long-lived (potentially hours or days), session persistence configuration is critical — using source IP or cookie affinity to ensure all frames in a session reach the same backend.

    Q: Is Kubernetes Ingress a Layer 4 or Layer 7 component?

    A: Kubernetes Ingress is a Layer 7 construct. The Ingress resource defines HTTP and HTTPS routing rules based on hostname and path, and the Ingress Controller (Nginx Ingress, Traefik, Envoy-based Contour, HAProxy Ingress) implements these as an L7 reverse proxy running inside the cluster. Kubernetes also provides Service resources of type LoadBalancer, which provisions a cloud provider L4 load balancer (typically equivalent to an NLB) that routes TCP traffic to the cluster's NodePort or directly to pods via the cloud provider's native pod networking integration. In a typical production setup, both are present: an L4 NLB at the edge forwards traffic to the L7 Ingress Controller, which routes to pods.

    Q: Can a Layer 7 load balancer become a single point of failure?

    A: Yes, if deployed as a single instance it is by definition a SPOF. In production environments, L7 load balancers must be deployed in a highly available configuration. Common approaches include: active-passive pairs with a shared virtual IP managed by VRRP and Keepalived (failover on node failure); active-active clusters behind an upstream L4 load balancer or Anycast IP (all instances serve traffic simultaneously); or DNS-based load balancing with low TTLs pointing to multiple L7 instances. Cloud-managed L7 load balancers (AWS ALB, GCP HTTPS LB, Azure Application Gateway) handle availability internally and are inherently multi-zone by design. On self-managed infrastructure, always treat the load balancer tier itself as a component requiring its own HA architecture.

    Frequently Asked Questions

    What is the primary difference between Layer 4 and Layer 7 load balancing?

    Layer 4 load balancing routes traffic based solely on TCP/UDP transport-layer information — source IP, destination IP, and ports — without inspecting the payload. Layer 7 load balancing operates as a full reverse proxy, parsing application-layer protocols like HTTP to make routing decisions based on URL paths, Host headers, cookies, and other application-level signals. L4 is faster and protocol-agnostic; L7 is more capable but protocol-specific.

    When should I choose a Layer 4 load balancer over Layer 7?

    Choose Layer 4 when you need maximum throughput with minimal per-packet latency, when routing non-HTTP protocols such as PostgreSQL, Redis, or SMTP, when you want to preserve end-to-end TLS without terminating it at the load balancer, or when distributing connections across a pool of L7 load balancer instances. L4 is also preferable in very high connection-rate environments where L7 CPU overhead would become a bottleneck.

    Can HAProxy do both Layer 4 and Layer 7 load balancing?

    Yes, within the same process and configuration file. HAProxy supports mode tcp for Layer 4 transparent proxying and mode http for Layer 7 reverse proxying with ACLs, header manipulation, and cookie management. Multiple frontends and backends can each operate in a different mode simultaneously, making HAProxy highly versatile for mixed-protocol environments.

    What is TLS termination and why does it matter for load balancing?

    TLS termination is the process of decrypting an inbound TLS connection at the load balancer rather than passing it encrypted to backend servers. It offloads CPU-intensive cryptographic operations, allows the L7 load balancer to inspect HTTP content, centralizes certificate management, and simplifies backends which can then operate over plain HTTP on the internal RFC 1918 network. TLS termination is only possible at Layer 7 or at a dedicated TLS offload proxy.

    How does health checking differ between L4 and L7 load balancers?

    An L4 health check verifies only that a TCP three-way handshake can complete to the backend port. An L7 health check sends a real HTTP request and validates the response code and optionally the response body. L7 health checks are more accurate because a backend can accept TCP connections while being unable to serve application traffic due to a crashed thread pool, full connection queue, or failed downstream dependency.

    What load balancing scheduling algorithms are available at each layer?

    Layer 4 load balancers typically support round-robin, least connections, IP hash, and weighted variants. Layer 7 load balancers support all of those plus URL hash for cache affinity, consistent hashing to minimize rehashing on topology changes, random-with-two-choices, and resource-aware algorithms that evaluate backend CPU or queue depth. The richer algorithm set at L7 reflects the additional context available from application-layer data.

    Does gRPC require a Layer 7 load balancer?

    In practice, yes. gRPC multiplexes multiple RPC streams over a single long-lived HTTP/2 connection. An L4 load balancer pins all RPC traffic from a client to one backend for the connection lifetime, negating load distribution. An L7 load balancer that understands HTTP/2 framing — such as Envoy, Nginx with grpc_pass, or HAProxy 2.x — can distribute individual gRPC streams across the backend pool for true per-RPC load balancing.

    What is Direct Server Return and when should it be used?

    Direct Server Return (DSR) is an L4 forwarding mode where the load balancer rewrites only the Layer 2 destination MAC address and the backend responds directly to the client, bypassing the load balancer on the return path. It is valuable when outbound response traffic is much larger than inbound requests — file downloads, video, large API payloads — eliminating the load balancer as a bandwidth bottleneck. It requires the load balancer and all backends to share the same Layer 2 broadcast domain.

    How do major cloud providers map their products to L4 and L7?

    AWS offers Network Load Balancer (NLB) for Layer 4 and Application Load Balancer (ALB) for Layer 7. Google Cloud provides TCP/UDP Proxy Load Balancing at L4 and Cloud HTTP(S) Load Balancing at L7. Azure offers Azure Load Balancer (L4) and Application Gateway (L7). NLBs offer lower latency and protocol flexibility; ALBs and Application Gateways provide path routing, WAF integration, and managed TLS certificates.

    What happens to WebSocket connections at a Layer 7 load balancer?

    WebSocket connections begin as an HTTP/1.1 upgrade request. The L7 load balancer applies its routing rules, selects a backend, and forwards the upgrade. Once the backend responds with 101 Switching Protocols, the load balancer switches to transparent tunnel mode, forwarding WebSocket frames bidirectionally without further HTTP parsing. Because WebSocket connections are long-lived, session persistence via source IP or cookie affinity is essential to keep all frames in a session on the same backend.

    Related Articles