F5 Virtual Server Not Responding

Symptoms

When an F5 BIG-IP virtual server stops responding, users and monitoring systems surface a predictable set of indicators. Recognizing these before you log in to the BIG-IP dramatically narrows your search space.

Clients receive Connection timed out or Connection refused when reaching the VIP address
Browser returns HTTP 503 Service Unavailable with no F5 error page injected
curl -Iv https://10.10.50.100
hangs indefinitely or resets immediately with a TCP RST
Monitoring probes (Nagios, Zabbix, Datadog) alert that the virtual server health check has failed
Application servers log upstream connection failures — no new requests arriving from the load balancer
SSL handshake never completes for HTTPS virtual servers; the TLS client hello receives no server hello
Traffic capture on the client side shows SYN packets transmitted but no SYN-ACK returned
BIG-IP statistics counters for the virtual server show zero increments on current connections despite active client attempts

Root Cause 1: VIP Not Enabled

Why It Happens

The most frequently missed cause of a non-responding virtual server is that the VIP itself is administratively disabled. This occurs after maintenance windows where engineers forget to re-enable objects before closing the change ticket, after a configuration push that inadvertently sets the virtual server state to disabled, or following automated deployment scripts that toggle virtual server state without a corresponding re-enable step. The BIG-IP data plane completely ignores a disabled virtual server — no SYN is acknowledged, no connection is established, and no log entry is generated for dropped attempts.

How to Identify It

SSH to the BIG-IP as infrarunbook-admin and check the virtual server state with TMSH:

tmsh show ltm virtual vs_solvethenetwork_443

A disabled virtual server produces output similar to:

Ltm::Virtual Server: vs_solvethenetwork_443
  Availability     : offline
  State            : disabled
  Reason           : Virtual server is disabled
  Current Sessions : 0
  Total Sessions   : 847291

To scan all virtual servers for any that are disabled:

tmsh list ltm virtual | grep -E "ltm virtual|disabled"

ltm virtual vs_solvethenetwork_443 {
    disabled
    destination 10.10.50.100:443
ltm virtual vs_solvethenetwork_80 {
    destination 10.10.50.100:80

How to Fix It

Re-enable the virtual server from TMSH and persist the configuration:

tmsh modify ltm virtual vs_solvethenetwork_443 enabled
tmsh save sys config

Confirm the state change took effect:

tmsh show ltm virtual vs_solvethenetwork_443 field-fmt | grep -E "availability|enabled"

status.availability-state available
status.enabled-state       enabled
status.status-reason       The virtual server is available

Root Cause 2: No Pool Members Up

Why It Happens

Even when the virtual server is enabled, it will refuse or drop new connections if every member of its default pool is marked down by the health monitor. Pool members go offline when the configured monitor fails its checks against the backend. Common triggers include application server restarts, a failed code deployment returning unexpected HTTP status codes, network ACL changes blocking the monitor source IP from reaching the pool member port, or health monitor timeouts configured too aggressively for the application's actual response time. With zero healthy members, BIG-IP has nowhere to forward new connections. Depending on the action-on-service-down setting, it will either reset the client connection or simply drop the SYN.

How to Identify It

Display pool and member availability:

tmsh show ltm pool pool_solvethenetwork_443 members

Ltm::Pool: pool_solvethenetwork_443
  Availability : offline
  State        : enabled
  Reason       : The children pool member(s) are down

  Ltm::Pool Member: 10.10.20.10:8443
    Availability : offline
    State        : enabled
    Reason       : Pool member has been marked down by a monitor

  Ltm::Pool Member: 10.10.20.11:8443
    Availability : offline
    State        : enabled
    Reason       : Pool member has been marked down by a monitor

  Ltm::Pool Member: 10.10.20.12:8443
    Availability : offline
    State        : enabled
    Reason       : Pool member has been marked down by a monitor

Check what monitor is assigned and inspect its configuration:

tmsh list ltm pool pool_solvethenetwork_443 | grep monitor
tmsh list ltm monitor https mon_solvethenetwork_https

ltm monitor https mon_solvethenetwork_https {
    defaults-from https
    interval 5
    recv "HTTP/1.1 200"
    send "GET /health HTTP/1.1\r\nHost: solvethenetwork.com\r\nConnection: close\r\n\r\n"
    timeout 16
}

How to Fix It

Verify the backend application is actually reachable from the BIG-IP self-IP on the server VLAN:

ping -c 4 10.10.20.10
curl -kv --interface 10.10.10.1 https://10.10.20.10:8443/health

If the application is genuinely down, restore the application service on all backend servers. Once the application responds with the expected content, the BIG-IP monitor will automatically mark members available within one monitor interval. You can force an immediate re-evaluation by bouncing the member state:

tmsh modify ltm pool pool_solvethenetwork_443 members modify { 10.10.20.10:8443 { state user-down } }
tmsh modify ltm pool pool_solvethenetwork_443 members modify { 10.10.20.10:8443 { state user-up } }

If the application is healthy but the monitor is misconfigured (wrong recv string, wrong URI, wrong port), correct the monitor and save:

tmsh modify ltm monitor https mon_solvethenetwork_https recv "200 OK"
tmsh save sys config

Root Cause 3: Routing Issue to VIP

Why It Happens

The virtual server is enabled and pool members are healthy, yet clients still cannot reach the VIP. The culprit is frequently a routing problem: the upstream router lacks a route to the VIP subnet, route redistribution has failed to inject the VIP into the routing domain, or a recent network change has invalidated the path. This is especially common when the VIP address lives on a different subnet than the BIG-IP management interface, when the F5 participates in dynamic routing (OSPF or BGP) and has lost its adjacency, or after a BIG-IP failover where the floating IP has moved but upstream ARP or routing tables have not refreshed.

How to Identify It

From a host on the client network (172.16.5.0/24), trace the path to the VIP:

traceroute 10.10.50.100

traceroute to 10.10.50.100 (10.10.50.100), 30 hops max, 60 byte packets
 1  172.16.5.1      0.4 ms   0.3 ms   0.4 ms
 2  * * *
 3  * * *
 4  * * *

The trace dying after the first hop means the distribution-layer router has no route to 10.10.50.100. On sw-infrarunbook-01, inspect the routing table:

show ip route 10.10.50.100

% Network not in table

On the BIG-IP, confirm the default gateway and routing table are intact:

tmsh show net route
tmsh list sys management-route

Net::Routes
Name       Dest          Gateway       Type    Interface
------------------------------------------------------
default    0.0.0.0/0     10.10.50.254  static  external
10.10.20.0/24  10.10.10.1   --        connected  internal

How to Fix It

Add a static host route on sw-infrarunbook-01 pointing to the F5 external self-IP as the next hop:

ip route 10.10.50.100 255.255.255.255 10.10.50.1

If the VIP subnet is a /24 rather than a host route, use:

ip route 10.10.50.0 255.255.255.0 10.10.50.1

For BGP or OSPF environments, verify the F5 routing adjacency:

tmsh show net routing bgp neighbor
tmsh show net routing ospf neighbor

Confirm end-to-end reachability after the fix:

traceroute 10.10.50.100
curl -Iv http://10.10.50.100/

traceroute to 10.10.50.100, 30 hops max
 1  172.16.5.1    0.4 ms
 2  10.10.50.1    0.6 ms
 3  10.10.50.100  0.9 ms

Root Cause 4: Profile Misconfiguration

Why It Happens

BIG-IP virtual servers rely on profiles — HTTP, SSL/TLS, TCP, UDP, OneConnect, and others — to define exactly how traffic is processed at each layer. A misconfigured profile can silently break traffic without surfacing an obvious error on the virtual server availability indicator. The most common failure modes are: an SSL client profile referencing a certificate that has expired or been deleted from the certificate store; a cipher suite mismatch between the client profile and what connecting clients support; an HTTP profile with broken header insertion that causes backends to reject the request; a TCP profile with idle timeouts shorter than application keepalive intervals; or a FastL4 profile inadvertently replacing a Full Proxy profile, stripping application-layer visibility the backend depends on.

How to Identify It

List the profiles attached to the virtual server:

tmsh list ltm virtual vs_solvethenetwork_443 profiles

ltm virtual vs_solvethenetwork_443 {
    profiles {
        clientssl_solvethenetwork {
            context clientside
        }
        http { }
        tcp { }
    }
}

Inspect the SSL profile, paying close attention to the referenced certificate and key:

tmsh list ltm profile client-ssl clientssl_solvethenetwork

ltm profile client-ssl clientssl_solvethenetwork {
    cert solvethenetwork.com.crt
    key  solvethenetwork.com.key
    chain solvethenetwork-chain.crt
    ciphers DEFAULT:!RC4:!EXPORT
}

Check certificate expiry:

tmsh list sys file ssl-cert solvethenetwork.com.crt | grep expiration

expiration-date   1680000000
expiration-string Apr  1, 2023

An expired certificate causes TLS handshake failures. Examine

/var/log/ltm

for SSL error messages:

tail -100 /var/log/ltm | grep -iE "ssl|handshake|profile|err"

Apr  6 08:12:04 sw-infrarunbook-01 err tmm[14432]: 01260009:3: Connection error: ssl_hs_rxhello:97: unsupported version (70)
Apr  6 08:12:05 sw-infrarunbook-01 err tmm[14432]: 01260009:3: Connection error: ssl_hs_rxhello:97: peer does not support any known cipher suite

How to Fix It

Import the renewed certificate and key, then save:

tmsh install sys crypto cert solvethenetwork.com.crt from-local-file /var/tmp/solvethenetwork_2025.crt
tmsh install sys crypto key solvethenetwork.com.key from-local-file /var/tmp/solvethenetwork_2025.key
tmsh install sys crypto cert solvethenetwork-chain.crt from-local-file /var/tmp/solvethenetwork_chain_2025.crt
tmsh save sys config

For a cipher mismatch, update the SSL profile cipher string to include supported suites:

tmsh modify ltm profile client-ssl clientssl_solvethenetwork ciphers "ECDHE+AESGCM:ECDHE+AES:!RC4:!EXPORT:!aNULL"
tmsh save sys config

Verify the new certificate is active and not expired:

tmsh list sys file ssl-cert solvethenetwork.com.crt | grep expiration

expiration-date   1775000000
expiration-string Mar 20, 2026

Root Cause 5: Self-IP Conflict

Why It Happens

A Self-IP conflict occurs when another device on the network has been assigned the same IP address as the F5 BIG-IP's self-IP or floating self-IP. Because the self-IP resides on the same VLAN and subnet as the virtual server, a conflicting ARP entry on the upstream switch can redirect traffic destined for the VIP to the wrong device. The rogue device almost certainly does not know how to process the traffic, so all connections silently fail. This is especially dangerous in environments where IP Address Management (IPAM) is loosely enforced, where a new server has been provisioned without cross-checking existing allocations, or after a DR failover brings up a standby environment with the same IP assignments as production.

How to Identify It

From the BIG-IP, use

arping

to detect duplicate addresses on the external VLAN:

arping -I external 10.10.50.1 -c 5

ARPING 10.10.50.1 from 10.10.50.1 external
Unicast reply from 10.10.50.1 [00:50:56:AB:11:22]  1.012ms
Unicast reply from 10.10.50.1 [00:50:56:CD:33:44]  1.148ms
Unicast reply from 10.10.50.1 [00:50:56:AB:11:22]  0.994ms
Unicast reply from 10.10.50.1 [00:50:56:CD:33:44]  1.201ms
Sent 5 probes (5 broadcast(s))
Received 10 response(s)

Two distinct MAC addresses replying to a single IP is definitive proof of an address conflict. Verify on sw-infrarunbook-01:

show arp | include 10.10.50.1

Internet  10.10.50.1    0    0050.56ab.1122  ARPA  Vlan50
Internet  10.10.50.1    0    0050.56cd.3344  ARPA  Vlan50

Identify the offending device by tracing the MAC to a switch port:

show mac address-table | include 0050.56cd.3344

  50    0050.56cd.3344   DYNAMIC     Gi0/12

How to Fix It

Identify the device on Gi0/12 via your CMDB or LLDP neighbors, then reassign it to a non-conflicting IP address. After the change, clear the ARP cache on the upstream switch and verify resolution:

clear arp-cache interface Vlan50
clear ip arp 10.10.50.1

On the BIG-IP, gratuitously announce the correct ownership of the self-IP:

arping -A -I external -c 3 10.10.50.1

Confirm only a single MAC now responds:

arping -I external 10.10.50.1 -c 5

ARPING 10.10.50.1 from 10.10.50.1 external
Unicast reply from 10.10.50.1 [00:50:56:AB:11:22]  0.987ms
Unicast reply from 10.10.50.1 [00:50:56:AB:11:22]  0.943ms
Sent 5 probes (5 broadcast(s))
Received 5 response(s)

Root Cause 6: iRule Dropping or Rejecting Traffic

Why It Happens

iRules are Tcl-based scripts that execute inline during traffic processing. An iRule containing an overly broad drop or reject statement, a logic error in a conditional branch, or an access control list that inadvertently matches legitimate source IPs will silently discard all matching connections. Engineers often attach iRules for logging or geo-blocking purposes without fully testing every branch. A single missing

else

clause can result in all traffic outside the explicitly permitted condition being dropped.

How to Identify It

tmsh list ltm virtual vs_solvethenetwork_443 rules

ltm virtual vs_solvethenetwork_443 {
    rules {
        irule_geo_block
    }
}

tmsh list ltm rule irule_geo_block

ltm rule irule_geo_block {
when CLIENT_ACCEPTED {
    if { [IP::addr [IP::client_addr] equals 172.16.5.0/24] } {
        log local0. "Dropping client [IP::client_addr]"
        drop
    }
}
}

The subnet 172.16.5.0/24 is the primary client network — every connection is being dropped. Check LTM logs for the iRule drop messages:

grep "irule_geo_block" /var/log/ltm | tail -20

Apr  6 09:01:12 sw-infrarunbook-01 notice tmm: Rule irule_geo_block: Dropping client 172.16.5.44
Apr  6 09:01:13 sw-infrarunbook-01 notice tmm: Rule irule_geo_block: Dropping client 172.16.5.45

How to Fix It

Detach the iRule while you fix the logic:

tmsh modify ltm virtual vs_solvethenetwork_443 rules none
tmsh save sys config

Correct the iRule to invert the logic (block everything except the allowed range, or block only the genuinely unwanted ranges), then re-attach:

tmsh modify ltm virtual vs_solvethenetwork_443 rules { irule_geo_block }
tmsh save sys config

Root Cause 7: Connection Limit Exceeded

Why It Happens

BIG-IP allows administrators to cap the maximum concurrent connections on a virtual server, a pool, or individual pool members. When this ceiling is reached, new connections are refused even though the virtual server shows as available and all pool members are healthy. Connection limits are commonly set conservatively during initial deployment and never revisited as traffic grows organically. A sudden traffic spike — a marketing campaign, a batch job, or an application connection leak — can exhaust the limit in seconds.

How to Identify It

tmsh show ltm virtual vs_solvethenetwork_443 | grep -iE "conn|limit"

  Current Connections                  10000
  Maximum Connections                  10000
  Total Connections                 14823912
  Connection Limit                     10000

When Current Connections equals Connection Limit, the VIP is saturated. Also check whether individual pool members have their own limits:

tmsh show ltm pool pool_solvethenetwork_443 members detail | grep -E "Current|Limit|Conn"

How to Fix It

Raise or remove the limit on the virtual server (0 means unlimited):

tmsh modify ltm virtual vs_solvethenetwork_443 connection-limit 0
tmsh save sys config

Investigate the source of excess connections — whether a legitimate traffic spike, an application-level connection leak, or a DDoS — before permanently removing the ceiling. If the application is leaking connections, identifying and patching the leak is the correct long-term fix.

Root Cause 8: SNAT Misconfiguration Causing Asymmetric Routing

Why It Happens

If a virtual server uses SNAT automap or a SNAT pool, the F5 rewrites the source IP of the client connection before forwarding to the pool member. If the translated source address is not routable back to the BIG-IP — because the SNAT pool IP is not on a directly connected segment, or the backend server's default gateway points elsewhere — the backend's response packets will be delivered to the wrong next hop. The client's TCP handshake will never complete because the SYN-ACK never returns to the BIG-IP.

How to Identify It

tmsh list ltm virtual vs_solvethenetwork_443 source-address-translation

ltm virtual vs_solvethenetwork_443 {
    source-address-translation {
        pool snatpool_solvethenetwork_external
        type snat
    }
}

tmsh list ltm snatpool snatpool_solvethenetwork_external

ltm snatpool snatpool_solvethenetwork_external {
    members {
        10.10.30.50
    }
}

If 10.10.30.50 is not reachable from the server VLAN (10.10.20.0/24), return traffic will be mis-routed. Verify reachability from a pool member:

ping -c 4 10.10.30.50   # run from pool member 10.10.20.10

How to Fix It

Update the SNAT pool to use a self-IP address that resides on the same VLAN as the pool members, or switch to automap if the internal self-IP is already on that VLAN:

tmsh modify ltm virtual vs_solvethenetwork_443 source-address-translation { type automap }
tmsh save sys config

Prevention

Preventing virtual server outages on F5 BIG-IP requires a combination of operational discipline, monitoring depth, and change control hygiene.

State verification after maintenance: After every maintenance window, run
tmsh show ltm virtual
and confirm all virtual servers show enabled and available before closing the change ticket. Automate this check in your post-change validation runbook.
Certificate lifecycle management: Alert at least 30 days before any SSL certificate expires. Use the BIG-IP's built-in expiry reporting (
tmsh list sys file ssl-cert | grep expiration
) and feed it into your monitoring platform. Automate renewal with ACME or internal PKI tooling wherever possible.
IPAM enforcement for Self-IPs: Register all F5 self-IPs, floating IPs, and VIP addresses in your IPAM system with a permanent reservation marked as infrastructure — do not reassign. Require IPAM allocation approval before any new server receives an IP on a shared subnet.
Health monitor tuning: Align monitor intervals and timeouts with measured application response times under load. Use application-layer monitors (HTTP/HTTPS with a real health endpoint) rather than ICMP. Set timeout to at least 3x the interval.
iRule peer review: Require peer review for every iRule change. Stage all iRule modifications on a non-production virtual server first. Use
tmsh show ltm rule
statistics after deployment to confirm events fire as expected.
Routing validation after failover: After every planned or unplanned HA failover event, immediately run a synthetic transaction from the client network and verify the traceroute path lands on the correct BIG-IP unit.
Connection limit headroom: Review connection limits quarterly against observed peak counts. Maintain at least a 50% headroom buffer above the historical peak. Set alerts at 80% utilization so you act before the limit is hit.
Centralized log alerting: Stream
/var/log/ltm
to a SIEM. Create alerts for: pool member down, ssl handshake failure, connection limit, virtual server disabled. Early detection cuts mean time to repair dramatically.

Symptoms

Root Cause 1: VIP Not Enabled

Why It Happens

How to Identify It

How to Fix It

Root Cause 2: No Pool Members Up

Why It Happens

How to Identify It

How to Fix It

Root Cause 3: Routing Issue to VIP

Why It Happens

How to Identify It

How to Fix It

Root Cause 4: Profile Misconfiguration

Why It Happens

How to Identify It

How to Fix It

Root Cause 5: Self-IP Conflict

Why It Happens

How to Identify It

How to Fix It

Root Cause 6: iRule Dropping or Rejecting Traffic

Why It Happens

How to Identify It

How to Fix It

Root Cause 7: Connection Limit Exceeded

Why It Happens

How to Identify It

How to Fix It

Root Cause 8: SNAT Misconfiguration Causing Asymmetric Routing

Why It Happens

How to Identify It

How to Fix It

Prevention

Related Articles

Frequently Asked Questions

How do I check the health of all virtual servers on an F5 BIG-IP at once?

What is the difference between a virtual server being disabled versus unavailable?

How do I test connectivity to the VIP directly from the BIG-IP without involving external clients?

Can a virtual server show as available while all pool members are down?

What logs should I check first when an F5 virtual server stops responding?

How do I tell whether traffic is actually arriving at the F5 for the VIP?

What is a floating self-IP and why does it matter during failover?

How do I force a pool member online without restarting the application?

Can a hardware or licensing issue cause a virtual server to stop responding?

Should I reboot the BIG-IP to resolve a virtual server outage?

Related Articles