Symptoms
You're staring at an error you didn't expect. Maybe your application won't start. Maybe a cron job that ran fine yesterday is now throwing exceptions. The most common things you'll see when Redis connections go wrong:
Could not connect to Redis at 127.0.0.1:6379: Connection refused
NOAUTH Authentication required.
WRONGPASS invalid username-password pair or user is disabled.
ERR max number of clients reached
Error: Connection timeout
Error: Protocol error, got "\xff" as reply type byte
— Redis is up, but TLS isn't configured on the client- Redis CLI hanging indefinitely with no output
- Application logs showing repeated reconnect attempts with exponential backoff
The frustrating thing about Redis connection problems is that several completely different root causes produce almost identical symptoms on the application side. Your app just sees "can't connect" — it's up to you to figure out why. Work through the causes below systematically rather than jumping to conclusions, and you'll resolve it faster every time.
Root Cause 1: Redis Is Not Running
This is the first thing to check, and I've seen it burn experienced engineers who assume the service must be up. Redis crashes more than people expect — out-of-memory kills, a failed restart after a config change, a botched package upgrade that left the systemd unit in a failed state. Don't skip this step just because it feels too obvious.
Why It Happens
The Linux OOM killer will terminate Redis if the system runs low on memory. Redis will also fail to start if the configuration file has a syntax error, if the bind address is already in use by another process, or if the data directory it needs to write to doesn't exist or has wrong permissions. After OS upgrades or package manager operations, the systemd unit can end up disabled or in a failed state even if the binary is still installed.
How to Identify It
Start with a direct connection attempt:
redis-cli -h 127.0.0.1 -p 6379 ping
If Redis isn't running, you'll get:
Could not connect to Redis at 127.0.0.1:6379: Connection refused
Then check the service status:
systemctl status redis.service
A crashed Redis looks like this:
● redis.service - Advanced key-value store
Loaded: loaded (/lib/systemd/system/redis.service; enabled; vendor preset: enabled)
Active: failed (Result: exit-code) since Sat 2026-04-19 08:12:34 UTC; 3min ago
Process: 14822 ExecStart=/usr/bin/redis-server /etc/redis/redis.conf (code=exited, status=1/FAILURE)
Main PID: 14822 (code=exited, status=1/FAILURE)
Apr 19 08:12:34 sw-infrarunbook-01 redis-server[14822]: # Creating Server TCP listening socket 127.0.0.1:6379: bind: Address already in use
Apr 19 08:12:34 sw-infrarunbook-01 systemd[1]: redis.service: Main process exited, code=exited, status=1/FAILURE
Dig further with the journal:
journalctl -u redis.service --since "1 hour ago" --no-pager
If Redis logs to a file instead:
tail -100 /var/log/redis/redis-server.log
If you suspect the OOM killer, check dmesg:
dmesg | grep -i "oom\|killed"
[ 4821.093421] Out of memory: Kill process 14822 (redis-server) score 892 or sacrifice child
[ 4821.093427] Killed process 14822 (redis-server) total-vm:1843200kB, anon-rss:921600kB
How to Fix It
Start or restart the service and verify it came up cleanly:
systemctl start redis.service
systemctl status redis.service
If it fails to start, test the configuration file directly:
redis-server /etc/redis/redis.conf --test-config
Make sure the unit is enabled so it survives reboots:
systemctl enable redis.service
If the OOM killer is the root cause, restarting alone won't fix it — it'll just crash again. Set a memory ceiling in
/etc/redis/redis.confso Redis manages its own memory budget:
maxmemory 512mb
maxmemory-policy allkeys-lru
Restart after updating the config and confirm the limits are active:
redis-cli -h 127.0.0.1 INFO memory | grep maxmemory
maxmemory:536870912
maxmemory_policy:allkeys-lru
Root Cause 2: Bind Address Blocking Remote Connections
By default, Redis binds only to
127.0.0.1. This is a sensible security default — but it trips up anyone who installs Redis expecting to connect from another host. In my experience, this is the most common misconfiguration ticket I see from teams that just stood up a new Redis node. The service is running perfectly, but anything outside the local machine gets a hard refusal.
Why It Happens
When Redis starts, it listens only on the interfaces specified by the
binddirective in
redis.conf. A fresh install from most Linux distributions will have:
bind 127.0.0.1 -::1
That means only loopback connections are accepted. Any connection attempt from another IP — even a different interface on the same machine — gets rejected before Redis's auth layer ever sees it.
How to Identify It
From the Redis host itself, everything looks fine:
redis-cli -h 127.0.0.1 ping
PONG
But from a remote host or a different source IP, you get:
redis-cli -h 10.10.20.45 -p 6379 ping
Could not connect to Redis at 10.10.20.45:6379: Connection refused
Confirm what interfaces Redis is actually listening on:
ss -tlnp | grep 6379
LISTEN 0 511 127.0.0.1:6379 0.0.0.0:* users:(("redis-server",pid=14901,fd=6))
That
127.0.0.1:6379tells you everything — Redis won't accept a packet from any other address.
How to Fix It
Edit
/etc/redis/redis.confand extend the
binddirective to include the specific private interface you want to expose:
# Before
bind 127.0.0.1 -::1
# After — add the specific RFC 1918 address for your private network
bind 127.0.0.1 10.10.20.45
Don't bind to
0.0.0.0without a firewall in front of Redis. Use the explicit private IP you need. Restart Redis and verify:
systemctl restart redis.service
ss -tlnp | grep 6379
LISTEN 0 511 127.0.0.1:6379 0.0.0.0:* users:(("redis-server",pid=15012,fd=6))
LISTEN 0 511 10.10.20.45:6379 0.0.0.0:* users:(("redis-server",pid=15012,fd=7))
Now test the remote connection:
redis-cli -h 10.10.20.45 -p 6379 ping
PONG
Root Cause 3: requirepass Not Matching
Authentication errors are deceptively simple but surprisingly common in multi-team environments where the Redis password gets rotated in one place without updating every client. The server is healthy and reachable — it just won't talk until you authenticate correctly.
Why It Happens
When
requirepassis set in
redis.conf, every connecting client must issue
AUTH <password>before any other command will succeed. If the client sends the wrong password, sends no password at all, or the application config still references a stale credential, you get an immediate rejection. This also surfaces when an application's secrets manager is updated but the service isn't restarted to pick up the new value.
How to Identify It
Connecting without a password:
redis-cli -h 127.0.0.1 ping
NOAUTH Authentication required.
Connecting with the wrong password:
redis-cli -h 127.0.0.1 -a wrongpassword ping
WRONGPASS invalid username-password pair or user is disabled.
Verify the password is set and check its value in the config file:
grep -i "requirepass" /etc/redis/redis.conf
requirepass r3d1s@S3cur3Pass!
If you already have a working client connection, you can query the running config directly:
redis-cli -h 127.0.0.1 -a r3d1s@S3cur3Pass! CONFIG GET requirepass
1) "requirepass"
2) "r3d1s@S3cur3Pass!"
In application logs, this typically surfaces as:
NOAUTH Authentication required.
Redis::CommandError: NOAUTH Authentication required.
redis.exceptions.AuthenticationError: Authentication required.
How to Fix It
Ensure the connecting client is passing the correct password. For redis-cli:
redis-cli -h 127.0.0.1 -a r3d1s@S3cur3Pass! ping
PONG
For application configuration, update the Redis connection URL to include the password. A standard Redis URL with authentication looks like:
redis://:r3d1s@S3cur3Pass!@10.10.20.45:6379/0
If you need to rotate the password without a service restart, use
CONFIG SET— the change takes effect immediately for new connections. Existing authenticated connections aren't dropped:
redis-cli -h 127.0.0.1 -a r3d1s@S3cur3Pass! CONFIG SET requirepass newpassword123
Immediately write the new value to
redis.confas well so the next restart doesn't revert it:
sed -i 's/^requirepass .*/requirepass newpassword123/' /etc/redis/redis.conf
Then update every application that connects to Redis before the next restart window. Rolling this change without a coordinated update of all clients is how you end up with a partial outage.
Root Cause 4: Max Clients Reached
This one hits production systems at the worst possible moment — usually during a traffic spike or after a connection leak accumulates silently over days. The server is healthy, the network is fine, auth is correct, but new connections are being flatly refused. I've seen this take down staging environments for hours because the error message looks unrelated to a connection pool problem at first glance.
Why It Happens
Redis enforces a hard ceiling on simultaneous client connections via the
maxclientssetting in
redis.conf(default: 10000). When that limit is hit, any new connection attempt is rejected immediately. The most common causes are connection leaks in application code — connections opened but never returned to the pool — and connection pools misconfigured without a maximum size. Scale-out events can also trigger this: doubling your app instances doubles the number of Redis connections without any changes to Redis itself.
How to Identify It
The error is unmistakable:
redis-cli -h 127.0.0.1 -a r3d1s@S3cur3Pass! ping
ERR max number of clients reached
Check the current connected client count against the limit:
redis-cli -h 127.0.0.1 -a r3d1s@S3cur3Pass! INFO clients
# Clients
connected_clients:10000
cluster_connections:0
maxclients:10000
client_recent_max_input_buffer:20480
client_recent_max_output_buffer:0
blocked_clients:0
tracking_clients:0
clients_in_timeout_table:0
connected_clientsexactly at
maxclients— that's your confirmation. To see who's connected and spot idle connections, pull the client list:
redis-cli -h 127.0.0.1 -a r3d1s@S3cur3Pass! CLIENT LIST
id=1482 addr=10.10.20.12:54201 laddr=10.10.20.45:6379 fd=10 name= age=3601 idle=3601 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=20486 argv-mem=10 obl=0 oll=0 omem=0 tot-mem=22314 rbs=16384 rbp=16384 resp=2 cmd=NULL user=default
id=1483 addr=10.10.20.12:54202 laddr=10.10.20.45:6379 fd=11 name= age=3602 idle=3602 flags=N db=0 sub=0 ...
Connections with very high
idlevalues — those are seconds since the last command — are dead weight. A connection idle for 3601 seconds is a leak.
How to Fix It
To get immediate relief, kill specific idle connections by their client ID:
# Kill a specific connection by ID
redis-cli -h 127.0.0.1 -a r3d1s@S3cur3Pass! CLIENT KILL ID 1482
# Verify the client count dropped
redis-cli -h 127.0.0.1 -a r3d1s@S3cur3Pass! INFO clients | grep connected_clients
The durable fix is enabling connection timeouts in
/etc/redis/redis.confso Redis automatically reclaims stale connections:
timeout 300
tcp-keepalive 60
timeout 300closes connections that have been idle for 5 minutes.
tcp-keepalive 60sends TCP keepalive probes every 60 seconds to detect dead connections from clients that disappeared without a clean close. Apply these without a restart using
CONFIG SET:
redis-cli -h 127.0.0.1 -a r3d1s@S3cur3Pass! CONFIG SET timeout 300
redis-cli -h 127.0.0.1 -a r3d1s@S3cur3Pass! CONFIG SET tcp-keepalive 60
To raise the client ceiling temporarily while you chase down the leak:
redis-cli -h 127.0.0.1 -a r3d1s@S3cur3Pass! CONFIG SET maxclients 20000
Then update
redis.confto persist the value. But don't stop there — trace the connection leak in your application. Verify the Redis client library is using a connection pool with bounded size and idle connection cleanup. Unbounded pools and fire-and-forget patterns without proper close handling are the root cause here more than anything on the Redis side.
Root Cause 5: TLS Not Configured on the Client
Redis has supported TLS since version 6.0, and teams are enabling it more often now for compliance and in-transit encryption. When TLS is enabled on the server but not configured on the client, you get cryptic protocol errors that aren't immediately obvious if you're not expecting TLS to be in play.
Why It Happens
When Redis is configured to require TLS, a plain TCP connection gets rejected at the protocol level. The client sends a regular Redis protocol greeting; Redis expects a TLS ClientHello. The result is a protocol mismatch that manifests as SSL errors, connection resets, or a garbled response byte. This commonly happens during a security hardening pass where TLS is enabled on Redis without a corresponding update to all application clients.
How to Identify It
A plain redis-cli connection to a TLS-enabled Redis port:
redis-cli -h 10.10.20.45 -p 6380 ping
Error: Protocol error, got "\xff" as reply type byte
In application logs you might see:
SSL_connect: SSL_ERROR_SYSCALL
error:0A000410:SSL routines::sslv3 alert handshake failure
Connection reset by peer
redis.exceptions.ConnectionError: Error 104 connecting to 10.10.20.45:6380. Connection reset by peer.
Verify TLS is enabled in the Redis configuration:
grep -E "tls-port|tls-cert-file|tls-key-file|tls-ca-cert-file" /etc/redis/redis.conf
tls-port 6380
tls-cert-file /etc/redis/tls/redis.crt
tls-key-file /etc/redis/tls/redis.key
tls-ca-cert-file /etc/redis/tls/ca.crt
Test the TLS handshake independently using openssl to isolate whether the issue is the certificate or the client configuration:
openssl s_client -connect 10.10.20.45:6380 -CAfile /etc/redis/tls/ca.crt
CONNECTED(00000003)
depth=1 CN = Redis-CA
verify return:1
depth=0 CN = sw-infrarunbook-01
verify return:1
---
Certificate chain
0 s:CN = sw-infrarunbook-01
i:CN = Redis-CA
a:PKEY: rsaEncryption, 2048 (bit); sigalg: RSA-SHA256
---
SSL handshake has read 1432 bytes and written 400 bytes
Verification: OK
If the openssl handshake succeeds but redis-cli doesn't, the TLS layer is working — the client just isn't configured to use it.
How to Fix It
For redis-cli, pass the TLS flags explicitly:
redis-cli -h 10.10.20.45 -p 6380 \
--tls \
--cacert /etc/redis/tls/ca.crt \
--cert /etc/redis/tls/client.crt \
--key /etc/redis/tls/client.key \
-a r3d1s@S3cur3Pass! ping
PONG
For Python applications using redis-py:
import redis
import ssl
r = redis.Redis(
host='10.10.20.45',
port=6380,
password='r3d1s@S3cur3Pass!',
ssl=True,
ssl_certfile='/etc/redis/tls/client.crt',
ssl_keyfile='/etc/redis/tls/client.key',
ssl_ca_certs='/etc/redis/tls/ca.crt',
ssl_cert_reqs=ssl.CERT_REQUIRED
)
If you're getting certificate verification errors specifically — the TLS handshake completes but the cert is rejected — check that the CA on the client matches the CA that signed the Redis server cert, and that the server cert's CN or SAN includes the hostname or IP you're connecting to. Inspect the cert directly:
openssl x509 -in /etc/redis/tls/redis.crt -noout -text | grep -A2 "Subject Alternative Name\|Subject:"
Subject: CN = sw-infrarunbook-01
X509v3 Subject Alternative Name:
IP Address:10.10.20.45, DNS:sw-infrarunbook-01.solvethenetwork.com
A common mistake is generating the cert with the IP as the CN but then connecting by FQDN. The SAN field is what matters for modern TLS validation — make sure both the IP and the DNS name are present in the SAN if your clients use both.
Root Cause 6: Firewall Rules Blocking Port 6379
Even when Redis is running and bound to the right address, a firewall rule can silently drop packets before they ever reach the process. The tell here is the behavior of the failure: bind address issues return an immediate connection refused because the OS kernel rejects the connection. A firewall drop causes the connection to hang until it times out. That distinction is useful during triage.
Why It Happens
iptables, nftables, UFW, or cloud security groups often default to deny-all on new interfaces. Redis gets an explicit allow rule for loopback traffic during install, but not for the private network interface that was added later. This is especially common when Redis is migrated from a shared app server to a dedicated host — the firewall config doesn't follow the move.
How to Identify It
A connection that hangs rather than immediately refusing points to a drop rule:
redis-cli -h 10.10.20.45 -p 6379 --connect-timeout 5 ping
Error: Connection timed out
Check the INPUT chain on the Redis host for port 6379:
iptables -L INPUT -n -v | grep 6379
No output means there's no explicit allow rule. Check UFW if that's the active firewall manager:
ufw status verbose | grep 6379
How to Fix It
Allow the specific source subnet to reach port 6379. With iptables:
iptables -A INPUT -p tcp -s 10.10.20.0/24 --dport 6379 -j ACCEPT
With UFW:
ufw allow from 10.10.20.0/24 to any port 6379 proto tcp
Never open Redis to the public internet. Always restrict to the specific RFC 1918 subnet your application servers live on. Persist the iptables rule so it survives reboots — either through
iptables-save/
iptables-restoreor by writing it into your firewall management tool of choice.
Root Cause 7: Wrong Port or Hostname
Simple, but worth covering explicitly — particularly in environments running Redis Sentinel, Redis Cluster, or where the port was moved from the default 6379 to 6380 for a TLS migration. I've spent longer than I'd like to admit tracing a connection error that turned out to be an application config still pointing at 6379 after the server was reconfigured for TLS on 6380.
How to Identify and Fix It
Confirm the actual ports Redis is listening on:
ss -tlnp | grep redis
LISTEN 0 511 127.0.0.1:6379 0.0.0.0:* users:(("redis-server",pid=15012,fd=6))
LISTEN 0 511 10.10.20.45:6380 0.0.0.0:* users:(("redis-server",pid=15012,fd=7))
Cross-reference against the application's connection string. If the connection is by FQDN, verify DNS resolves to the right IP:
dig +short sw-infrarunbook-01.solvethenetwork.com
10.10.20.45
A stale DNS entry resolving to a decommissioned host will produce the same connection refused error as a misconfigured bind address. Always confirm the full path — DNS resolution, port, and bind — before assuming the Redis process itself is the problem.
Prevention
Most Redis connection incidents are preventable with the right defaults set from the start, combined with proactive monitoring that alerts before users notice.
Set a
maxmemorylimit with an eviction policy appropriate for your workload so Redis never gets OOM-killed by the kernel. Enable
timeoutand
tcp-keepalivefrom day one to automatically reclaim idle connections before they accumulate into a maxclients problem. Use
requirepasswith a strong password stored in your secrets manager — rotate it through your secrets manager and automation, not by hand-editing config files on individual nodes.
For monitoring, track these metrics from
INFO statsand
INFO clientsin your metrics stack:
connected_clients,
rejected_connections, and
instantaneous_ops_per_sec. An alert on
rejected_connections > 0gives you advance warning of a client limit problem before it becomes an outage. Check it periodically yourself during normal operations:
redis-cli -h 10.10.20.45 -a r3d1s@S3cur3Pass! INFO stats | grep rejected_connections
rejected_connections:0
Automate a basic health check that runs
redis-cli pingand alerts on anything other than
PONG. Combine it with a check that explicitly tests authentication so you catch credential drift before applications do. A two-line shell script in your monitoring agent is enough — there's no need for anything complex.
For TLS, include cert expiry monitoring in your PKI workflow. Check expiry dates in your automation:
openssl x509 -enddate -noout -in /etc/redis/tls/redis.crt
notAfter=Apr 19 08:00:00 2027 GMT
Alert at 30 days out. An expired Redis cert is a silent outage for every application using TLS connections — and it always seems to expire on a Friday.
Finally, document your Redis bind addresses, ports, authentication method, and TLS configuration in your runbook. Sounds obvious. But the single biggest factor in fast incident resolution is whether the on-call engineer can find the connection details and expected behavior in under two minutes, without needing to call someone at 3am who remembers how it was set up.
