Symptoms
TCP routing failures in Traefik are notoriously quiet. Unlike HTTP routing, where a misconfigured backend gives you a 502 or a well-formatted error page, TCP problems manifest as raw connection failures — and those are harder to reason about because Traefik often stays completely silent about them. Here's what you'll typically see:
- Running
nc -zv 192.168.10.5 5432
returnsConnection refused
even though your PostgreSQL container is healthy and running - A Redis client trying to reach
192.168.10.5:6379
through Traefik times out with no response - SSH connections over a custom Traefik TCP entrypoint hang indefinitely and eventually drop
- The Traefik dashboard shows no TCP routers for the service you just configured
- Traefik logs are completely clean — no errors — which somehow makes the situation more confusing
- Direct connections to the backend, bypassing Traefik entirely, work fine — confirming the problem is in the proxy layer itself
In my experience, the most disorienting part is that Traefik often starts without any logged error when TCP routing is fundamentally broken. It silently ignores or skips invalid TCP router configurations, leaving you staring at a clean log with a broken service. Let's go through each cause systematically.
Root Cause 1: EntryPoint Not Defined for TCP
This is the most common cause I see on first-time TCP routing setups. Traefik separates its concerns cleanly between static configuration — entrypoints, providers, TLS resolvers — and dynamic configuration — routers, services, middlewares. If you define a TCP router in dynamic config but the entrypoint it references doesn't exist in the static config, Traefik will ignore your router entirely. No warning, no error. The router just doesn't load.
The key thing to understand is that Traefik entrypoints don't have an explicit protocol type. They're listening addresses. However, if an entrypoint is already serving HTTP and you try to attach a TCP router to it, you create an ambiguous situation. Best practice is to dedicate a separate entrypoint for each TCP service you're proxying.
How to Identify It
Start by inspecting your static configuration:
cat /etc/traefik/traefik.yml
A config that's missing a dedicated TCP entrypoint looks like this:
entryPoints:
web:
address: ":80"
websecure:
address: ":443"
Now check your TCP router definition in a dynamic config file:
tcp:
routers:
postgres-router:
entryPoints:
- tcp-postgres
rule: "HostSNI(`*`)"
service: postgres-svc
The router references
tcp-postgresbut that entrypoint doesn't exist in the static config. To confirm Traefik is ignoring this router, query the API directly:
curl -s http://192.168.10.5:8080/api/tcp/routers | jq .
If your router is absent from the output, it hasn't loaded. Also check Traefik logs at startup for lines like:
level=error msg="entrypoint not found" entryPointName=tcp-postgres
How to Fix It
Add a dedicated entrypoint to your static configuration and restart Traefik. Static config changes always require a full restart — dynamic config hot-reloads, but static config does not:
entryPoints:
web:
address: ":80"
websecure:
address: ":443"
tcp-postgres:
address: ":5432"
systemctl restart traefik
If running in Docker:
docker restart traefik
After the restart, verify the entrypoint is live and owned by Traefik:
ss -tlnp | grep 5432
LISTEN 0 128 0.0.0.0:5432 0.0.0.0:* users:(("traefik",pid=3142,fd=12))
Root Cause 2: Router Rule Syntax Wrong
HTTP routers in Traefik support a rich rule language — Host, PathPrefix, Headers, Method, and combinations thereof. TCP routers don't get any of that. TCP routing is fundamentally different because at the TCP layer, before any application-level protocol has been negotiated, Traefik can't read HTTP headers. The only signal available for routing decisions at the TCP layer is the SNI field from a TLS ClientHello.
This means TCP routers support exactly one rule type:
HostSNI(). That's it. I've seen engineers copy an HTTP router definition and paste it into a TCP router block, keeping rules like
Host(`db.solvethenetwork.com`)— which is HTTP-only syntax. Traefik will silently reject the entire router configuration without loading it.
How to Identify It
Look at your TCP router definition. A broken configuration might look like this:
tcp:
routers:
redis-router:
entryPoints:
- tcp-redis
rule: "Host(`redis.solvethenetwork.com`)"
service: redis-svc
Host()is HTTP-only syntax. Traefik will log an error at startup or on dynamic config reload:
level=error msg="Error while adding rule Host(`redis.solvethenetwork.com`) for tcp router redis-router: unsupported rule"
You can also query the API to see the router's status:
curl -s http://192.168.10.5:8080/api/tcp/routers/redis-router@file | jq .status
"disabled"
A disabled status on a router you just defined means Traefik loaded the file but rejected the router's configuration. Rule syntax is almost always the culprit.
How to Fix It
For non-TLS TCP services — raw TCP with no TLS involved — the only valid rule is the wildcard, which matches every connection arriving on that entrypoint:
tcp:
routers:
redis-router:
entryPoints:
- tcp-redis
rule: "HostSNI(`*`)"
service: redis-svc
For TLS-aware TCP routing where you want to route different services based on the SNI hostname the client presents, use the specific hostname variant:
tcp:
routers:
postgres-tls-router:
entryPoints:
- tcp-secure
rule: "HostSNI(`db.solvethenetwork.com`)"
tls:
passthrough: true
service: postgres-svc
Since you're using the file provider, Traefik hot-reloads on save. After correcting the rule, verify the router appears with the right status:
curl -s http://192.168.10.5:8080/api/tcp/routers | jq '.[].rule'
"HostSNI(`*`)"
Root Cause 3: Backend Not Accepting Raw TCP
This one is subtler and takes longer to diagnose. Your Traefik TCP router is correctly configured, the entrypoint is listening, connections reach Traefik — but they still fail. The problem is on the backend side. Either the backend requires a protocol negotiation that Traefik's raw TCP forwarding isn't performing, or the backend simply isn't reachable from the Traefik host's network perspective.
A classic scenario: you configure Traefik to forward raw TCP to a PostgreSQL instance, but that instance has SSL required in
pg_hba.conf. The client connects to Traefik, Traefik forwards the raw byte stream, PostgreSQL expects a TLS handshake and instead sees a startup packet, and it drops the connection. The client sees a confusing authentication failure rather than a clear connection error.
How to Identify It
Always test the backend directly first, bypassing Traefik entirely:
nc -zv 192.168.10.20 5432
Connection to 192.168.10.20 5432 port [tcp/postgresql] succeeded!
Good — the backend is listening. Now test through Traefik's entrypoint:
nc -zv 192.168.10.5 5432
nc: connect to 192.168.10.5 port 5432 (tcp) failed: Connection refused
The discrepancy confirms the problem is in Traefik's forwarding or reachability to the backend. Check what address Traefik's service definition is actually pointing to:
curl -s http://192.168.10.5:8080/api/tcp/services/postgres-svc@file | jq .
{
"loadBalancer": {
"servers": [
{
"address": "192.168.10.20:5432"
}
]
}
}
Now verify the Traefik host itself can reach that address. If Traefik runs in Docker, this test should run inside the Traefik container:
docker exec traefik nc -zv 192.168.10.20 5432
If that fails, you have a network reachability problem. Check Docker network attachments:
docker inspect postgres-container | jq '.[0].NetworkSettings.Networks'
How to Fix It
If it's a Docker network isolation issue, make sure both containers share a network. In Docker Compose:
services:
traefik:
networks:
- proxy-net
postgres:
networks:
- proxy-net
networks:
proxy-net:
driver: bridge
If the backend requires TLS but Traefik is sending raw bytes, you have two choices: configure TLS passthrough in Traefik (see Root Cause 4), or relax the backend's TLS requirement for internal connections. For PostgreSQL, update
pg_hba.confto use
hostinstead of
hostsslfor the Traefik subnet:
host all all 192.168.10.0/24 md5
Reload PostgreSQL after editing
pg_hba.conf:
pg_ctlcluster 15 main reload
Root Cause 4: TLS Passthrough Misconfigured
TLS passthrough is one of Traefik's most useful TCP features but it's unforgiving when misconfigured. In passthrough mode, Traefik peeks at the TLS ClientHello to extract the SNI hostname for routing decisions, then forwards the entire encrypted stream to the backend without terminating TLS itself. The backend handles the full TLS handshake. Traefik never sees the decrypted data.
When passthrough is misconfigured, you'll typically see TLS handshake failures or SSL errors on the client side. The two most common mistakes are combining passthrough with Traefik-managed certificate resolvers (which are mutually exclusive), and configuring passthrough for connections that aren't TLS at all.
How to Identify It
A contradictory passthrough configuration looks like this:
tcp:
routers:
secure-db-router:
entryPoints:
- tcp-secure
rule: "HostSNI(`db.solvethenetwork.com`)"
tls:
passthrough: true
certResolver: letsencrypt
service: secure-db-svc
Combining
passthrough: truewith
certResolveris self-contradictory — if Traefik is passing TLS through without terminating it, it can't use a certificate resolver to manage its own cert. Traefik will log this at startup:
level=error msg="found TLS options and a TLS passthrough, which are mutually exclusive" routerName=secure-db-router
Another common mistake is enabling passthrough on a service whose backend isn't actually doing TLS. Verify the backend presents a TLS certificate before configuring passthrough:
openssl s_client -connect 192.168.10.20:8443 2>&1 | head -20
If you see
SSL handshake has read 0 bytesor the connection closes immediately, the backend isn't serving TLS. Passthrough will not fix a non-TLS backend — it'll break routing further. Also verify through Traefik's address that the right certificate is being presented (i.e., the backend's cert, not Traefik's):
openssl s_client -connect 192.168.10.5:8443 -servername db.solvethenetwork.com 2>&1 | grep "subject\|issuer"
If you see a Traefik-issued certificate here, passthrough isn't active — Traefik is terminating TLS instead of forwarding it.
How to Fix It
For true TLS passthrough, remove all Traefik-managed TLS configuration. The router should only specify
passthrough: truewith no cert resolver, no TLS options:
tcp:
routers:
secure-db-router:
entryPoints:
- tcp-secure
rule: "HostSNI(`db.solvethenetwork.com`)"
tls:
passthrough: true
service: secure-db-svc
services:
secure-db-svc:
loadBalancer:
servers:
- address: "192.168.10.20:8443"
One more thing worth knowing: TLS passthrough depends entirely on the client sending an SNI field in its ClientHello. Some legacy clients and certain low-level TCP tools don't send SNI. If your clients don't send SNI, Traefik won't be able to match the
HostSNI()rule and the connection will be dropped or fall through to a catch-all. In that case, dedicate a separate entrypoint per backend service and use
HostSNI(`*`)— accepting that you lose per-hostname routing granularity.
After fixing the configuration, test again:
openssl s_client -connect 192.168.10.5:8443 -servername db.solvethenetwork.com 2>&1 | grep "subject\|issuer"
You should now see the backend's own certificate, confirming Traefik is passing TLS through rather than terminating it.
Root Cause 5: Port Conflict
This one is embarrassingly common and I've been caught by it personally. Traefik tries to bind a port for a TCP entrypoint but something else is already listening on that address. On most Linux hosts, services like PostgreSQL, Redis, SSH, or even a second Traefik instance may already occupy the port you're trying to assign. When Traefik can't bind, it logs an error at startup — but depending on the version and how the entrypoint interacts with other config, Traefik may still start successfully with that entrypoint simply absent.
The tricky part is that a port conflict doesn't always prevent Traefik from starting. It may start, appear healthy, and simply have a silent gap where your TCP entrypoint should be.
How to Identify It
Check what's actually listening on your target port on sw-infrarunbook-01:
ss -tlnp | grep 5432
LISTEN 0 128 0.0.0.0:5432 0.0.0.0:* users:(("postgres",pid=1823,fd=5))
PostgreSQL is already bound to port 5432 on the host. Traefik can't also bind it. Check Traefik's startup logs for the bind error:
journalctl -u traefik --since "10 minutes ago" | grep -i "bind\|listen\|address already"
level=error msg="Error creating server: listen tcp :5432: bind: address already in use" entryPointName=tcp-postgres
In Docker environments, port conflicts surface differently. Check for containers already publishing that port to the host:
docker ps --format "table {{.Names}}\t{{.Ports}}" | grep 5432
postgres-container 0.0.0.0:5432->5432/tcp
If another container owns that host port, Docker will reject Traefik's attempt to publish the same port at startup. The Traefik container either fails to start or starts without the conflicting port mapped.
How to Fix It
You have two paths: change Traefik's entrypoint to a non-conflicting port, or stop the conflicting process if it's unintentional.
Option 1 — Change the entrypoint port. This is the right call when the existing service is legitimate and intentional. Clients connect to Traefik on the alternate port, and Traefik proxies to the backend's standard port:
entryPoints:
tcp-postgres:
address: ":15432"
This pattern works well when Traefik is meant to be the single network ingress point — clients always talk to Traefik, never directly to backends.
Option 2 — Stop the conflicting process if it's accidental. For example, if a development PostgreSQL instance is running on the host but shouldn't be:
systemctl stop postgresql
systemctl disable postgresql
systemctl restart traefik
Verify Traefik now owns the port:
ss -tlnp | grep 5432
LISTEN 0 128 0.0.0.0:5432 0.0.0.0:* users:(("traefik",pid=3501,fd=8))
Confirm the TCP router is loaded and active:
curl -s http://192.168.10.5:8080/api/tcp/routers | jq '.[].name'
Root Cause 6: Missing or Malformed Service Definition
Even with a correctly configured router, Traefik won't proxy anything if the TCP service definition is malformed. This happens frequently when converting from Docker label-based configuration to file-based, or when someone copies an HTTP service block into a TCP section without understanding the structural differences.
A TCP service must use
loadBalancer.servers[].addressin
host:portformat. Missing the port — which looks like a reasonable thing to do if you assume Traefik infers it — is a common mistake:
tcp:
services:
redis-svc:
loadBalancer:
servers:
- address: "192.168.10.30"
Without the port, Traefik can't establish the backend connection. Check the service status through the API:
curl -s http://192.168.10.5:8080/api/tcp/services | jq '.[].status'
"disabled"
Fix it by specifying the full address with port:
tcp:
services:
redis-svc:
loadBalancer:
servers:
- address: "192.168.10.30:6379"
After saving the corrected file, the service should transition to an enabled state within a few seconds. Re-query the API to confirm.
Prevention
TCP routing issues are largely avoidable with a disciplined configuration workflow. A few practices I've built into every Traefik deployment have consistently prevented the issues described above from reaching production.
Keep a clear inventory of all entrypoints in a single version-controlled static config file. Every entrypoint should have a comment explaining its purpose, the service it backs, and the owning team. When someone adds a new TCP service, they can immediately see whether the entrypoint they need already exists or needs to be added — and whether it would conflict with something already defined.
Automate port conflict detection before any Traefik deployment that adds new entrypoints. A simple check in your CI/CD or pre-deploy script run on sw-infrarunbook-01:
ss -tlnp | awk '{print $4}' | grep -E ':(5432|6379|6380|9000)$'
If this returns output, you have a conflict to resolve before deploying. Catch it early, not after wondering why Traefik started cleanly but a port never appeared.
Use the Traefik API — not just the dashboard — as your verification tool after every deployment. The dashboard can lag and misrepresent partial failure states. Script a post-deploy check:
curl -s http://192.168.10.5:8080/api/tcp/routers | jq '[.[] | {name: .name, status: .status, rule: .rule}]'
Any router not showing
"enabled"status needs immediate investigation before you declare the deploy successful.
For TLS passthrough specifically, document in your runbook exactly which services own their own TLS certificates and which rely on Traefik-managed certificates. These two models are mutually exclusive at the router level. Mixing them up is the single root cause of most passthrough misconfigurations, and a one-paragraph note in your docs prevents it entirely.
Finally, enable TCP access logging. By default Traefik only logs HTTP access. TCP access logs give you connection attempt records that allow you to distinguish between a connection never reaching Traefik and a connection reaching Traefik but failing to route. Add this to your static configuration:
accessLog:
format: json
fields:
defaultMode: keep
That single change cuts TCP debugging time significantly. Without it, you're guessing whether the problem is network-level or Traefik-level. With it, you know immediately.
