ArgoCD Application Out of Sync

Symptoms

The most obvious sign is the red OutOfSync badge sitting next to your application name in the ArgoCD UI. It's deceptively simple — that one badge can mean a dozen different things. You might also see the app flip to Degraded health at the same time, which usually means the sync didn't just drift, it actively failed. If auto-sync is enabled, you'll watch the retry counter climb. Notifications fire. People start pinging you in Slack.

From the CLI,

argocd app list

is your first stop:

NAME              CLUSTER                         NAMESPACE   PROJECT  STATUS     HEALTH   SYNCPOLICY  CONDITIONS
my-api-service    https://kubernetes.default.svc  production  default  OutOfSync  Healthy  Auto        <none>

Running

argocd app get my-api-service

gives you the detail you actually need:

Name:               my-api-service
Project:            default
Server:             https://kubernetes.default.svc
Namespace:          production
URL:                https://argocd.solvethenetwork.com/applications/my-api-service
Repo:               git@github.com:solvethenetwork/k8s-manifests.git
Target:             main
Path:               apps/my-api-service
SyncWindow:         Sync Allowed
Sync Policy:        Automated
Sync Status:        OutOfSync from main (a3f1c29)
Health Status:      Healthy

GROUP  KIND        NAMESPACE   NAME            STATUS     HEALTH   HOOK  MESSAGE
apps   Deployment  production  my-api-service  OutOfSync  Healthy        ...

Pay close attention to the CONDITION block. A ComparisonError means ArgoCD couldn't even generate the desired state to compare against — it never got that far. An actual diff listed under the resource table means ArgoCD can see what it wants, it just doesn't match what's live. Those two scenarios have completely different root causes, and this distinction will save you a lot of time.

Root Cause 1: Git Repository Not Accessible

In my experience, this is the one that catches teams off guard months into a smooth-running deployment. A security scan rotates SSH deploy keys across the organization. A personal access token hits its 90-day expiry. Someone renames the GitHub repository and forgets to update the ArgoCD repo registration. Or a network policy gets tightened and now blocks the

argocd-repo-server

pod from reaching out to your Git host.

The tell-tale sign is a ComparisonError in the conditions block of

argocd app get

CONDITION         MESSAGE                                                                          LAST TRANSITION
ComparisonError   rpc error: code = Unknown desc = error testing repository connectivity:          2026-04-18 09:12:34 +0000 UTC
                  ssh: handshake failed: ssh: unable to authenticate, attempted methods [none
                  publickey], no supported methods remain

You can also check the repo-server logs directly:

kubectl logs -n argocd deploy/argocd-repo-server | grep -i error

And list all registered repositories to see their current status:

argocd repo list

TYPE  NAME  REPO                                                   INSECURE  STATUS  MESSAGE
git         git@github.com:solvethenetwork/k8s-manifests.git  false     Failed  ssh: handshake failed: ...

To fix it: regenerate the deploy key in GitHub under repository Settings → Deploy Keys, then patch the Kubernetes secret and restart the repo-server. If you're using HTTPS with a token, update the password field in the repo secret:

kubectl -n argocd patch secret argocd-repo-creds-solvethenetwork \
  --type='json' \
  -p='[{"op":"replace","path":"/data/password","value":"'$(echo -n "ghp_newtoken_abc123" | base64)'"}]'

kubectl -n argocd rollout restart deploy/argocd-repo-server

Verify the fix by running

argocd repo list

again — the STATUS column should flip from Failed to Successful within a few seconds of the repo-server coming back up.

Root Cause 2: RBAC Preventing Sync

ArgoCD has its own RBAC layer that lives entirely inside the

argocd-rbac-cm

ConfigMap and operates independently of Kubernetes RBAC. It controls who can do what to which applications, and it's common for these policies to drift out of alignment with how teams are actually structured — especially after an SSO reconfiguration or a project reorganization.

When RBAC blocks a sync, you'll see a PermissionDenied error immediately when you try to run the sync manually:

argocd app sync my-api-service

FATA[0001] rpc error: code = PermissionDenied desc = permission denied: applications, sync, default/my-api-service

Check the current policy to understand what's allowed:

kubectl -n argocd get configmap argocd-rbac-cm -o yaml

apiVersion: v1
data:
  policy.csv: |
    p, role:readonly, applications, get, */*, allow
    p, role:developer, applications, sync, staging/*, allow
    g, solvethenetwork:platform-team, role:admin
    g, solvethenetwork:developers, role:developer
  policy.default: role:readonly
kind: ConfigMap

In that example, the developer role can only sync staging applications. Anyone in the developers group who tries to touch production gets denied. The fix is to edit the ConfigMap and add the appropriate policy line:

kubectl -n argocd edit configmap argocd-rbac-cm

Add the required permission under

policy.csv

p, role:developer, applications, sync, production/*, allow

Or for a specific user account:

p, infrarunbook-admin, applications, sync, production/my-api-service, allow

ArgoCD watches the ConfigMap and picks up changes immediately — no restart required. Test the fix by re-running

argocd app sync my-api-service

with the affected account.

Root Cause 3: Resource Health Check Failing

ArgoCD won't mark a sync complete — and won't report the application as Synced — while managed resources are in a Degraded health state. This design is intentional. It prevents ArgoCD from declaring victory when the cluster is actually broken. But it also means that a CrashLoopBackOff pod or a pending PVC can hold your sync status hostage even when the manifests themselves are perfectly correct.

The symptom here is that the resource table shows Synced (meaning the manifest was applied) but the health column shows Degraded:

argocd app get my-api-service

GROUP  KIND        NAMESPACE   NAME                     STATUS  HEALTH    HOOK  MESSAGE
apps   Deployment  production  my-api-service           Synced  Degraded        Deployment does not have minimum availability.
       ReplicaSet  production  my-api-service-7f4d9b    Synced  Degraded
       Pod         production  my-api-service-7f4d9b-x9p  Synced  Degraded        Back-off restarting failed container

Drill in with standard kubectl commands to find the actual problem:

kubectl -n production describe deployment my-api-service
kubectl -n production get pods -l app=my-api-service
kubectl -n production logs my-api-service-7f4d9b-x9p --previous

Fix the underlying issue — wrong image tag, missing environment variable, insufficient memory limits, whatever kubectl logs tells you. Once the pods stabilize and the Deployment reaches minimum availability, ArgoCD updates the health status automatically and the OutOfSync condition resolves.

For Custom Resource Definitions that ArgoCD doesn't have built-in health checks for, you can define Lua health check scripts in

argocd-cm

. Without these, CRDs default to Progressing indefinitely and block your sync just the same:

kubectl -n argocd edit configmap argocd-cm

data:
  resource.customizations.health.batch_v1_Job: |
    hs = {}
    if obj.status ~= nil then
      if obj.status.succeeded ~= nil and obj.status.succeeded > 0 then
        hs.status = "Healthy"
        return hs
      end
    end
    hs.status = "Progressing"
    return hs

Root Cause 4: Hook Job Failing

ArgoCD sync hooks are Kubernetes Jobs annotated with

argocd.argoproj.io/hook

set to PreSync, Sync, PostSync, or SyncFail. When one of these jobs fails, ArgoCD marks the entire sync operation as failed and the application stays OutOfSync. The deployment never reaches the cluster. Everything stops.

I've seen this bite teams hard when they add a database migration job as a PreSync hook. The migration fails — DB not reachable from the new network segment, schema conflict with a prior half-applied migration, whatever — and now no subsequent deployments can happen until someone clears the blockage. People stare at the ArgoCD UI convinced it's a GitOps problem when it's actually an application operations problem.

The resource table makes the failure clear:

argocd app get my-api-service

GROUP  KIND  NAMESPACE   NAME                            STATUS   HEALTH    HOOK     MESSAGE
batch  Job   production  my-api-service-pre-sync-hook    Failed   Degraded  PreSync  Job has reached the specified backoff limit
apps   Deployment  production  my-api-service            OutOfSync  Healthy

Get the logs from the hook job pod to understand the actual failure:

kubectl -n production get pods -l job-name=my-api-service-pre-sync-hook

NAME                                 READY  STATUS  RESTARTS  AGE
my-api-service-pre-sync-hook-z7m2x   0/1    Error   0         8m

kubectl -n production logs my-api-service-pre-sync-hook-z7m2x

Error: dial tcp 10.0.1.45:5432: connect: connection refused
Database migration failed. Exiting.

Fix the root cause first — in this case, restore connectivity to the Postgres instance at 10.0.1.45. Then delete the failed job to clear the blockage and re-trigger the sync:

kubectl -n production delete job my-api-service-pre-sync-hook

argocd app sync my-api-service

To prevent stale failed jobs from accumulating and causing this problem repeatedly, always set a hook deletion policy in your hook manifest annotations:

annotations:
  argocd.argoproj.io/hook: PreSync
  argocd.argoproj.io/hook-delete-policy: HookSucceeded

HookSucceeded

cleans up the job only on success, leaving failed jobs in place so you can debug them.

BeforeHookCreation

(the other common option) deletes any existing hook resource before creating a new one — useful when you want a clean slate on every sync without manual intervention.

Root Cause 5: IgnoreDifferences Not Configured

This is, without question, the most common cause of a persistent OutOfSync status on an otherwise healthy application. External controllers mutate resources after ArgoCD syncs them. ArgoCD then sees a difference between what Git says and what lives in the cluster, marks the app OutOfSync, and — if auto-sync is enabled — tries to revert those mutations. This puts ArgoCD in a war with the external controller, and nobody wins.

The most frequent offenders: HPA scaling up replica counts beyond what the Deployment manifest specifies, cert-manager injecting annotations or rotating certificate secrets, Istio and Linkerd admission webhooks inserting sidecar containers, and Kubernetes itself normalizing fields like

defaultMode

on volume mounts from

0644

420

(decimal vs octal — a fun one to debug at midnight).

Run

argocd app diff

to see exactly what ArgoCD thinks is wrong:

argocd app diff my-api-service

===== apps/Deployment production/my-api-service ======
30c30
<         replicas: 3
---
>         replicas: 1

===== /ConfigMap production/my-api-service-config ======
15c15
<         defaultMode: 420
---
>         defaultMode: 0644

Git has

replicas: 1

. The HPA scaled it to 3. ArgoCD wants to set it back to 1. If auto-sync is on, it will — and then the HPA will immediately scale it back to 3. You'll see this loop in the ArgoCD event history.

Fix it by adding

ignoreDifferences

to the Application spec:

apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
  name: my-api-service
  namespace: argocd
spec:
  ignoreDifferences:
  - group: apps
    kind: Deployment
    jsonPointers:
    - /spec/replicas
  - group: ""
    kind: ConfigMap
    name: my-api-service-config
    jsonPointers:
    - /data/defaultMode
  source:
    repoURL: git@github.com:solvethenetwork/k8s-manifests.git
    targetRevision: main
    path: apps/my-api-service
  destination:
    server: https://kubernetes.default.svc
    namespace: production

Apply it and the OutOfSync status resolves within seconds:

kubectl -n argocd apply -f my-api-service-app.yaml

argocd app get my-api-service | grep "Sync Status"
Sync Status:  Synced to main (a3f1c29)

For fields you want to ignore globally across all applications — like sidecar injection mutations — configure

resource.customizations.ignoreDifferences

argocd-cm

rather than duplicating the same ignoreDifferences block in every Application manifest.

Root Cause 6: Webhook Not Configured

This one isn't technically an error — it's a misconfiguration that causes confusion. Without a Git webhook pointing at your ArgoCD instance, ArgoCD falls back to polling the repository every 3 minutes (controlled by the

timeout.reconciliation

setting in

argocd-cm

, defaulting to 180 seconds). A commit lands in main. The developer checks ArgoCD. The app still shows the old commit hash. It looks stuck. They ping you.

Check when ArgoCD last reconciled versus when the commit landed:

argocd app get my-api-service | grep -A2 "Sync Status"
Sync Status:  OutOfSync from main (a3f1c29)

git log --oneline -3
b7d2e91 (HEAD -> main, origin/main) fix: correct image tag for v2.4.1
a3f1c29 feat: add readiness probe
8c1b043 chore: bump resource limits

ArgoCD is still on

a3f1c29

even though

b7d2e91

is on main. Check whether a webhook is configured in your Git provider by looking at the ArgoCD server logs for incoming webhook events — if you see none at all, polling is the only mechanism in play.

Set up the webhook in GitHub under repository Settings → Webhooks → Add webhook. Set the payload URL to

https://argocd.solvethenetwork.com/api/webhook

, content type to

application/json

, and select the push event. Generate a random secret and configure it in ArgoCD:

kubectl -n argocd edit secret argocd-secret

Add the key

webhook.github.secret

with the base64-encoded value of your webhook secret. After that, ArgoCD will receive push notifications within milliseconds of a commit landing on the target branch — no more polling lag.

Root Cause 7: Manifest Rendering Error

If you're using Helm or Kustomize, a broken values file or an invalid

kustomization.yaml

can prevent ArgoCD from generating manifests at all. There's nothing to compare against the live cluster state, so the app shows OutOfSync — but the real condition is a ComparisonError buried underneath.

argocd app get my-api-service

CONDITION         MESSAGE
ComparisonError   rpc error: code = Unknown desc = helm template . --name-template my-api-service
                  --namespace production ... exit status 1:
                  Error: render error in "my-api-service/templates/deployment.yaml":
                  template: my-api-service/templates/deployment.yaml:23:18:
                  executing "my-api-service/templates/deployment.yaml"
                  at <.Values.image.tag>: nil pointer evaluating interface {}.tag

Test rendering locally before pushing to catch these early:

# For Helm
helm template my-api-service ./chart -f values-production.yaml

# For Kustomize
kustomize build overlays/production

In this case,

image.tag

is missing from the production values file. Add it, push to main, and ArgoCD picks it up on the next poll or webhook trigger. The ComparisonError clears and the diff appears normally.

Prevention

Most of these causes are preventable with a bit of upfront work. Here's what I'd put in place on any new ArgoCD installation before it touches production.

Configure Git webhooks on day one. Polling is a fallback, not an architecture. Webhooks give you sub-second sync triggers and eliminate the confusion of apparent drift that's really just a polling delay.
Audit ignoreDifferences before go-live. After your first sync in a staging environment, run
argocd app diff
on each application and catalog every field that external controllers modify. Add those fields to
ignoreDifferences
before they become incidents.
Set hook deletion policies on every hook resource. Never ship a hook Job without
argocd.argoproj.io/hook-delete-policy
. Stale failed jobs will block your next sync at the worst possible moment.
Alert on repo credential expiry before it happens. If you're using short-lived tokens or SSH keys with expiry dates, set a calendar reminder or a CronJob that checks expiry and fires a Slack alert with enough lead time to rotate without an incident.
Add
argocd app wait
to your CI pipelines. After pushing a change, run
argocd app wait --sync --health --timeout 300 my-api-service
and fail the pipeline if it doesn't converge. This catches sync failures before you declare the deployment successful.
Define Lua health checks for all CRDs your operators use. Without them, custom resources default to Progressing indefinitely and will hold up syncs just like a degraded Deployment would.
Set up Prometheus alerts on sync status. The
argocd_app_info
metric exposes sync and health labels. Alert on OutOfSync conditions that persist beyond a few minutes:

groups:
- name: argocd.rules
  rules:
  - alert: ArgoCDAppOutOfSync
    expr: argocd_app_info{sync_status="OutOfSync"} == 1
    for: 5m
    labels:
      severity: warning
    annotations:
      summary: "ArgoCD app {{ $labels.name }} stuck OutOfSync"
      description: "{{ $labels.name }} in project {{ $labels.project }} has been OutOfSync for over 5 minutes."

OutOfSync is ArgoCD telling you something doesn't match. Sometimes that's fine — a legitimate drift you're aware of. Often it's a signal that something quietly broke. The teams that catch it fast are the ones who built observability around it from the start, not after the first production incident.

ArgoCD Application Out of Sync

Symptoms

Root Cause 1: Git Repository Not Accessible

Root Cause 2: RBAC Preventing Sync

Root Cause 3: Resource Health Check Failing

Root Cause 4: Hook Job Failing

Root Cause 5: IgnoreDifferences Not Configured

Root Cause 6: Webhook Not Configured

Root Cause 7: Manifest Rendering Error

Prevention

Frequently Asked Questions

How do I force ArgoCD to re-sync immediately without waiting for the poll cycle?

What is the difference between OutOfSync and Degraded in ArgoCD?

Why does my app show OutOfSync immediately after a successful sync?

Can ArgoCD auto-sync fix an OutOfSync application automatically?

How do I check what is actually different between Git and the cluster?

Related Articles