InfraRunBook
    Back to articles

    ArgoCD Application Out of Sync

    CI/CD
    Published: Apr 18, 2026
    Updated: Apr 18, 2026

    An ArgoCD OutOfSync status can mean anything from a broken Git credential to an external controller fighting your desired state. This runbook covers every major root cause with real CLI commands, error messages, and fixes.

    ArgoCD Application Out of Sync

    Symptoms

    The most obvious sign is the red OutOfSync badge sitting next to your application name in the ArgoCD UI. It's deceptively simple — that one badge can mean a dozen different things. You might also see the app flip to Degraded health at the same time, which usually means the sync didn't just drift, it actively failed. If auto-sync is enabled, you'll watch the retry counter climb. Notifications fire. People start pinging you in Slack.

    From the CLI,

    argocd app list
    is your first stop:

    NAME              CLUSTER                         NAMESPACE   PROJECT  STATUS     HEALTH   SYNCPOLICY  CONDITIONS
    my-api-service    https://kubernetes.default.svc  production  default  OutOfSync  Healthy  Auto        <none>

    Running

    argocd app get my-api-service
    gives you the detail you actually need:

    Name:               my-api-service
    Project:            default
    Server:             https://kubernetes.default.svc
    Namespace:          production
    URL:                https://argocd.solvethenetwork.com/applications/my-api-service
    Repo:               git@github.com:solvethenetwork/k8s-manifests.git
    Target:             main
    Path:               apps/my-api-service
    SyncWindow:         Sync Allowed
    Sync Policy:        Automated
    Sync Status:        OutOfSync from main (a3f1c29)
    Health Status:      Healthy
    
    GROUP  KIND        NAMESPACE   NAME            STATUS     HEALTH   HOOK  MESSAGE
    apps   Deployment  production  my-api-service  OutOfSync  Healthy        ...

    Pay close attention to the CONDITION block. A ComparisonError means ArgoCD couldn't even generate the desired state to compare against — it never got that far. An actual diff listed under the resource table means ArgoCD can see what it wants, it just doesn't match what's live. Those two scenarios have completely different root causes, and this distinction will save you a lot of time.

    Root Cause 1: Git Repository Not Accessible

    In my experience, this is the one that catches teams off guard months into a smooth-running deployment. A security scan rotates SSH deploy keys across the organization. A personal access token hits its 90-day expiry. Someone renames the GitHub repository and forgets to update the ArgoCD repo registration. Or a network policy gets tightened and now blocks the

    argocd-repo-server
    pod from reaching out to your Git host.

    The tell-tale sign is a ComparisonError in the conditions block of

    argocd app get
    :

    CONDITION         MESSAGE                                                                          LAST TRANSITION
    ComparisonError   rpc error: code = Unknown desc = error testing repository connectivity:          2026-04-18 09:12:34 +0000 UTC
                      ssh: handshake failed: ssh: unable to authenticate, attempted methods [none
                      publickey], no supported methods remain

    You can also check the repo-server logs directly:

    kubectl logs -n argocd deploy/argocd-repo-server | grep -i error

    And list all registered repositories to see their current status:

    argocd repo list
    
    TYPE  NAME  REPO                                                   INSECURE  STATUS  MESSAGE
    git         git@github.com:solvethenetwork/k8s-manifests.git  false     Failed  ssh: handshake failed: ...

    To fix it: regenerate the deploy key in GitHub under repository Settings → Deploy Keys, then patch the Kubernetes secret and restart the repo-server. If you're using HTTPS with a token, update the password field in the repo secret:

    kubectl -n argocd patch secret argocd-repo-creds-solvethenetwork \
      --type='json' \
      -p='[{"op":"replace","path":"/data/password","value":"'$(echo -n "ghp_newtoken_abc123" | base64)'"}]'
    
    kubectl -n argocd rollout restart deploy/argocd-repo-server

    Verify the fix by running

    argocd repo list
    again — the STATUS column should flip from Failed to Successful within a few seconds of the repo-server coming back up.

    Root Cause 2: RBAC Preventing Sync

    ArgoCD has its own RBAC layer that lives entirely inside the

    argocd-rbac-cm
    ConfigMap and operates independently of Kubernetes RBAC. It controls who can do what to which applications, and it's common for these policies to drift out of alignment with how teams are actually structured — especially after an SSO reconfiguration or a project reorganization.

    When RBAC blocks a sync, you'll see a PermissionDenied error immediately when you try to run the sync manually:

    argocd app sync my-api-service
    
    FATA[0001] rpc error: code = PermissionDenied desc = permission denied: applications, sync, default/my-api-service

    Check the current policy to understand what's allowed:

    kubectl -n argocd get configmap argocd-rbac-cm -o yaml
    
    apiVersion: v1
    data:
      policy.csv: |
        p, role:readonly, applications, get, */*, allow
        p, role:developer, applications, sync, staging/*, allow
        g, solvethenetwork:platform-team, role:admin
        g, solvethenetwork:developers, role:developer
      policy.default: role:readonly
    kind: ConfigMap

    In that example, the developer role can only sync staging applications. Anyone in the developers group who tries to touch production gets denied. The fix is to edit the ConfigMap and add the appropriate policy line:

    kubectl -n argocd edit configmap argocd-rbac-cm

    Add the required permission under

    policy.csv
    :

    p, role:developer, applications, sync, production/*, allow

    Or for a specific user account:

    p, infrarunbook-admin, applications, sync, production/my-api-service, allow

    ArgoCD watches the ConfigMap and picks up changes immediately — no restart required. Test the fix by re-running

    argocd app sync my-api-service
    with the affected account.

    Root Cause 3: Resource Health Check Failing

    ArgoCD won't mark a sync complete — and won't report the application as Synced — while managed resources are in a Degraded health state. This design is intentional. It prevents ArgoCD from declaring victory when the cluster is actually broken. But it also means that a CrashLoopBackOff pod or a pending PVC can hold your sync status hostage even when the manifests themselves are perfectly correct.

    The symptom here is that the resource table shows Synced (meaning the manifest was applied) but the health column shows Degraded:

    argocd app get my-api-service
    
    GROUP  KIND        NAMESPACE   NAME                     STATUS  HEALTH    HOOK  MESSAGE
    apps   Deployment  production  my-api-service           Synced  Degraded        Deployment does not have minimum availability.
           ReplicaSet  production  my-api-service-7f4d9b    Synced  Degraded
           Pod         production  my-api-service-7f4d9b-x9p  Synced  Degraded        Back-off restarting failed container

    Drill in with standard kubectl commands to find the actual problem:

    kubectl -n production describe deployment my-api-service
    kubectl -n production get pods -l app=my-api-service
    kubectl -n production logs my-api-service-7f4d9b-x9p --previous

    Fix the underlying issue — wrong image tag, missing environment variable, insufficient memory limits, whatever kubectl logs tells you. Once the pods stabilize and the Deployment reaches minimum availability, ArgoCD updates the health status automatically and the OutOfSync condition resolves.

    For Custom Resource Definitions that ArgoCD doesn't have built-in health checks for, you can define Lua health check scripts in

    argocd-cm
    . Without these, CRDs default to Progressing indefinitely and block your sync just the same:

    kubectl -n argocd edit configmap argocd-cm
    
    data:
      resource.customizations.health.batch_v1_Job: |
        hs = {}
        if obj.status ~= nil then
          if obj.status.succeeded ~= nil and obj.status.succeeded > 0 then
            hs.status = "Healthy"
            return hs
          end
        end
        hs.status = "Progressing"
        return hs

    Root Cause 4: Hook Job Failing

    ArgoCD sync hooks are Kubernetes Jobs annotated with

    argocd.argoproj.io/hook
    set to PreSync, Sync, PostSync, or SyncFail. When one of these jobs fails, ArgoCD marks the entire sync operation as failed and the application stays OutOfSync. The deployment never reaches the cluster. Everything stops.

    I've seen this bite teams hard when they add a database migration job as a PreSync hook. The migration fails — DB not reachable from the new network segment, schema conflict with a prior half-applied migration, whatever — and now no subsequent deployments can happen until someone clears the blockage. People stare at the ArgoCD UI convinced it's a GitOps problem when it's actually an application operations problem.

    The resource table makes the failure clear:

    argocd app get my-api-service
    
    GROUP  KIND  NAMESPACE   NAME                            STATUS   HEALTH    HOOK     MESSAGE
    batch  Job   production  my-api-service-pre-sync-hook    Failed   Degraded  PreSync  Job has reached the specified backoff limit
    apps   Deployment  production  my-api-service            OutOfSync  Healthy

    Get the logs from the hook job pod to understand the actual failure:

    kubectl -n production get pods -l job-name=my-api-service-pre-sync-hook
    
    NAME                                 READY  STATUS  RESTARTS  AGE
    my-api-service-pre-sync-hook-z7m2x   0/1    Error   0         8m
    
    kubectl -n production logs my-api-service-pre-sync-hook-z7m2x
    
    Error: dial tcp 10.0.1.45:5432: connect: connection refused
    Database migration failed. Exiting.

    Fix the root cause first — in this case, restore connectivity to the Postgres instance at 10.0.1.45. Then delete the failed job to clear the blockage and re-trigger the sync:

    kubectl -n production delete job my-api-service-pre-sync-hook
    
    argocd app sync my-api-service

    To prevent stale failed jobs from accumulating and causing this problem repeatedly, always set a hook deletion policy in your hook manifest annotations:

    annotations:
      argocd.argoproj.io/hook: PreSync
      argocd.argoproj.io/hook-delete-policy: HookSucceeded

    HookSucceeded
    cleans up the job only on success, leaving failed jobs in place so you can debug them.
    BeforeHookCreation
    (the other common option) deletes any existing hook resource before creating a new one — useful when you want a clean slate on every sync without manual intervention.

    Root Cause 5: IgnoreDifferences Not Configured

    This is, without question, the most common cause of a persistent OutOfSync status on an otherwise healthy application. External controllers mutate resources after ArgoCD syncs them. ArgoCD then sees a difference between what Git says and what lives in the cluster, marks the app OutOfSync, and — if auto-sync is enabled — tries to revert those mutations. This puts ArgoCD in a war with the external controller, and nobody wins.

    The most frequent offenders: HPA scaling up replica counts beyond what the Deployment manifest specifies, cert-manager injecting annotations or rotating certificate secrets, Istio and Linkerd admission webhooks inserting sidecar containers, and Kubernetes itself normalizing fields like

    defaultMode
    on volume mounts from
    0644
    to
    420
    (decimal vs octal — a fun one to debug at midnight).

    Run

    argocd app diff
    to see exactly what ArgoCD thinks is wrong:

    argocd app diff my-api-service
    
    ===== apps/Deployment production/my-api-service ======
    30c30
    <         replicas: 3
    ---
    >         replicas: 1
    
    ===== /ConfigMap production/my-api-service-config ======
    15c15
    <         defaultMode: 420
    ---
    >         defaultMode: 0644

    Git has

    replicas: 1
    . The HPA scaled it to 3. ArgoCD wants to set it back to 1. If auto-sync is on, it will — and then the HPA will immediately scale it back to 3. You'll see this loop in the ArgoCD event history.

    Fix it by adding

    ignoreDifferences
    to the Application spec:

    apiVersion: argoproj.io/v1alpha1
    kind: Application
    metadata:
      name: my-api-service
      namespace: argocd
    spec:
      ignoreDifferences:
      - group: apps
        kind: Deployment
        jsonPointers:
        - /spec/replicas
      - group: ""
        kind: ConfigMap
        name: my-api-service-config
        jsonPointers:
        - /data/defaultMode
      source:
        repoURL: git@github.com:solvethenetwork/k8s-manifests.git
        targetRevision: main
        path: apps/my-api-service
      destination:
        server: https://kubernetes.default.svc
        namespace: production

    Apply it and the OutOfSync status resolves within seconds:

    kubectl -n argocd apply -f my-api-service-app.yaml
    
    argocd app get my-api-service | grep "Sync Status"
    Sync Status:  Synced to main (a3f1c29)

    For fields you want to ignore globally across all applications — like sidecar injection mutations — configure

    resource.customizations.ignoreDifferences
    in
    argocd-cm
    rather than duplicating the same ignoreDifferences block in every Application manifest.

    Root Cause 6: Webhook Not Configured

    This one isn't technically an error — it's a misconfiguration that causes confusion. Without a Git webhook pointing at your ArgoCD instance, ArgoCD falls back to polling the repository every 3 minutes (controlled by the

    timeout.reconciliation
    setting in
    argocd-cm
    , defaulting to 180 seconds). A commit lands in main. The developer checks ArgoCD. The app still shows the old commit hash. It looks stuck. They ping you.

    Check when ArgoCD last reconciled versus when the commit landed:

    argocd app get my-api-service | grep -A2 "Sync Status"
    Sync Status:  OutOfSync from main (a3f1c29)
    
    git log --oneline -3
    b7d2e91 (HEAD -> main, origin/main) fix: correct image tag for v2.4.1
    a3f1c29 feat: add readiness probe
    8c1b043 chore: bump resource limits

    ArgoCD is still on

    a3f1c29
    even though
    b7d2e91
    is on main. Check whether a webhook is configured in your Git provider by looking at the ArgoCD server logs for incoming webhook events — if you see none at all, polling is the only mechanism in play.

    Set up the webhook in GitHub under repository Settings → Webhooks → Add webhook. Set the payload URL to

    https://argocd.solvethenetwork.com/api/webhook
    , content type to
    application/json
    , and select the push event. Generate a random secret and configure it in ArgoCD:

    kubectl -n argocd edit secret argocd-secret

    Add the key

    webhook.github.secret
    with the base64-encoded value of your webhook secret. After that, ArgoCD will receive push notifications within milliseconds of a commit landing on the target branch — no more polling lag.

    Root Cause 7: Manifest Rendering Error

    If you're using Helm or Kustomize, a broken values file or an invalid

    kustomization.yaml
    can prevent ArgoCD from generating manifests at all. There's nothing to compare against the live cluster state, so the app shows OutOfSync — but the real condition is a ComparisonError buried underneath.

    argocd app get my-api-service
    
    CONDITION         MESSAGE
    ComparisonError   rpc error: code = Unknown desc = helm template . --name-template my-api-service
                      --namespace production ... exit status 1:
                      Error: render error in "my-api-service/templates/deployment.yaml":
                      template: my-api-service/templates/deployment.yaml:23:18:
                      executing "my-api-service/templates/deployment.yaml"
                      at <.Values.image.tag>: nil pointer evaluating interface {}.tag

    Test rendering locally before pushing to catch these early:

    # For Helm
    helm template my-api-service ./chart -f values-production.yaml
    
    # For Kustomize
    kustomize build overlays/production

    In this case,

    image.tag
    is missing from the production values file. Add it, push to main, and ArgoCD picks it up on the next poll or webhook trigger. The ComparisonError clears and the diff appears normally.


    Prevention

    Most of these causes are preventable with a bit of upfront work. Here's what I'd put in place on any new ArgoCD installation before it touches production.

    • Configure Git webhooks on day one. Polling is a fallback, not an architecture. Webhooks give you sub-second sync triggers and eliminate the confusion of apparent drift that's really just a polling delay.
    • Audit ignoreDifferences before go-live. After your first sync in a staging environment, run
      argocd app diff
      on each application and catalog every field that external controllers modify. Add those fields to
      ignoreDifferences
      before they become incidents.
    • Set hook deletion policies on every hook resource. Never ship a hook Job without
      argocd.argoproj.io/hook-delete-policy
      . Stale failed jobs will block your next sync at the worst possible moment.
    • Alert on repo credential expiry before it happens. If you're using short-lived tokens or SSH keys with expiry dates, set a calendar reminder or a CronJob that checks expiry and fires a Slack alert with enough lead time to rotate without an incident.
    • Add
      argocd app wait
      to your CI pipelines.
      After pushing a change, run
      argocd app wait --sync --health --timeout 300 my-api-service
      and fail the pipeline if it doesn't converge. This catches sync failures before you declare the deployment successful.
    • Define Lua health checks for all CRDs your operators use. Without them, custom resources default to Progressing indefinitely and will hold up syncs just like a degraded Deployment would.
    • Set up Prometheus alerts on sync status. The
      argocd_app_info
      metric exposes sync and health labels. Alert on OutOfSync conditions that persist beyond a few minutes:
    groups:
    - name: argocd.rules
      rules:
      - alert: ArgoCDAppOutOfSync
        expr: argocd_app_info{sync_status="OutOfSync"} == 1
        for: 5m
        labels:
          severity: warning
        annotations:
          summary: "ArgoCD app {{ $labels.name }} stuck OutOfSync"
          description: "{{ $labels.name }} in project {{ $labels.project }} has been OutOfSync for over 5 minutes."

    OutOfSync is ArgoCD telling you something doesn't match. Sometimes that's fine — a legitimate drift you're aware of. Often it's a signal that something quietly broke. The teams that catch it fast are the ones who built observability around it from the start, not after the first production incident.

    Frequently Asked Questions

    How do I force ArgoCD to re-sync immediately without waiting for the poll cycle?

    Run `argocd app sync my-api-service` from the CLI, or click the Sync button in the UI. You can also trigger a refresh (without a sync) using `argocd app get my-api-service --refresh` which forces ArgoCD to re-fetch from Git immediately. Configuring a Git webhook is the permanent fix for polling delays.

    What is the difference between OutOfSync and Degraded in ArgoCD?

    OutOfSync means the desired state in Git doesn't match the live state in the cluster. Degraded is a health status meaning one or more resources are unhealthy (CrashLoopBackOff, failed Job, etc.). An app can be OutOfSync but Healthy, Synced but Degraded, or both OutOfSync and Degraded simultaneously. Each combination points to a different root cause.

    Why does my app show OutOfSync immediately after a successful sync?

    This is almost always an IgnoreDifferences issue. An external controller (HPA, cert-manager, an admission webhook) mutated a resource field immediately after ArgoCD applied it, causing ArgoCD to detect a new diff. Run `argocd app diff my-api-service` to identify the specific fields and add them to the ignoreDifferences block in your Application spec.

    Can ArgoCD auto-sync fix an OutOfSync application automatically?

    Yes, if `syncPolicy.automated` is configured in the Application spec. However, auto-sync won't help if the root cause is a ComparisonError (can't reach Git or render manifests), a failed hook blocking the sync phase, or RBAC preventing the operation. In those cases, auto-sync will retry and fail repeatedly until you fix the underlying issue.

    How do I check what is actually different between Git and the cluster?

    Use `argocd app diff my-api-service`. This shows a unified diff between the desired manifests (from Git) and the live manifests (from the cluster). You can also pass `--local` with a path to diff against a local version of your manifests before pushing, which is useful for validating changes in CI.

    Related Articles