InfraRunBook
    Back to articles

    DNS Propagation Delay Issues

    DNS
    Published: Apr 6, 2026
    Updated: Apr 6, 2026

    A deep-dive troubleshooting guide for DNS propagation delays, covering every layer of the resolution stack from authoritative servers to ISP resolvers, OS caches, and zone transfer failures.

    DNS Propagation Delay Issues

    Symptoms

    DNS propagation delays are among the most operationally disruptive issues in infrastructure work. You have updated a record on your authoritative nameserver — changed an A record, swapped an MX endpoint, removed a deprecated CNAME — and yet hours later, behavior is inconsistent. Some users reach the new destination; others are still hitting the old one. The discrepancy is hard to reproduce because it depends entirely on which resolver a client happens to be using.

    Common symptoms include:

    • Running
      dig solvethenetwork.com @8.8.8.8
      returns the new IP but
      dig solvethenetwork.com @1.1.1.1
      still returns the old one
    • Users in different geographic regions or on different ISPs receive different IP addresses for the same hostname
    • After a server migration, the decommissioned host continues receiving production traffic
    • TLS certificate issuance via ACME fails intermittently because the ACME challenge DNS record resolves to the wrong IP from the CA's resolver
    • Email delivery bounces or defers because MX changes haven't reached the recipient's mail server resolver
    • Monitoring and health checks report the new record while end-user support tickets report the old behavior
    • A newly created subdomain returns NXDOMAIN from some resolvers even though it exists on the authoritative server

    Propagation delays are rarely caused by a single failure. They are typically a combination of several independently-caching systems — each operating on its own TTL clock — that must all expire their cached data before the change is universally visible. Diagnosing the problem requires interrogating each layer of the resolution stack in isolation.


    Root Cause 1: TTL Too High on the Old Record

    Why It Happens

    The Time-to-Live (TTL) field on every DNS resource record instructs recursive resolvers how long they are permitted to serve that record from cache before they must re-query the authoritative server. If a record carried a TTL of 86400 (24 hours) at the time a resolver last fetched it, that resolver is fully compliant with RFC 1035 when it continues serving the cached answer for the next 24 hours — even after you've changed the record on the authoritative server. This is not a bug; it is DNS behaving exactly as designed.

    The failure mode is almost always a process failure, not a technical one. An operator makes a record change and simultaneously lowers the TTL in the same zone file edit. By the time any resolver sees the new TTL value, it has already cached the old record under the old TTL, so the lower TTL has no effect on the current cache cycle. It only benefits the next re-fetch — which still won't happen until the original high TTL has expired.

    How to Identify It

    Query the authoritative nameserver directly to see the TTL currently published in the zone:

    dig @ns1.solvethenetwork.com solvethenetwork.com A +norecurse
    
    ;; ANSWER SECTION:
    solvethenetwork.com.    86400   IN  A   10.10.1.75

    Then query a public recursive resolver. The TTL in the answer will count down from the value at the time of caching:

    dig @8.8.8.8 solvethenetwork.com A
    
    ;; ANSWER SECTION:
    solvethenetwork.com.    82341   IN  A   10.10.1.50

    A remaining TTL of 82341 means the resolver cached this entry approximately 4059 seconds ago (86400 − 82341) and will continue serving the stale answer for another 22.8 hours. The authoritative server has the new record (10.10.1.75) but the resolver still has the old one (10.10.1.50) locked in for nearly a full day.

    How to Fix It

    The only correct procedure is to lower the TTL well in advance of any planned record change — before the change window, not during it. The lead time must equal at least one full current TTL cycle so that all resolvers that were holding the record under the old TTL have had a chance to refresh and pick up the new low TTL.

    ; Step 1: 24+ hours before the change window, lower the TTL
    solvethenetwork.com.    300     IN  A   10.10.1.50
    
    ; Step 2: After one full TTL cycle, make the record change
    solvethenetwork.com.    300     IN  A   10.10.1.75
    
    ; Step 3: After confirming propagation, restore TTL to operational value
    solvethenetwork.com.    86400   IN  A   10.10.1.75

    If you are already past the point of no return — the record has changed but the old high-TTL version is cached everywhere — there is no shortcut. You must wait. You can monitor progress by querying multiple public resolvers every few minutes and watching the TTL value count down toward zero. When it hits zero, that resolver will re-query and pick up the new record.


    Root Cause 2: Negative Cache Not Expired

    Why It Happens

    RFC 2308 defines negative caching: when a resolver queries for a record that does not exist (NXDOMAIN response) or queries for a record type that has no entries for a name (NOERROR/NODATA), the resolver caches that negative result. The duration of that negative cache is governed by the MINIMUM field of the zone's SOA record — commonly called the negative TTL.

    This creates two distinct problem scenarios. First, if you create a new hostname or record type that previously did not exist, any resolver that already queried and received an NXDOMAIN response will refuse to re-query until the negative TTL expires — even though the record now exists. Second, if a record was briefly absent during a zone reload, BIND restart, or accidental deletion, resolvers may have cached an NXDOMAIN during that window. Those resolvers will continue returning NXDOMAIN until the negative cache expires, regardless of how quickly you restore the record.

    How to Identify It

    Inspect the SOA record to determine the negative TTL value (the last field in the SOA RDATA):

    dig solvethenetwork.com SOA +short
    ns1.solvethenetwork.com. infrarunbook-admin.solvethenetwork.com. 2024040601 3600 900 604800 300

    The seventh field (300) is the negative TTL in seconds. To confirm that a remote resolver has cached a negative response, query it and inspect the status and authority section:

    dig @8.8.8.8 api.solvethenetwork.com A
    
    ;; ->>HEADER<<- opcode: QUERY, status: NXDOMAIN, id: 44271
    ;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 1, ADDITIONAL: 1
    
    ;; AUTHORITY SECTION:
    solvethenetwork.com.    287     IN  SOA  ns1.solvethenetwork.com. infrarunbook-admin.solvethenetwork.com. 2024040601 3600 900 604800 300

    Status NXDOMAIN combined with a SOA record in the authority section is the definitive signature of a cached negative response. The TTL on the SOA entry (287) is the remaining seconds before this negative cache expires and the resolver will try again.

    How to Fix It

    Reduce the negative TTL in the SOA record for zones subject to frequent changes. Edit the zone file on sw-infrarunbook-01 and set the seventh SOA field to 60 seconds:

    $TTL 86400
    @   IN  SOA  ns1.solvethenetwork.com. infrarunbook-admin.solvethenetwork.com. (
                2024040602  ; Serial — increment after every change
                3600        ; Refresh
                900         ; Retry
                604800      ; Expire
                60 )        ; Negative TTL — reduced from 300 to 60

    After editing, increment the serial and reload the zone:

    rndc reload solvethenetwork.com
    server: reloading zone 'solvethenetwork.com/IN': success

    For records that were just added and have already been negatively cached, you must wait for the negative TTL to expire on each affected resolver. You cannot remotely flush external caches. The only workaround for end users who cannot wait is to point them at a resolver with a fresher cache, or have them flush their local OS-level cache.


    Root Cause 3: ISP Resolver Caching Stale Data

    Why It Happens

    ISP-operated recursive resolvers are shared infrastructure serving potentially hundreds of thousands of subscribers. Some of these resolvers implement aggressive caching strategies that deliberately extend TTL values beyond what the authoritative server advertises. This behavior — sometimes called TTL stretching — reduces the resolver's upstream query volume and improves perceived response times for subscribers. It is technically non-compliant with RFC 1035 but is common enough in the wild to be a regular source of propagation complaints.

    Even when an ISP resolver faithfully honors TTL values, its popularity problem is significant: a heavily-used resolver may receive thousands of queries for

    solvethenetwork.com
    per minute. Each query that arrives before the TTL expires resets nothing — the cached entry stays alive as long as the TTL on the existing cache entry hasn't hit zero. But in practice, a record that is constantly being queried will always find the entry still in cache when it is re-requested, so it stays perpetually cached until an explicit re-fetch cycle completes.

    How to Identify It

    The telltale sign is divergence between well-known public resolvers and a specific ISP's resolver. Query each vantage point and compare:

    # Ground truth — authoritative server
    dig @ns1.solvethenetwork.com solvethenetwork.com A +norecurse +short
    10.10.1.75
    
    # Google Public DNS
    dig @8.8.8.8 solvethenetwork.com A +short
    10.10.1.75
    
    # Cloudflare DNS
    dig @1.1.1.1 solvethenetwork.com A +short
    10.10.1.75
    
    # ISP-assigned resolver (RFC 1918 address seen via DHCP)
    dig @192.168.100.1 solvethenetwork.com A +short
    10.10.1.50   # still returning old IP

    If the authoritative server and multiple public resolvers all return the new record but a specific ISP resolver returns the old one — and the TTL on the SOA-published record has clearly long expired — ISP-side TTL stretching or aggressive caching is the cause.

    How to Fix It

    You have no direct mechanism to flush a third-party ISP's resolver cache. Your practical options are:

    • Wait: Even ISP resolvers with aggressive caching eventually expire entries. Most will re-query within hours even if they ignore the published TTL.
    • Redirect users temporarily: Instruct affected users to configure their system DNS to 8.8.8.8 or 1.1.1.1 until the ISP cache clears.
    • Contact the ISP NOC: Some ISPs will flush their resolver cache for a specific domain on request. Success rate is variable, but worth attempting for critical migrations.
    • Proxy old to new: If traffic volume is critical, keep the old server running with a reverse proxy or redirect rule pointing to the new server while the ISP cache drains.

    The most effective long-term mitigation is strict pre-migration TTL reduction. If the record's TTL was already at 300 seconds for 24 hours before the change, even a misbehaving ISP resolver will re-fetch within 5 minutes of each subscriber's query cycle.


    Root Cause 4: Authoritative Server Not Updated

    Why It Happens

    Production DNS deployments almost universally involve a primary authoritative server and one or more secondary servers. Changes applied to the primary do not automatically appear on secondaries. Secondaries must either receive a DNS NOTIFY message from the primary and then pull an IXFR or AXFR, or they must poll the primary periodically (every Refresh interval from the SOA) and detect a serial number change. If the NOTIFY is lost, the secondary's poll interval is long, or the transfer itself fails silently, the secondary will continue serving the old zone data indefinitely.

    This is particularly insidious because a zone's NS records list all authoritative servers equally. Recursive resolvers typically round-robin across listed nameservers or select them by latency. This means some resolvers will hit the updated primary and get the correct answer, while others hit the stale secondary and get the old answer. The result is non-deterministic: the same query from two different resolvers returns different answers, and neither client can explain the discrepancy.

    How to Identify It

    Query each listed authoritative server directly and compare SOA serials and record values:

    # Check SOA serial on primary
    dig @ns1.solvethenetwork.com solvethenetwork.com SOA +short
    ns1.solvethenetwork.com. infrarunbook-admin.solvethenetwork.com. 2024040602 3600 900 604800 300
    
    # Check SOA serial on secondary
    dig @ns2.solvethenetwork.com solvethenetwork.com SOA +short
    ns1.solvethenetwork.com. infrarunbook-admin.solvethenetwork.com. 2024040601 3600 900 604800 300

    The serial mismatch (2024040602 vs 2024040601) confirms the secondary is serving a stale zone version. Verify by querying the specific record that was changed on each server:

    dig @ns1.solvethenetwork.com www.solvethenetwork.com A +short
    10.10.1.75
    
    dig @ns2.solvethenetwork.com www.solvethenetwork.com A +short
    10.10.1.50   # old IP — secondary is stale

    How to Fix It

    On sw-infrarunbook-01 (the primary), force a NOTIFY to all configured secondaries:

    rndc notify solvethenetwork.com
    zone 'solvethenetwork.com' is now notified

    Watch the BIND log for confirmation that the transfer completed cleanly:

    tail -f /var/log/named/named.log
    
    06-Apr-2024 14:22:10.341 notify: info: zone solvethenetwork.com/IN: sending notifies (serial 2024040602)
    06-Apr-2024 14:22:10.502 xfer-in: info: transfer of 'solvethenetwork.com/IN' from 10.10.1.10#53: Transfer completed: 1 messages, 14 records, 476 bytes, 0.001 secs (476000 bytes/sec)

    If NOTIFY fails, manually trigger a retransfer from the secondary:

    # Run on the secondary nameserver
    rndc retransfer solvethenetwork.com

    If zone transfers are being blocked, verify the primary's

    named.conf
    allows the secondary's IP under
    allow-transfer
    , and confirm that TCP/53 is open between the servers at the firewall level. Zone transfers always use TCP; a firewall that only permits UDP/53 will silently block all transfers.


    Root Cause 5: Partial Zone Transfer

    Why It Happens

    A zone transfer — whether a full AXFR or an incremental IXFR — can fail midway through due to a network interruption, a TCP session timeout, a firewall stateful table overflow, or a TSIG authentication failure on a single packet in a multi-packet sequence. When this occurs, the secondary may apply only part of the zone changes, leaving the zone in an internally inconsistent state: some records reflect the new version, others still carry old values. Critically, the secondary's SOA serial may or may not reflect the intended zone version depending on exactly when in the transfer process the failure occurred.

    IXFR transfers are more susceptible to this failure mode than full AXFR transfers. An IXFR conveys only the diff — the sequence of deletions and additions between two zone serial numbers. If that diff is complex (many records changed in a single serial increment) or if the transfer spans multiple TCP segments, a mid-transfer reset can leave the secondary in an indeterminate state that is difficult to detect by serial comparison alone.

    How to Identify It

    The warning sign of a partial transfer is matching serial numbers across servers accompanied by mismatched record data. Start with serial comparison:

    dig @ns1.solvethenetwork.com solvethenetwork.com SOA +short
    ns1.solvethenetwork.com. infrarunbook-admin.solvethenetwork.com. 2024040605 3600 900 604800 300
    
    dig @ns2.solvethenetwork.com solvethenetwork.com SOA +short
    ns1.solvethenetwork.com. infrarunbook-admin.solvethenetwork.com. 2024040605 3600 900 604800 300

    Serials match — but now compare specific records that were part of the update:

    dig @ns1.solvethenetwork.com app.solvethenetwork.com A +short
    10.10.2.20
    
    dig @ns2.solvethenetwork.com app.solvethenetwork.com A +short
    10.10.1.80   # old value — partial transfer left this record unchanged

    Matching serials with divergent record data is the definitive fingerprint of a partial zone transfer. Confirm by checking the secondary's BIND log for IXFR errors:

    grep -i "ixfr\|xfer\|transfer\|failed" /var/log/named/named.log
    
    06-Apr-2024 12:14:33.110 xfer-in: error: transfer of 'solvethenetwork.com/IN' from 10.10.1.10#53: IXFR failed: unexpected end of input
    06-Apr-2024 12:14:33.112 xfer-in: info: transfer of 'solvethenetwork.com/IN' from 10.10.1.10#53: retrying AXFR
    06-Apr-2024 12:14:33.891 xfer-in: error: transfer of 'solvethenetwork.com/IN' from 10.10.1.10#53: transfer failed: timed out

    How to Fix It

    Force the secondary to discard its current zone state and perform a fresh full AXFR. The safest method is to remove the zone journal file (which tracks IXFR history) and retrigger the transfer:

    # On the secondary (sw-infrarunbook-01 acting as secondary)
    systemctl stop named
    rm /var/cache/bind/solvethenetwork.com.jnl
    systemctl start named
    
    # Then explicitly request retransfer
    rndc retransfer solvethenetwork.com

    Confirm a clean completion in the logs:

    06-Apr-2024 14:30:01.221 xfer-in: info: transfer of 'solvethenetwork.com/IN' from 10.10.1.10#53: Transfer completed: 4 messages, 52 records, 2041 bytes, 0.004 secs

    To reduce the frequency of partial IXFR failures for critical zones, add

    request-ixfr no;
    to the secondary's zone configuration to always use full AXFR. This increases transfer bandwidth but eliminates incremental transfer failures as a failure mode entirely. Alternatively, deploy TSIG for authenticated zone transfers — TSIG covers the entire transfer stream at the message level, providing integrity verification and making partial transfer corruption detectable immediately.


    Root Cause 6: Multiple Authoritative Servers Running Different Zone Versions

    Why It Happens

    Some organizations operate DNS in a stealth primary configuration where the authoritative servers listed in the parent zone's NS delegation are all secondaries, with the actual primary hidden from public view. Changes flow: operator edits primary → primary notifies all secondaries → secondaries transfer. If a deployment pipeline applies changes to some secondaries but not others — due to a partial Ansible run, a failed configuration management job, or a network partition — the visible authoritative servers diverge silently.

    How to Identify It

    Enumerate all NS records and query each in turn:

    dig solvethenetwork.com NS +short
    ns1.solvethenetwork.com.
    ns2.solvethenetwork.com.
    
    for ns in ns1.solvethenetwork.com ns2.solvethenetwork.com; do
      printf "%-35s" "$ns:"
      dig @$ns solvethenetwork.com A +short
    done
    
    ns1.solvethenetwork.com:    10.10.1.75
    ns2.solvethenetwork.com:    10.10.1.50

    How to Fix It

    Re-apply the full zone update to all out-of-sync authoritative servers. If you use a configuration management system, ensure all DNS nodes are included in the target inventory and that the run completes successfully on every node. After applying changes, always run the serial comparison check across all listed NS records as part of your post-change verification before closing the change ticket.


    Root Cause 7: OS and Browser DNS Cache

    Why It Happens

    End-user operating systems maintain a local DNS cache independent of any upstream resolver. On Linux,

    systemd-resolved
    or
    nscd
    caches records locally. On macOS,
    mDNSResponder
    handles caching. Web browsers implement their own separate DNS cache on top of the OS cache — Chromium-based browsers cache positive responses for up to 60 seconds by default, and this is not controlled by the record's TTL.

    How to Identify It

    If a specific user reports stale resolution but querying the same upstream resolver from a different machine returns the correct answer, the problem is local to that user's machine:

    # Check systemd-resolved cache statistics
    resolvectl statistics
    
    Current Transactions: 0
    Total Transactions: 5112
    Current Cache Size:  94
    Cache Hits:          4803
    Cache Misses:        309

    How to Fix It

    Flush the local OS cache and the browser cache:

    # Linux — systemd-resolved
    resolvectl flush-caches
    
    # Linux — nscd
    nscd -i hosts
    
    # macOS
    sudo dscacheutil -flushcache; sudo killall -HUP mDNSResponder

    For browser caches, navigate to the internal DNS cache management page and flush it directly:

    # Chromium-based browsers
    chrome://net-internals/#dns
    
    # Firefox
    about:networking#dns

    Prevention

    Most DNS propagation issues are preventable with disciplined operational processes. The following practices eliminate the majority of propagation-related incidents:

    • Pre-lower TTLs before every record change. Reduce the target record's TTL to 300 seconds or less at least one full TTL-cycle before the planned change. Never change the record value and TTL simultaneously in the same edit.
    • Keep the negative TTL low. Set the SOA MINIMUM field to 60–300 seconds for all zones where new records may be added. A high negative TTL (e.g., 86400) causes newly created records to be invisible from previously-queried resolvers for an entire day.
    • Alert on SOA serial mismatches. Implement monitoring that queries all authoritative nameservers for each zone's SOA serial every 5 minutes and alerts if any server lags behind the primary by more than one serial increment. This catches failed transfers before they affect users.
    • Authenticate zone transfers with TSIG. Deploy TSIG keys between primary and secondary servers. TSIG authenticates every DNS message in a transfer sequence, making partial or corrupted transfers detectable immediately.
    • Verify across all authoritative servers after every change. Before closing any DNS change, query every NS record listed for the zone and confirm the expected value and serial are returned by each server.
    • Use a DNS change runbook. Standardize a four-step procedure: lower TTL → wait one TTL cycle → make the record change → verify across all authoritatives and multiple resolvers → restore TTL.
    • Audit parent zone NS delegation regularly. Decommissioned nameservers left in parent-zone NS delegation will serve stale or empty responses to resolvers that happen to query them. Audit and clean up NS records whenever a nameserver is retired.
    • Document rollback values. Before making any DNS change, record the current record value and TTL. If rollback is needed, time-to-restore matters — have the previous zone file state committed to version control and ready to re-apply immediately.

    Frequently Asked Questions

    Q: What is DNS propagation and why does it take time?

    A: DNS propagation is the process by which a change made on an authoritative nameserver becomes visible to all recursive resolvers worldwide. It takes time because resolvers cache records for the duration of the record's TTL. Until that TTL expires and the resolver re-queries the authoritative server, it continues serving the cached answer. There is no broadcast mechanism in DNS — each resolver independently decides when to refresh its cache.

    Q: How do I check whether a DNS change has propagated to a specific resolver?

    A: Query that resolver directly using the

    @
    flag in dig. For example:
    dig @8.8.8.8 solvethenetwork.com A +short
    . Compare the result to your authoritative server:
    dig @ns1.solvethenetwork.com solvethenetwork.com A +norecurse +short
    . If the answers differ, the resolver is still serving cached data.

    Q: What TTL value should I normally set on A records?

    A: For stable infrastructure records that rarely change, 3600–86400 seconds is reasonable. For records on hosts that are subject to migrations or failover, 300–600 seconds is more appropriate. As a rule: the higher the TTL, the longer propagation takes when a change is needed. Match your TTL to your operational risk tolerance for change propagation latency.

    Q: Can I force external resolvers to clear their cache for my domain?

    A: For most public resolvers, no — you cannot directly flush their caches. Google Public DNS provides a cache flush tool at its resolver management console. Cloudflare provides a cache purge API. For ISP resolvers, you can call the NOC and request a manual flush. In all cases, the most reliable solution is to have a low TTL in place before making the change so the propagation window is short by design.

    Q: What is the difference between AXFR and IXFR, and which should I use?

    A: AXFR (Authoritative Full Zone Transfer) transfers the complete zone file. IXFR (Incremental Zone Transfer) transfers only the diff between two serial numbers. IXFR is more bandwidth-efficient for large zones but is more complex and can fail in ways that leave the secondary in a partially-updated state. For small-to-medium zones or for critical zones where correctness is paramount, AXFR is the safer choice. You can force AXFR-only behavior on a secondary by setting

    request-ixfr no;
    in the zone block in
    named.conf
    .

    Q: How does negative caching affect newly created DNS records?

    A: If a resolver queried for a hostname before that hostname's record was created and received an NXDOMAIN response, the resolver caches that negative result for the duration of the zone's negative TTL (SOA MINIMUM field). Even after the record is created, that resolver will return NXDOMAIN until the negative cache entry expires. This is a common cause of confusion when provisioning new services — the record exists on the authoritative server but is invisible from resolvers that pre-cached the NXDOMAIN.

    Q: Why do some users see the new record immediately after a change while others do not?

    A: Different users use different resolvers. A user querying a resolver that hadn't previously cached the record will get the new answer immediately. A user querying a resolver that cached the old answer under a high TTL will continue seeing the old answer until that TTL expires. The geographic and ISP diversity of resolvers in use by your user base directly determines the spread of propagation lag you observe.

    Q: How do I verify that my BIND secondary performed a complete and correct zone transfer?

    A: Check three things: (1) compare the SOA serial on the primary and secondary using

    dig @primary solvethenetwork.com SOA +short
    and
    dig @secondary solvethenetwork.com SOA +short
    — they should match; (2) query several recently-changed records on both servers and compare the answers; (3) review the BIND transfer log at
    /var/log/named/named.log
    for the most recent xfer-in entry and confirm it completed without errors and with a reasonable record count.

    Q: What does it mean when my SOA serials match but record values differ between primary and secondary?

    A: This is the signature of a partial zone transfer. The secondary received enough of the transfer to update its serial to match the primary's, but the transfer was interrupted before all record changes were applied. The secondary's zone data is now internally inconsistent. The fix is to delete the zone journal file on the secondary, restart BIND, and allow a clean full AXFR to complete.

    Q: How can I test DNS propagation across multiple global resolvers at once?

    A: You can script a check across multiple known public resolvers using a loop:

    for resolver in 8.8.8.8 1.1.1.1 9.9.9.9 208.67.222.222; do
      printf "%-18s" "$resolver:"
      dig @$resolver solvethenetwork.com A +short
    done

    This gives you a snapshot of resolver agreement at a point in time and lets you identify which resolvers are still serving stale data.

    Q: Is there any way to speed up propagation after a mistake has already been made with a high TTL?

    A: Once a high-TTL record is cached by a resolver, you cannot force that resolver to expire it early. Your practical options are: (1) wait for the TTL to naturally expire; (2) run the old and new services in parallel so traffic going to either destination is handled correctly; (3) instruct users to switch to a public resolver like 8.8.8.8 which may have a fresher cache or will at least refresh as soon as the current TTL expires; (4) lower the TTL immediately so that the next re-fetch cycle (whenever it occurs) will result in a short new cache duration. Option 4 shortens the propagation tail even if it does not fix the current cache cycle.

    Q: How does TSIG help prevent zone transfer issues?

    A: TSIG (Transaction Signature, RFC 2845) adds a cryptographic MAC to each DNS message in a zone transfer sequence. This serves two purposes: authentication (the secondary can verify the transfer is coming from an authorized source) and integrity verification (any corruption or truncation of the transfer stream is immediately detectable). Without TSIG, a partial transfer that happens to terminate cleanly at a message boundary may not generate any log error, leaving the secondary silently inconsistent. With TSIG, any message that fails MAC verification causes the transfer to be rejected and logged, triggering a retry.

    Frequently Asked Questions

    What is DNS propagation and why does it take time?

    DNS propagation is the process by which a change made on an authoritative nameserver becomes visible to all recursive resolvers worldwide. It takes time because resolvers cache records for the duration of the record's TTL. Until that TTL expires and the resolver re-queries the authoritative server, it continues serving the cached answer. There is no broadcast mechanism in DNS — each resolver independently decides when to refresh its cache.

    How do I check whether a DNS change has propagated to a specific resolver?

    Query that resolver directly using the @ flag in dig. For example: dig @8.8.8.8 solvethenetwork.com A +short. Compare the result to your authoritative server: dig @ns1.solvethenetwork.com solvethenetwork.com A +norecurse +short. If the answers differ, the resolver is still serving cached data.

    What TTL value should I normally set on A records?

    For stable infrastructure records that rarely change, 3600–86400 seconds is reasonable. For records on hosts subject to migrations or failover, 300–600 seconds is more appropriate. Match your TTL to your operational risk tolerance for change propagation latency — the higher the TTL, the longer propagation takes when a change is needed.

    Can I force external resolvers to clear their cache for my domain?

    For most public resolvers, no. Google Public DNS and Cloudflare each provide cache flush tools via their resolver management consoles. For ISP resolvers, you can contact the NOC and request a manual flush. The most reliable solution is to have a low TTL in place before making any change so the propagation window is short by design.

    What is the difference between AXFR and IXFR, and which should I use?

    AXFR transfers the complete zone file. IXFR transfers only the diff between two zone serial numbers. IXFR is more bandwidth-efficient but more complex and prone to partial-transfer failures. For small-to-medium zones or where correctness is critical, AXFR is the safer choice. Force AXFR-only on a secondary by setting request-ixfr no; in the named.conf zone block.

    How does negative caching affect newly created DNS records?

    If a resolver queried for a hostname before that record was created and received an NXDOMAIN response, it caches that negative result for the zone's negative TTL (SOA MINIMUM field). Even after the record is created on the authoritative server, that resolver returns NXDOMAIN until the negative cache entry expires. This is a common issue when provisioning new services.

    Why do some users see the new record immediately after a change while others do not?

    Different users use different resolvers. A resolver that had not previously cached the record fetches the new answer immediately. A resolver that cached the old answer under a high TTL continues serving it until the TTL expires. The diversity of resolvers across your user base directly determines the spread of propagation lag you observe.

    How do I verify that my BIND secondary performed a complete and correct zone transfer?

    Check three things: compare the SOA serial on primary and secondary (they must match); query several recently-changed records on both servers and compare answers; review /var/log/named/named.log for the most recent xfer-in entry and confirm it completed without errors with a reasonable record count. Matching serials with divergent record data indicates a partial transfer.

    What does it mean when my SOA serials match but record values differ between primary and secondary?

    This is the signature of a partial zone transfer. The secondary received enough of the transfer to update its serial but the session was interrupted before all record changes were applied. The fix is to delete the zone journal file on the secondary (the .jnl file), restart BIND, and allow a clean full AXFR to complete.

    Is there any way to speed up propagation after a mistake has already been made with a high TTL?

    Once a high-TTL record is cached, you cannot force external resolvers to expire it early. Your options are: wait for the TTL to naturally expire; run old and new services in parallel; instruct users to switch to a public resolver; or lower the TTL immediately so the next re-fetch cycle results in a short new cache duration. Lowering the TTL shortens the propagation tail even if it cannot fix the current cache cycle.

    Related Articles