Symptoms
High CPU on a FortiGate doesn't always announce itself cleanly. Sometimes it's a frantic message — the internet is slow — and you SSH in to find the box sitting at 98% CPU with no obvious culprit. Other times it's an SNMP alert that fires at 3am, and by the time you look at it the CPU has already settled back down. Either way, here's what you'll typically see when things are going wrong:
- The System Resources widget in the GUI shows CPU above 80%, often spiking to 100% and staying there
get system performance status
reports high user, kernel, or iowait CPU across one or more cores- GUI management becomes sluggish — pages take 15-30 seconds to load or time out entirely
- SSH sessions lag badly, characters echo late, or the connection drops with a timeout
- Users report degraded throughput or complete connectivity loss on affected segments
- HA heartbeat timeouts trigger an unexpected failover even though the unit isn't truly down
- SNMP traps fire for CPU threshold breaches and won't clear
Before chasing individual processes, always start with the global picture. Run
get system performance statusto see CPU breakdown across states, then follow with
diagnose sys topto find what's actually consuming cycles. The combination of those two commands will point you at the right root cause 80% of the time.
FortiGate # get system performance status
CPU states: 92% user 6% system 0% nice 0% idle 2% iowait
CPU0 states: 91% user 7% system 0% nice 0% idle 2% iowait
CPU1 states: 93% user 5% system 0% nice 0% idle 2% iowait
Memory: 8066428k total, 6234512k used (77.3%), 1831916k free (22.7%)
Average network usage: 2345000 bps in 1 minute, 2289000 bps out 1 minute
Average sessions: 194321 sessions in 1 minute
Average session setup rate: 8432 sessions per second in last 1 minute
Virus caught: 0 total in 1 minute
IPS attacks blocked: 124 total in 1 minute
FortiGate uptime: 14 days, 6 hours, 23 minutes
Note the iowait figure — that tells you whether the CPU is waiting on disk I/O rather than actually computing. Then run
diagnose sys topand give it 10-15 seconds to stabilize before reading the output. Press q to exit.
FortiGate # diagnose sys top
Run Time: 14 days, 6 hours, 23 minutes, 51 seconds
4U, 6N, 0H; 8060M total, 1823M free
ipsengine 52 S 38.9 4.3
scanunitd 85 S 29.1 5.1
miglogd 106 S 8.2 1.2
httpsd 123 S 2.1 0.8
In that output, the first number after the process name is the PID, the letter is the process state (S = sleeping but runnable), and the next two numbers are CPU% and memory%. Anything consistently above 10% CPU warrants investigation.
Root Cause 1: AV Scanning Consuming CPU
This is one of the most common culprits I've seen in environments that haven't revisited their security profiles since initial deployment. When antivirus scanning runs in proxy-based inspection mode, the FortiGate must fully buffer each file before it can scan it. For every HTTP download, every email attachment, every file transfer — the box holds the data in memory and processes it through the AV engine before forwarding it on. At any real scale, this is brutal on CPU.
It gets significantly worse when SSL inspection is also in the picture. The FortiGate decrypts the traffic, hands it off to the AV engine, scans it, then re-encrypts it for the client. You're running two CPU-intensive operations in sequence for every HTTPS connection. If you've got a few hundred users streaming large file downloads over HTTPS while proxy-mode AV is enabled, the CPU will feel every single one of those sessions.
How to Identify
In
diagnose sys top, look for scanunitd or avd sitting consistently high. You can also check the AV engine status directly to see the active scan thread count and queue depth:
FortiGate # diagnose antivirus status
AV engine: 6.4.159
Signature DB: 1.00889
Extended DB: 1.00889
AI/ML model: available
Scan daemon: scanunitd (PID: 85)
Current scan threads: 8
Active scans: 312
Queue depth: 47
Dropped: 0
A queue depth above zero under normal traffic conditions is a red flag. It means the AV engine cannot keep up with the incoming file scan rate. Check which policies have AV enabled and what inspection mode they're running:
FortiGate # show firewall policy | grep -A5 "av-profile"
set av-profile "default"
set inspection-mode proxy
How to Fix
The most impactful single change is switching from proxy-based to flow-based inspection wherever your security requirements allow. Flow-based AV doesn't buffer entire files — it streams data through the engine as packets arrive, which dramatically reduces CPU and memory overhead. You won't catch every file-based threat that proxy mode catches, but in the vast majority of enterprise environments the tradeoff is worth it:
FortiGate # config firewall policy
FortiGate (policy) # edit 10
FortiGate (policy-10) # set inspection-mode flow
FortiGate (policy-10) # end
If you must stay in proxy mode for specific policies, tune the file size limit so large files are passed through without scanning. A 4 GB ISO download does not need AV inspection in most environments, and trying to scan it is a CPU tax with minimal security return:
FortiGate # config antivirus profile
FortiGate (profile) # edit "default"
FortiGate (default) # config http
FortiGate (default-http) # set av-scan-max-filesize 10240
FortiGate (default-http) # end
FortiGate (default) # end
Also review your SSL inspection scope. If you're decrypting and scanning traffic to known-good CDNs and software update servers, you're burning CPU for zero security value. Build SSL inspection exemption lists for categories like software updates, financial services, and health — and apply them to your SSL inspection profile aggressively.
Root Cause 2: IPS Engine Overloaded
The IPS engine is a resource hog by design — it's doing deep packet inspection, running pattern matching against thousands of signatures, and tracking connection state simultaneously. In my experience, IPS-related CPU spikes usually happen after one of three events: a firmware upgrade that refreshed the signature database and added new compute-heavy signatures, a sudden traffic spike the engine wasn't sized for, or someone applying a broad "default" IPS profile to policies carrying high-volume bulk data traffic like backups, large file sync jobs, or video conferencing streams.
The IPS engine runs as ipsengine in the process list and can spawn multiple worker threads across CPU cores. When the engine is overwhelmed, new connections start queuing behind the inspection backlog. Session setup latency climbs. Users see slow page loads even when the link itself has plenty of headroom remaining.
How to Identify
FortiGate # diagnose ips session status
IPS session table: current: 47821, max: 65535
IPS engine resource status:
Engine 0: CPU 91%, Memory 78%, Queue: 1243 packets pending
Engine 1: CPU 87%, Memory 76%, Queue: 998 packets pending
Anomaly detection: enabled
Protocol decoders: active
A pending packet queue in the IPS engine is the clearest indicator of overload. If packets are queuing, you're past the engine's processing capacity. Cross-reference with
diagnose sys top— if ipsengine is sitting above 40% on an ongoing basis, it's your root cause. You can also check IPS anomaly detection counters to rule out a flood event:
FortiGate # diagnose ips anomaly list
List of active anomaly detectors:
tcp_syn_flood: threshold=2000, current=1847, status=normal
udp_flood: threshold=2000, current=342, status=normal
icmp_flood: threshold=250, current=12, status=normal
How to Fix
Start by reviewing what's actually in your IPS profile. The default profile includes a massive set of signatures, many of which are completely irrelevant to your environment. If you're protecting Windows web servers, you don't need Cisco IOS exploit signatures active. Build a custom profile scoped to your actual asset types:
FortiGate # config ips sensor
FortiGate (sensor) # edit "custom-web"
FortiGate (custom-web) # config entries
FortiGate (custom-web-entries) # edit 1
FortiGate (entry-1) # set severity high critical
FortiGate (entry-1) # set os Windows
FortiGate (entry-1) # set application Webserver
FortiGate (entry-1) # end
If your hardware model supports multiple IPS engine workers and you're currently running fewer than your core count allows, increasing the worker count can help absorb traffic spikes:
FortiGate # diagnose ips global
Current IPS engine count: 2
Maximum supported: 4
FortiGate # config ips global
FortiGate (global) # set engine-count 4
FortiGate (global) # end
Don't set engine-count higher than your CPU core count minus two — you need headroom for other critical processes. And don't apply IPS to policies carrying internal backup traffic or large file replication jobs. Those flows don't benefit from IPS and the volume will overwhelm the engine.
Root Cause 3: Logging to Disk Heavily
This one surprises people. Logging seems innocuous — how could writing log entries cause 80% CPU? The answer is I/O wait. When the FortiGate is logging at high volume to local disk, the CPU must wait for disk writes to complete before it can move on. On spinning disk models this is particularly bad. Even on SSD-equipped units, if logging is verbose enough — every accepted session, every denied packet, every IPS event, every AV scan result — the log daemon can consume significant CPU just managing write buffers and flushes.
I've walked into environments where traffic logging was enabled at the "all" level on every policy, including high-volume internal policies carrying backup jobs that transferred hundreds of gigabytes nightly. The box was writing tens of thousands of log entries per second to a local hard drive. CPU iowait was sitting at 15-20% continuously, and the system felt sluggish even when network utilization was entirely reasonable.
How to Identify
FortiGate # get log disk usage
Log disk usage:
Total: 119 GB
Used: 98 GB (82%)
Free: 21 GB (18%)
FortiGate # diagnose log test
Disk log test: 8234 messages/sec write rate
Warning: high log write rate detected
FortiGate # diagnose hardware sysinfo disk
Disk: /dev/sda1
type: HDD
total: 119 GB
free: 21 GB
read: 12.3 MB/s
write: 48.7 MB/s (sustained high)
In
diagnose sys top, watch for miglogd — the log management daemon — consuming more than 5-8% CPU consistently. That process being elevated under sustained load is a reliable indicator that logging throughput is the problem, not data plane traffic.
How to Fix
The cleanest solution is to offload logging to FortiAnalyzer or an external syslog server and demote local disk to a fallback role. There's no good reason to use local disk as your primary log destination when you have options:
FortiGate # config log fortianalyzer setting
FortiGate (setting) # set status enable
FortiGate (setting) # set server 10.10.20.50
FortiGate (setting) # set reliable enable
FortiGate (setting) # end
Then drastically reduce local disk logging verbosity. You don't need to log every accepted session locally — focus on security events and errors:
FortiGate # config log disk filter
FortiGate (filter) # set severity warning
FortiGate (filter) # set forward-traffic disable
FortiGate (filter) # set local-traffic disable
FortiGate (filter) # set multicast-traffic disable
FortiGate (filter) # set sniffer-traffic disable
FortiGate (filter) # end
This alone can drop log write rates by 90% in environments that had verbose logging enabled across all policies. After making this change, watch
diagnose sys topfor miglogd — it should drop significantly within a few minutes as the write queue drains.
Root Cause 4: Session Table Full
Every connection through the FortiGate consumes an entry in the session table. When the table fills up, the situation deteriorates fast. New sessions may be dropped before they're established. The kernel works harder to age out stale entries. In some conditions you'll see CPU spike as the system simultaneously tries to find space for new sessions and runs garbage collection on expired ones — a feedback loop that makes things worse the longer it continues.
Session tables fill up for a handful of reasons: abnormally long session timeouts leaving entries in the table well past their useful life, a traffic spike from a legitimate event or a DDoS, a compromised host or misconfigured backup client hammering out connections, or simply a device model that's under-sized for the actual session count the environment generates.
How to Identify
FortiGate # get system session status
Session table: current sessions: 512847, max: 524288 (97% full)
IPv6 session table: current: 1284, max: 65536
FortiGate # diagnose sys session stat
session_count=512847
session_count_ipv6=1284
setup_rate=12381
exp_count=43
clash=0
memory_tension_drop=8423
ephemeral=0
removeable=0
kernel_session_id=last: 3289441923
The memory_tension_drop counter is the smoking gun. If that number is non-zero and climbing, the system is actively dropping sessions because it can't accommodate new ones. A session table at 97% of max with a rising drop counter is a critical condition — not something to monitor for a few hours while you think about it.
How to Fix
First, identify what's consuming the session table. Find the top talkers by source IP:
FortiGate # diagnose sys session full-stat
Source IP Sessions Bytes
10.10.1.147 48321 2.4 GB
10.10.1.203 31442 1.1 GB
10.10.1.89 12847 445 MB
That first host with 48,000 active sessions warrants immediate investigation. That could be a port scanner, a misconfigured backup agent with excessive concurrency, or a compromised machine. Identify it before you take action. In the meantime, aggressively reduce session timeouts to free up table space faster:
FortiGate # config system session-ttl
FortiGate (session-ttl) # set default 300
FortiGate (session-ttl) # end
FortiGate # config system settings
FortiGate (settings) # set tcp-halfopen-timer 10
FortiGate (settings) # set tcp-timewait-timer 1
FortiGate (settings) # set udp-idle-timer 60
FortiGate (settings) # end
If you've identified a problem source and need to clear its sessions immediately:
FortiGate # diagnose sys session filter src 10.10.1.147
FortiGate # diagnose sys session clear
Use that session clear command carefully — it drops active sessions from the filtered source immediately. If the host is a legitimate server under abnormal load, coordinate with the relevant team before pulling that trigger.
Root Cause 5: Hardware Bypass (NPU Offloading) Not Working
FortiGate appliances include dedicated Network Processing Units — purpose-built silicon that handles packet forwarding, NAT, and session management entirely in hardware, completely bypassing the main CPU. When offloading is working correctly, established high-volume sessions barely touch the CPU at all. When offloading is broken or disabled, all that traffic flows through the software path, and CPU climbs fast regardless of how well everything else is tuned.
NPU offloading failures are particularly insidious because they can happen silently. A firmware regression in the NPU driver, a misconfigured interface flag, an unsupported feature in the active traffic path, or a configuration toggle that was set during a previous troubleshooting session and never restored — any of these can result in traffic that should be offloaded instead hammering the CPU around the clock. I've seen this specific issue appear after routine maintenance firmware upgrades where the NPU driver had a quiet regression that wasn't caught until users started complaining about performance the next business day.
How to Identify
FortiGate # diagnose hardware npu np6 session-stats 0
NP6 session statistics:
Total offloaded sessions: 847
Sessions expected to offload: 94821
Offload ratio: 0.9%
Drops: 0
Errors: 0
Hardware bypass: disabled (unexpected)
A 0.9% offload ratio on a busy FortiGate is catastrophically low. In healthy operation, that number should be 80-95%+ for typical internet traffic flows. Check the per-interface NP assignment and fastpath configuration:
FortiGate # diagnose npu np6 port-list
Ports assigned to NP6:
port1 (wan1): NP6_0
port2 (wan2): NP6_0
port3 (internal): NP6_0
port4 (dmz): NP6_0
FortiGate # diagnose hardware npu np6 port-stats 0
Port stats for NP6_0:
RX packets: 4823941234
TX packets: 4721834923
Offloaded: 42381 (0.09% of RX)
Then check whether npu-fastpath is actually enabled on the relevant interfaces. This is a common place where the problem hides:
FortiGate # show system interface port1
config system interface
edit "port1"
set vdom "root"
set ip 10.10.0.1 255.255.255.0
set allowaccess ping https ssh
set type physical
set npu-fastpath disable
next
end
There it is. That
set npu-fastpath disableline means every packet traversing port1 is being processed in software.
How to Fix
Re-enable NPU fastpath on the affected interfaces:
FortiGate # config system interface
FortiGate (interface) # edit "port1"
FortiGate (port1) # set npu-fastpath enable
FortiGate (port1) # end
If NPU acceleration has been disabled globally — something that sometimes happens during troubleshooting sessions and gets left behind — restore it at the system level:
FortiGate # config system global
FortiGate (global) # set np-acceleration enable
FortiGate (global) # end
After making these changes, recheck the offload stats within a couple of minutes. CPU will drop noticeably within 60-120 seconds once offloading is restored on a heavily loaded unit. If offload ratios remain low after re-enabling fastpath, the traffic path itself may contain features that explicitly prevent NPU offloading — proxy-based policies, certain DoS policy configurations, and SSL inspection all route traffic through the software path by design. Review which features are active in the affected traffic flow and weigh whether each one is necessary.
Root Cause 6: Excessive Management Plane Traffic
Sometimes the CPU isn't struggling with data plane traffic at all — it's the management plane. SNMP polling at an aggressive rate from multiple monitoring systems, automated scripts hammering the REST API, too many concurrent GUI sessions, or a runaway syslog connection can all drive CPU up in ways that look confusing until you check the right processes. The management daemons httpsd and snmpd are the usual suspects.
How to Identify
FortiGate # diagnose sys top
httpsd 123 S 18.4 0.8
snmpd 134 S 11.2 0.4
sshd 201 S 0.3 0.1
If httpsd is consistently above 10-12% and you don't have heavy API automation, open admin sessions or aggressive GUI polling from multiple browser tabs are likely contributors. Check active admin sessions with
get system session list | grep httpsand look for stale sessions from monitoring tools that never properly close their connections.
How to Fix
Consolidate SNMP polling to a single source, increase the polling interval to at minimum 60 seconds, and restrict management access by source IP. For REST API automation, ensure scripts maintain persistent connections rather than opening a new HTTPS session on every individual API call — each new session goes through the full TLS handshake and auth flow, which adds up fast at scale.
Prevention
High CPU events are rarely truly unpredictable once you understand what to watch for. The following practices will catch problems before they cascade into outages.
Monitor CPU and session table proactively, not reactively. Configure SNMP thresholds at 70% CPU for a warning and 85% for critical. Do the same for session table utilization — alert at 80% full so you have time to investigate before the table is exhausted. Most monitoring platforms can pull these values directly from the FortiGate MIB. Don't wait for user complaints to be your detection mechanism.
Audit security profiles every six months. Proxy-based AV and IPS are powerful tools but expensive ones. Regularly review which policies carry these profiles and whether flow-based inspection would meet your security requirements. In most environments, proxy-mode is only genuinely justified on policies handling inbound SMTP or file-sharing traffic where full file reassembly is required for accurate detection. Everything else is often better served by flow-based inspection.
Offload logging to FortiAnalyzer or syslog from day one, not after a crisis. Local disk logging is a maintenance burden and a latent CPU risk. Configure an external log destination during initial deployment and use local disk only as a fallback. The cost of a syslog server is negligible compared to the cost of a 3am CPU incident caused by runaway local log writes.
Establish session table baselines and know your normal. Run
get system session statusat regular intervals and document what normal session count looks like at peak hours for your environment. When the count doubles unexpectedly, you want to notice that before the table fills. Anomalous session counts are often an early indicator of a compromised host or a misconfigured application long before the situation becomes critical.
Test NPU offload ratios after every firmware upgrade. This takes two minutes and has saved me from post-upgrade performance regressions more than once. Run
diagnose hardware npu np6 session-stats 0before your maintenance window, note the baseline offload ratio, and recheck it an hour after the upgrade under normal traffic conditions. If the ratio drops significantly, you have a driver regression and you can open a TAC case before users notice anything wrong.
Right-size the platform against real-world throughput numbers. FortiGate datasheets publish impressive throughput figures, but those baseline numbers reflect plain firewall forwarding — not AV, IPS, SSL inspection, and logging simultaneously. The relevant numbers for a fully-featured deployment are the "threat protection" throughput figures, which reflect multi-feature real-world loads. A box rated for 10 Gbps firewall throughput might handle 1.5 Gbps of SSL-inspected, IPS-scanned, AV-scanned traffic. If you're consistently running above 70% CPU during normal business hours, start the hardware refresh conversation now — not during the next outage at 2pm on a Tuesday.
