Server policy intermittently inaccessible
If a server-policy is accessible most of the time, but it may become inaccessible sometimes, perform the following steps to trouble shoot.
-
Check if networking connection is stable:
Ping continuously from a remote client to see if any failures or long response time;
Ping the back-end server from FortiWeb to see if any failures or long response time;
Visit the back-end server continuously from a remote client to see if any failures or long response time;
Visit the back-end server from FortiWeb to see if any failures or long response time when accessing the server-policy from remote client fails.
-
Check if back-end servers’ status in server-pool are stable:
If server health check is ON, check Event logs to confirm the health check down/up events;
If server health check is OFF, check the logs on the back-end server (Apache/Nginx logs or other monitor system) if possible;
Visit the back-end server continuously from FortiWeb to see if any failures or long response time from time to time or when the connectivity issue occurs.
You can use curl on FortiWeb back-end shell to visit the back-end server, and check the response time.
Samples:
curl -o /dev/null -s -w %{time_total}\\n HTTP://<back-end server_IP>:<port>
curl -v HTTPs://<domain/IP>/ -A "check_HTTP" -so /dev/null --resolve <domain>:<port>:<IP> -k -w %{time_namelookup}::%{time_connect}::%{time_starttransfer}::%{time_total}::%{speed_download}"\n"
You can run a script on FortiWeb back-end shell (upload the script via System > Maintenance > Backup&Restore > GUI File Download/Upload > Upload and chmod to add the execute permission) to visit the back-end server periodically and record the return code&response time. However, it’s not recommended when traffic is heavy.
-
Check if FortiWeb system has resource shortage;
Check FortiWeb event logs to see if there is any high CPU or Memory usage when the issue occurs;
Find logs like below in Log&Report > Event > Filter > Action > check-resource:
CPU usage too high,CPU usage is 64, process cmdbsvr
.For more information, see Checking System Resource Issues.
Check other system logs such as NMON files "debug_<function name>.txt" to see if CPU or Memory usage were abnormal when the issue occurred;
For information, see Retrieving system logs in backend system.
Check if a high volume of logs are generated or sent to external logs servers such as FortiAnalyzer.
With heavy traffic load, especially high RPS or CPS numbers, the CPU usage may get extremely high if traffic logs are enabled and a high volume of logs are generated, written to disk or sent to FortiAnalyzer or other remote log servers.
In these situations, you can run
diagnose system top
to see if CPU usage of logd, indexd or mysqld is high.
-
Check if traffic reaches FortiWeb's performance bottlenecks;
CPU or Memory exhausted events are often caused by traffic reaching performance bottleneck, traffic burst or DDoS. You can double check with the methods below.Check if any real-time performance numbers are overloaded when the issue occurs. For example, the number of the Concurrent Connection, Connection Per second, Transactions Per second and Throughput.
For more information, see Checking CPU information&Issues.
You can also check other 3rd party network monitor systems (if available) to confirm if there was any traffic bursts, overload or bandwidth exhausted events.
- Check if the FortiWeb TCP ports used to connect the pserver exhausted;
This issue usually happens when the number of concurrent connections reaches the TCP ports limitation especially when there is only one FortiWeb IP used to connect to a single backend server. The maximum connection number from a single FortiWeb IP to one pserver is 64500.
This issue may also happen when concurrent connections are occupied by a large number of TIME_WAIT connections. If you find the number of TIME_WAIT keeps very large, it might be a hint that new TCP connections could hardly be established, thus causing new request failures.
The established concurrent connection number can be found in Dashboard > Total Connection or through CLI
diagnose policy total-session list
. And the TIME_WAIT number can be seen in the backend shell with netstat.Please note that the established connections can be also shown by netstat, while the number is doubled because FortiWeb establishes bi-direction connections with the client and pserver respectively.
/# netstat -antp | grep ESTABLISH | wc -l
19094
/# netstat -antp | grep TIME_WAIT | wc -l
38688
/# netstat -nat | awk '{print $6}' | sort | uniq -c | sort -r
56338 TIME_WAIT
33940 ESTABLISHED
427 SYN_SENT
251 LISTEN
221 FIN_WAIT1
196 FIN_WAIT2
5 SYN_RECV
1 established)
1 Foreign
- Add secondary IPs to the interface connected to the back-end server:
Secondary IPs are necessary for both below methods.
FortiWeb # sho sys interface port1
config system interface
edit "port3"
set type physical
set ip 10.13.4.254/24
set allowaccess ping ssh snmp HTTP HTTPs FortiWeb-manager
config secondaryip
edit 1
set ip 10.13.4.253/24
next
edit 2
set ip 10.13.4.252/24
next
end
end
- Method 1: Enable
ip-src-balance or ip6-src-balance
to allow FortiWeb to connect to back-end servers using multiple IPv4 addresses configured as above.This is a global option that affects all server policies. FortiWeb uses round-robin algorithm between all primary&secondary IPs to distribute connections to back-end servers:
config system network-option
set ip-src-balance enable
set ip6-src-balance enable
End
Method 2: Enable
client-real-ip
and add available secondary IPs configured above to IP ranges, then traffic matching the specific policy will connect to back-end servers using these secondary IPs added to IP/IP Range:To ensure FortiWeb receives the server's response, configure FortiWeb as the back-end server’s gateway.
This option is available only for Reverse Proxy mode.
FortiWeb # show server-policy policy
config server-policy policy
edit "Test_Policy"
...
set client-real-ip enable
set real-ip-addr 10.13.4.253
next
end
- Check if kernel or daemon coredump files are generated when the issue occurred.
Check core* or coredump* files via System > Maintenance > Backup & Restore > GUI File Download/Upload or "/var/log/gui_upload".
Please note that kernel coredump files cannot be displayed by
diagnose debug crashlog show
on 7.0.1 and earlier builds, while they can be shown on 7.0.2 and newer builds. -
7. Collect other debug logs or files for further investigation.
-
Execute
diagnose system top
anddiagnose system perf
several times to find the top CPU-consuming processes; -
Collect pstack information of proxyd to check where proxyd may stuck at;
On 6.3:
FortiWeb # fn sh
/#
/# pidof proxyd
8602
/# pstack 8602 #replace with the actual proxyd_pid … …
From 7.0.0 to 7.0.3:
FortiWeb # fn pidof proxyd
28913
FortiWeb # fn pstack 28913 #replace with the actual proxyd_pid
… …
From 7.0.4 and newer builds, you need to configure shell-access and use an SSH client to login to the back-end shell before collecting pstack information. Please refer to Run backend-shell commands for how to configure shell-access.
/# pidof proxyd
28913
/# pstack 28913 #replace with the actual proxyd_pid
… …
If proxyd gets stuck for 5 or 60 seconds (on different builds this value varies), watchdog files like “watchdog-proxyd-3991-1658580435.bt” will be generated and will be zipped to the debug log "console_log.tar.gz". For more information on pstack, see Retrieving system logs in backend system.
-
Check the output on console terminal;
Some critical system messages will be printed to console but not written to system logs, so sometimes the console output is very useful for locating the problem. But keep in mind that printing a large amount of messages to console may reduce system performance.
-
Download system debug logs, including the one-click download debug log "console_log.tar.gz" and other logs that require to be manually downloaded.
Most of the necessary system logs are included in the archived "console_log.tar.gz", while some require to be downloaded manually especially on FortiWeb old versions.
For more information on collecting "console_log.tar.gz", see Collecting core/coredump files and logs.
for more information on the content of these logs, see Retrieving system logs in backend system.
The more complete logs you collect, the better it will help for further analysis.
-
Solution
To alleviate or solve such issue, you can increase the maximum number of connection by adding IP addresses used to connect to the back-end servers: