Enhanced Watchdog Logging and Recovery for Worker Thread Failures (8.0.0)
FortiWeb 8.0.0 introduces enhancements to the watchdog mechanism that monitors the health of proxyd worker threads. These improvements offer better visibility into stuck threads and provide automatic recovery behavior to help maintain service continuity.
Previously, the watchdog system could detect unresponsive proxyd worker threads but lacked sufficient detail in logs and did not support automated recovery. With this update, the system now:
-
Logs detailed information when a worker becomes stuck.
-
Automatically restarts the
proxydprocess if one or more threads remain unresponsive. -
Consolidates logs when multiple workers are affected in a short time window.
These changes improve FortiWeb's ability to detect, log, and recover from thread-level failures, making troubleshooting and monitoring more effective.
Key Enhancements
|
Key Enhancements |
Description |
|---|---|
| Thread-aware log naming | The watchdog log file now includes the proxyd process ID and the worker thread ID (TID), helping to identify the affected thread directly. |
| Consolidated logging | If multiple worker threads become stuck within a short time frame, FortiWeb writes only one combined watchdog log instead of separate files for each event. |
Automatic proxyd restart |
The system now restarts proxyd if any of the following conditions are met:
|
| Stack trace logging | When a stuck worker is detected, FortiWeb captures and logs the full stack trace to assist with root cause analysis. This applies whether the issue is isolated or affects multiple threads. |
Watchdog Log Format
-
Path:
/var/log/gui_upload/watchdog-proxyd-<pid>-<timestamp>-<tid>.bt
-
Contents:
-
Worker Stuck Status: Lists all affected worker threads, their thread IDs, and how long each has been unresponsive.
-
Stack Trace: Provides the call stack of each stuck thread to assist in debugging.
-
These enhancements are backend-driven and require no manual configuration. The watchdog operates automatically as part of FortiWeb’s self-monitoring and fault recovery framework.