Fortinet white logo
Fortinet white logo

Administration Guide

Diagnosing memory leak issues

Diagnosing memory leak issues

When you find the memory usage is very high and increases very fast in a short time period, it might be a memory leak issue, and you can analyze by the following steps.

Please note memory increase does not always mean a memory leak. A memory leak issue usually has these phenomena:

  • Very fast and abnormal memory increase (usually with common or low traffic level)

  • Continuous memory increase without deallocated

  • Used memory are not deallocated even after traffic drops or stopped

The most important thing for troubleshooting a memory leak issue is to locate which module, process or function causes the memory increase.

  1. Check history logs to see memory resource status:

    Log&Report > Event > Filter > Action > check-resource

    failure msg="mem usage raise too high,mem(67)

  2. Check if there are some memory related print outputs in the console.
  3. Check connection amounts to see if memory increase is possibly caused by too many concurrent connections.

    /# netstat -nat | awk '{print $6}' | sort | uniq -c | sort -r

    319800 ESTABLISHED

    330 FIN_WAIT2

    251 LISTEN

    7 TIME_WAIT

    1 established)

    1 SYN_SENT

    1 Foreign

    If there are too many TIME_WAIT or FIN_WAIT2 connections, it may be abnormal because connections are not closed normally.

    If memory usage still does not decrease when TIME_WAIT or FIN_WAIT2 are released, it may mean memory leak.

  4. Execute “diagnose debug memory” several times, then compare the diff of the output to find which part/module/process has the most increase.

    According to the memory increment speed, you may adjust the interval to execute the command and collect the output.

  5. Use diagnose debug jemalloc-heap & diagnose system jeprof to trace and analyze memory occupation and cause of memory usage over a period of time.
    • If the jemallc profile is activated and the memory usage exceeds the configured threshold, the heap file will be generated in directory /var/log/gui_upload.
    • You can use jemalloc-heap to show or clear the heap files. At most 10 heap files are kept on the device.
    • You can use jeprof to parse the heap file via jeprof tool
    • The jemalloc commands don't give us useful information when the memory doesn't increase.

      1) Enable jemalloc profile

      FortiWeb# diagnose debug jemalloc-conf proxyd enable

      2) if memory increases quickly, execute below command to generate dump files.

      E.g., you can wait the memory usage to increase 10% and execute below commands; and it’s better to repeat this commands for several times when memory increases every 10%:

      FortiWeb# diagnose debug jemalloc proxyd dump

      3) Check the dump heap file generated:

      FortiWeb # diagnose debug jemalloc-heap show

      jeprof.out.28279.1641342474.heap

      jeprof.out.4973.1641276249.heap

      4) After getting a few heap file, execute below command to parse the heap file

      FortiWeb # diagnose system jeprof proxyd

      Using local file /bin/proxyd

      Using local file /var/log/gui_upload/jeprof.out.28279.1641342474.heap

      Total: 124422365 B

      34403589 27.7% 27.7% 34403589 27.7% ssl3_setup_write_buffer

      34262011 27.5% 55.2% 34262011 27.5% ssl3_setup_read_buffer

      18062121 14.5% 69.7% 18062121 14.5% CRYPTO_zalloc

      17011023 13.7% 83.4% 17011023 13.7% _HTTP_init

      9905760 8.0% 91.3% 9905760 8.0% BUF_MEM_grow

      3195135 2.6% 93.9% 3195135 2.6% buffer_new

      1583640 1.3% 95.2% 18857320 15.2% HTTP_substream_process_ctx_create

      Using local file /bin/proxyd

      Using local file /var/log/gui_upload/jeprof.out.4973.1641276249.heap

      Total: 576387295 B

      175840569 30.5% 30.5% 175840569 30.5% ssl3_setup_write_buffer

      175415833 30.4% 60.9% 175415833 30.4% ssl3_setup_read_buffer

      81823328 14.2% 75.1% 81823328 14.2% CRYPTO_zalloc

      72087699 12.5% 87.6% 72612307 12.6% _HTTP_init

      8578052 1.5% 89.1% 84473564 14.7% HTTP_substream_process_ctx_create

      7654262 1.3% 90.5% 7654262 1.3% asn1_enc_save

      7311586 1.3% 91.7% 7311586 1.3% HTTP_get_modify_value_by_name

      6855757 1.2% 92.9% 6855757 1.2% pt_stream_create_svrinfo

      5851046 1.0% 93.9% 5851046 1.0% _hlp_parse_cookie

      5136808 0.9% 94.8% 5136808 0.9% HTTP_process_ctx_create

      5) Use graph tool to analyze the function call relationship from .heap files

      This tool is for internal R&D investigation only. Just for reference.

    • Generate a .dot file on FortiWeb backend shell:

      jeprof --dot /bin/proxyd jeprof.out.4973.1641276249.heap > 1641276249.dot

      Or add an option --base with a previous .heap file to get the difference between two heaps:

      jeprof --base= jeprof.out.4973.1642276345.heap --dot /bin/proxyd

      jeprof.out.4973.1641276249.heap > diff.dot

    • Copy 1601044510.dot to ubuntu;
    • Install graphviz on Ubuntu:

      apt install graphviz

    • Generate a png picture:

      dot -Tpng 1641276249.dot -o 1641276249.png

      A png image will be generated as below, indicating the top memory usage functions, and function call relationship. Taking the case below for example, one can check if HTTPS traffic load increased or related configuration is changed.

      6) You can also download the jeprof.out files and provide them to support team for further investigation:

      /var/log/gui_upload# ls jeprof.out* -l

      -rw-r--r-- 1 root 0 109251 Sep 27 18:30 jeprof.out.11164.1632789019.heap

      -rw-r--r-- 1 root 0 111975 Dec 22 12:22 jeprof.out.3777.1640200954.heap

      Note: In jeprof.out.3777.1640200954.heap:

      3777 is the PID of proxyd

      1640200954 is the UNIX timestamp; one can use online tools to convert it to a human-readable date so as to just pay attention to recent dump files. This is useful to confirm the recent & current coredump files if there are many files.

      E.g.:

      Epoch Converter - Unix Timestamp Converter

  6. Besides jemalloc dump files, you can also generate proxyd pdump logs with the following command.

    These logs named as “proxyd-objpool-*-*.txt” include memory statistics information for key data structures.

    You can find these logs in the same directory, but manually download them via System > Maintenance > Backup & Restore, because these logs are not included in the one-click download debug log "console_log.tar.gz".

    FWB# diagnose debug jemalloc proxyd dump

    /var/log/gui_upload# ls -l proxyd*

    --wS--Sr-x 1 root 0 1417 Aug 3 10:38 proxyd-objpool-32741-1659548316.txt

    --wS--Sr-x 1 root 0 1417 Aug 3 10:38 proxyd-objpool-32741-1659548336.txt

  7. As stated in point 2, after 6.4.0 GA release, a regular monitoring file is generated as /var/log/gui_upload/debug_memory.txt. One can set a memory boundary for it: if the memory usage reaches the boundary and proxyd or ml_daemon is the top 10 high memory usage, it will enable their jemalloc debug function automatically.

    FortiWeb # show full system global

    config system global

    set debug-memory-boundary 70 #memory usage percentage, 1%-100%

    End

Diagnosing memory leak issues

Diagnosing memory leak issues

When you find the memory usage is very high and increases very fast in a short time period, it might be a memory leak issue, and you can analyze by the following steps.

Please note memory increase does not always mean a memory leak. A memory leak issue usually has these phenomena:

  • Very fast and abnormal memory increase (usually with common or low traffic level)

  • Continuous memory increase without deallocated

  • Used memory are not deallocated even after traffic drops or stopped

The most important thing for troubleshooting a memory leak issue is to locate which module, process or function causes the memory increase.

  1. Check history logs to see memory resource status:

    Log&Report > Event > Filter > Action > check-resource

    failure msg="mem usage raise too high,mem(67)

  2. Check if there are some memory related print outputs in the console.
  3. Check connection amounts to see if memory increase is possibly caused by too many concurrent connections.

    /# netstat -nat | awk '{print $6}' | sort | uniq -c | sort -r

    319800 ESTABLISHED

    330 FIN_WAIT2

    251 LISTEN

    7 TIME_WAIT

    1 established)

    1 SYN_SENT

    1 Foreign

    If there are too many TIME_WAIT or FIN_WAIT2 connections, it may be abnormal because connections are not closed normally.

    If memory usage still does not decrease when TIME_WAIT or FIN_WAIT2 are released, it may mean memory leak.

  4. Execute “diagnose debug memory” several times, then compare the diff of the output to find which part/module/process has the most increase.

    According to the memory increment speed, you may adjust the interval to execute the command and collect the output.

  5. Use diagnose debug jemalloc-heap & diagnose system jeprof to trace and analyze memory occupation and cause of memory usage over a period of time.
    • If the jemallc profile is activated and the memory usage exceeds the configured threshold, the heap file will be generated in directory /var/log/gui_upload.
    • You can use jemalloc-heap to show or clear the heap files. At most 10 heap files are kept on the device.
    • You can use jeprof to parse the heap file via jeprof tool
    • The jemalloc commands don't give us useful information when the memory doesn't increase.

      1) Enable jemalloc profile

      FortiWeb# diagnose debug jemalloc-conf proxyd enable

      2) if memory increases quickly, execute below command to generate dump files.

      E.g., you can wait the memory usage to increase 10% and execute below commands; and it’s better to repeat this commands for several times when memory increases every 10%:

      FortiWeb# diagnose debug jemalloc proxyd dump

      3) Check the dump heap file generated:

      FortiWeb # diagnose debug jemalloc-heap show

      jeprof.out.28279.1641342474.heap

      jeprof.out.4973.1641276249.heap

      4) After getting a few heap file, execute below command to parse the heap file

      FortiWeb # diagnose system jeprof proxyd

      Using local file /bin/proxyd

      Using local file /var/log/gui_upload/jeprof.out.28279.1641342474.heap

      Total: 124422365 B

      34403589 27.7% 27.7% 34403589 27.7% ssl3_setup_write_buffer

      34262011 27.5% 55.2% 34262011 27.5% ssl3_setup_read_buffer

      18062121 14.5% 69.7% 18062121 14.5% CRYPTO_zalloc

      17011023 13.7% 83.4% 17011023 13.7% _HTTP_init

      9905760 8.0% 91.3% 9905760 8.0% BUF_MEM_grow

      3195135 2.6% 93.9% 3195135 2.6% buffer_new

      1583640 1.3% 95.2% 18857320 15.2% HTTP_substream_process_ctx_create

      Using local file /bin/proxyd

      Using local file /var/log/gui_upload/jeprof.out.4973.1641276249.heap

      Total: 576387295 B

      175840569 30.5% 30.5% 175840569 30.5% ssl3_setup_write_buffer

      175415833 30.4% 60.9% 175415833 30.4% ssl3_setup_read_buffer

      81823328 14.2% 75.1% 81823328 14.2% CRYPTO_zalloc

      72087699 12.5% 87.6% 72612307 12.6% _HTTP_init

      8578052 1.5% 89.1% 84473564 14.7% HTTP_substream_process_ctx_create

      7654262 1.3% 90.5% 7654262 1.3% asn1_enc_save

      7311586 1.3% 91.7% 7311586 1.3% HTTP_get_modify_value_by_name

      6855757 1.2% 92.9% 6855757 1.2% pt_stream_create_svrinfo

      5851046 1.0% 93.9% 5851046 1.0% _hlp_parse_cookie

      5136808 0.9% 94.8% 5136808 0.9% HTTP_process_ctx_create

      5) Use graph tool to analyze the function call relationship from .heap files

      This tool is for internal R&D investigation only. Just for reference.

    • Generate a .dot file on FortiWeb backend shell:

      jeprof --dot /bin/proxyd jeprof.out.4973.1641276249.heap > 1641276249.dot

      Or add an option --base with a previous .heap file to get the difference between two heaps:

      jeprof --base= jeprof.out.4973.1642276345.heap --dot /bin/proxyd

      jeprof.out.4973.1641276249.heap > diff.dot

    • Copy 1601044510.dot to ubuntu;
    • Install graphviz on Ubuntu:

      apt install graphviz

    • Generate a png picture:

      dot -Tpng 1641276249.dot -o 1641276249.png

      A png image will be generated as below, indicating the top memory usage functions, and function call relationship. Taking the case below for example, one can check if HTTPS traffic load increased or related configuration is changed.

      6) You can also download the jeprof.out files and provide them to support team for further investigation:

      /var/log/gui_upload# ls jeprof.out* -l

      -rw-r--r-- 1 root 0 109251 Sep 27 18:30 jeprof.out.11164.1632789019.heap

      -rw-r--r-- 1 root 0 111975 Dec 22 12:22 jeprof.out.3777.1640200954.heap

      Note: In jeprof.out.3777.1640200954.heap:

      3777 is the PID of proxyd

      1640200954 is the UNIX timestamp; one can use online tools to convert it to a human-readable date so as to just pay attention to recent dump files. This is useful to confirm the recent & current coredump files if there are many files.

      E.g.:

      Epoch Converter - Unix Timestamp Converter

  6. Besides jemalloc dump files, you can also generate proxyd pdump logs with the following command.

    These logs named as “proxyd-objpool-*-*.txt” include memory statistics information for key data structures.

    You can find these logs in the same directory, but manually download them via System > Maintenance > Backup & Restore, because these logs are not included in the one-click download debug log "console_log.tar.gz".

    FWB# diagnose debug jemalloc proxyd dump

    /var/log/gui_upload# ls -l proxyd*

    --wS--Sr-x 1 root 0 1417 Aug 3 10:38 proxyd-objpool-32741-1659548316.txt

    --wS--Sr-x 1 root 0 1417 Aug 3 10:38 proxyd-objpool-32741-1659548336.txt

  7. As stated in point 2, after 6.4.0 GA release, a regular monitoring file is generated as /var/log/gui_upload/debug_memory.txt. One can set a memory boundary for it: if the memory usage reaches the boundary and proxyd or ml_daemon is the top 10 high memory usage, it will enable their jemalloc debug function automatically.

    FortiWeb # show full system global

    config system global

    set debug-memory-boundary 70 #memory usage percentage, 1%-100%

    End