Web content filter
You can control web content by blocking access to web pages containing specific words or patterns. This helps to prevent access to pages with questionable material. You can also add words, phrases, patterns, wild cards and Perl regular expressions to match content on web pages. You can add multiple web content filter lists and then select the best web content filter list for each web filter profile.
Enabling web content filtering involves three separate parts of the FortiGate configuration.
- The security policy allows certain network traffic based on the sender, receiver, interface, traffic type, and time of day.
- The web filter profile specifies what sort of web filtering is applied.
- The web content filter list contains blocked and exempt patterns.
The web content filter feature scans the content of every web page that is accepted by a security policy. The system administrator can specify banned words and phrases and attach a numerical value, or score, to the importance of those words and phrases. When the web content filter scan detects banned content, it adds the scores of banned words and phrases in the page. If the sum is higher than a threshold set in the web filter profile, the FortiGate unit blocks the page.
General configuration steps
Follow the configuration procedures in the order given. Also, note that if you perform any additional actions between procedures, your configuration may have different results.
- Create a web content filter list.
- Add patterns of words, phrases, wildcards, and regular expressions that match the content to be blocked or exempted.
- You can add the patterns in any order to the list. You need to add at least one pattern that blocks content.
- In a web filter profile, enable the web content filter and select a web content filter list from the options list.
To complete the configuration, you need to select a security policy or create a new one. Then, in the security policy, enable Webfilter and select the appropriate web filter profile from the list.
Creating a web filter content list
You can create multiple content lists and then select the best one for each web filter profile. Creating your own web content lists can be accomplished only using the CLI.
This example shows how to create a web content list called inappropriate language, with two entries, offensive and rude.
To create a web filter content list
config webfilter content
edit 3
set name "inappropriate language"
config entries
edit offensive
set action block
set lang western
set pattern-type wildcard
set score 15
set status enable
next
edit rude
set action block
set lang western
set pattern-type wildcard
set score 5
set status enable
end
end
end
Configuring a web content filter list
Once you have created the web filter content list, you need to add web content patterns to it. There are two types of patterns: Wildcard and Regular Expression.
You use the Wildcard setting to block or exempt one word or text strings of up to 80 characters. You can also use the wildcard symbols, such as “*” or “?”, to represent one or more characters. For example, as a wildcard expression, forti*.com will match fortinet.com and forticare.com. The “*” represents any kind of character appearing any number of times.
You use the Regular Expression setting to block or exempt patterns of Perl expressions, which use some of the same symbols as wildcard expressions, but for different purposes. The “*” represents the character before the symbol. For example, forti*.com will match fortiii.com but not fortinet.com or fortiice.com. The symbol “*” represents “i” in this case, appearing any number of times. RP: Add a regex example.
The maximum number of web content patterns in a list is 5000.
How content is evaluated
Every time the web content filter detects banned content on a web page, it adds the score for that content to the sum of scores for that web page. You set this score when you create a new pattern to block the content. The score can be any number from zero to 99999. Higher scores indicate more offensive content. When the sum of scores equals or exceeds the threshold score, the web page is blocked. The default score for web content filter is 10 and the default threshold is 10. This means that by default a web page is blocked by a single match. Blocked pages are replaced with a message indicating that the page is not accessible according to the Internet usage policy.
Banned words or phrases are evaluated according to the following rules:
- The score for each word or phrase is counted only once, even if that word or phrase appears many times in the web page.
- The score for any word in a phrase without quotation marks is counted.
- The score for a phrase in quotation marks is counted only if it appears exactly as written.
The following table describes how these rules are applied to the contents of a web page. Consider the following, a web page that contains only this sentence: “The score for each word or phrase is counted only once, even if that word or phrase appears many times in the web page.”
Banned pattern rules
Banned pattern |
Assigned score |
Score added to the sum for the entire page |
Threshold score |
Comment |
---|---|---|---|---|
word |
20 |
20 |
20 |
Appears twice but only counted once. Web page is blocked. |
word phrase |
20 |
40 |
20 |
Each word appears twice but only counted once giving a total score of 40. Web page is blocked |
word sentence |
20 |
20 |
20 |
“word” appears twice, “sentence” does not appear, but since any word in a phrase without quotation marks is counted, the score for this pattern is 20. Web page is blocked. |
“word sentence” |
20 |
0 |
20 |
“This phrase does not appear exactly as written. Web page is allowed. |
“word or phrase” |
20 |
20 |
20 |
This phrase appears twice but is counted only once. Web page is blocked. |
Enabling the web content filter and setting the content threshold
When you enable the web content filter, the web filter will block any web pages when the sum of scores for banned content on that page exceeds the content block threshold. The threshold will be disregarded for any exemptions within the web filter list.