Bayesian Filtering Overview

Feature Overview

The Bayesian filter uses probability analysis for given phrases and words to learn if they are most commonly found in Ham (non-Spam) or Spam. The advantage of Bayesian filtering is that it adapts to the peculiarities of an individual organization, significantly enhancing the Spam catch rate. The longer the filter has learned, the more accurate the filter will become.

 

Training

When the Bayesian filter is first enabled, it will go through a period of training in which it will learn which phrases are associated with Spam or Ham. After the initial training, the Bayesian filter will continue to learn and apply what has been learned to any new message that comes in. When messages are incorrectly scored, the Bayesian filter can be re-trained. If a message is quarantined and then released, the message is automatically re-learned as Ham. You can also use the SpamFilter Training Tool (formerly Outlook Plug-in) which allows a user to mark received messages as Spam so they are re-learned by the Bayesian filter.

 

Configuration

To turn on the Bayesian filter access the user interface of your InstaGate or ThreatWall. Go to the SpamFilter menu option and click on Bayesian Filter. Enabling the Bayesian Filtering check box will start the training process.

To enable use of the Training Tool check the box labeled Allow Training Tool. After enabling training tool access you have the option for a management address. This is the IP address where training communications will be sent to from the user. If you would like users to be able to train the filter from outside the network you would enter the WAN IP address in this box. For more information on the SpamFilter Training Tool please read the guide available at the following URL.

http://www.esoft.com/kb/pdf/SpamFilterTraining.pdf

 

Troubleshooting

To verify Bayesian scoring is working you can enable detailed reporting on email messages. Go to the SpamFilter menu option in the user interface and click on Settings and then Advanced button. Check the box Add Detailed Report to Message Headers. You can then view the headers of the message and look for Bayesian filter scoring.

If you are having problems using the Training Tool verify the management address is set correctly. By viewing the header of the message you should be able to see the “X-Spam-Id” line of the header. At the end of this line is the IP address and port the message is trying to contact. Verify connectivity to this IP address and port. If you are external to the network remote support will need to be enabled.

In rare circumstances the Bayesian filter database may become corrupted. In these situations you may see all messages scored with the same Bayesian scores or no Bayesian message scoring even though the training period has ended. In this case, reset the Bayesian database and check the functionality after re-training.

In almost all cases you will not want to allow access to the Training Tool from the WAN. Also, you will only want to allow the most trusted users access to the Training Tool. This tool can significantly impact the functionality of your SpamFilter as it allows control over the global whitelist and blacklist.

Not what you were looking for?

Get more Help -  Ask a Question -  Login to Support Portal

© 2012 eSoft. All rights reserved.
Privacy | Site Map