We use cookies on our website to ensure we provide you with the best experience on our website. By using our website, you agree to the use of cookies for analytics and personalized content.This website uses cookies. More Information
It seems like your browser didn't download the required fonts. Please revise your security settings and try again.

How does the Bayesian database affect email received by my Email Security Gateway?

  • Type: Knowledgebase
  • Date changed: 10 months ago
Solution #00001323

Scope:
This solution applies to all models of the Email Security Gateway, firmware versions 3.3 and higher.

Answer:
The Bayesian analysis process on the Email Security Gateway can increase or decrease the spam score of a message by up to 5 points in either direction if fully configured. The adjustment in score depends on how the Bayesian database has been trained (in other words, which messages have been marked as Spam and Not Spam).

When a message is marked as Spam (or Not Spam) the Email Security Gateway inserts all words, word pairs, and word triplets into the Bayesian database as tokens. If the message was marked as Spam, the values of the tokens associated with those entries are increased slightly; if the message was marked as Not Spam, the values are decreased slightly. When 200 messages have been marked as each Spam and Not Spam, there will be a distribution of tokens that range from highly negative values (those words and short phrases marked only as Not Spam), to zero values (those words and short phrases marked as Spam and Not Spam in equal measure), to highly positive values (those words and short phrases marked only as Spam). This database of tokens, consisting of words and values, are used to adjust the spam score of each message, augmenting the score assigned by the Email Security Gateway's spam scanning process by a maximum of plus or minus 5 points.

The Barracuda will answer these three questions to determine the Bayesian score of a message:
  • What percent of the tokens pulled out of the message were found in the Bayesian database?
  • Which category (Spam or Not Spam) has a higher incidence or occurrence rate for each of those tokens? Are there more Spam tokens? More Not Spam tokens?
  • What is the score for each of the tokens that were found in the Bayesian Database?
Additional Notes:
To check a specific message's Bayesian breakdown, find that message on the Basic > Message Log page, click on that message, and in the new window that appears, click the View Bayesian Breakdown tab. Please note, when this page is generated, the Bayesian breakdown is calculated using the existing Bayesian database. If you or any other administrator has marked any other messages as Spam or Not Spam since the message was received, the Bayesian breakdown may differ from the actual calculations performed when the message was first received.


Link to This Page:
https://campus.barracuda.com/solution/50160000000GQAlAAO