I would like to quarantine an outgoing email when the To and CC header contains a large number of recipients. How can I do this?

Import the email address pattern from https://www.ciphermail.com/patterns.html. Make sure that the action is set to Quarantine. The number of email addresses at which the email will be quarantined is determined by the threshold.

Can the DLP module detect all information leakage?

No. The DLP module protects against accidental data leakage but it does not protect you against a knowledgeable attacker. Any DLP vendor that claims it can detect all information leakage is not telling the complete story. Information can be rewritten/recoded in such a way that it becomes almost impossible to detect. For example the word secret can be written in ASCII art like:

 ####  ######  ####  #####  ###### #####
#      #      #    # #    # #        #
 ####  #####  #      #    # #####    #
     # #      #      #####  #        #
#    # #      #    # #   #  #        #
 ####  ######  ####  #    # ######   #

If all outgoing information should be screened, the best thing to do is to quarantine all outgoing email and let the DLP managers approve/disapprove all outgoing email. DLP rules can be specified for a user, domain or for the global settings. This allows you for example to setup a rule to only quarantine outgoing email sent by certain users.

I have added a sentence to the list of patterns but somehow the sentence is not matched. Why is the sentence not matched?

It could be that a word from you sentence belongs to the list of words that are skipped (the skip list). All words from the skip list are removed from the text before scanning. For example, assuming that the skip list contains the word, this and that the pattern is this is a text the pattern will never match because the pattern contains a word from the list of words to skip.

Tip

To see what text the DLP engine scans, you can upload the MIME content (i.e., a raw email) to the “Extract text from a MIME message” page (see Extract text from MIME message).

What is the skip list?

The main reason certain words are skipped, is that scanning time is improved since words that are very common but that do not contain any sensitive information are removed before scanning. The default list of words to skip contains the 100 most common words in English.

I would like to match a word if the word contains uppercase characters but not when it contains lowercase characters. Is this possible?

No this is not possible. When text is extracted from the email, the text is normalized using the following procedure:

  • All carriage returns and line feeds are replaced with spaces.

  • Consecutive spaces are trimmed to one space.

  • All characters are converted to lowercase.

  • The resulting text is Unicode normalized (NFC).

The reason why all text is normalized is that patterns are simpler and therefore faster. Because all text is converted to lowercase, you cannot match uppercase characters.

If my pattern contains uppercase letters, it never matches any text. Why is that?

For a more detailed answer, see question I would like to match a word if the word contains uppercase characters but not when it contains lowercase characters. Is this possible? . The short answer is that all text is converted to lowercase. You should therefore make sure that your pattern only contains lowercase characters.

Are there any pre-defined patterns?

You can download some patterns from https://www.ciphermail.com/patterns.html. If you have patterns that can be valuable for others and would like to share them, please contact us.

Are attachments also scanned?

Text based attachments, for example .xml and .html, are scanned. The contents of binary attachments are currently not scanned. Support for binary attachments, like .doc, .zip, .pdf etc. will be added to future versions of the CipherMail gateway. The document type however is scanned and added to the extract text. This for example allows you to create a rule which quarantines emails if the email contains a PDF or word file. The document type is detected even if the filename extension is changed.

I cannot delete certain patterns. Why is that?

If a pattern is used, for example selected by the global settings, the pattern is in use and cannot be deleted. Before a pattern can be deleted, make sure that the pattern is no longer selected by a user, a domain or by the global settings. To get an overview of which setting is using the pattern, click the pattern and select view usage.

Are email headers scanned?

Yes email headers are also scanned. However, the following headers are skipped:

  • Received

  • From

  • Reply-To

  • References

  • Message-ID

The reason these headers are skipped is to make it less likely to get false positives when scanning for multiple email addresses. If you want all headers to be scanned or add headers to be skipped for scanning, add or remove headers to dlp.xml.