Filtering foreign language charsets

Posted: Wed Mar 30, 2005 12:28 am
by wysocki
I get junk email that has foreign character sets in either the subject or sender (see selected lines in image below). How can I filter these out?


Posted: Wed Mar 30, 2005 12:43 am
by KY Dave
In PopTray, RIGHT CLICK on any of the emails that have the characters you want to delete email if they are in the subject.


Then in PopTray on the RULES tab, click on the RULE you just added.
Change the CRITERIA to COMPARE = WILDCARD, in the TEXT area, delete all but one of the characters. Put an ASTERIK before the character and an ASTERIK after the character.

SAVE RULES, QUIT PopTray and RESTART for changes to take effect.

Test the rule and see if it works. You might need to click SHIFT+CHECK the first time to make the rules fire.

On that same RULE, you can ADD ROW and do the same thing for another character. Be sure to change the CRITERIA of NEEDED to ANY ROW. You can add as many rows as you need to the same rule, or create a new rule.
Looking at your image, I'd suggest you try K9 to classify your spam and let PopTray delete it automatically. If you would like to try it click the link in my signature.

one-line regex

Posted: Wed Mar 30, 2005 5:31 am
by lemming
I think Dave's "delete anything with foreign chars" rule is too broad. Plus it requires too many rules/rows be useful.

While it may stop the spam you described, it would also catch words like Raphaël, resumé and many scandanavian and french names.

From the example you provided, I can see that every instance of that type of spam contains 3 consecutive foreign vowels.

So here is a one-line regular expression (regex) rule which will catch those spams:

Subject -> Reg Expr ->


The square brackets contain the foreign vowels to looks for, while the curly brackets just mean "three or more occurrences of".

This would match any subject that has three foreign vowels together, but it would not trigger on something like resumé or flambé.

Some notes:

1) Punctuation is very important in regex, in particular the square brackets, curly brackets and comma.
2) Make sure you do not include any regular (English) vowels in the regex.
3) Poptray rules are not case sensitive, and I believe that applies to foreign chars too.
4) If you need to add more foreign characters to the rule, just use the Character Map utility which is installed by default in Windows XP (accessories->system tools).
5) You should use "Mark as Spam" for initial testing.

I noticed too you're still on Poptray RC2. You should upgrade to the final version 3.1.

-Lemming 8)