The Rules & Regular Expressions Thread

General discussion about PopTray. You love it? You hate it? Talk about it here.

Moderators: KY Dave, jojobear99, Rdsok

User avatar
lemming
Groupie
Posts: 55
Joined: Sun Jan 09, 2005 3:51 am
Location: Malaysia

Regex for most medications (update)

Post by lemming » Fri Jan 14, 2005 6:56 am

Hi, am testing various regexes I modified from Spamassassin. These cover most medication spam:

viagra
(\\/|v).{0,2}[iíïìî1l!¡:\|].{0,2}[aáäàâå4@].{0,2}g.{0,2}r.{0,2}[aáäàâå4@]

vicodin
(\\/|v).{0,2}[iíïìî1l!¡:\|].{0,2}[cç¢].{0,2}[o0].{0,2}(d|c\|).{0,2}[iíïìî1l!¡:\|]

cialis
[cç©¢].{0,2}[iíïìî1l!¡:\|].{0,2}[aáäàâ4@].{0,2}[L1l\|].{0,2}[iíïìî1l!¡:\|].{0,2}[s$§]

oxycontin/oxycondone
(o|0|\(\)).{0,2}x.{0,2}y.{0,2}[c©¢ç].{0,2}(o|0|\(\)).{0,2}[nd]

xanax
x.{0,2}[aáäàâ4@].{0,2}(n|\|\\\|).{0,2}[aáäàâ4@].{0,2}x

These regexes have a lot in common. Some notes:

.{0,2} is a useful regex which will match "gappy" subjects like v.i.a.g.r.a or even v. i` a ~g -r 'a

[aáäàâ4@] will catch variants of a

[c©¢ç] will catch variants of c

[eèéêë] will catch variants of e

[iíïìî1l!¡:\|] will catch variants of i

(n|\|\\\|) will catch variants of n, includding |\|

(o|0|\(\)) will catch variants of o, includding ()

(\\/|v) will catch variants of v, including \/

Hope you find them useful. Any corrections or improvements are welcome.

Regards,
Lemming.
Last edited by lemming on Wed Apr 27, 2005 9:10 am, edited 1 time in total.

tbp
First Timer
Posts: 1
Joined: Sat Feb 12, 2005 9:48 pm
Location: Dallas, Texas area

Post by tbp » Sat Apr 16, 2005 6:50 pm

How about a regex for a numeric range? Would be useful in trapping originating IPs. For example, I do a whois on an IP address and get the following (fictitious) result:

Stupid Spammer, Inc. 123.123.234.0 - 123.123.245.255

Obviously a twelve line rule will work, but there has to be a simple, elegant regular expression.

Isn't there?

User avatar
lemming
Groupie
Posts: 55
Joined: Sun Jan 09, 2005 3:51 am
Location: Malaysia

Two-row regex

Post by lemming » Sat Apr 16, 2005 8:33 pm

Sure, you could come up with a one-row regex. But in this case, I think you could live with a two-row regex rule to make things readable:


Row 1 = 123\.123\.23[4-9]\.\d

Row 2 = 123\.123\.24[0-5]\.\d

The first row detects all IPs from 123.123.234.0 to 123.123.239.255
The second row detects all IPs from 123.123.240.0 to 123.123.245.255

Remember to use the "ANY Row" option.

-Lemming
tbp wrote:How about a regex for a numeric range? Would be useful in trapping originating IPs. For example, I do a whois on an IP address and get the following (fictitious) result:

Stupid Spammer, Inc. 123.123.234.0 - 123.123.245.255

Obviously a twelve line rule will work, but there has to be a simple, elegant regular expression.

Isn't there?

Borgtex
Groupie
Posts: 52
Joined: Mon Mar 08, 2004 1:32 pm

Post by Borgtex » Sun May 15, 2005 1:38 pm

if somebody still wants to know how to do it in one row:

123\.123\.2(3[4-9]|4[0-5])\.\d

Helmut
Still here
Posts: 9
Joined: Sat May 21, 2005 9:54 am
Location: Germany

Post by Helmut » Sat May 21, 2005 10:19 am

homaquebec wrote:According to my knowledges (and my choices), here what it gives :

Image

...

Do you understand?
No, I don't understand. I do not have the choice of "Body", although I had been looking for it. Can it be itwas forgotten in the German version?

Using PopTray 3.1

Cheers
Ciao

Helmut

User avatar
Bateman
PopTray Family
Posts: 664
Joined: Sun Nov 11, 2001 9:53 pm
Location: Germany

Post by Bateman » Sat May 21, 2005 11:45 am

Helmut wrote:I do not have the choice of "Body", although I had been looking for it. Can it be itwas forgotten in the German version?
Actually, nothing was forgotten. Try it yourself by changing the language back to English and you'll see that the "Body" field will not occur either. This screenshot relates to an older version of PopTray.

Helmut
Still here
Posts: 9
Joined: Sat May 21, 2005 9:54 am
Location: Germany

Post by Helmut » Sat May 21, 2005 1:24 pm

Bateman wrote:
Helmut wrote:I do not have the choice of "Body", although I had been looking for it. Can it be itwas forgotten in the German version?
Actually, nothing was forgotten. Try it yourself by changing the language back to English and you'll see that the "Body" field will not occur either. This screenshot relates to an older version of PopTray.
Thanks, I found the correct answer in another Thread: if the >>Advanced Options do not include to retrieve the message body, then no "BODY" choice will appear. That simple.
Ciao

Helmut

User avatar
lemming
Groupie
Posts: 55
Joined: Sun Jan 09, 2005 3:51 am
Location: Malaysia

beware "retrieve the message body"

Post by lemming » Sat May 21, 2005 4:32 pm

Note that on my setup (Eudora, Win XP, PopTray 3.1), the "retrieve the message body" option causes all mail to be marked as read by Eudora. Not a big deal for me cos I sort by date, but it may cause some people to overlook new mail.

-Lemming
Helmut wrote: Thanks, I found the correct answer in another Thread: if the >>Advanced Options do not include to retrieve the message body, then no "BODY" choice will appear. That simple.

Helmut
Still here
Posts: 9
Joined: Sat May 21, 2005 9:54 am
Location: Germany

Re: beware "retrieve the message body"

Post by Helmut » Sat May 21, 2005 5:39 pm

lemming wrote:Note that on my setup (Eudora, Win XP, PopTray 3.1), the "retrieve the message body" option causes all mail to be marked as read by Eudora. Not a big deal for me cos I sort by date, but it may cause some people to overlook new mail.
Hmm, does this also happen if you request only the first couple lines?
Ciao

Helmut

User avatar
lemming
Groupie
Posts: 55
Joined: Sun Jan 09, 2005 3:51 am
Location: Malaysia

Re: beware "retrieve the message body"

Post by lemming » Mon May 23, 2005 10:18 am

Yeah, even for a few lines. But I think it's a mail server-specific feature. Probably doesn't apply for most other users.
Helmut wrote:Hmm, does this also happen if you request only the first couple lines?

User avatar
lemming
Groupie
Posts: 55
Joined: Sun Jan 09, 2005 3:51 am
Location: Malaysia

reading spam percentages from K9

Post by lemming » Fri May 27, 2005 6:51 am

I'm also using K9 in conjunction with PopTray. The rules you created can easily be summarised to single lines by using regular expressions.

To match any email with spam percentage of 50.0% - 85.9%:

\[Spam\]\[([567]\d|8[12345])

To match any email with spam percentage of 86.0% - 99.9%:

\[Spam\]\[(8[6789]\d|9\d)

The square brackets [ ] are special characters in regex, so that's why they are prefixed with a backslash. The \d part just means "any number".

Since you're basically dealing with round numbers in your percentages, everything past the decimal point can be ignored.

Also, If you want to search for a really complicated range, say 68.4% - 82.8%, that is also possible with a regex. But such a range would be difficult to look for using only wildcards.

-Lemming 8)
KY Dave wrote: I use K9 to filter my email, it is a bayesian filter that learns what I consider spam and marks the email subjects accordingly with [SPAM] and the percentage. K9 also has the possibility of using a DNSBL. It works between PopTray/OE and my mail server.

I use 2 rules in PopTRay to filter ALL the spam.

My first rule MARKS AS SPAM any email with the percentage of 50.0% - 85.9%.

Code: Select all

MARK AS SPAM RULE

SUBJECT, WILDCARD, *[Spam][5?.?%]*
ADD LINE ->  SUBJECT, WILDCARD, *[Spam][6?.?%]*
ADD LINE ->  SUBJECT, WILDCARD, *[Spam][7?.?%]*
ADD LINE ->  SUBJECT, WILDCARD, *[Spam][80.?%]*
ADD LINE ->  SUBJECT, WILDCARD, *[Spam][81.?%]*
ADD LINE ->  SUBJECT, WILDCARD, *[Spam][82.?%]*
ADD LINE ->  SUBJECT, WILDCARD, *[Spam][83.?%]*
ADD LINE ->  SUBJECT, WILDCARD, *[Spam][84.?%]*
ADD LINE ->  SUBJECT, WILDCARD, *[Spam][85.?%]*

IGNORE DON'T NOTIFY, MARK AS SPAM, ANY LINE
My second rule DELETES any SPAM email with the percentage of 86.0% - 99.9%.

Code: Select all

DELETE SPAM RULE

ADD LINE ->  SUBJECT, WILDCARD, *[Spam][86.?%]*
ADD LINE ->  SUBJECT, WILDCARD, *[Spam][87.?%]*
ADD LINE ->  SUBJECT, WILDCARD, *[Spam][88.?%]*
ADD LINE ->  SUBJECT, WILDCARD, *[Spam][89.?%]*
ADD LINE ->  SUBJECT, WILDCARD, *[Spam][9?.?%]*
ADD LINE ->  SUBJECT, WILDCARD, *[DNSBL]*

IGNORE DON'T NOTIFY, DELETE, ANY LINE
This example has the breaking point at 85.9% for MARK AS SPAM and above that is DELETED. Following this example, it would be easy for you to set the percentage at the point you would like to use.

Post Reply

Who is online

Users browsing this forum: No registered users and 7 guests