RegExpr Request

General discussion about PopTray. You love it? You hate it? Talk about it here.

Moderators: KY Dave, jojobear99, Rdsok

Post Reply
User avatar
KY Dave
Not the Developer
Posts: 1599
Joined: Thu Mar 14, 2002 7:29 pm
Location: Burkesville, KY. U.S.A.
Contact:

RegExpr Request

Post by KY Dave » Thu May 13, 2004 2:53 pm

Since I use K9 to filter for Spam, I was hoping there could be a RegExpr that I could use to sort my email on spam percentage.

K9 will mark the messages' subject as spam and also place a percentage in the subject like this;
Subject: Add Permanent..enlargement to Manhood. [Spam][88.7%]

Is there a way using PopTray's RegExpr to filter only percentages above a certain amount?

I thought I could create separate lines in a rule using something with WILDCARD [6?.?%] or [7?.?%], but that didn't work. I don't know if it can't handle multiple wildcard characters or what?

I thought there might be an easier way.
KY Dave

Family Blog
You can STOP SPAM using PopFile and PopTray.

User avatar
Rdsok
PopTray Family
Posts: 1463
Joined: Fri Mar 19, 2004 11:36 pm
Location: Norman, Oklahoma USA
Contact:

Post by Rdsok » Thu May 13, 2004 5:31 pm

Well if you mean filter as in sort, you can't do that. If you mean to just select as spam certain ones over whatever % you can. I'd like to mention that you don't need and can't use 'WILDCARD', you'll have to use 'RegExpr' instead which can use different types of wildcards. I'll try to explain a variable and a command item that I'm going to use, then mention why you can't use the '.' part the way you've typed it.

Since regular expressions use several symbols for the things they trying to define, they have a method to 'escape' out a symbol so you can use it just as typed. Basically it just tells the regular expression that what you type next is really the symbol you want to use or look for.

The escape code they use is the '\' char. As an example, if you wanted to look for the '\' in a regexpr it would be written with 2 like this '\\'

The next part is one of the reasons I mentioned the last part. I'm just going to quote vitoco partially, if you want more info see his post at Guide to Write Regular Expression Rules in PopTray anyway here is the short quote..
- "." (dot) is a replacement for "?" wildcard.

- ".*" (dot-asterisk) is a replacement for "*" wildcard.
So if you had tried to put just a '.' in there you would get a wildcard instead. What you want is '\.' so that it means to look for the '.' .

Clear as mud.. good I'll go on. :D

RegExpr's also have a few shortcut variables that can help you define your expression a little. For instance '\d' is the same as any digit between 0-9.

One last part, if you want to look for more than one item, you enclose the items with the '[]' brackets (you can also use '()' if only looking for one instance of the items listed, see vitoco's thread for more info) and seperate each item with a '|' pipe sign. So to look for 'this' or 'that' it would be something like...

[this|that] or say for 3 items [this|that|the other thing].

To put this together using something you may be wanting. Lets use the numbers you were using and don't forget that certain symbols need to be 'escaped' out. Trying to just find something in the 80+% range like [Spam][88.7%] these symbols need escaped '[] and . ' Here is what you'll need to find that...

SUBJECT --> RegExpr --> \[Spam\]\[8\d\.\d%\]

here is a modified version to find 70+,80+ or 90+% items...

SUBJECT --> RegExpr --> \[Spam\]\[[7|8|9]\d\.\d%\]

All that was added to the second was a '[' , '7', '|', and again '|','9' and finally a ']' to close the group

Here is another way using a range of numbers...

SUBJECT --> RegExpr --> \[Spam\]\[[7-9]\d\.\d%\]

The '-' defines it as a range so anthing between 7-9 is looked for.
In this you can also use a-z for any letter from a to z.

To say it a different way, with the last expression you should be able to get a rule to fire with a % anywhere between 70% up to 99.9%.

Summary of this info:
To use regular expressions use 'RegExpr' instead of 'WILDCARD'
Escape out symbols that regexpr use for variables and commands with '\'
"." (dot) is a replacement for "?" wildcard
You can use groups to refine your expression by using '[]' or '()' and seperate each item with a '|'
A '|' means either 'or' or 'and/or' depending on how you use it.
A '-' defines a range like 1-9 is numbers one through nine, and a-z letters 'a' through 'z'

Regular expressions are easier to write than they are to read. It's not that they are hard to understand by any means. They just look odd so they appear as if they are hard to do. Certainly if you were able to follow this, I'd encourge you to use regexpr everywhere you can. Their flexibility can really give you power over defining your rules.

User avatar
KY Dave
Not the Developer
Posts: 1599
Joined: Thu Mar 14, 2002 7:29 pm
Location: Burkesville, KY. U.S.A.
Contact:

Post by KY Dave » Thu May 13, 2004 6:25 pm

Rdsok wrote: here is a modified version to find 70+,80+ or 90+% items...

SUBJECT --> RegExpr --> \[Spam\]\[[7|8|9]\d\.\d%\]

All that was added to the second was a '[' , '7', '|', and again '|','9' and finally a ']' to close the group

Here is another way using a range of numbers...

SUBJECT --> RegExpr --> \[Spam\]\[[7-9]\d\.\d%\]
Thanks, this is working for me. I changed the numbers and have one rule mark and the other delete the spam.

Thanks again :!:
KY Dave

Family Blog
You can STOP SPAM using PopFile and PopTray.

++vitoco

Post by ++vitoco » Fri May 14, 2004 4:07 pm

Rdsok wrote:You can use groups to refine your expression by using '[]' or '()' and seperate each item with a '|'
Note that '(aaa|bbb)' is used to select between words present in a string, while '[xyz]' are used to pick chars from a group and don't require a '|' between them, but you can describe ranges using '-'.

Then your RegExpr using '[7|8|9]' will search for a 7, 8, 9 or pipe!!!
Rdsok wrote:A '|' means either 'or' or 'and/or' depending on how you use it.
Huh! :?

I don't use K9, so I don't know if it is possible to have a 100% or a percent without a decimal like 88%. If so, the rule should be:

Subject | RegExpr | \[Spam\]\[(100|[7-9]\d)(\.\d)?%\]

++Vitoco

User avatar
KY Dave
Not the Developer
Posts: 1599
Joined: Thu Mar 14, 2002 7:29 pm
Location: Burkesville, KY. U.S.A.
Contact:

Post by KY Dave » Fri May 14, 2004 5:12 pm

Thanks for all the responses.

I think the last thing users and I will need to know, is how to break the keep/delete point in the middle.

Example
Marking As Spam everything from 00.0% to 84.5%.
Marking everthing for Deletion from 85.6% to 100%.

I think if shown that, we can change the figures to fit our personal preferences.

Thanks for the answers,
KY Dave

User avatar
Rdsok
PopTray Family
Posts: 1463
Joined: Fri Mar 19, 2004 11:36 pm
Location: Norman, Oklahoma USA
Contact:

Post by Rdsok » Fri May 14, 2004 5:40 pm

Thanks Vitoco,

For correcting me on the '|' pipe. I reread some of the regexpr stuff and the 'and/or' part I had mentioned was a misunderstanding I got from info on required range counts like (this|that) means 'this' or 'that' matchs and (this|that){2} matchs only when there is 'this' and 'that' .

I thought though, that a pipe always had to be escaped (I think you call it delimited) with a '\' to be recognized. I guess I stand corrected there also, you've had much more experience there.

Also thank for pointing out that [1234] will match any of those digits. I've been working with whole words so much, that one had slipped away. I've only been using regular expressions very much since about March so I guess I'm not doing bad, but not an expert yet.

KY Dave, obviously, I'd suggest you use Vitoco's rule instead of mine. His will also catch the 100% where mine will not it only looks for up to 99.9%. I don't think K9 will mark one as that, but better safe than sorry.

Rdsok

CharlesB

Post by CharlesB » Fri May 21, 2004 4:17 pm

KY Dave wrote: I think the last thing users and I will need to know, is how to break the keep/delete point in the middle.

Example
Marking As Spam everything from 00.0% to 84.5%.
Marking everthing for Deletion from 85.6% to 100%.

I think if shown that, we can change the figures to fit our personal preferences.
KY,after experimentation, I have found that a regexpfor marking 85.0% to 89.9% which works for me is

[spam\]\[8][5-9]\.\d%\]

Coupled with a modification of the rule from Rdsok, the second line of my expression is

[spam\]\[9]\d\.\d%\]

These two lines capture everything from 85.0% to 99.9% (which I now automatically delete).

For marking from 60 to 84.9, the following two line rule works for me:

[spam\]\[6|7]\d\.\d%]
[spam\]\[8][0-4]\.\d%\]

Please note I do NOT use a leading delimiter of \. For whatever reason, I could not make that work in PT, but it did work without one.

You have probably already done this yourself by now, but if not, hope this helps.

User avatar
KY Dave
Not the Developer
Posts: 1599
Joined: Thu Mar 14, 2002 7:29 pm
Location: Burkesville, KY. U.S.A.
Contact:

Post by KY Dave » Fri May 21, 2004 5:33 pm

Thanks CharlesB :!:

I had taken the time to post the technique for using WILDCARDS, but I didn't have a clue as to writing a REGEXP for the rules.

Your example works for me, and thanks again.
KY Dave

Family Blog
You can STOP SPAM using PopFile and PopTray.

User avatar
Strepy
First Timer
Posts: 1
Joined: Fri Jun 04, 2004 2:23 am
Location: Italy

Post by Strepy » Fri Jun 04, 2004 2:30 am

This seems to work fine for me:

Between 60 and 84.9%:
Reg1: Subject | RegExpr | \[Spam\]\[([6-7]\d|8[0-4])(\.\d)%\]
Sign as Spam
...

Between 85.0% and 100.0%:
Reg2: Subject | RegExpr | \[Spam\]\[(8[5-9]|9\d|100)(\.\d)%\]
Delete from Server
...

Post Reply

Who is online

Users browsing this forum: No registered users and 10 guests