The Rules & Regular Expressions Thread

General discussion about PopTray. You love it? You hate it? Talk about it here.

Moderators: KY Dave, jojobear99, Rdsok

User avatar
NO CARRIER
Fanatic
Posts: 82
Joined: Sat May 10, 2003 1:52 pm
Location: Bulgaria
Contact:

Post by NO CARRIER » Fri Mar 05, 2004 5:07 pm

Here are some RegEx rulesets for sendmail, which we can use in PopTray, I think.

Code: Select all

# This regular expression matches some spamware-generated Message-Id headers.
Kspammessageid regex -f -S  -aREJECT ^<(0000[0-9a-f]{8}\$$0000[0-9a-f]{4}\$$0000[0-9a-f]{4}|[0-9a-f]{12}\$$[0-9a-f]{7}[1-9a-f]\$$[0-9a-f]{8})@
# This regular expression matches some headers with just a random string.
Krandom regex -f -S  -aREJECT ^[.0-9A-Za-z]*[0-9][.0-9A-Za-z]*$$
# This regular expression matches some headers with just an all-numeric string.
Knumeric regex -f -S  -aREJECT ^[0-9]{8,10}$$
Kregistration regex -f -S  -aREJECT ^\#(01B0530810E603002D00|00F06206106618006920)$$
# These Content-Type headers are used by the Klez worm.
Kklez1 regex -f -S  -aKLEZ ^multipart/alternative; +boundary=[A-Z][0-9A-Za-z]+$$
Kklez2 regex -f -S  -aKLEZ ^multipart/alternative;  boundary="Boundary_\(ID_[+/0-9A-Za-z]{22}\)"$$
# These Content-Type headers are used by the Sobig worm.
Ksobig regex -f -S  -aSOBIG ^multipart/mixed;  boundary="(CSmtpMsgPart123X456|_NextPart)_000_[0-9A-Z]{8}"$$
# This Content-Type header is used by the Bugbear worm.
Kbugbear regex -f -S  -aBUGBEAR ^multipart/(alternative|mixed); boundary="----------[0-9A-Z]{14,15}"$$
# This Content-Type header is used by the Swen worm.
Kswen regex -f -S  -aSWEN ^multipart/(alternative|mixed); +boundary="[a-z]{5,}"$$
# This Content-Type header is used by the Beagle worm.
Kbeagle regex -f -S  -aBEAGLE ^multipart/mixed;         boundary="--------[0-9]{15}"$$
# This Content-Type header is used by the Netsky.B worm.
Knetsky regex -f -S  -aNETSKY ^multipart/mixed; boundary="[0-9]{8}"$$
# These Content-Type headers are spamware signatures.
Kboundary regex -f -S  -aREJECT ^multipart/(alternative|mixed); boundary="(=+xymMimeex22ader|[0-9A-Za-z]+---Minemindfxxyf)[0-9A-Za-z]+=+"$$
These are our mailserver's filter rules:

Code: Select all

iframe src
name=.*\.vbs
name=.*\.scr
name=.*\.pif
name=.*\.bat
^Content.+name=.+\.vbs
^Content.+name=.+\.scr
THIS EMAIL COMPLIES WITH ALL REGULATIONS AND IS NOT SPAM
MONEY MAKING OPPORTUNITY
China Enterprise Management
<FONT>Hello,This is a  funny game<br>
<FONT>Hello,This is a  excite game<br>
^This game is my first work.<br>$
is a very dangerous virus that can infect on Win98/Me/2000
Unsolicited.*?Commercial.*?Electronic.*?Mail
Sent using e - BroadCast
 nigeria 
emailofferz\.biz
www\.mail15\.com
filename="winmail.dat"
^Content-Type: application/x-zip-compressed; name="message\.zip"
 mor.tgage q.uote 
Get.*?your.*?RISK-FREE
Life.*?Insurance
BestEmailDeals
viagra
THIS.*?IS.*?NOT.*?SPAM
university.*?diplomas
premature.*?ejaculation
 RedV 
Unsecured.*?Platinum.*?Card
lose.*?weight.*?while.*?you.*?sleep
unwanted.*?body.*?fat
"November 2003, Cumulative Patch"
"December 2003, Cumulative Patch"
 w0man 
--\>ove 
www.market.bg
 Impr0ve 
 w0men 
>Rem<!--
Content-Disposition: attachment; filename="pic.gif"
from.*?a.*?diploma.*?within.*?days
 V..gr. 
 W..ght.*?Loss 
 S.xual 
ContentCleaner
pen.s.*?enlargement
 y0ur 
Get.*?Your.*?Insurance.*?Quote
\/pics\/gv1.gif
 l0nely 
 horny 
 fr0m 
acc0unts
 c00l 
ARJ.EXE.exe
 d.*?ck l.*?ngt.*?h 
Via.gra
ERE.CTION
Drug.*?Prices
penis.*?enhancement
h.rb.l.*?p.tch
hottest.*?adult.*?site
guys.*?looking.*?for.*?hot.*?content
worth.*?ur.*?Attn\:Men
jenna.*?jameson
p.rn.*?4.*?free
p[a|4]r[i|1|l]s.*?h[i|1|l]lt[o|0]n
 V[1|i|l]c[0|o]d[1|i|l]n 
Increase P.n.s S.z.
THIS IS NOT A  S_P_A_M !
valium
xanax
\/loan\/
Increase your Manhood
Subject: =\?iso-8859-1
come.*?check.*?out.*?this.*?exclusive.*?footage
hot.*?brand.*?new.*?adult.*?site
^Content-Type: multipart/mixed; boundary="[0-9]+"
Note spaces in the begin and in the end of sentences.

Borgtex
Groupie
Posts: 52
Joined: Mon Mar 08, 2004 1:32 pm

Post by Borgtex » Mon Mar 08, 2004 2:29 pm

Use at your own risk...

[bcdfghjklmnpqrstvwxyz]{5} - to filter a mail address like gftsfxytu@hotmail.com

\$.*\$ - looks for two dollar signs in a sentence, for a subject like "turn $50 into $50000"

[a-z]\s[a-z]\s[a-z]\s[a-z]\s[a-z] - for a subject with spaces between letters, i.e. "o r d e r o n l i n e"

[0-9]{4}[a-z][0-9]{2} - for a address containing a sequence of 4 digits, followed by a letter, followed by 2 digits, i.e. fht4538v34@hotmail.com
Last edited by Borgtex on Wed Mar 10, 2004 8:48 pm, edited 1 time in total.

User avatar
homaquebec
PopTray Family
Posts: 913
Joined: Tue May 27, 2003 6:47 pm
Location: Québec (Canada)

A rule that works fine

Post by homaquebec » Tue Mar 09, 2004 12:51 am

The rule

Body --> Contains --> [Text]

Content-disposition: attachment; filename=*.pif

[x] Delete on server

succeeded. The message was deleted when checking an other time.

User avatar
vitoco
Veteran
Posts: 422
Joined: Wed Jul 09, 2003 9:22 pm
Location: Chile
Contact:

Post by vitoco » Tue Mar 09, 2004 5:53 am

vitoco wrote:Here is a useful Subject RegExp:

v\W?[1i]\W?[a4]\W?g\W?R\W?[a4]

Set Mark as Spam and it will catch any of the following:

VIAGRA
viagra
V14GR4
vi-agra
E n j o y - V i a g r a
v.i.a.g.r.a
V.i+A=g/r*a
I've extended the previous example:

Code: Select all

(\\/|v)_*\W*_*[iíïìî1l\|]+_*\W*_*[aáäàâ4@]+_*\W*_*[gr6]+_*\W*_*[GR6]+_*\W*_*[aáäàâ4@]_*(\W.*)?$
It will catch all the previous variations plus more:

VIARGA
Vl@6R@
_V_I_A_G_R_A_
via-------gra
***V|agra***
viiargra
\/IAGRÄ

and it will NOT catch:

Vitoco Introduced A Great Regexp exAmple.
Tranvia Grande

Did you get the idea? 8)

++Vitoco

PS: Renier, you can use this in your help file.

BTW, Is there a simple way to test RegExp against strings without sending emails?
Can someone write a simple delphi app for me with 2 text boxes (one for the regexp pattern and one for the test string), one Test button and a result message (OK/failed)? :roll: Renier? NO CARRIER? :roll:

User avatar
Renier
Site Admin
Posts: 1957
Joined: Mon Oct 15, 2001 12:54 pm
Location: Cape Town, South-Africa
Contact:

Post by Renier » Tue Mar 09, 2004 7:47 am

There is a very nice test program on the http://www.regexpstudio.com/ site. This is the RegExp library I'm using.

++vitoco

Post by ++vitoco » Tue Mar 09, 2004 3:18 pm

Renier wrote:There is a very nice test program on the http://www.regexpstudio.com/ site. This is the RegExp library I'm using.
The only example I found is in source code... Can someone compile it for me? :|

++Vitoco (unlogged)

User avatar
Renier
Site Admin
Posts: 1957
Joined: Mon Oct 15, 2001 12:54 pm
Location: Cape Town, South-Africa
Contact:

Post by Renier » Tue Mar 09, 2004 3:38 pm

RegExp Studio is the test program. No source code available, only the exe.

++vitoco

Post by ++vitoco » Tue Mar 09, 2004 3:47 pm

++vitoco wrote:The only example I found is in source code... Can someone compile it for me?
:oops: Don't worry, I found the executable in other zip by mistake... :oops:
(the site is not very intuitive)

Using it, I found that my latest example will not catch VIAGRAA, so a plus sign must be added near the end:

(\\/|v)_*\W*_*[iíïìî1l\|]+_*\W*_*[aáäàâ4@]+_*\W*_*[gr6]+_*\W*_*[GR6]+_*\W*_*[aáäàâ4@]+_*(\W.*)?$

But if someone wants to ignore whatever immediatelly following viagra, (s)he must remove all the pattern starting with that plus sign.

++Vitoco (still unlogged)

User avatar
Genius
First Timer
Posts: 2
Joined: Tue May 27, 2003 3:28 pm
Location: Ancona, Italy

Post by Genius » Wed Mar 10, 2004 8:20 pm

Renier wrote:For an explanation of the RegExpr syntax (and the library I use) you can have a look at this link: http://www.regexpstudio.com/TRegExpr/He ... yntax.html
Hi, here you can find a very good ITALIAN tutorial, it will be very useful for Italian people (like me :) )

http://fido.altervista.org/RegExp/regex.html

Thanks for your last version with regex filter, I hope it will help me to keep my addresses clean.
--
Ciao, Andrea

efgerman
Still here
Posts: 11
Joined: Wed Mar 10, 2004 6:54 pm

Post by efgerman » Wed Mar 10, 2004 9:15 pm

Hi!

There's a K9 antispam blacklist rules (several regex) that can be useful. Ed Cottrell <http://www.edcottrell.com/k9.cfm>, the author, classifies it as "very agressive" so you may want to "polite" it. :wink:
Kind regards,
Euler German

Toff
First Timer
Posts: 1
Joined: Thu Mar 11, 2004 12:14 pm
Location: France

Post by Toff » Thu Mar 11, 2004 12:28 pm

A quite safe rule to delete all mails with *.pif, *.vbs, *.bat, *.com, *.scr attachements.

Code: Select all

Name : Bad attachements
Needed : ALL Rows
Header -> Contains -> Content-Type: multipart/mixed;
Body -> Reg Expr -> ^Content-Disposition: attachment;[\s]*filename=".*\.(pif|vbs|bat|com|scr)"$

[x] Delete on server

User avatar
vitoco
Veteran
Posts: 422
Joined: Wed Jul 09, 2003 9:22 pm
Location: Chile
Contact:

Post by vitoco » Thu Mar 11, 2004 12:40 pm

Toff wrote:A quite safe rule to delete all mails with *.pif, *.vbs, *.bat, *.com, *.scr attachements.
This also requires to set Retrieve Body while Checking advanced option :!:

++V

User avatar
Bateman
PopTray Family
Posts: 664
Joined: Sun Nov 11, 2001 9:53 pm
Location: Germany

Post by Bateman » Tue Mar 16, 2004 1:46 am

First of all, THX to vitoco for that "Viagra" RegExpr. Works great and gets even the weirdest spellings!

As proposed before, why not open a thread which only contains RegExpr for the most frequent and most annoying words/phrases in mails like Viagra, Xanax, mortgage, presciptions and the like.

That would be easier to overlook that this thread here and would help RegExpr newbies (like me) quite a lot :wink:

User avatar
ComputerBob
Guru
Posts: 278
Joined: Sat Jun 14, 2003 5:27 pm
Location: The Gulf Coast of the Sunshine State, USA
Contact:

Post by ComputerBob » Tue Mar 16, 2004 3:18 am

Bateman wrote:As proposed before, why not open a thread which only contains RegExpr for the most frequent and most annoying words/phrases in mails like Viagra, Xanax, mortgage, presciptions and the like.
That's what I was thinking when I started THIS thread, but you can't control which RegExpr people are going to post. The good news is that you can usually learn something, even from the ones that you don't plan to use. :wink:
ComputerBob - Making Geek-Speak Chic™
http://www.computerbob.com
One Of The Largest One-Person Sites On The Web
With Tons of Information, Software, Help, and Fun

User avatar
Bateman
PopTray Family
Posts: 664
Joined: Sun Nov 11, 2001 9:53 pm
Location: Germany

Post by Bateman » Tue Mar 16, 2004 2:33 pm

Okay then, let's be more precise :)

Proposal:
What about an announcement (preferably a closed thread) where all properly working RegExpr are posted, moderated by Renier, vitoco and other RegExpr cracks.

We could learn from this thread here, but would also have an easy to use and nice collection of useful expressions.

User avatar
KY Dave
Not the Developer
Posts: 1599
Joined: Thu Mar 14, 2002 7:29 pm
Location: Burkesville, KY. U.S.A.
Contact:

Post by KY Dave » Tue Mar 16, 2004 3:07 pm

KY Dave wrote:Posted: Fri Mar 12, 2004 8:44 am

IMO Concerning the Regular Expressions
I think an APPROVED EXPRESSION thread on the forum should be made where no one but Renier and maybe vitoco could post the regular expressions after they have been evaluated and tested. Leave the other thread going where all users can post their regular expressions and then pick the ones that work and post them to the APPROVED EXPRESSION thread.
I concur with your suggestion, but only Renier can comply.

A forum that was OPEN TO PUBLIC to READ, but LIMITED to POSTERS is what is needed. No one would need to have MODERATOR PERMISSION. It could be done with a USERGROUP being created and only that USERGROUP could post to that forum. The USERGROUP would be a CLOSED USERGROUP that Renier could add members as they became available.

I would suggest that NO CARRIER be one the users that could post to the thread along with previous suggestion of Vitoco. They seem to have the knowledge to evaluate the REG EXPR.
KY Dave

Family Blog
You can STOP SPAM using PopFile and PopTray.

Borgtex
Groupie
Posts: 52
Joined: Mon Mar 08, 2004 1:32 pm

Post by Borgtex » Thu Mar 18, 2004 9:47 pm

Another rule for subjects like:

Worried _about D*l,CK*SIZ'E
Chepests Levit*ra on the Internet
Enhanc^e yo`ur...
Your one stop prescr%iptions
Why Pay for over priced P\rescription D*rugs


[a-z][%*$^`\\][a-z]

Passer-by

Distributive

Post by Passer-by » Thu Mar 18, 2004 11:43 pm

Instead of discussing the rights of posting RegExp on the forum, why don't we assume an option of including an *.ini file with the most common expressions in the distribution/installation package?
Advanced users will be able to modify them (if they find it necessary) and most users will just use the programme with predefined expressions.
Maybe later on an option like "update rules online" can be added... :wink:

Borgtex
Groupie
Posts: 52
Joined: Mon Mar 08, 2004 1:32 pm

Post by Borgtex » Thu Mar 25, 2004 9:18 pm

Updated:
Subject >Weird characters between letters (c0m*put3r): [a-z][\]\[#%*$~^`\\0-9][a-z]
Subject > Space or other punctuation simbols between letters 5 or more times (B.u.y t_h_i_s): ([a-z](\s|\.|_|~)){5}

New:
Body> random comments (<--iuhuhuihihiuhsdf-->): <!--[^>]*[bcdfghjklmnpqrstvwxyz]{5}[^>]*-->

Body> strange subdomains (http://jhjkhdkjd.buyingonliners.com): http://[^\.]*[bcdfghjklmnpqrstvwxyz]{5}


As always, use with care.

User avatar
NO CARRIER
Fanatic
Posts: 82
Joined: Sat May 10, 2003 1:52 pm
Location: Bulgaria
Contact:

Post by NO CARRIER » Fri Mar 26, 2004 2:55 pm

KY Dave wrote:I would suggest that NO CARRIER be one the users that could post to the thread along with previous suggestion of Vitoco. They seem to have the knowledge to evaluate the REG EXPR.
Sorry for the late answer...

Thanks KY Dave for the honour, but I'm not a RegEx guru like ++Vitoco. :D

Post Reply

Who is online

Users browsing this forum: No registered users and 6 guests