-
Notifications
You must be signed in to change notification settings - Fork 99
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Spam filter to work with flags from Rspamd and bogofilter #300
base: master
Are you sure you want to change the base?
Conversation
Codecov Report
@@ Coverage Diff @@
## master #300 +/- ##
==========================================
- Coverage 47.08% 46.86% -0.22%
==========================================
Files 30 30
Lines 1079 1084 +5
==========================================
Hits 508 508
- Misses 571 576 +5
Flags with carried forward coverage won't be shown. Click here to find out more.
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wow, perfect timing (i submitted #299)! Thank you;)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This also needs updates to the documentation, describing the behaviour, and a test (especially considering this changes HeaderMatchingFilter
, which is used by users directly, and in ListMailsFilter
too.
self.header = [self.header] | ||
|
||
try: | ||
value = next(filter(None, map(message.get_header, self.header))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is pretty hard to read. I assume it'll call message.get_header() with every header
configured in the filter, and filter out results that are an empty string (which can be the case if one of the headers doesn't exist)?
What's the next() in front of everything doing here? Why is value
still a single element? What happens if the filter returns multiple elements?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, you read it right.
value
is a single element, the first matching header. This is because the HeaderMatchingFilter
only matches a string, it only has a pattern to match a string. Up until now the filter only matches to one header. For spam, as there are many spam flagging tools, you get different headers depending on the tool, thus you need to be able to search for many headers.
I assume, sysadmins don't run their many spam filters at one, they just pick a tool. Thus I pick the first not empty match(that is what next
does), which might be the only match. Later I just let the the same logic of the filter run its way and match the pattern to the string. There I use the benefits of the regex. Spamassasing and Rspamd Mark with YES, their spam, bogofilter uses Spam.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While it makes sense in the scope of SpamFilter
, this changes semantics in HeaderMatchingFilter
in a non-intuitive and undocumented way.
Can we expose this in HeaderMatchingFilter
in a less confusing way, and have it documented?
I started using bogofilter as it is low on resources, and enough for my email load. I realized I was missing other header matching options for spam.
This also Closes #299, for Rspamd
P.S. Happy if you also give this PR a hacktoberfest-accepted label