-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Can't parse multipart/related followed by multipart/alternative #70
Comments
This seems to be where the issue arose: 47a9e81#diff-4aeeafbfcb59f18a0a43c805646b96c69c0f72820b15ed2a24508621d2e0152dR684-R704 |
It is a bug. Incidentally I was going to change this behaviour for the next release, but didn't quite get to it yet. In fact, this is the last thing I wanted to do before the next release (which also fixes the other issues you reported). But I got busy with other tasks. Although I was aware the current behaviour deviates from the RFCs, I did not notice that it could fail in this way. I only thought it would accept some invalid data - not reject valid data. You have provided a nice test case :) I shall endeavour to fix this and release |
Thanks for the update and no need to apologise. I am enjoying using this library! If there's anything that I can help with in the meantime then please let me know. If it would be helpful, and you can give me a hint of what to fix, then I can try to fix it myself an submit a PR. |
Many thanks for the offer @tomjaguarpaw. And if I wasn't already halfway through the fix, I would take up your generous offer! But if you continue using this library, perhaps there is something you will contribute - besides your bug reports of course, which are themselves valuable contributions. For the sake of transparency (and reducing bus factor, I suppose), I will sketch the problem and the approach I'm taking to fix it. RIght now we parse the "type" parameter as a full content-type, possibly with parameters, instead of just type and subtype as specified by RFC 2387. The fix is to introduce a new type for "type and subtype" - no parameters - and parse just that. The |
Side note: how nice to find an RFC-compliant |
I just had to downgrade due to a related issue choking on this header: |
@singpolyma thanks for the report. I will investigate in the next couple weeks. |
Never mind, I see the problem.
and https://www.rfc-editor.org/rfc/rfc5322#section-3.6.4:
The value must contain an @singpolyma Can you please provide more context? What program is producing this message, and how widespread are these non-conformant messages? |
@singpolyma Can you please provide more context? What program is producing
this message, and how widespread are these non-conformant messages?
I would say over half the MMS our system sees have a header with this param
included and no @
|
@singpolyma thanks. I don't know anything about the implementation of MMS. Is there a normative specification somewhere? Which vendor(s) produce the non-conforming messages? Have you raised ticket(s) with them? Would it be possible to detect and "massage" the non-conforming messages into conformance before processing them with purebred-email? |
I could pre-massage but I'd basically need a header parser to do that with I guess. I think the MMS messages are forwarded mostly-unmodified from various carriers, so convincing Verizon to change how they format their SMTP headers is probably not something I can do. Using an older version of purebred is working for me for now. |
@singpolyma do you think you could send me some (real world, complete) sample messages that exhibit this non-conformance? And I'll have a think about how to approach this problem. |
mms1.txt I've found another spec violation that comes from specifically Google Voice they send like this example: This Content-Type header does not have a "type" parameter at all. It seems to me that both of these examples are syntactically valid, but violate various semantic requirements (required argument, syntax of particular argument in some contexts) and it would be nice if in these cases the message still parsed at all into parts, etc (which seems like it was more true on older versions of the library). |
I am experiencing a problem with version 0.5. This problem was not present in 0.4.3.
I have several emails in my archive that have the following header:
I the following parse error:
When parsing
multipart/related
wegetRequiredParam "type"
which succeeds and thenparsed parseContentType
which fails (https://github.com/purebred-mua/purebred-email/blob/v0.5/src/Data/MIME.hs#L928-L934).parseContentType
explicitly looks for aboundary
(https://github.com/purebred-mua/purebred-email/blob/v0.5/src/Data/MIME.hs#L593-L603) but there is none formultipart/alternative
. The boundary here belongs tomultipart/related
.I'm a bit confused at this point. I'm not familiar with the spec, but perhaps we should be looking for just something of form
type="<...>/<...>"
rather than something with another boundary? (If so then I'm surprised that I'm the first person to run into this.)The text was updated successfully, but these errors were encountered: