gh-49680: Don't rewrite CR/LF in imaplib APPEND literals#151557
gh-49680: Don't rewrite CR/LF in imaplib APPEND literals#151557harjothkhara wants to merge 2 commits into
Conversation
Documentation build overview
|
IMAP4.append() rewrote bare CR or LF bytes in the message to CRLF before sending it. An IMAP literal is a counted, opaque octet sequence, so rewriting its bytes corrupts payloads that legitimately contain bare CR/LF and changes the byte count the caller intends to send.
Send the message verbatim. _command() still owns the protocol framing (advertising the {N} octet count, the continuation handshake, and the trailing CRLF); append() now owns only the payload bytes.
Callers that relied on the normalization -- e.g. EmailMessage.as_bytes(), which uses bare LF under the default policy -- must now supply CRLF themselves (for example via email.policy.SMTP), as required by RFC 3501.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
4805d9f to
7e6bd8c
Compare
|
@sethmlarson would you mind weighing in on the behavior choice here? This PR makes |
serhiy-storchaka
left a comment
There was a problem hiding this comment.
Thanks, this is a good PR!
For backward compatibility, I think the new behavior should be optional, with the current normalization kept as the default -- note that EmailMessage.as_bytes() emits bare \n under the default policy.
I accidentally opened a concurrent PR, #152775, before noticing yours. We can either close mine and you add an option like its translate_line_endings, or I keep mine and add you as a co-author. In both cases I'd like to carry over your tests.
Which do you prefer?
| If *flags* is not already enclosed in parentheses, parentheses are | ||
| added automatically. | ||
|
|
||
| .. versionchanged:: 3.16 |
There was a problem hiding this comment.
| .. versionchanged:: 3.16 | |
| .. versionchanged:: next |
|
Thanks @serhiy-storchaka. Agreed on backward compatibility. Keeping normalization as the default and making verbatim opt-in is the right call, since Let's go with option B: keep your #152775, add me as co-author, and carry over the tests. No need to duplicate a clean PR that's ready to land. I'll close #151557 in favor of yours. On the tests: please pull whatever is useful. Mine also read the exact |
|
Closing in favor of #152775 (option B). See the discussion above. |
IMAP4.append()rewrites every bare CR or LF in the message to CRLF (viaMapCRLF.sub) before sending it. But an IMAP literal is a counted, opaque octet sequence ({N}+ exactly N bytes), so the rewrite corrupts payloads that legitimately contain bare CR/LF — e.g. encoded headers or an already-serialized message — and desyncs the byte count from what the caller meant to send.This patch makes
append()send the message verbatim;_command()still owns the protocol framing (the{N}literal, the continuation handshake, the trailing CRLF).Behavior change: callers that relied on the old normalization — notably
EmailMessage.as_bytes(), which emits bare\nunder the default policy — would now send those bytes unchanged and should ensure CRLF themselves (e.g.email.policy.SMTP). Whether treating the literal as opaque is the right fix, or another approach is preferred, is a maintainer call — see my comment below. The new tests read the exact{N}literal octets and assert byte-exact round-trips for a payload mixing\n,\r, and\r\n, and for a realEmailMessage.Note for reviewers: the module-level
MapCRLFregex is now unused. It's not in__all__and isn't documented; I left it in place to keep the diff focused, but happy to remove it (with its own NEWS line).Disclosure: I used AI assistance writing this; I've reviewed the change and can explain it.