Skip to content

gh-49680: Don't rewrite CR/LF in imaplib APPEND literals#151557

Closed
harjothkhara wants to merge 2 commits into
python:mainfrom
harjothkhara:codex/49680-imaplib-append-literal
Closed

gh-49680: Don't rewrite CR/LF in imaplib APPEND literals#151557
harjothkhara wants to merge 2 commits into
python:mainfrom
harjothkhara:codex/49680-imaplib-append-literal

Conversation

@harjothkhara

@harjothkhara harjothkhara commented Jun 16, 2026

Copy link
Copy Markdown
Contributor

IMAP4.append() rewrites every bare CR or LF in the message to CRLF (via MapCRLF.sub) before sending it. But an IMAP literal is a counted, opaque octet sequence ({N} + exactly N bytes), so the rewrite corrupts payloads that legitimately contain bare CR/LF — e.g. encoded headers or an already-serialized message — and desyncs the byte count from what the caller meant to send.

This patch makes append() send the message verbatim; _command() still owns the protocol framing (the {N} literal, the continuation handshake, the trailing CRLF).

Behavior change: callers that relied on the old normalization — notably EmailMessage.as_bytes(), which emits bare \n under the default policy — would now send those bytes unchanged and should ensure CRLF themselves (e.g. email.policy.SMTP). Whether treating the literal as opaque is the right fix, or another approach is preferred, is a maintainer call — see my comment below. The new tests read the exact {N} literal octets and assert byte-exact round-trips for a payload mixing \n, \r, and \r\n, and for a real EmailMessage.

Note for reviewers: the module-level MapCRLF regex is now unused. It's not in __all__ and isn't documented; I left it in place to keep the diff focused, but happy to remove it (with its own NEWS line).

Disclosure: I used AI assistance writing this; I've reviewed the change and can explain it.

@read-the-docs-community

read-the-docs-community Bot commented Jun 16, 2026

Copy link
Copy Markdown

Documentation build overview

📚 cpython-previews | 🛠️ Build #33170388 | 📁 Comparing 7e6bd8c against main (c2ca772)

  🔍 Preview build  

2 files changed
± library/imaplib.html
± whatsnew/changelog.html

harjothkhara and others added 2 commits June 16, 2026 12:34
IMAP4.append() rewrote bare CR or LF bytes in the message to CRLF before sending it. An IMAP literal is a counted, opaque octet sequence, so rewriting its bytes corrupts payloads that legitimately contain bare CR/LF and changes the byte count the caller intends to send.

Send the message verbatim. _command() still owns the protocol framing (advertising the {N} octet count, the continuation handshake, and the trailing CRLF); append() now owns only the payload bytes.

Callers that relied on the normalization -- e.g. EmailMessage.as_bytes(), which uses bare LF under the default policy -- must now supply CRLF themselves (for example via email.policy.SMTP), as required by RFC 3501.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@harjothkhara harjothkhara force-pushed the codex/49680-imaplib-append-literal branch from 4805d9f to 7e6bd8c Compare June 16, 2026 19:45
@harjothkhara

harjothkhara commented Jun 17, 2026

Copy link
Copy Markdown
Contributor Author

@sethmlarson would you mind weighing in on the behavior choice here? This PR makes append() send message bytes unchanged rather than normalizing bare CR/LF. Should we keep that behavior, or reject input that still contains bare CR/LF?

@serhiy-storchaka serhiy-storchaka left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, this is a good PR!

For backward compatibility, I think the new behavior should be optional, with the current normalization kept as the default -- note that EmailMessage.as_bytes() emits bare \n under the default policy.

I accidentally opened a concurrent PR, #152775, before noticing yours. We can either close mine and you add an option like its translate_line_endings, or I keep mine and add you as a co-author. In both cases I'd like to carry over your tests.

Which do you prefer?

Comment thread Doc/library/imaplib.rst
If *flags* is not already enclosed in parentheses, parentheses are
added automatically.

.. versionchanged:: 3.16

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
.. versionchanged:: 3.16
.. versionchanged:: next

@harjothkhara

harjothkhara commented Jul 1, 2026

Copy link
Copy Markdown
Contributor Author

Thanks @serhiy-storchaka. Agreed on backward compatibility. Keeping normalization as the default and making verbatim opt-in is the right call, since EmailMessage.as_bytes() emits bare \n under the default policy.

Let's go with option B: keep your #152775, add me as co-author, and carry over the tests. No need to duplicate a clean PR that's ready to land. I'll close #151557 in favor of yours.

On the tests: please pull whatever is useful. Mine also read the exact {N} literal octets and check a byte-exact round-trip for a payload mixing \n, \r, and \r\n, plus one for a real EmailMessage. Happy to adapt those to sit next to your test_append_translate_line_endings if you want the extra coverage. Let me know if you want me to push anything.

@harjothkhara

Copy link
Copy Markdown
Contributor Author

Closing in favor of #152775 (option B). See the discussion above.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants