@jlokier pointed out to UTF-8 encoding in websocket spec.
https://datatracker.ietf.org/doc/html/rfc6455#section-5.6
Text
The "Payload data" is text data encoded as UTF-8. Note that a particular text frame might include a partial UTF-8 sequence; however, the whole message MUST contain valid UTF-8. Invalid UTF-8 in reassembled messages is handled as described in Section 8.1.
Autobahn have this kind of test cases, but surprisingly, we pass all of them.
Because we don't handle this case specifically, there is still potential for this bug to manifest itself.
The reason we pass autobahn test is we read using big enough buffer to assemble all arrived UTF-8 frames into complete message and thus satisfy the spec condition. but if we are using smaller buffer, behavior will be different.