-
Notifications
You must be signed in to change notification settings - Fork 127
feat: support more delimiter for SSE events in GCP gemini #1536
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
…tex AI Signed-off-by: Sukumar Gaonkar <[email protected]>
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #1536 +/- ##
==========================================
+ Coverage 84.14% 84.16% +0.02%
==========================================
Files 149 149
Lines 12971 12992 +21
==========================================
+ Hits 10914 10935 +21
Misses 1436 1436
Partials 621 621 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
| } | ||
| lines := bytes.Split(bodyBytes, []byte("\n\n")) | ||
| bodyReader := io.MultiReader(bytes.NewReader(o.bufferedBody), body) | ||
| scanner := bufio.NewScanner(bodyReader) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When implementing the mcp sse parser, I got issues with an initial implementation that used the Scanner; for some servers, I got long lines and the parsing failed with bufio.ErrTooLong (bufio.Scanner: token too long). I only caught this when using some real MCP servers.
I don't know if this can be the case as well here, but just a data point to consider not using the scanner if long lines could be expected.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good point.
as of go 1.25.3 the default for MaxScanTokenSize seems to be 64KB
Since each streaming chunk represents couple of generated tokens, the size of each chunk should not exceed this limit. though i couldn't find docs that guarantee that the chunk will be smaller than a specific size
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What comes from the LLM is inherently unpredictable, and I don't know if we can make that statement (for example, there are some tests that use chunks that have tool calls, and the length of the arguments could be unpredictable...
I agree it is unlikely to happen, but my main concern is that we are introducing a limitation we didn't have before, and I don't want this to hit us in the form of a bug.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
two approaches i can think of
- make the
MaxScanTokenSizeconfigurable - follow the mcp sse parser logic to explicitly parse the chunks
I personally prefer option 1 as relying on scanner to do the parsing aspect is simpler and easier maintain
what do you think @nacx ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about a third option: in the previous implementation, we're loading everything in memory and doing a bytes.Split. Why don't just keep that, but use the right function to split? I mean, test for the presence of the delimiters to figure out the "split characters", then keep the original code just splitting using those.
This way we don't change the behaviour, keep it backwards-compatible, and we don't introduce a new config option that users won't know what value to set confidently and how to set it. Also, if we find loading everything in memory to be an issue (it hasn't been till now), then we can address the stream reader approach, but probably it's not needed for this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Signed-off-by: Sukumar Gaonkar <[email protected]> # Conflicts: # internal/translator/openai_gcpvertexai.go
|
unsure about test failures
|
|
/retest |
Signed-off-by: Sukumar Gaonkar <[email protected]>
Description
The SSE event spec supports 3 different delimiters (pair of CRLF / LF / CR). This PR updates the stream processing translator for GCP VertexAI to support all 3 delimiters.
SSE docs about delimiters