Skip to content

Conversation

@doubleclip118
Copy link

@doubleclip118 doubleclip118 commented Dec 18, 2025

This PR fixes the 'Invalid tools[0].function.name: empty string' error that occurs with non-ASCII document store titles. Instead of allowing Unicode characters (which would cause LLM API errors), it sanitizes tool names to ASCII-only and generates unique fallback names for non-ASCII titles.

- Add sanitizeToolName helper function that preserves Unicode letters (Korean, Chinese, Japanese, etc.)
- Use Unicode-aware regex pattern [^\p{L}\p{N}_-] instead of [^a-z0-9_-]
- Add fallback name generation when result is empty
- Fixes 'Invalid tools[0].function.name: empty string' error for non-ASCII titles
@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @doubleclip118, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request resolves an issue where tool names containing non-ASCII characters would lead to validation errors. It introduces a robust sanitization utility that supports Unicode characters and ensures a valid tool name is always generated, significantly enhancing the internationalization capabilities and stability of tool name handling within the system.

Highlights

  • Unicode Support for Tool Names: Introduced a new sanitizeToolName helper function that uses a Unicode-aware regular expression ([^\p{L}\p{N}_-]) to allow letters from various languages (e.g., Korean, Chinese, Japanese) in document store tool names.
  • Fallback Name Generation: The sanitizeToolName function now includes a fallback mechanism to generate a unique name (e.g., tool_timestamp) if the sanitization process results in an empty string, preventing 'Invalid tools[0].function.name: empty string' errors.
  • Refactored Tool Name Sanitization: Existing inline logic for sanitizing tool names for document stores and knowledge bases has been replaced with calls to the new sanitizeToolName helper, centralizing and improving the consistency of tool name generation.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a sanitizeToolName helper function to allow Unicode characters in tool names, addressing an issue where non-ASCII titles caused errors. The new function correctly uses a Unicode-aware regex and provides a fallback for empty names. The implementation is a significant improvement, centralizing the sanitization logic. My review includes a high-severity suggestion to prevent potential tool name collisions in the fallback mechanism and a medium-severity suggestion to simplify a conditional check for better code clarity.


// If the result is empty, generate a fallback name
if (!sanitized || sanitized.trim() === '') {
return `tool_${Date.now()}`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The fallback name generation using Date.now() is not guaranteed to be unique. If this function is called multiple times within the same millisecond for different tools whose names both sanitize to an empty string, it will produce duplicate tool names. This can lead to unpredictable behavior, as the agent might use the wrong tool. To ensure uniqueness, consider appending a random component to the fallback name.

Suggested change
return `tool_${Date.now()}`
return `tool_${Date.now()}_${Math.random().toString(36).slice(2, 7)}`

.replace(/[^\p{L}\p{N}_-]/gu, '') // Allow Unicode letters and numbers

// If the result is empty, generate a fallback name
if (!sanitized || sanitized.trim() === '') {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The condition !sanitized || sanitized.trim() === '' can be simplified. Given the preceding replacement operations (.replace(/ /g, '_') and .replace(/[^\\p{L}\\p{N}_-]/gu, '')), the sanitized string will not contain any leading or trailing whitespace. If the string becomes empty, !sanitized will be true. Therefore, the sanitized.trim() === '' check is redundant and can be removed for clarity.

Suggested change
if (!sanitized || sanitized.trim() === '') {
if (!sanitized) {

- Simplify empty check by removing unnecessary trim() call
- Add random suffix to fallback name for uniqueness guarantee
- Prevents potential duplicate tool names in same millisecond
@doubleclip118
Copy link
Author

/gemini review

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a sanitizeToolName helper function to handle Unicode characters in tool names, preventing errors when non-ASCII titles are used. The new function replaces duplicated sanitization logic throughout the Agent.ts file.

My review focuses on the implementation of sanitizeToolName. While the intent to support Unicode is good, the current implementation may cause issues with LLM APIs that have strict naming conventions for tools. I've provided suggestions to make the sanitization more robust and compatible with major models like OpenAI, Anthropic, and Gemini. I've also included recommendations for enforcing a character limit and improving the generation of unique fallback names.

Comment on lines 73 to 85
const sanitizeToolName = (name: string): string => {
const sanitized = name
.toLowerCase()
.replace(/ /g, '_')
.replace(/[^\p{L}\p{N}_-]/gu, '') // Allow Unicode letters and numbers

// If the result is empty, generate a fallback name with random suffix for uniqueness
if (!sanitized) {
return `tool_${Date.now()}_${Math.random().toString(36).slice(2, 7)}`
}

return sanitized
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

While allowing Unicode characters in tool names is a good intention for internationalization, it might cause issues with the underlying LLM APIs. Most major models have strict restrictions on tool names:

  • OpenAI & Anthropic: ^[a-zA-Z0-9_-]{1,64}$
  • Google Gemini: [a-zA-Z0-9_]+ (max 63 chars, no hyphens)

The current regex /[^\p{L}\p{N}_-]/gu is too permissive and will allow characters that these models reject, potentially causing runtime errors.

A safer approach is to use a more restrictive regex that only allows characters supported by most models. The fallback mechanism you've added will correctly handle names that become empty after sanitization (like names composed entirely of non-ASCII characters).

Additionally, it's good practice to enforce a maximum length for tool names (e.g., 64 characters). I've included this in the suggestion below.

Suggested change
const sanitizeToolName = (name: string): string => {
const sanitized = name
.toLowerCase()
.replace(/ /g, '_')
.replace(/[^\p{L}\p{N}_-]/gu, '') // Allow Unicode letters and numbers
// If the result is empty, generate a fallback name with random suffix for uniqueness
if (!sanitized) {
return `tool_${Date.now()}_${Math.random().toString(36).slice(2, 7)}`
}
return sanitized
}
const sanitizeToolName = (name: string): string => {
const sanitized = name
.toLowerCase()
.replace(/ /g, '_')
.replace(/[^a-z0-9_-]/g, ''); // Restrict to characters allowed by major LLMs
// If the result is empty, generate a fallback name with random suffix for uniqueness
if (!sanitized) {
return `tool_${Date.now()}_${Math.random().toString(36).slice(2, 7)}`;
}
// Enforce a 64-character limit, which is common for tool names
return sanitized.slice(0, 64);
}


// If the result is empty, generate a fallback name with random suffix for uniqueness
if (!sanitized) {
return `tool_${Date.now()}_${Math.random().toString(36).slice(2, 7)}`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

For generating unique identifiers, crypto.randomBytes is generally preferred over Math.random() as it provides cryptographically stronger random values, reducing the chance of collisions, however unlikely. Since this file uses CommonJS (module.exports), you can use require('crypto') inline.

Suggested change
return `tool_${Date.now()}_${Math.random().toString(36).slice(2, 7)}`
return `tool_${Date.now()}_${require('crypto').randomBytes(3).toString('hex')}`;

- Add crypto import at the top of the file
- Replace inline require('crypto') with imported crypto module
- Restrict tool names to LLM-compatible characters [a-z0-9_-]
- Add 64 character limit for tool names
- Use crypto.randomBytes for stronger random fallback names
@doubleclip118
Copy link
Author

/gemini review

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a sanitizeToolName helper function to allow Unicode characters in document store tool names, fixing an issue where non-ASCII titles would result in an invalid empty tool name. The new function also adds a fallback name generation mechanism and enforces a 64-character limit. The repeated sanitization logic has been correctly refactored to use this new centralized function.

My main feedback is that the regular expression in sanitizeToolName does not actually allow Unicode characters as intended by this pull request. I've provided a suggestion to correct the regex to align with the PR's objective.

const sanitized = name
.toLowerCase()
.replace(/ /g, '_')
.replace(/[^a-z0-9_-]/g, '') // Restrict to characters allowed by major LLMs
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

The current regex /[^a-z0-9_-]/g does not align with the PR's goal to support Unicode characters in tool names, as it strips out any non-ASCII characters. This contradicts the PR description which states: 'Use Unicode-aware regex pattern [^\p{L}\p{N}_-] instead of [^a-z0-9_-]'.

To fix this and allow Unicode characters from languages like Korean, Chinese, and Japanese, the regex should be updated. The u flag is necessary for Unicode property escapes to work correctly.

I've also updated the comment to reflect the change, as the previous comment was contradictory to allowing Unicode characters.

Suggested change
.replace(/[^a-z0-9_-]/g, '') // Restrict to characters allowed by major LLMs
.replace(/[^\p{L}\p{N}_-]/gu, '') // Allow Unicode letters/numbers, underscore, and hyphen

- Explicitly document ASCII-only restriction for LLM API compatibility
- Clarify that non-ASCII titles use auto-generated fallback names
- Mention the specific error being prevented
@doubleclip118
Copy link
Author

/gemini review

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a sanitizeToolName function to handle non-ASCII characters in tool names, preventing errors with LLM APIs. The new function correctly sanitizes names to be ASCII-only and generates a unique fallback name if the sanitization results in an empty string. The duplicated sanitization logic across the file has been refactored to use this new centralized function, which is a great improvement.

My feedback includes a suggestion to make the fallback name generation more robust by relying solely on cryptographic randomness instead of a timestamp, further reducing the chance of collisions.

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
@doubleclip118
Copy link
Author

Hi @HenryHengZJ and @vinodkiran,

Thank you for your work on Flowise.

This PR addresses an issue related to the HTTP node in Agentflow v2 where certain responses fail to be handled properly under specific conditions. It includes improved handling for streams and empty response bodies, and adds safety checks to avoid runtime errors in common workflows.

Could you please take a look when you have time?
I can update this further based on feedback.

Thank you!

@@ -1,3 +1,4 @@
import crypto from 'crypto'

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: import { randomBytes } from 'crypto'


// If the result is empty (e.g., non-ASCII only input), generate a unique fallback name
if (!sanitized) {
return `tool_${crypto.randomBytes(8).toString('hex')}`
Copy link

@tianwei-liu tianwei-liu Dec 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I noticed you already used Math.random() below. I think it should be enough for the random name.
return `tool_${Date.now()}_${Math.random().toString(36).slice(2, 7)}`;
This can avoid importing the crypto library just for generating a random name, since it doesn't require cryptographic security.

It seems that Gemini proposed to use crypto.randomBytes. I'm open to either option.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants