fix: allow Unicode characters in document store tool names #5599

doubleclip118 · 2025-12-18T06:53:04Z

This PR fixes the 'Invalid tools[0].function.name: empty string' error that occurs with non-ASCII document store titles. Instead of allowing Unicode characters (which would cause LLM API errors), it sanitizes tool names to ASCII-only and generates unique fallback names for non-ASCII titles.

- Add sanitizeToolName helper function that preserves Unicode letters (Korean, Chinese, Japanese, etc.) - Use Unicode-aware regex pattern [^\p{L}\p{N}_-] instead of [^a-z0-9_-] - Add fallback name generation when result is empty - Fixes 'Invalid tools[0].function.name: empty string' error for non-ASCII titles

gemini-code-assist · 2025-12-18T06:53:18Z

Summary of Changes

Hello @doubleclip118, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request resolves an issue where tool names containing non-ASCII characters would lead to validation errors. It introduces a robust sanitization utility that supports Unicode characters and ensures a valid tool name is always generated, significantly enhancing the internationalization capabilities and stability of tool name handling within the system.

Highlights

Unicode Support for Tool Names: Introduced a new sanitizeToolName helper function that uses a Unicode-aware regular expression ([^\p{L}\p{N}_-]) to allow letters from various languages (e.g., Korean, Chinese, Japanese) in document store tool names.
Fallback Name Generation: The sanitizeToolName function now includes a fallback mechanism to generate a unique name (e.g., tool_timestamp) if the sanitization process results in an empty string, preventing 'Invalid tools[0].function.name: empty string' errors.
Refactored Tool Name Sanitization: Existing inline logic for sanitizing tool names for document stores and knowledge bases has been replaced with calls to the new sanitizeToolName helper, centralizing and improving the consistency of tool name generation.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request introduces a sanitizeToolName helper function to allow Unicode characters in tool names, addressing an issue where non-ASCII titles caused errors. The new function correctly uses a Unicode-aware regex and provides a fallback for empty names. The implementation is a significant improvement, centralizing the sanitization logic. My review includes a high-severity suggestion to prevent potential tool name collisions in the fallback mechanism and a medium-severity suggestion to simplify a conditional check for better code clarity.

gemini-code-assist · 2025-12-18T06:54:15Z

packages/components/nodes/agentflow/Agent/Agent.ts

+
+    // If the result is empty, generate a fallback name
+    if (!sanitized || sanitized.trim() === '') {
+        return `tool_${Date.now()}`


The fallback name generation using Date.now() is not guaranteed to be unique. If this function is called multiple times within the same millisecond for different tools whose names both sanitize to an empty string, it will produce duplicate tool names. This can lead to unpredictable behavior, as the agent might use the wrong tool. To ensure uniqueness, consider appending a random component to the fallback name.

Suggested change

return `tool_${Date.now()}`

return `tool_${Date.now()}_${Math.random().toString(36).slice(2, 7)}`

gemini-code-assist · 2025-12-18T06:54:15Z

packages/components/nodes/agentflow/Agent/Agent.ts

+        .replace(/[^\p{L}\p{N}_-]/gu, '') // Allow Unicode letters and numbers
+
+    // If the result is empty, generate a fallback name
+    if (!sanitized || sanitized.trim() === '') {


The condition !sanitized || sanitized.trim() === '' can be simplified. Given the preceding replacement operations (.replace(/ /g, '_') and .replace(/[^\\p{L}\\p{N}_-]/gu, '')), the sanitized string will not contain any leading or trailing whitespace. If the string becomes empty, !sanitized will be true. Therefore, the sanitized.trim() === '' check is redundant and can be removed for clarity.

Suggested change

if (!sanitized || sanitized.trim() === '') {

if (!sanitized) {

- Simplify empty check by removing unnecessary trim() call - Add random suffix to fallback name for uniqueness guarantee - Prevents potential duplicate tool names in same millisecond

doubleclip118 · 2025-12-19T04:20:17Z

/gemini review

gemini-code-assist

Code Review

This pull request introduces a sanitizeToolName helper function to handle Unicode characters in tool names, preventing errors when non-ASCII titles are used. The new function replaces duplicated sanitization logic throughout the Agent.ts file.

My review focuses on the implementation of sanitizeToolName. While the intent to support Unicode is good, the current implementation may cause issues with LLM APIs that have strict naming conventions for tools. I've provided suggestions to make the sanitization more robust and compatible with major models like OpenAI, Anthropic, and Gemini. I've also included recommendations for enforcing a character limit and improving the generation of unique fallback names.

gemini-code-assist · 2025-12-19T04:22:11Z

packages/components/nodes/agentflow/Agent/Agent.ts

+const sanitizeToolName = (name: string): string => {
+    const sanitized = name
+        .toLowerCase()
+        .replace(/ /g, '_')
+        .replace(/[^\p{L}\p{N}_-]/gu, '') // Allow Unicode letters and numbers
+
+    // If the result is empty, generate a fallback name with random suffix for uniqueness
+    if (!sanitized) {
+        return `tool_${Date.now()}_${Math.random().toString(36).slice(2, 7)}`
+    }
+
+    return sanitized
+}


While allowing Unicode characters in tool names is a good intention for internationalization, it might cause issues with the underlying LLM APIs. Most major models have strict restrictions on tool names:

OpenAI & Anthropic: ^[a-zA-Z0-9_-]{1,64}$

Google Gemini: [a-zA-Z0-9_]+ (max 63 chars, no hyphens)

The current regex /[^\p{L}\p{N}_-]/gu is too permissive and will allow characters that these models reject, potentially causing runtime errors.

A safer approach is to use a more restrictive regex that only allows characters supported by most models. The fallback mechanism you've added will correctly handle names that become empty after sanitization (like names composed entirely of non-ASCII characters).

Additionally, it's good practice to enforce a maximum length for tool names (e.g., 64 characters). I've included this in the suggestion below.

Suggested change

const sanitizeToolName = (name: string): string => {

const sanitized = name

.toLowerCase()

.replace(/ /g, '_')

.replace(/[^\p{L}\p{N}_-]/gu, '') // Allow Unicode letters and numbers

// If the result is empty, generate a fallback name with random suffix for uniqueness

if (!sanitized) {

return `tool_${Date.now()}_${Math.random().toString(36).slice(2, 7)}`

}

return sanitized

}

const sanitizeToolName = (name: string): string => {

const sanitized = name

.toLowerCase()

.replace(/ /g, '_')

.replace(/[^a-z0-9_-]/g, ''); // Restrict to characters allowed by major LLMs

// If the result is empty, generate a fallback name with random suffix for uniqueness

if (!sanitized) {

return `tool_${Date.now()}_${Math.random().toString(36).slice(2, 7)}`;

}

// Enforce a 64-character limit, which is common for tool names

return sanitized.slice(0, 64);

}

gemini-code-assist · 2025-12-19T04:22:11Z

packages/components/nodes/agentflow/Agent/Agent.ts

+
+    // If the result is empty, generate a fallback name with random suffix for uniqueness
+    if (!sanitized) {
+        return `tool_${Date.now()}_${Math.random().toString(36).slice(2, 7)}`


For generating unique identifiers, crypto.randomBytes is generally preferred over Math.random() as it provides cryptographically stronger random values, reducing the chance of collisions, however unlikely. Since this file uses CommonJS (module.exports), you can use require('crypto') inline.

Suggested change

return `tool_${Date.now()}_${Math.random().toString(36).slice(2, 7)}`

return `tool_${Date.now()}_${require('crypto').randomBytes(3).toString('hex')}`;

- Add crypto import at the top of the file - Replace inline require('crypto') with imported crypto module - Restrict tool names to LLM-compatible characters [a-z0-9_-] - Add 64 character limit for tool names - Use crypto.randomBytes for stronger random fallback names

doubleclip118 · 2025-12-19T04:40:44Z

/gemini review

gemini-code-assist

Code Review

This pull request introduces a sanitizeToolName helper function to allow Unicode characters in document store tool names, fixing an issue where non-ASCII titles would result in an invalid empty tool name. The new function also adds a fallback name generation mechanism and enforces a 64-character limit. The repeated sanitization logic has been correctly refactored to use this new centralized function.

My main feedback is that the regular expression in sanitizeToolName does not actually allow Unicode characters as intended by this pull request. I've provided a suggestion to correct the regex to align with the PR's objective.

gemini-code-assist · 2025-12-19T04:41:43Z

packages/components/nodes/agentflow/Agent/Agent.ts

+    const sanitized = name
+        .toLowerCase()
+        .replace(/ /g, '_')
+        .replace(/[^a-z0-9_-]/g, '') // Restrict to characters allowed by major LLMs


The current regex /[^a-z0-9_-]/g does not align with the PR's goal to support Unicode characters in tool names, as it strips out any non-ASCII characters. This contradicts the PR description which states: 'Use Unicode-aware regex pattern [^\p{L}\p{N}_-] instead of [^a-z0-9_-]'.

To fix this and allow Unicode characters from languages like Korean, Chinese, and Japanese, the regex should be updated. The u flag is necessary for Unicode property escapes to work correctly.

I've also updated the comment to reflect the change, as the previous comment was contradictory to allowing Unicode characters.

Suggested change

.replace(/[^a-z0-9_-]/g, '') // Restrict to characters allowed by major LLMs

.replace(/[^\p{L}\p{N}_-]/gu, '') // Allow Unicode letters/numbers, underscore, and hyphen

- Explicitly document ASCII-only restriction for LLM API compatibility - Clarify that non-ASCII titles use auto-generated fallback names - Mention the specific error being prevented

doubleclip118 · 2025-12-19T04:47:42Z

/gemini review

gemini-code-assist

Code Review

This pull request introduces a sanitizeToolName function to handle non-ASCII characters in tool names, preventing errors with LLM APIs. The new function correctly sanitizes names to be ASCII-only and generates a unique fallback name if the sanitization results in an empty string. The duplicated sanitization logic across the file has been refactored to use this new centralized function, which is a great improvement.

My feedback includes a suggestion to make the fallback name generation more robust by relying solely on cryptographic randomness instead of a timestamp, further reducing the chance of collisions.

packages/components/nodes/agentflow/Agent/Agent.ts

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>

doubleclip118 · 2025-12-19T05:22:26Z

Hi @HenryHengZJ and @vinodkiran,

Thank you for your work on Flowise.

This PR addresses an issue related to the HTTP node in Agentflow v2 where certain responses fail to be handled properly under specific conditions. It includes improved handling for streams and empty response bodies, and adds safety checks to avoid runtime errors in common workflows.

Could you please take a look when you have time?
I can update this further based on feedback.

Thank you!

tianwei-liu · 2025-12-19T18:37:01Z

packages/components/nodes/agentflow/Agent/Agent.ts

@@ -1,3 +1,4 @@
+import crypto from 'crypto'


nit: import { randomBytes } from 'crypto'

tianwei-liu · 2025-12-19T18:43:05Z

packages/components/nodes/agentflow/Agent/Agent.ts

+
+    // If the result is empty (e.g., non-ASCII only input), generate a unique fallback name
+    if (!sanitized) {
+        return `tool_${crypto.randomBytes(8).toString('hex')}`


I noticed you already used Math.random() below. I think it should be enough for the random name.
return `tool_${Date.now()}_${Math.random().toString(36).slice(2, 7)}`;
This can avoid importing the crypto library just for generating a random name, since it doesn't require cryptographic security.

It seems that Gemini proposed to use crypto.randomBytes. I'm open to either option.

gemini-code-assist bot reviewed Dec 18, 2025

View reviewed changes

refactor: improve sanitizeToolName based on code review

a20dfc8

- Simplify empty check by removing unnecessary trim() call - Add random suffix to fallback name for uniqueness guarantee - Prevents potential duplicate tool names in same millisecond

gemini-code-assist bot reviewed Dec 19, 2025

View reviewed changes

docs: clarify sanitizeToolName behavior in comments

e3a6e0c

- Explicitly document ASCII-only restriction for LLM API compatibility - Clarify that non-ASCII titles use auto-generated fallback names - Mention the specific error being prevented

gemini-code-assist bot reviewed Dec 19, 2025

View reviewed changes

packages/components/nodes/agentflow/Agent/Agent.ts Outdated Show resolved Hide resolved

Update packages/components/nodes/agentflow/Agent/Agent.ts

cf04c93

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>

HenryHengZJ requested review from jocelynlin-wd and tianwei-liu December 19, 2025 11:01

tianwei-liu reviewed Dec 19, 2025

View reviewed changes

doubleclip118 added 2 commits December 22, 2025 09:51

chore: fix prettier formatting

527e910

chore: fix prettier formatting

e89137d

	return `tool_${Date.now()}`
	return `tool_${Date.now()}_${Math.random().toString(36).slice(2, 7)}`

	if (!sanitized \|\| sanitized.trim() === '') {
	if (!sanitized) {

	return `tool_${Date.now()}_${Math.random().toString(36).slice(2, 7)}`
	return `tool_${Date.now()}_${require('crypto').randomBytes(3).toString('hex')}`;

	.replace(/[^a-z0-9_-]/g, '') // Restrict to characters allowed by major LLMs
	.replace(/[^\p{L}\p{N}_-]/gu, '') // Allow Unicode letters/numbers, underscore, and hyphen

Uh oh!

fix: allow Unicode characters in document store tool names #5599

Are you sure you want to change the base?

fix: allow Unicode characters in document store tool names #5599

Conversation

doubleclip118 commented Dec 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gemini-code-assist bot commented Dec 18, 2025

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Dec 18, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Dec 18, 2025

Choose a reason for hiding this comment

Uh oh!

doubleclip118 commented Dec 19, 2025

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Dec 19, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Dec 19, 2025

Choose a reason for hiding this comment

Uh oh!

doubleclip118 commented Dec 19, 2025

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Dec 19, 2025

Choose a reason for hiding this comment

Uh oh!

doubleclip118 commented Dec 19, 2025

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

doubleclip118 commented Dec 19, 2025

Uh oh!

tianwei-liu Dec 19, 2025

Choose a reason for hiding this comment

Uh oh!

tianwei-liu Dec 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

doubleclip118 commented Dec 18, 2025 •

edited

Loading

tianwei-liu Dec 19, 2025 •

edited

Loading