Vercel v5 #557

BoBer78 · 2025-10-20T13:53:11Z

Main things

Made the switch to the Response API.
The Messages DB schema is now completely different --> mimics the ResponseAPI.
Switch to Vercel v5 --> has same schema as Response API.
Refactored HIL to only allow for Accept / Reject.
Fixed some frontend bugs.

DB Schema change

IMPORTANT : if you want to backup your local DB before running alembic upgrade, I used :

docker exec -i  <CONTAINER_ID>  pg_dump -U postgres -d neuroagent -F c -v > backup.dump

docker exec -i <CONTAINER_ID>   pg_restore -U postgres -d neuroagent -v --clean < backup.dump

I only kept the mandatory fields so that the we can send the messages back to OpenAI. Some non mandatory info is lost in the process.

The DB messages go from User, AI_TOOL, TOOL and AI_MESSAGE to only User and Assistant. Everything is then stored in the Parts of the messages :

User messages only have one Message part
Assistant can have Reasoning, Tool_call, Tool_call_output and Message parts (Any number in any order)
One Assistant message now correspond to one full turn of the agent loop in AgentRoutine.

The most complex part of the PR lies the migration script.

I tested extensively. You can write some messages, upgrade to new schema, write some more messages, downgrade, and repeat 10 times. It works. I did not test all the weird edge case like for example stopping a HIL message at a very specific timing.

Reponse API change

IMPORTANT : The response API of OpenRouter is still in Beta. When not streaming, it works fine, but they send the chunks in a different order sometimes and the main LLM is not compatible with it for now.
For non streamed responses or parsed structured output it works fine.

With the new change DB schema, the only thing we have to do is :
https://github.com/openbraininstitute/neuroagent/blob/f6250e7cb16586d049e05bb9e5b72ea08b23c1a0/backend/src/neuroagent/utils.py#L28C1-L40C27
And we can send it to OpenAI.

AgentRoutine Change

I made everything work with the pydantic schemas from OpenAI. So now there is no random dict, everything is an OpenAI type. One very conveniant change is that OpenAI now does the concatenation on their side. Once a Part is finished streaming, OpenAI sends us a chunk with the complete part.

neuroagent/backend/src/neuroagent/agent_routine.py

Line 293 in f6250e7

temp_stream_data: dict[str, Any] = {

: this keeps the temporary data if the user stop the stream. It also keeps the current tool call that need to be executed until they are done.

The new parts are appended (when they are complete) both in the history and the new_message with this function :
https://github.com/openbraininstitute/neuroagent/blob/f6250e7cb16586d049e05bb9e5b72ea08b23c1a0/backend/src/neuroagent/utils.py#L250C1-L272C27

Vercel v5

A lot of small things were changed. The main annoyance was the the way the chunks were streamed changed completely.
But with the switch to the response API, one vercel chunk == one Response chunk so it was not that complicated.

… Need to still adjust to Vercel new types

BoBer78 · 2025-12-09T13:22:34Z

backend/src/neuroagent/app/database/sql_schemas.py

+class Parts(Base):
+    """SQL table for storing Response API parts (JSONB format)."""

+    __tablename__ = "parts"
+    part_id: Mapped[uuid.UUID] = mapped_column(
+        UUID, primary_key=True, default=lambda: uuid.uuid4()
+    )
    message_id: Mapped[uuid.UUID] = mapped_column(
-        UUID, ForeignKey("messages.message_id")
+        UUID, ForeignKey("messages.message_id"), nullable=False
    )
-    message: Mapped[Messages] = relationship("Messages", back_populates="tool_calls")
+    order_index: Mapped[int] = mapped_column(Integer, nullable=False)
+    type: Mapped[PartType] = mapped_column(Enum(PartType), nullable=False)
+    output: Mapped[dict[str, Any]] = mapped_column(JSONB, nullable=False)
+    is_complete: Mapped[bool] = mapped_column(Boolean, nullable=False)
+    validated: Mapped[bool] = mapped_column(Boolean, nullable=True)
+
+    message: Mapped[Messages] = relationship("Messages", back_populates="parts")
+
+    __table_args__ = (Index("ix_parts_message_id", "message_id"),)


I decided to only keep the necessary in the table. output is the Raw OpenAI output. This is slightly annoying because if we want to get to the content of a message, it is nested quite deeply.

BoBer78 · 2025-12-09T13:23:22Z

backend/src/neuroagent/app/dependencies.py

        WebSearchTool,
        # NowTool,
-        # WeatherTool,
+        WeatherTool,


I will remove it. Left it for review.

BoBer78 · 2025-12-09T14:02:40Z

backend/src/neuroagent/utils.py

        return {"input_cached": None, "input_noncached": None, "completion": None}
+
+
+def append_part(


I defined a couple functions here to make AgentRoutine more readable. Feel free to protest.

BoBer78 · 2025-12-09T14:04:38Z

frontend/src/components/chat/chat-input-inside-thread.tsx

+                  setMessages((prevState) => {
+                    prevState[prevState.length - 1] = {
+                      ...prevState[prevState.length - 1],
+                      isComplete: false,
+                    };
+                    // We only change the metadata at message level and keep the rest.
+                    return prevState;
+                  });


Might need to be revisited.

BoBer78 · 2025-12-09T14:08:16Z

frontend/src/components/chat/chat-page.tsx

-      };
-    },
+    messages: retrievedMessages,
+    experimental_throttle: 50,


This eliminates all infintie re-render bugs. Else they are very frequent. Except while doing window switching while the chat was streaming, I never encountered the bug again.

One thing to note, now the infinite re-render bug is caught by our error handler and stops the chat.

BoBer78 · 2025-12-09T14:08:46Z

frontend/src/components/chat/chat-page.tsx

-    },
+    messages: retrievedMessages,
+    experimental_throttle: 50,
+    sendAutomaticallyWhen: lastAssistantHasAllToolOutputs,


Now we have to specify when we want the agent to send a message automatically.

BoBer78 · 2025-12-09T14:09:38Z

frontend/src/components/chat/chat-page.tsx

+  // Handle chat inputs.
+  const [input, setInput] = useState("");
+  const handleSubmit = (
+    e: React.FormEvent<HTMLFormElement | HTMLTextAreaElement>,
+  ) => {
+    e.preventDefault();
+    sendMessage({ text: input });
+    setInput("");
+  };
+


Inputs change and submit is now our responsibility, not Vercel

BoBer78 · 2025-12-09T14:10:19Z

frontend/src/components/chat/chat-page.tsx

-  // Handle streaming interruption
-  useEffect(() => {
-    if (stopped) {
-      setMessages((prevState) => {
-        prevState[prevState.length - 1] = {
-          ...prevState[prevState.length - 1],
-          annotations: prevState
-            .at(-1)
-            ?.annotations?.map((ann) =>
-              !ann.toolCallId ? { isComplete: false } : ann,
-            ),
-        };
-        // We only change the annotation at message level and keep the rest.
-        return prevState;
-      });
-    }
-  }, [stopped, setMessages]);


Might want to revisit that ... it works for now.

BoBer78 · 2025-12-09T14:11:23Z

frontend/src/components/chat/chat-page.tsx

  useEffect(() => {
    if (isInvalidating || isFetching) return;
    // Set retrieved DB messaged as current messages
    if (!stopped) {
      setMessages(() => [
        ...retrievedMessages,
        ...messages.filter(
-          (m) => m.id.length !== 36 && !m.id.startsWith("temp"),
+          (m) => m.id.length !== 36 && !m.id.startsWith("msg"),
        ),
      ]);
    } else {
      setMessages(retrievedMessages);
    }
-  }, [md5(JSON.stringify(retrievedMessages))]); // Rerun on content change
+    // eslint-disable-next-line react-hooks/exhaustive-deps
+  }, [isInvalidating, isFetching]); // RE-run on new fetching or stop


I wanted to get rid of this. In my testing it works fine. But please do check again.

BoBer78 · 2025-12-09T14:12:06Z

frontend/src/components/chat/chat-page.tsx


  // Observer to fetch new pages :
  useEffect(() => {
+    const container = containerRef.current;


I did some cleaning.

BoBer78 · 2025-12-09T14:13:52Z

frontend/src/components/sidebar/thread-list-client.tsx

+        root: scrollContainer,
+        rootMargin: "100px",
+        threshold: 0.1,
      },


Without that there was a bug. Sometimes new threads where not fetched.

BoBer78 · 2025-12-09T14:15:59Z

frontend/src/lib/env.ts

+export const env = {
+  // Server
+  SERVER_SIDE_BACKEND_URL: process.env.SERVER_SIDE_BACKEND_URL,
+  NEXTAUTH_SECRET: process.env.NEXTAUTH_SECRET,
+  KEYCLOAK_ID: process.env.KEYCLOAK_ID,
+  KEYCLOAK_SECRET: process.env.KEYCLOAK_SECRET,
+  KEYCLOAK_ISSUER: process.env.KEYCLOAK_ISSUER,
+  // Client
+  NEXT_PUBLIC_BACKEND_URL: process.env.NEXT_PUBLIC_BACKEND_URL,
+};


There was some compatibility issue with the new Vercel and another package that was super old. I did that as one of the first step of the PR. This is probably bad. Did not have much issue when tesing.

BoBer78 · 2025-12-09T14:17:44Z