Hey,
So i've been debugging an issue where i've seen MCP server processes build up on my long lived copilot-sdk pods.
More context as to what I built is here, but at a high level we pop work of a queue and process it with copilot-cli via copilot-sdk.
This is the symptom:
[atcloud@atai-765cbcc889-wzlxs atai]$ ps aux
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
atcloud 1 0.3 1.6 74733124 262104 ? Ssl 10:58 0:17 /usr/local/autotrader/atai/atai start-worker
atcloud 15 0.4 1.6 74737288 269632 ? Ssl 10:58 0:22 /usr/local/autotrader/atai/atai start-worker-child
atcloud 27 0.1 1.9 18088604 320432 ? Sl 10:58 0:08 node /usr/local/autotrader/atai/node_modules/@github/copilot/index.js --allow-all-tools --allow-all-urls --allow-all-paths --disable-mcp-server=gi
atcloud 140 0.2 2.2 74770060 360172 ? Sl 11:12 0:09 /usr/local/autotrader/atai/atai mcp
atcloud 3434 0.2 2.0 74737356 332484 ? Sl 11:37 0:06 /usr/local/autotrader/atai/atai mcp
atcloud 9684 0.2 1.9 74737292 314188 ? Sl 11:57 0:04 /usr/local/autotrader/atai/atai mcp
atcloud 13970 0.0 0.0 9028 3328 pts/0 R+ 12:27 0:00 ps aux
So as you can see, bunch of a mcp servers.
Our flow is basically:
- start copilot
- received a work item
- create a session
- process request
- stop and delete session
The copilot process lives on but sessions are per request. Seeing as mcp configuration is tied to the session, i'd expect session teardown to stop the MCP server but it does not.
I actually asked copilot to investigate this across my app, the sdk and the cli, here's its report which I think explains it better than I!
Summary
client.deleteSession() / session.disconnect() leave session-scoped stdio MCP subprocesses running until client.stop() / client.forceStop()
When a session is created with mcpServers, calling the SDK's session cleanup APIs does not tear down the stdio MCP subprocesses started for that session.
In my case the MCP server is atai mcp, but I do not think this is specific to that server. The important part is that the MCP server is configured per session via mcpServers and started as a stdio subprocess.
client.deleteSession(sessionId) succeeds, but the MCP child process remains alive under the long-lived Copilot CLI process. The child only disappears when client.stop() or client.forceStop() terminates the CLI process itself.
Why I am filing this in @github/copilot-sdk
This is observable through the SDK's public lifecycle APIs:
session.disconnect() sends session.destroy
client.deleteSession(sessionId) sends session.delete
- both look like the correct session-scoped cleanup APIs for session-owned resources
- neither actually tears down the session-owned MCP subprocesses
So from an SDK consumer's point of view, the session lifecycle contract is incomplete or at least surprising.
That said, after tracing the installed code, the likely implementation fix is in the bundled @github/copilot CLI, because that is where session.destroy / session.delete are handled and where MCP teardown appears to be missing.
Environment
@github/copilot-sdk 0.3.0
@github/copilot 1.0.36
useStdio: true
- session-scoped stdio MCP server configured via
mcpServers
- reproduced in a long-lived worker and in a standalone local repro script
Usage Pattern
My real application has:
- one long-lived
CopilotClient
- one new session per job
- a session-scoped stdio MCP server configured through
mcpServers
- per-job cleanup of
session.abort() followed by client.deleteSession(session.sessionId)
- whole-worker cleanup of
client.forceStop() only when the worker exits
That initially looked like it might be an application bug, but the same leak reproduces in a standalone script that only creates a client, creates one session, sends one prompt, deletes the session, and inspects the child processes.
Minimal Repro
import { spawnSync } from "node:child_process";
import { CopilotClient, approveAll } from "@github/copilot-sdk";
const client = new CopilotClient({
useLoggedInUser: true,
useStdio: true,
cwd: process.cwd(),
cliArgs: [
"--allow-all-tools",
"--allow-all-urls",
"--allow-all-paths",
"--autopilot",
"--no-ask-user",
"--no-auto-update"
]
});
const session = await client.createSession({
clientName: "repro",
model: "gpt-5.4-mini",
onPermissionRequest: approveAll,
streaming: false,
workingDirectory: process.cwd(),
infiniteSessions: { enabled: true },
mcpServers: {
atai: {
type: "stdio",
...whatever,
tools: ["*"]
}
}
});
await session.sendAndWait({ prompt: "Reply with exactly ok." }, 30000);
// At this point the MCP subprocess is running under the Copilot CLI process.
await client.deleteSession(session.sessionId);
// Expected: the session-scoped MCP subprocess exits here.
// Actual: the MCP subprocess is still alive.
await client.forceStop();
// Only here does the MCP subprocess disappear.
Observed Behavior
- Create a session with a stdio MCP server in
mcpServers.
- Send a prompt so the MCP server is actually started.
- Call
client.deleteSession(sessionId).
- Observe that the MCP subprocess is still alive.
- Call
client.stop() or client.forceStop().
- Observe that the MCP subprocess finally disappears when the entire Copilot CLI process is terminated.
I also checked session.disconnect(). It behaves the same way, which makes sense if both session.destroy and session.delete share the same CLI cleanup path.
Expected Behavior
I would expect one of these to be true:
session.disconnect() and client.deleteSession() should fully tear down session-owned MCP servers and subprocesses.
- Or the SDK documentation should explicitly state that these APIs only perform logical session cleanup, and that process cleanup for session-owned MCP servers requires client-level teardown.
I think the first behavior is the correct one, because mcpServers are configured per session and feel like session-owned resources.
What I Found In The Installed Code
SDK Side
From the installed SDK code:
session.disconnect() sends session.destroy
session.abort() sends session.abort
client.deleteSession() sends session.delete
client.stop() and client.forceStop() are the only SDK paths I found that terminate the spawned Copilot CLI process itself
So on the SDK side, the lifecycle methods are thin RPC wrappers around CLI behavior.
CLI Side (@github/copilot)
From the installed bundled CLI code:
session.destroy and session.delete both route through the same session cleanup path
- that cleanup path removes listeners, exporters, and bookkeeping
- I could not find session-owned MCP teardown in that cleanup path
- the concrete session
dispose() implementation I found shuts down shell context and telemetry, but I could not find MCP teardown there either
- there is also a separate
mcpHostCache.cleanup() helper for cached MCP hosts, and I could not find a caller for that either
So the user-visible problem appears at the SDK API boundary, but the likely code change belongs in @github/copilot session teardown.
Why This Matters
In a long-lived worker or daemon process, this leaks one or more MCP subprocesses per completed session.
Over time that can mean:
- stale MCP servers remaining alive after session completion
- unnecessary process and memory growth
- a risk of cross-session state surviving longer than expected
- pressure to recreate the entire
CopilotClient per job, which defeats the point of keeping a long-lived client around
Temporary Workaround
The only reliable workaround I found is to tear down the whole client process:
await client.stop()
- or
await client.forceStop()
That works because it terminates the parent Copilot CLI process, which also takes the leaked MCP subprocess with it. But it is a much heavier operation than session cleanup and changes client lifetime semantics.
Overlap With @github/copilot
I think this belongs in both places, but for different reasons:
@github/copilot-sdk: the problem is exposed through the public SDK lifecycle APIs, and the current behavior is surprising for SDK consumers
@github/copilot: the implementation fix probably needs to happen in the CLI bundle's session teardown path, because that is where session.destroy / session.delete are handled and where MCP teardown appears to be missing
If you would prefer, I can file the companion issue in @github/copilot instead and link the two.
Hey,
So i've been debugging an issue where i've seen MCP server processes build up on my long lived
copilot-sdkpods.More context as to what I built is here, but at a high level we pop work of a queue and process it with
copilot-cliviacopilot-sdk.This is the symptom:
So as you can see, bunch of a mcp servers.
Our flow is basically:
The copilot process lives on but sessions are per request. Seeing as
mcpconfiguration is tied to the session, i'd expect session teardown to stop the MCP server but it does not.I actually asked copilot to investigate this across my app, the sdk and the cli, here's its report which I think explains it better than I!
Summary
client.deleteSession()/session.disconnect()leave session-scoped stdio MCP subprocesses running untilclient.stop()/client.forceStop()When a session is created with
mcpServers, calling the SDK's session cleanup APIs does not tear down the stdio MCP subprocesses started for that session.In my case the MCP server is
atai mcp, but I do not think this is specific to that server. The important part is that the MCP server is configured per session viamcpServersand started as a stdio subprocess.client.deleteSession(sessionId)succeeds, but the MCP child process remains alive under the long-lived Copilot CLI process. The child only disappears whenclient.stop()orclient.forceStop()terminates the CLI process itself.Why I am filing this in
@github/copilot-sdkThis is observable through the SDK's public lifecycle APIs:
session.disconnect()sendssession.destroyclient.deleteSession(sessionId)sendssession.deleteSo from an SDK consumer's point of view, the session lifecycle contract is incomplete or at least surprising.
That said, after tracing the installed code, the likely implementation fix is in the bundled
@github/copilotCLI, because that is wheresession.destroy/session.deleteare handled and where MCP teardown appears to be missing.Environment
@github/copilot-sdk0.3.0@github/copilot1.0.36useStdio: truemcpServersUsage Pattern
My real application has:
CopilotClientmcpServerssession.abort()followed byclient.deleteSession(session.sessionId)client.forceStop()only when the worker exitsThat initially looked like it might be an application bug, but the same leak reproduces in a standalone script that only creates a client, creates one session, sends one prompt, deletes the session, and inspects the child processes.
Minimal Repro
Observed Behavior
mcpServers.client.deleteSession(sessionId).client.stop()orclient.forceStop().I also checked
session.disconnect(). It behaves the same way, which makes sense if bothsession.destroyandsession.deleteshare the same CLI cleanup path.Expected Behavior
I would expect one of these to be true:
session.disconnect()andclient.deleteSession()should fully tear down session-owned MCP servers and subprocesses.I think the first behavior is the correct one, because
mcpServersare configured per session and feel like session-owned resources.What I Found In The Installed Code
SDK Side
From the installed SDK code:
session.disconnect()sendssession.destroysession.abort()sendssession.abortclient.deleteSession()sendssession.deleteclient.stop()andclient.forceStop()are the only SDK paths I found that terminate the spawned Copilot CLI process itselfSo on the SDK side, the lifecycle methods are thin RPC wrappers around CLI behavior.
CLI Side (
@github/copilot)From the installed bundled CLI code:
session.destroyandsession.deleteboth route through the same session cleanup pathdispose()implementation I found shuts down shell context and telemetry, but I could not find MCP teardown there eithermcpHostCache.cleanup()helper for cached MCP hosts, and I could not find a caller for that eitherSo the user-visible problem appears at the SDK API boundary, but the likely code change belongs in
@github/copilotsession teardown.Why This Matters
In a long-lived worker or daemon process, this leaks one or more MCP subprocesses per completed session.
Over time that can mean:
CopilotClientper job, which defeats the point of keeping a long-lived client aroundTemporary Workaround
The only reliable workaround I found is to tear down the whole client process:
await client.stop()await client.forceStop()That works because it terminates the parent Copilot CLI process, which also takes the leaked MCP subprocess with it. But it is a much heavier operation than session cleanup and changes client lifetime semantics.
Overlap With
@github/copilotI think this belongs in both places, but for different reasons:
@github/copilot-sdk: the problem is exposed through the public SDK lifecycle APIs, and the current behavior is surprising for SDK consumers@github/copilot: the implementation fix probably needs to happen in the CLI bundle's session teardown path, because that is wheresession.destroy/session.deleteare handled and where MCP teardown appears to be missingIf you would prefer, I can file the companion issue in
@github/copilotinstead and link the two.