Skip to content

feat: send SIGHUP to MCP container after config updates#316

Merged
rshoemaker merged 1 commit intomainfrom
feat/PLAT-480/mcp_config_reloads
Mar 25, 2026
Merged

feat: send SIGHUP to MCP container after config updates#316
rshoemaker merged 1 commit intomainfrom
feat/PLAT-480/mcp_config_reloads

Conversation

@rshoemaker
Copy link
Copy Markdown
Contributor

@rshoemaker rshoemaker commented Mar 24, 2026

Summary

  • When MCPConfigResource.Update() rewrites config.yaml, the running MCP container now receives SIGHUP to trigger a hot config reload — no container restart needed
  • Previously, config changes only took effect after a full container restart (via service spec change or force_update=true)
  • Adds a generic ContainerSignal(ctx, containerID, signal) method to the Docker client wrapper, usable by any service type
  • Signal delivery is best-effort — if the container isn't running yet or the signal fails, a warning is logged and the update still succeeds

Test plan

  • Verified docker kill --signal SIGHUP triggers reload in MCP container logs (Received SIGHUP, reloading configuration...)
  • Verified container start time is unchanged after signal (no restart)
  • Verified full update-database workflow delivers SIGHUP after config write
  • Verified allow_writes flip (RO → RW credential switch) works via SIGHUP — MCP server closes old DB connections and reconnects with new credentials
  • Verified write query succeeds after SIGHUP-triggered credential switch (CREATE TABLE, schema 21 → 22 tables)
  • E2E tests passing: TestProvisionMCPService, TestProvisionMultiHostMCPService, TestUpdateMCPServiceConfig, TestUpdateDatabaseServiceStable, and all other MCP service tests

PLAT-480

After MCPConfigResource.Update() writes a new config.yaml, send SIGHUP
to the running MCP container so it hot-reloads without a full restart.

- Add ContainerSignal() to the Docker client wrapper, using the ContainerKill API with an arbitrary signal
- Add signalConfigReload() to MCPConfigResource, called after config write on Update only (not Create, since the container doesn't exist yet)
- Signal failure is non-fatal: the config file is already correct on disk and will be picked up on next container restart
@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Mar 24, 2026

📝 Walkthrough

Walkthrough

Adds a Docker client method to send arbitrary signals to containers and changes the MCP config resource to regenerate config.yaml, write it only when changed, and best-effort notify the running MCP container (choosing SIGHUP or SIGTERM) without failing the overall update flow.

Changes

Cohort / File(s) Summary
Docker Client Enhancement
server/internal/docker/docker.go
Added exported method (*Docker).ContainerSignal(ctx context.Context, containerID string, signal string) error that forwards the request to the Docker Engine ContainerKill API to deliver arbitrary signals without stopping the container.
MCP Config generation & reload
server/internal/orchestrator/swarm/mcp_config_resource.go
Update now generates config.yaml bytes and uses a read-compare-write flow to only write when content differs. Added helpers: generateConfigContent, writeConfigBytes, and writeConfigFileIfChanged. Create behavior still writes unconditionally when no prior file exists (tokens/users left untouched).
Reload decision & signaling
server/internal/orchestrator/swarm/mcp_config_resource.go
Added reloadSignal decision logic to choose between signalSIGHUP and signalSIGTERM by comparing previous vs current active credentials and DB-related fields. Added signalConfigReload which obtains *docker.Docker and zerolog.Logger from the injector, looks up the MCP container by ServiceInstanceID, returns nil when no container exists, attempts to deliver the chosen signal via ContainerSignal, logs warnings and returns nil on delivery failures, and logs info on success. Injector/container-lookup errors are propagated.

Poem

🐰 I nibble bytes and write them new,
I check the file — if changed, I strew.
A gentle hop, a signal sent,
SIGHUP or TERM — intent well-meant.
Hooray! The config wakes anew.

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately describes the main change: adding SIGHUP signaling to MCP container after config updates, which is the primary feature introduced in this pull request.
Description check ✅ Passed The PR description covers all required template sections with clear, detailed information about changes and comprehensive testing verification.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/PLAT-480/mcp_config_reloads

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
server/internal/docker/docker.go (1)

536-542: Consider wrapping errors for consistency with other methods.

Other methods in this file (e.g., ContainerStop, ContainerRemove, ContainerInspect) wrap errors with errTranslate and contextual messages. While the caller treats errors as non-fatal warnings, consistent error handling would improve debuggability.

♻️ Suggested improvement
 func (d *Docker) ContainerSignal(ctx context.Context, containerID string, signal string) error {
-	return d.client.ContainerKill(ctx, containerID, signal)
+	if err := d.client.ContainerKill(ctx, containerID, signal); err != nil {
+		return fmt.Errorf("failed to signal container: %w", errTranslate(err))
+	}
+	return nil
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@server/internal/docker/docker.go` around lines 536 - 542, ContainerSignal
currently returns the raw error from d.client.ContainerKill; update it to mirror
other methods by capturing the error, wrapping it with errTranslate and a short
contextual message (e.g., "failed to signal container <id> with <signal>")
before returning; locate the ContainerSignal function and replace the direct
return of d.client.ContainerKill with error handling that calls
errTranslate(err, containerID) (and includes the containerID and signal in the
context message) so error reporting is consistent with ContainerStop,
ContainerRemove, and ContainerInspect.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@server/internal/docker/docker.go`:
- Around line 536-542: ContainerSignal currently returns the raw error from
d.client.ContainerKill; update it to mirror other methods by capturing the
error, wrapping it with errTranslate and a short contextual message (e.g.,
"failed to signal container <id> with <signal>") before returning; locate the
ContainerSignal function and replace the direct return of d.client.ContainerKill
with error handling that calls errTranslate(err, containerID) (and includes the
containerID and signal in the context message) so error reporting is consistent
with ContainerStop, ContainerRemove, and ContainerInspect.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: c044bef3-cf90-46fb-8156-6900c8ccbef1

📥 Commits

Reviewing files that changed from the base of the PR and between ada912a and a820845.

📒 Files selected for processing (2)
  • server/internal/docker/docker.go
  • server/internal/orchestrator/swarm/mcp_config_resource.go

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
server/internal/orchestrator/swarm/mcp_config_resource.go (1)

282-329: Well-documented best-effort implementation.

The method correctly implements the best-effort semantics with clear differentiation between systemic errors (injector failures) and transient issues (container not found, signal failed). The TODO referencing PLAT-527 for the SIGTERM→SIGHUP transition is helpful for tracking.

Consider adding ServiceInstanceID to log entries for observability.

Adding the service instance ID as a structured field would improve traceability in multi-service environments, especially for the debug log at line 310 where no container ID is available.

♻️ Optional: Add ServiceInstanceID to log entries
 	container, err := GetServiceContainer(ctx, dockerClient, r.ServiceInstanceID)
 	if err != nil {
 		if errors.Is(err, ErrNoServiceContainer) {
-			logger.Debug().Msg("no running MCP container found, skipping config reload signal")
+			logger.Debug().
+				Str("service_instance_id", r.ServiceInstanceID).
+				Msg("no running MCP container found, skipping config reload signal")
 			return nil
 		}
-		logger.Warn().Err(err).Msg("failed to find service container for config reload signal")
+		logger.Warn().Err(err).
+			Str("service_instance_id", r.ServiceInstanceID).
+			Msg("failed to find service container for config reload signal")
 		return nil
 	}

 	if err := dockerClient.ContainerSignal(ctx, container.ID, "SIGTERM"); err != nil {
 		logger.Warn().Err(err).
 			Str("container_id", container.ID).
+			Str("service_instance_id", r.ServiceInstanceID).
 			Msg("failed to send SIGTERM to MCP container")
 		return nil
 	}

 	logger.Info().
 		Str("container_id", container.ID).
+		Str("service_instance_id", r.ServiceInstanceID).
 		Msg("sent SIGTERM to MCP container for config reload")
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@server/internal/orchestrator/swarm/mcp_config_resource.go` around lines 282 -
329, The logs in signalConfigReload omit the ServiceInstanceID, reducing
traceability; update all logger calls in signalConfigReload (notably the
logger.Debug() when ErrNoServiceContainer, the logger.Warn() on
GetServiceContainer failure, the logger.Warn() on ContainerSignal failure, and
the logger.Info() after sending the signal) to include the ServiceInstanceID
(r.ServiceInstanceID) as a structured field so every branch emits the service
instance identifier for observability.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@server/internal/orchestrator/swarm/mcp_config_resource.go`:
- Around line 282-329: The logs in signalConfigReload omit the
ServiceInstanceID, reducing traceability; update all logger calls in
signalConfigReload (notably the logger.Debug() when ErrNoServiceContainer, the
logger.Warn() on GetServiceContainer failure, the logger.Warn() on
ContainerSignal failure, and the logger.Info() after sending the signal) to
include the ServiceInstanceID (r.ServiceInstanceID) as a structured field so
every branch emits the service instance identifier for observability.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 0d1a8c07-a810-45b7-a8ef-f183df05a9cf

📥 Commits

Reviewing files that changed from the base of the PR and between a820845 and 8abac18.

📒 Files selected for processing (1)
  • server/internal/orchestrator/swarm/mcp_config_resource.go

Str("container_id", container.ID).
Str("signal", signal).
Msg("failed to signal MCP container")
return nil
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we treating this return as a success?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dave fixed the root issue in MCP, so I was able to revert back to a simpler version of the code with no workarounds. However, sending the signals is still "best-effort" - if it fails, we log a message and move on.

@rshoemaker rshoemaker force-pushed the feat/PLAT-480/mcp_config_reloads branch from 03bdcf1 to a820845 Compare March 25, 2026 15:51
Copy link
Copy Markdown
Member

@mmols mmols left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Works from my end - I tested various config updates and updates to target_nodes.

@rshoemaker rshoemaker merged commit 0f812ab into main Mar 25, 2026
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants