feat: send SIGHUP to MCP container after config updates by rshoemaker · Pull Request #316 · pgEdge/control-plane

rshoemaker · 2026-03-24T21:35:38Z

Summary

When MCPConfigResource.Update() rewrites config.yaml, the running MCP container now receives SIGHUP to trigger a hot config reload — no container restart needed
Previously, config changes only took effect after a full container restart (via service spec change or force_update=true)
Adds a generic ContainerSignal(ctx, containerID, signal) method to the Docker client wrapper, usable by any service type
Signal delivery is best-effort — if the container isn't running yet or the signal fails, a warning is logged and the update still succeeds

Test plan

Verified docker kill --signal SIGHUP triggers reload in MCP container logs (Received SIGHUP, reloading configuration...)
Verified container start time is unchanged after signal (no restart)
Verified full update-database workflow delivers SIGHUP after config write
Verified allow_writes flip (RO → RW credential switch) works via SIGHUP — MCP server closes old DB connections and reconnects with new credentials
Verified write query succeeds after SIGHUP-triggered credential switch (CREATE TABLE, schema 21 → 22 tables)
E2E tests passing: TestProvisionMCPService, TestProvisionMultiHostMCPService, TestUpdateMCPServiceConfig, TestUpdateDatabaseServiceStable, and all other MCP service tests

PLAT-480

After MCPConfigResource.Update() writes a new config.yaml, send SIGHUP to the running MCP container so it hot-reloads without a full restart. - Add ContainerSignal() to the Docker client wrapper, using the ContainerKill API with an arbitrary signal - Add signalConfigReload() to MCPConfigResource, called after config write on Update only (not Create, since the container doesn't exist yet) - Signal failure is non-fatal: the config file is already correct on disk and will be picked up on next container restart

coderabbitai · 2026-03-24T21:35:58Z

📝 Walkthrough

Walkthrough

Adds a Docker client method to send arbitrary signals to containers and changes the MCP config resource to regenerate config.yaml, write it only when changed, and best-effort notify the running MCP container (choosing SIGHUP or SIGTERM) without failing the overall update flow.

Changes

Cohort / File(s)	Summary
Docker Client Enhancement `server/internal/docker/docker.go`	Added exported method `(*Docker).ContainerSignal(ctx context.Context, containerID string, signal string) error` that forwards the request to the Docker Engine `ContainerKill` API to deliver arbitrary signals without stopping the container.
MCP Config generation & reload `server/internal/orchestrator/swarm/mcp_config_resource.go`	`Update` now generates `config.yaml` bytes and uses a read-compare-write flow to only write when content differs. Added helpers: `generateConfigContent`, `writeConfigBytes`, and `writeConfigFileIfChanged`. `Create` behavior still writes unconditionally when no prior file exists (tokens/users left untouched).
Reload decision & signaling `server/internal/orchestrator/swarm/mcp_config_resource.go`	Added `reloadSignal` decision logic to choose between `signalSIGHUP` and `signalSIGTERM` by comparing previous vs current active credentials and DB-related fields. Added `signalConfigReload` which obtains `*docker.Docker` and `zerolog.Logger` from the injector, looks up the MCP container by `ServiceInstanceID`, returns `nil` when no container exists, attempts to deliver the chosen signal via `ContainerSignal`, logs warnings and returns `nil` on delivery failures, and logs info on success. Injector/container-lookup errors are propagated.

Poem

🐰 I nibble bytes and write them new,
I check the file — if changed, I strew.
A gentle hop, a signal sent,
SIGHUP or TERM — intent well-meant.
Hooray! The config wakes anew.

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title accurately describes the main change: adding SIGHUP signaling to MCP container after config updates, which is the primary feature introduced in this pull request.
Description check	✅ Passed	The PR description covers all required template sections with clear, detailed information about changes and comprehensive testing verification.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch feat/PLAT-480/mcp_config_reloads

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

🧹 Nitpick comments (1)

server/internal/docker/docker.go (1)

536-542: Consider wrapping errors for consistency with other methods.

Other methods in this file (e.g., ContainerStop, ContainerRemove, ContainerInspect) wrap errors with errTranslate and contextual messages. While the caller treats errors as non-fatal warnings, consistent error handling would improve debuggability.

♻️ Suggested improvement

 func (d *Docker) ContainerSignal(ctx context.Context, containerID string, signal string) error {
-	return d.client.ContainerKill(ctx, containerID, signal)
+	if err := d.client.ContainerKill(ctx, containerID, signal); err != nil {
+		return fmt.Errorf("failed to signal container: %w", errTranslate(err))
+	}
+	return nil
 }

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@server/internal/docker/docker.go` around lines 536 - 542, ContainerSignal
currently returns the raw error from d.client.ContainerKill; update it to mirror
other methods by capturing the error, wrapping it with errTranslate and a short
contextual message (e.g., "failed to signal container <id> with <signal>")
before returning; locate the ContainerSignal function and replace the direct
return of d.client.ContainerKill with error handling that calls
errTranslate(err, containerID) (and includes the containerID and signal in the
context message) so error reporting is consistent with ContainerStop,
ContainerRemove, and ContainerInspect.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@server/internal/docker/docker.go`:
- Around line 536-542: ContainerSignal currently returns the raw error from
d.client.ContainerKill; update it to mirror other methods by capturing the
error, wrapping it with errTranslate and a short contextual message (e.g.,
"failed to signal container <id> with <signal>") before returning; locate the
ContainerSignal function and replace the direct return of d.client.ContainerKill
with error handling that calls errTranslate(err, containerID) (and includes the
containerID and signal in the context message) so error reporting is consistent
with ContainerStop, ContainerRemove, and ContainerInspect.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: c044bef3-cf90-46fb-8156-6900c8ccbef1

📥 Commits

Reviewing files that changed from the base of the PR and between ada912a and a820845.

📒 Files selected for processing (2)

server/internal/docker/docker.go
server/internal/orchestrator/swarm/mcp_config_resource.go

coderabbitai

🧹 Nitpick comments (1)

server/internal/orchestrator/swarm/mcp_config_resource.go (1)

282-329: Well-documented best-effort implementation.

The method correctly implements the best-effort semantics with clear differentiation between systemic errors (injector failures) and transient issues (container not found, signal failed). The TODO referencing PLAT-527 for the SIGTERM→SIGHUP transition is helpful for tracking.

Consider adding ServiceInstanceID to log entries for observability.

Adding the service instance ID as a structured field would improve traceability in multi-service environments, especially for the debug log at line 310 where no container ID is available.

♻️ Optional: Add ServiceInstanceID to log entries

 	container, err := GetServiceContainer(ctx, dockerClient, r.ServiceInstanceID)
 	if err != nil {
 		if errors.Is(err, ErrNoServiceContainer) {
-			logger.Debug().Msg("no running MCP container found, skipping config reload signal")
+			logger.Debug().
+				Str("service_instance_id", r.ServiceInstanceID).
+				Msg("no running MCP container found, skipping config reload signal")
 			return nil
 		}
-		logger.Warn().Err(err).Msg("failed to find service container for config reload signal")
+		logger.Warn().Err(err).
+			Str("service_instance_id", r.ServiceInstanceID).
+			Msg("failed to find service container for config reload signal")
 		return nil
 	}

 	if err := dockerClient.ContainerSignal(ctx, container.ID, "SIGTERM"); err != nil {
 		logger.Warn().Err(err).
 			Str("container_id", container.ID).
+			Str("service_instance_id", r.ServiceInstanceID).
 			Msg("failed to send SIGTERM to MCP container")
 		return nil
 	}

 	logger.Info().
 		Str("container_id", container.ID).
+		Str("service_instance_id", r.ServiceInstanceID).
 		Msg("sent SIGTERM to MCP container for config reload")

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@server/internal/orchestrator/swarm/mcp_config_resource.go` around lines 282 -
329, The logs in signalConfigReload omit the ServiceInstanceID, reducing
traceability; update all logger calls in signalConfigReload (notably the
logger.Debug() when ErrNoServiceContainer, the logger.Warn() on
GetServiceContainer failure, the logger.Warn() on ContainerSignal failure, and
the logger.Info() after sending the signal) to include the ServiceInstanceID
(r.ServiceInstanceID) as a structured field so every branch emits the service
instance identifier for observability.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@server/internal/orchestrator/swarm/mcp_config_resource.go`:
- Around line 282-329: The logs in signalConfigReload omit the
ServiceInstanceID, reducing traceability; update all logger calls in
signalConfigReload (notably the logger.Debug() when ErrNoServiceContainer, the
logger.Warn() on GetServiceContainer failure, the logger.Warn() on
ContainerSignal failure, and the logger.Info() after sending the signal) to
include the ServiceInstanceID (r.ServiceInstanceID) as a structured field so
every branch emits the service instance identifier for observability.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 0d1a8c07-a810-45b7-a8ef-f183df05a9cf

📥 Commits

Reviewing files that changed from the base of the PR and between a820845 and 8abac18.

📒 Files selected for processing (1)

server/internal/orchestrator/swarm/mcp_config_resource.go

tsivaprasad · 2026-03-25T13:08:51Z

server/internal/orchestrator/swarm/mcp_config_resource.go

+			Str("container_id", container.ID).
+			Str("signal", signal).
+			Msg("failed to signal MCP container")
+		return nil


Are we treating this return as a success?

Dave fixed the root issue in MCP, so I was able to revert back to a simpler version of the code with no workarounds. However, sending the signals is still "best-effort" - if it fails, we log a message and move on.

mmols

Works from my end - I tested various config updates and updates to target_nodes.

coderabbitai bot reviewed Mar 24, 2026

View reviewed changes

tsivaprasad reviewed Mar 25, 2026

View reviewed changes

rshoemaker force-pushed the feat/PLAT-480/mcp_config_reloads branch from 03bdcf1 to a820845 Compare March 25, 2026 15:51

mmols approved these changes Mar 25, 2026

View reviewed changes

rshoemaker merged commit 0f812ab into main Mar 25, 2026
3 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: send SIGHUP to MCP container after config updates#316

feat: send SIGHUP to MCP container after config updates#316
rshoemaker merged 1 commit intomainfrom
feat/PLAT-480/mcp_config_reloads

rshoemaker commented Mar 24, 2026 •

edited

Loading

Uh oh!

coderabbitai bot commented Mar 24, 2026 •

edited

Loading

Walkthrough

Changes

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot left a comment

Uh oh!

tsivaprasad Mar 25, 2026

Uh oh!

rshoemaker Mar 25, 2026

Uh oh!

mmols left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

rshoemaker commented Mar 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Test plan

Uh oh!

coderabbitai bot commented Mar 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

tsivaprasad Mar 25, 2026

Choose a reason for hiding this comment

Uh oh!

rshoemaker Mar 25, 2026

Choose a reason for hiding this comment

Uh oh!

mmols left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

rshoemaker commented Mar 24, 2026 •

edited

Loading

coderabbitai bot commented Mar 24, 2026 •

edited

Loading