fix(monitoring): track real MetricsPort in state, expose monitoring via inspect#199
Merged
kuny0707 merged 1 commit intoJun 18, 2026
Conversation
- Store MetricsPort in ManagedNode state alongside other ports so network add's Prometheus scrape-target rebuild uses the node's actual metrics port instead of hardcoding 9527. Broken when the user overrode ports.metrics in the intent — reloadNetworkMonitoring would write a wrong target address and Prometheus would scrape nothing. - Add metricsPort() helper (state-backed, falls back to 9527 for legacy nodes) and wire into reloadNetworkMonitoring. - Surface monitoring stack endpoints via `trond inspect`: when Monitoring is enabled, output gains an optional `monitoring` block with enabled + prometheus_port + grafana_port. Agents can discover the stack without re-parsing apply output. Text-mode inspect also prints the ports. - Update inspect.schema.json (both embedded + public copies) with the monitoring object (additive optional → PATCH). - Bump SchemaVersion 1.12.0 → 1.12.2 and re-snapshot baseline.
83ea045 to
831c12d
Compare
barbatos2011
pushed a commit
to barbatos2011/tron-deployment
that referenced
this pull request
Jun 18, 2026
…issed by the first pass A doc-consistency audit found the monitoring stack was undocumented for agents: tronprotocol#199 (inspect `monitoring` block + `metrics_port` tracking) merged after this PR opened, and AGENTS.md had zero monitoring coverage at all. - AGENTS.md: add a `monitoring` bullet to the machine-observable rig-state list (enabled / prometheus_port / grafana_port via status/inspect). - CHANGELOG: add the tronprotocol#199 entry under the agent-integration arc. - README: note that status/inspect surface the stack's ports. Audit also confirmed (no change needed): SchemaVersion is consistent (const = baseline = 1.12.2), and every -o json command has a schema (TestSchemaCoverage green).
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What does this PR do?
Two monitoring follow-ups identified during review of PR #185:
Fix hardcoded metrics port in
network add.reloadNetworkMonitoringwas hardcoding:9527as the scrape target when rebuildingprometheus.ymlfor a newly added node. If the user overrodeports.metricsin the intent, the reloaded config would point at the wrong port and Prometheus would never scrape the new node — silently empty dashboards. This PR storesMetricsPortinManagedNodestate (same pattern asHTTPPort/GRPCPort/P2PPort) and adds ametricsPort()helper that falls back to 9527 for legacy state entries.Expose monitoring stack via
trond inspect. When a node was deployed with--monitor,inspectoutput now includes an optionalmonitoringblock (enabled,prometheus_port,grafana_port) so agents can discover the Prometheus/Grafana stack without re-parsing the apply result. Host is implicit — 127.0.0.1 for local targets, target.host for SSH — so only ports are needed.Schema bump.
inspect.schema.jsongains themonitoringobject (additive optional field). SchemaVersion bumped 1.12.1 → 1.12.2 (PATCH), following the same convention used in feat(status): emit genesis_block_id (chain identity fingerprint) #197 which makes an identical class of change.txgen Falcon resource leak fix (included on this branch, same module boundary).
Test plan
go build ./...passesgo test ./...passes (baseline re-snapshot at 1.12.2)Extra details
Follow-up to PR #185 review feedback: "have
trond inspectsurface the Prometheus/Grafana endpoints so agents can discover them without parsing the apply result." TheMetricsPortfix was a bug discovered while implementing this —reloadNetworkMonitoringwas hardcoding 9527, which silently broke when a user customisedports.metricsin their intent.