Skip to content

Commit 77411df

Browse files
Jim8yNeo Bot
authored andcommitted
Add telemetry documentation and monitoring configs
1 parent dcd2ab8 commit 77411df

File tree

4 files changed

+622
-0
lines changed

4 files changed

+622
-0
lines changed

src/Plugins/Telemetry/README.md

Lines changed: 236 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,236 @@
1+
# Neo N3 Telemetry Plugin
2+
3+
A comprehensive telemetry and metrics collection plugin for Neo N3 full nodes, providing Prometheus-compatible metrics for monitoring node health, performance, and operational status.
4+
5+
## Features
6+
7+
- **Blockchain Metrics**: Block height, sync status, block processing time, transaction counts
8+
- **Network Metrics**: Peer connections, message statistics, bandwidth usage
9+
- **Mempool Metrics**: Transaction pool size, utilization, add/remove rates
10+
- **System Metrics**: CPU usage, memory consumption, GC statistics, thread pool
11+
- **Plugin Metrics**: Loaded plugins count and status
12+
- **Prometheus Export**: Native Prometheus metrics endpoint
13+
14+
## Installation
15+
16+
1. Build the plugin:
17+
```bash
18+
dotnet build src/Plugins/TelemetryPlugin/TelemetryPlugin.csproj
19+
```
20+
21+
2. Copy the output to your Neo node's `Plugins/TelemetryPlugin/` directory:
22+
```bash
23+
cp -r bin/Debug/net10.0/* /path/to/neo-node/Plugins/TelemetryPlugin/
24+
```
25+
26+
3. Configure the plugin by editing `config.json` in the plugin directory.
27+
28+
## Configuration
29+
30+
Create or edit `config.json` in the `Plugins/TelemetryPlugin/` directory:
31+
32+
```json
33+
{
34+
"PluginConfiguration": {
35+
"Enabled": true,
36+
"ExceptionPolicy": "StopPlugin",
37+
"PrometheusPort": 9100,
38+
"PrometheusHost": "localhost",
39+
"PrometheusPath": "/metrics",
40+
"HealthPort": 9100,
41+
"SystemMetricsIntervalMs": 5000,
42+
"CollectBlockchainMetrics": true,
43+
"CollectNetworkMetrics": true,
44+
"CollectMempoolMetrics": true,
45+
"CollectSystemMetrics": true,
46+
"NodeId": "my-neo-node",
47+
"NetworkName": "mainnet"
48+
}
49+
}
50+
```
51+
52+
### Configuration Options
53+
54+
| Option | Type | Default | Description |
55+
|--------|------|---------|-------------|
56+
| `Enabled` | bool | `true` | Enable/disable the telemetry plugin |
57+
| `ExceptionPolicy` | string | `StopPlugin` | Exception handling policy (`StopNode`, `StopPlugin`, `Ignore`) |
58+
| `PrometheusPort` | int | `9100` | Port for the Prometheus metrics endpoint |
59+
| `PrometheusHost` | string | `localhost` | Host address for the metrics endpoint |
60+
| `PrometheusPath` | string | `/metrics` | URL path for the metrics endpoint |
61+
| `HealthPort` | int | `9100` | Port for `/health`, `/ready`, `/live` endpoints |
62+
| `SystemMetricsIntervalMs` | int | `5000` | Interval for collecting system metrics (ms) |
63+
| `CollectBlockchainMetrics` | bool | `true` | Enable blockchain metrics collection |
64+
| `CollectNetworkMetrics` | bool | `true` | Enable network metrics collection |
65+
| `CollectMempoolMetrics` | bool | `true` | Enable mempool metrics collection |
66+
| `CollectSystemMetrics` | bool | `true` | Enable system resource metrics collection |
67+
| `NodeId` | string | hostname | Unique identifier for this node |
68+
| `NetworkName` | string | auto-detect | Network name label (mainnet, testnet, etc.) |
69+
70+
## Metrics Reference
71+
72+
### Blockchain Metrics
73+
74+
| Metric | Type | Description |
75+
|--------|------|-------------|
76+
| `neo_blockchain_height` | Gauge | Current block height |
77+
| `neo_blockchain_header_height` | Gauge | Current header height |
78+
| `neo_blockchain_blocks_persisted_total` | Counter | Total blocks persisted |
79+
| `neo_blockchain_block_persist_duration_milliseconds` | Histogram | Block persist duration |
80+
| `neo_blockchain_block_transactions` | Gauge | Transactions in last block |
81+
| `neo_blockchain_transactions_processed_total` | Counter | Total transactions processed |
82+
| `neo_blockchain_sync_status` | Gauge | Sync status (1=synced, 0=syncing) |
83+
| `neo_blockchain_blocks_behind` | Gauge | Blocks behind network |
84+
| `neo_blockchain_time_since_last_block_seconds` | Gauge | Time since last block |
85+
86+
### Network Metrics
87+
88+
| Metric | Type | Description |
89+
|--------|------|-------------|
90+
| `neo_network_peers_connected` | Gauge | Connected peer count |
91+
| `neo_network_peers_unconnected` | Gauge | Unconnected peer count |
92+
| `neo_network_peer_connections_total` | Counter | Total peer connections |
93+
| `neo_network_peer_disconnections_total` | Counter | Total peer disconnections |
94+
| `neo_network_messages_received_total` | Counter | Messages received by type |
95+
| `neo_network_messages_sent_total` | Counter | Messages sent by type |
96+
| `neo_network_bytes_received_total` | Counter | Total bytes received |
97+
| `neo_network_bytes_sent_total` | Counter | Total bytes sent |
98+
99+
### Mempool Metrics
100+
101+
| Metric | Type | Description |
102+
|--------|------|-------------|
103+
| `neo_mempool_transactions` | Gauge | Current mempool size |
104+
| `neo_mempool_verified_transactions` | Gauge | Verified transaction count |
105+
| `neo_mempool_unverified_transactions` | Gauge | Unverified transaction count |
106+
| `neo_mempool_capacity` | Gauge | Mempool capacity |
107+
| `neo_mempool_utilization_ratio` | Gauge | Mempool utilization (0-1) |
108+
| `neo_mempool_transactions_added_total` | Counter | Total transactions added |
109+
| `neo_mempool_transactions_removed_total` | Counter | Total transactions removed |
110+
111+
### System Metrics
112+
113+
| Metric | Type | Description |
114+
|--------|------|-------------|
115+
| `neo_system_cpu_usage_ratio` | Gauge | CPU usage ratio (0-1) |
116+
| `neo_system_memory_usage_bytes` | Gauge | Memory usage by type |
117+
| `neo_system_gc_collection_count` | Gauge | GC collections by generation |
118+
| `neo_system_threadpool_worker_threads` | Gauge | Active worker threads |
119+
| `neo_system_threadpool_completion_port_threads` | Gauge | Active completion port threads |
120+
| `neo_system_process_uptime_seconds` | Gauge | Process uptime |
121+
122+
### Node Info Metrics
123+
124+
| Metric | Type | Description |
125+
|--------|------|-------------|
126+
| `neo_node_info` | Gauge | Node information (version labels) |
127+
| `neo_node_start_time_seconds` | Gauge | Node start timestamp |
128+
| `neo_plugins_loaded` | Gauge | Number of loaded plugins |
129+
| `neo_plugin_status` | Gauge | Individual plugin status |
130+
131+
## Prometheus Integration
132+
133+
### Scrape Configuration
134+
135+
Add to your `prometheus.yml`:
136+
137+
```yaml
138+
scrape_configs:
139+
- job_name: 'neo-node'
140+
static_configs:
141+
- targets: ['localhost:9100']
142+
scrape_interval: 15s
143+
```
144+
145+
### Example Queries
146+
147+
```promql
148+
# Block height
149+
neo_blockchain_height{node_id="my-node"}
150+
151+
# Sync progress
152+
1 - (neo_blockchain_blocks_behind / neo_blockchain_header_height)
153+
154+
# Mempool utilization
155+
neo_mempool_utilization_ratio{node_id="my-node"}
156+
157+
# Transaction rate (per minute)
158+
rate(neo_blockchain_transactions_processed_total[1m])
159+
160+
# Connected peers
161+
neo_network_peers_connected{node_id="my-node"}
162+
163+
# Memory usage
164+
neo_system_memory_usage_bytes{type="working_set"}
165+
```
166+
167+
## Grafana Dashboard
168+
169+
Import the provided Grafana dashboard JSON for a pre-configured monitoring view:
170+
171+
1. Open Grafana
172+
2. Go to Dashboards → Import
173+
3. Upload `grafana-dashboard.json` or paste the JSON content
174+
4. Select your Prometheus data source
175+
5. Click Import
176+
177+
## Alerting Examples
178+
179+
### Prometheus Alerting Rules
180+
181+
```yaml
182+
groups:
183+
- name: neo-node-alerts
184+
rules:
185+
- alert: NeoNodeOutOfSync
186+
expr: neo_blockchain_blocks_behind > 10
187+
for: 5m
188+
labels:
189+
severity: warning
190+
annotations:
191+
summary: "Neo node is out of sync"
192+
description: "Node {{ $labels.node_id }} is {{ $value }} blocks behind"
193+
194+
- alert: NeoNodeNoPeers
195+
expr: neo_network_peers_connected == 0
196+
for: 2m
197+
labels:
198+
severity: critical
199+
annotations:
200+
summary: "Neo node has no peers"
201+
description: "Node {{ $labels.node_id }} has no connected peers"
202+
203+
- alert: NeoMempoolFull
204+
expr: neo_mempool_utilization_ratio > 0.9
205+
for: 5m
206+
labels:
207+
severity: warning
208+
annotations:
209+
summary: "Neo mempool is nearly full"
210+
description: "Mempool utilization is {{ $value | humanizePercentage }}"
211+
```
212+
213+
## Troubleshooting
214+
215+
### Metrics endpoint not accessible
216+
217+
1. Check if the plugin is enabled in `config.json`
218+
2. Verify the port is not in use: `netstat -tlnp | grep 9100`
219+
3. Check firewall rules allow the port
220+
4. Review node logs for plugin startup errors
221+
222+
### Missing metrics
223+
224+
1. Ensure the corresponding `Collect*Metrics` option is enabled
225+
2. Check that the node has fully started
226+
3. Verify the metric collection interval is appropriate
227+
228+
### High resource usage
229+
230+
1. Increase `SystemMetricsIntervalMs` to reduce collection frequency
231+
2. Disable unnecessary metric categories
232+
3. Consider using a dedicated metrics aggregation service
233+
234+
## License
235+
236+
This plugin is part of the Neo project and is distributed under the MIT license.
Lines changed: 59 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,59 @@
1+
# Neo N3 Node Prometheus Alerting Rules
2+
groups:
3+
- name: neo-node-critical
4+
rules:
5+
- alert: NeoNodeOutOfSync
6+
expr: neo_blockchain_sync_status == 0
7+
for: 5m
8+
labels:
9+
severity: warning
10+
annotations:
11+
summary: "Neo node is out of sync"
12+
description: "Node {{ $labels.node_id }} on {{ $labels.network }} is not synced"
13+
14+
- alert: NeoNodeNoPeers
15+
expr: neo_network_peers_connected == 0
16+
for: 2m
17+
labels:
18+
severity: critical
19+
annotations:
20+
summary: "Neo node has no peers"
21+
description: "Node {{ $labels.node_id }} has no connected peers"
22+
23+
- alert: NeoNodeFarBehind
24+
expr: neo_blockchain_blocks_behind > 100
25+
for: 10m
26+
labels:
27+
severity: critical
28+
annotations:
29+
summary: "Neo node is far behind"
30+
description: "Node {{ $labels.node_id }} is {{ $value }} blocks behind"
31+
32+
- name: neo-node-performance
33+
rules:
34+
- alert: NeoNodeHighCPU
35+
expr: neo_system_cpu_usage_ratio > 0.9
36+
for: 10m
37+
labels:
38+
severity: warning
39+
annotations:
40+
summary: "Neo node high CPU usage"
41+
description: "CPU usage is {{ $value | humanizePercentage }}"
42+
43+
- alert: NeoMempoolNearlyFull
44+
expr: neo_mempool_utilization_ratio > 0.9
45+
for: 5m
46+
labels:
47+
severity: warning
48+
annotations:
49+
summary: "Neo mempool is nearly full"
50+
description: "Mempool utilization is {{ $value | humanizePercentage }}"
51+
52+
- alert: NeoNodeLowPeers
53+
expr: neo_network_peers_connected < 3
54+
for: 5m
55+
labels:
56+
severity: warning
57+
annotations:
58+
summary: "Neo node has low peer count"
59+
description: "Only {{ $value }} connected peers"
Lines changed: 21 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,21 @@
1+
# Prometheus scrape configuration for Neo N3 nodes
2+
# Add this to your prometheus.yml scrape_configs section
3+
4+
scrape_configs:
5+
- job_name: 'neo-node'
6+
static_configs:
7+
- targets: ['localhost:9100']
8+
labels:
9+
instance: 'neo-node-1'
10+
scrape_interval: 15s
11+
scrape_timeout: 10s
12+
metrics_path: /metrics
13+
14+
# For multiple nodes:
15+
# scrape_configs:
16+
# - job_name: 'neo-nodes'
17+
# static_configs:
18+
# - targets:
19+
# - 'node1.example.com:9100'
20+
# - 'node2.example.com:9100'
21+
# - 'node3.example.com:9100'

0 commit comments

Comments
 (0)