Skip to content

Commit ac0adbc

Browse files
committed
fix(openshift): address PR review comments and improve robustness
- Defer namespace detection to avoid preflight crashes - Fix support for OpenShift cluster and multiple cluster detection logic - Remove generated topology diagram from repository - Standardize Python script filenames according to Google style guide - Extract shared utilities to ovn_utils.py for better code reuse Signed-off-by: Matteo Dallaglio <[email protected]>
1 parent fc40ce3 commit ac0adbc

File tree

8 files changed

+493
-142
lines changed

8 files changed

+493
-142
lines changed

plugins/openshift/commands/visualize-ovn-topology.md

Lines changed: 6 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -19,7 +19,6 @@ The `openshift:visualize-ovn-topology` command generates a comprehensive Mermaid
1919

2020
- Logical switches and routers
2121
- Switch and router ports with MAC/IP addresses
22-
- Load balancers and their VIPs
2322
- Pod connectivity
2423
- External network connections
2524
- Per-node component placement (interconnect mode) or centralized components (default mode)
@@ -43,7 +42,7 @@ This command invokes the `generating-ovn-topology` skill which implements a data
4342
**Key Features:**
4443
- **Data-driven**: Never generates synthetic data - always queries real cluster
4544
- **Architecture-aware**: Correctly handles both interconnect and default deployment modes
46-
- **Complete topology**: Shows all logical switches, routers, ports, and load balancers
45+
- **Complete topology**: Shows all logical switches, routers, and ports
4746
- **Visual clarity**: Uses color-coded components and node subgraphs for organization
4847

4948
**Skill Reference:**
@@ -59,12 +58,12 @@ This command invokes the `generating-ovn-topology` skill which implements a data
5958
## Examples
6059

6160
1. **Basic usage** (generates topology for detected cluster):
62-
```
61+
```shell
6362
/openshift:visualize-ovn-topology
6463
```
6564

6665
Output:
67-
```
66+
```text
6867
✓ Successfully generated OVN-Kubernetes topology diagram
6968
7069
📄 Diagram saved to: ovn-topology-diagram.md
@@ -73,19 +72,19 @@ This command invokes the `generating-ovn-topology` skill which implements a data
7372
- 3 nodes (ovn-control-plane, ovn-worker, ovn-worker2)
7473
- 10 logical switches, 4 logical routers
7574
- 27 logical switch ports, 13 logical router ports
76-
- 9 running pods, 4 load balancers
75+
- 9 running pods
7776
- Mode: Interconnect (distributed control plane)
7877
7978
💡 Open the file in your IDE to view the full rendered Mermaid diagram!
8079
```
8180

8281
2. **With existing file** (prompts for action):
83-
```
82+
```shell
8483
/openshift:visualize-ovn-topology
8584
```
8685

8786
You'll be asked:
88-
```
87+
```text
8988
File ovn-topology-diagram.md already exists. Would you like to:
9089
(1) Overwrite it
9190
(2) Save to a different location

plugins/openshift/skills/generating-ovn-topology/README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -112,7 +112,7 @@ This ensures **informed consent** - you're always aware when admin credentials a
112112

113113
### Generated Diagram
114114

115-
The command creates a Mermaid diagram in markdown format:
115+
The command creates a Mermaid diagram in Markdown format:
116116

117117
```markdown
118118
# OVN-Kubernetes Network Topology

plugins/openshift/skills/generating-ovn-topology/SKILL.md

Lines changed: 40 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -10,20 +10,41 @@ tools: [Bash, Read, Write]
1010

1111
1. **Detect Cluster**: Find the OVN-Kubernetes cluster kubeconfig
1212

13-
Run: `KC=$(scripts/detect-cluster.sh)`
13+
Run: `scripts/detect-cluster.sh 2>/dev/null`
1414

15-
The script intelligently discovers OVN-Kubernetes clusters:
15+
The script discovers OVN-Kubernetes clusters:
1616
- Scans all kubeconfig files: current KUBECONFIG env, ~/.kube/kind-config, ~/ovn.conf, ~/.kube/config
1717
- Tests ALL contexts in each kubeconfig (not just current-context)
18-
- If one cluster found: automatically selects it
19-
- If multiple clusters found: prompts user to choose
20-
- If selected context isn't current-context: automatically switches it
21-
- Returns kubeconfig path on stdout, diagnostics on stderr
22-
- Exit codes: 0=success, 1=no cluster found, 2=user cancelled
18+
- Returns parseable list to stdout: `index|kubeconfig|cluster_name|node_count|namespace`
19+
- Diagnostics go to stderr
20+
- Exit code: 0=success, 1=no clusters found
21+
22+
**How to handle the output:**
23+
24+
The script returns pipe-delimited lines to stdout, one per cluster found, e.g.:
25+
```
26+
1|/home/user/.kube/kind-config|kind-ovn|3|ovn-kubernetes
27+
2|/home/user/.kube/config|prod-cluster|12|openshift-ovn-kubernetes
28+
```
29+
30+
**Decision logic:**
31+
- If **one cluster** found → automatically use it (extract kubeconfig path from column 2)
32+
- If **multiple clusters** found → show the list to user and ask them to choose by number
33+
- After selection, extract the kubeconfig path from column 2 of the chosen line
34+
- Store the selected kubeconfig path in variable `KC` for use in subsequent steps
35+
36+
**Example output format parsing:**
37+
- Column 1: Index number (for user selection)
38+
- Column 2: Kubeconfig file path (this is what you need for `$KC`)
39+
- Column 3: Cluster display name
40+
- Column 4: Number of nodes
41+
- Column 5: OVN namespace name
42+
43+
**Important**: Parse the output using standard text processing. The exact implementation is up to you - use whatever approach works best (awk, Python, inline parsing, etc.).
2344

2445
2. **Check Permissions**: Verify user's Kubernetes access level and inform about write permissions
2546

26-
Run: `scripts/check-permissions.py "$KC"`
47+
Run: `scripts/check_permissions.py "$KC"`
2748

2849
The script returns:
2950
- **Exit 0**: Read-only access or user confirmed → proceed
@@ -43,7 +64,7 @@ tools: [Bash, Read, Write]
4364
5. If user says yes → continue, if no → stop
4465

4566
**Example of proper user communication:**
46-
```
67+
```text
4768
⚠️ WARNING: Write Permissions Detected
4869
4970
Your kubeconfig has cluster admin permissions:
@@ -72,7 +93,7 @@ tools: [Bash, Read, Write]
7293

7394
4. **Collect OVN Data**: Get full topology data from the cluster
7495

75-
Run: `scripts/collect-ovn-data.py "$KC"`
96+
Run: `scripts/collect_ovn_data.py "$KC"`
7697

7798
Detail files written to `$TMPDIR`:
7899
- `ovn_switches_detail.txt` - node|uuid|name|other_config
@@ -83,7 +104,7 @@ tools: [Bash, Read, Write]
83104

84105
5. **Analyze Placement**: Determine per-node vs cluster-wide components
85106

86-
Run: `scripts/analyze-placement.py "$TMPDIR"`
107+
Run: `scripts/analyze_placement.py "$TMPDIR"`
87108

88109
Placement results written to `$TMPDIR`:
89110
- `ovn_switch_placement.txt` - name|placement (per-node|cluster-wide|cluster-wide-visual)
@@ -150,14 +171,14 @@ In **interconnect mode**, each node runs its own NBDB with local copies of compo
150171

151172
## Helper Scripts
152173

153-
All helper scripts are in the `scripts/` directory.
174+
All helper scripts are in the `scripts/` directory.
154175

155176
| Script | Purpose | Input | Output |
156177
|--------|---------|-------|--------|
157-
| [detect-cluster.sh](scripts/detect-cluster.sh) | Find OVN cluster kubeconfig across all contexts. Scans multiple kubeconfig files and all their contexts. Prompts user if multiple clusters found. | None | Kubeconfig path to stdout, diagnostics to stderr. Exit: 0=success, 1=none found, 2=cancelled |
158-
| [check-permissions.py](scripts/check-permissions.py) | Check user permissions and warn if write access detected. | KUBECONFIG path | Exit: 0=proceed, 1=cancelled/error, 2=write perms (needs user confirmation) |
159-
| [collect-ovn-data.py](scripts/collect-ovn-data.py) | **Data collector**: Queries each node for all data, with **graceful degradation** (continues on node failures). Writes detail files. | KUBECONFIG path | Detail files: `ovn_switches_detail.txt`, `ovn_routers_detail.txt`, `ovn_lsps_detail.txt`, `ovn_lrps_detail.txt`, `ovn_pods_detail.txt` |
160-
| [analyze-placement.py](scripts/analyze-placement.py) | **Placement analyzer**: Analyzes UUID patterns from detail files to determine per-node vs cluster-wide placement. | TMPDIR (reads detail files) | Placement files: `ovn_switch_placement.txt`, `ovn_router_placement.txt` |
178+
| [detect-cluster.sh](scripts/detect-cluster.sh) | Find OVN cluster kubeconfig across all contexts. Scans multiple kubeconfig files and all their contexts. Returns parseable list. | None | Parseable list to stdout: `index|kubeconfig|cluster|nodes|namespace`. Exit: 0=success, 1=none found |
179+
| [check_permissions.py](scripts/check_permissions.py) | Check user permissions and warn if write access detected. | KUBECONFIG path | Exit: 0=proceed, 1=cancelled/error, 2=write perms (needs user confirmation) |
180+
| [collect_ovn_data.py](scripts/collect_ovn_data.py) | **Data collector**: Queries each node for all data, with **graceful degradation** (continues on node failures). Writes detail files. | KUBECONFIG path | Detail files: `ovn_switches_detail.txt`, `ovn_routers_detail.txt`, `ovn_lsps_detail.txt`, `ovn_lrps_detail.txt`, `ovn_pods_detail.txt` |
181+
| [analyze_placement.py](scripts/analyze_placement.py) | **Placement analyzer**: Analyzes UUID patterns from detail files to determine per-node vs cluster-wide placement. | TMPDIR (reads detail files) | Placement files: `ovn_switch_placement.txt`, `ovn_router_placement.txt` |
161182

162183
---
163184

@@ -301,7 +322,7 @@ graph BT
301322

302323
**CRITICAL: Discover pods from `$TMPDIR/ovn_pods_detail.txt`, NOT from LSPs**
303324

304-
The file `$TMPDIR/ovn_pods_detail.txt` contains ALL running pods in the cluster (populated by collect-ovn-data.py).
325+
The file `$TMPDIR/ovn_pods_detail.txt` contains ALL running pods in the cluster (populated by collect_ovn_data.py).
305326
Format: `namespace|pod_name|pod_ip|node_name`
306327

307328
**Pod-to-LSP Mapping Logic:**
@@ -332,7 +353,8 @@ Format: `namespace|pod_name|pod_ip|node_name`
332353
- `breth0_{node}`: LocalNet port (type="localnet")
333354
- `jtor-*`, `etor-*`, `tstor-*`: Router ports (type="router")
334355

335-
**Example: Control Plane Node with Host-Network Pods**
356+
### Example: Control Plane Node with Host-Network Pods
357+
336358
```mermaid
337359
%% 8 host-network pods sharing management port
338360
POD_etcd["Pod: etcd-ovn-control-plane<br/>Namespace: kube-system<br/>IP: 10.89.0.10"]

plugins/openshift/skills/generating-ovn-topology/scripts/analyze-placement.py renamed to plugins/openshift/skills/generating-ovn-topology/scripts/analyze_placement.py

Lines changed: 6 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,8 +1,8 @@
11
#!/usr/bin/env python3
22
"""
3-
analyze-placement.py - Analyze OVN component placement from collected data
3+
analyze_placement.py - Analyze OVN component placement from collected data
44
5-
Usage: analyze-placement.py [TMPDIR]
5+
Usage: analyze_placement.py [TMPDIR]
66
77
This script analyzes UUID patterns from previously collected OVN data to determine
88
component placement (per-node vs cluster-wide).
@@ -18,6 +18,8 @@
1818
Exit codes:
1919
0 - Success
2020
1 - Missing input files or analysis failed
21+
22+
Requirements: Python 3.6+
2123
"""
2224

2325
import os
@@ -71,7 +73,7 @@ def verify_input_files(self) -> bool:
7173
file=sys.stderr,
7274
)
7375
print(
74-
" Run collect-ovn-data.py first to gather data.",
76+
" Run collect_ovn_data.py first to gather data.",
7577
file=sys.stderr,
7678
)
7779
return False
@@ -82,7 +84,7 @@ def verify_input_files(self) -> bool:
8284
file=sys.stderr,
8385
)
8486
print(
85-
" Run collect-ovn-data.py first to gather data.",
87+
" Run collect_ovn_data.py first to gather data.",
8688
file=sys.stderr,
8789
)
8890
return False

plugins/openshift/skills/generating-ovn-topology/scripts/check-permissions.py renamed to plugins/openshift/skills/generating-ovn-topology/scripts/check_permissions.py

Lines changed: 28 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -1,24 +1,30 @@
11
#!/usr/bin/env python3
22
"""
3-
check-permissions.py - Check user permissions and warn if write access detected
3+
check_permissions.py - Check user permissions and warn if write access detected
44
5-
Usage: ./check-permissions.py KUBECONFIG
5+
Usage: ./check_permissions.py KUBECONFIG
66
Returns: 0 if user confirms to proceed, 1 if user cancels or error, 2 if write perms detected
77
8+
Note: This script must be run from the scripts/ directory or have the scripts/
9+
directory in PYTHONPATH for the ovn_utils import to work.
10+
11+
Requirements: Python 3.6+, kubectl in PATH
812
"""
913

1014
import subprocess
1115
import sys
1216
import os
1317
from typing import List, Optional
1418

19+
# Import shared utilities (must be in same directory or in PYTHONPATH)
20+
from ovn_utils import detect_ovn_namespace
21+
1522

1623
class PermissionChecker:
1724
"""Check Kubernetes RBAC permissions for the OVN topology skill."""
1825

1926
# Configuration constants
2027
PERMISSION_CHECK_TIMEOUT = 5 # seconds
21-
OVN_NAMESPACE = "ovn-kubernetes"
2228

2329
# Dangerous write permissions to check
2430
DANGEROUS_PERMS = [
@@ -38,10 +44,14 @@ def __init__(self, kubeconfig: str):
3844
self.kubeconfig = kubeconfig
3945
self.write_perms_found = False
4046
self.write_perms_list: List[str] = []
47+
self.ovn_namespace: Optional[str] = None
48+
4149

4250
def check_kubectl_available(self) -> bool:
4351
"""Check if kubectl is available in PATH."""
4452
try:
53+
env = os.environ.copy()
54+
env["KUBECONFIG"] = self.kubeconfig
4555
subprocess.run(
4656
[
4757
"kubectl",
@@ -53,7 +63,7 @@ def check_kubectl_available(self) -> bool:
5363
],
5464
capture_output=True,
5565
check=True,
56-
env={"KUBECONFIG": self.kubeconfig},
66+
env=env,
5767
)
5868
return True
5969
except FileNotFoundError:
@@ -95,11 +105,13 @@ def check_permission(
95105
cmd.extend(["-n", namespace])
96106

97107
try:
108+
env = os.environ.copy()
109+
env["KUBECONFIG"] = self.kubeconfig
98110
result = subprocess.run(
99111
cmd,
100112
capture_output=True,
101113
timeout=self.PERMISSION_CHECK_TIMEOUT,
102-
env={"KUBECONFIG": self.kubeconfig}
114+
env=env
103115
)
104116
return result.returncode == 0
105117
except subprocess.TimeoutExpired:
@@ -120,11 +132,11 @@ def check_permission(
120132
def check_all_permissions(self) -> None:
121133
"""Check all dangerous write permissions."""
122134
print("🔐 Checking your Kubernetes permissions...\n", file=sys.stderr)
123-
print(f"Checking permissions in '{self.OVN_NAMESPACE}' namespace...\n", file=sys.stderr)
135+
print(f"Checking permissions in '{self.ovn_namespace}' namespace...\n", file=sys.stderr)
124136

125137
# Check dangerous write permissions
126138
for verb, resource, description in self.DANGEROUS_PERMS:
127-
if self.check_permission(resource, verb, self.OVN_NAMESPACE):
139+
if self.check_permission(resource, verb, self.ovn_namespace):
128140
self.write_perms_found = True
129141
self.write_perms_list.append(f" ⚠️ {description} ({verb} {resource})")
130142

@@ -138,11 +150,11 @@ def check_all_permissions(self) -> None:
138150

139151
# Check namespace admin (only add if not already cluster admin)
140152
if not any("CLUSTER ADMIN" in perm for perm in self.write_perms_list):
141-
if self.check_permission("*", "*", self.OVN_NAMESPACE):
153+
if self.check_permission("*", "*", self.ovn_namespace):
142154
self.write_perms_found = True
143155
self.write_perms_list.append(
144156
f" ⚠️ NAMESPACE ADMIN - Full access to "
145-
f"{self.OVN_NAMESPACE} namespace"
157+
f"{self.ovn_namespace} namespace"
146158
)
147159

148160
def display_warning(self) -> None:
@@ -188,6 +200,13 @@ def run(self) -> int:
188200
if not self.check_kubectl_available():
189201
return 1
190202

203+
# Detect OVN namespace after kubectl availability check
204+
try:
205+
self.ovn_namespace = detect_ovn_namespace(self.kubeconfig)
206+
except Exception as exc:
207+
print(f"❌ Error detecting OVN namespace: {exc}", file=sys.stderr)
208+
return 1
209+
191210
self.check_all_permissions()
192211
return self.handle_confirmation()
193212

0 commit comments

Comments
 (0)