must-gather: Add IPsec tunnel analyzer #157

pperiyasamy · 2025-11-14T13:48:18Z

Introduces /must-gather:ipsec command to analyze IPsec configuration and tunnel establishment status in OpenShift clusters. The analyzer:

Checks IPsec enablement in cluster network configuration
Validates ovn-ipsec-host daemonset pod health
Verifies tunnel establishment by parsing connection configs and
status logs for ESTABLISHED_CHILD_SA markers
Supports node-specific filtering for targeted troubleshooting

The analysis script examines network_logs/ipsec/ data to report on all IPsec connections across the cluster mesh.
Assisted-by: Claude [email protected]

Example output 1:

> /must-gather:ipsec is running… /mnt/sdn-issues/ipsec-upgrade/4.21/quay-proxy-ci-openshift-org-openshift-ci-sha256-271512c45b445012535d9370b718c0db2143ce119d3042f0f02fcb1c652c1365 

● I'll analyze the IPsec configuration and tunnel status from the must-gather data.

● Bash(python3 plugins/must-gather/skills/must-gather-analyzer/scripts/analyze_ipsec.py "/mnt/sdn-issues/ipsec-upgrade/4.21/quay-proxy-ci-openshift-org-openshift-ci-sh…)
  ⎿  status log directory                                                                                                                                                                                  
     ovn-ipsec-host-42r4s
     /mnt/sdn-issues/ipsec-upgrade/4.21/quay-proxy-ci-openshift-org-openshift-ci-sha256-271512c45b445012535d9370b718c0db2143ce119d3042f0f02fcb1c652c1365/network_logs/ipsec/status/ovn-ipsec-host-42r4s.log
     … +157 lines (ctrl+o to expand)

● IPsec Analysis Results

  The IPsec analysis shows a healthy cluster configuration:

  Key Findings

  Configuration Status:
  - IPsec is enabled in Full mode
  - All 6 ovn-ipsec-host pods are running with 2/2 containers ready
  - Zero restarts across all pods

  Tunnel Health:
  - 60/60 connections established (100% success rate)
  - Each of the 6 nodes has 10 bidirectional IPsec connections (5 inbound + 5 outbound)
  - All connections show ESTABLISHED_CHILD_SA status, indicating active encrypted tunnels

  Node Coverage:
  The IPsec daemonset is running on all 6 nodes:
  - ip-10-0-90-216.us-west-1.compute.internal
  - ip-10-0-39-115.us-west-1.compute.internal
  - ip-10-0-114-30.us-west-1.compute.internal
  - ip-10-0-48-5.us-west-1.compute.internal
  - ip-10-0-117-247.us-west-1.compute.internal
  - ip-10-0-78-134.us-west-1.compute.internal

  Conclusion

  No issues detected. The IPsec encryption is fully operational across the entire cluster with all node-to-node tunnels successfully established.

Example output 2:

> /must-gather:ipsec is running… /mnt/sdn-issues/egressip-stale-snat-lrp/must-gather.local.1056168555590501156/quay-io-openshift-release-dev-ocp-v4-0-art-dev-sha256-4e4aab0ca41f3711e720fff2d053ebe4e16b7698a8cc171f438e914c844055b9 

● I'll analyze the IPsec configuration and tunnel status from this must-gather data.

● Bash(python3 plugins/must-gather/skills/must-gather-analyzer/scripts/analyze_ipsec.py "/mnt/sdn-issues/egressip-stale-snat-lrp/must-gather.local.1056168555590501156/…)
  ⎿  ====================================================================================================
     IPSEC CONFIGURATION
     ====================================================================================================
     … +21 lines (ctrl+o to expand)

● IPsec Analysis Results

  The analysis shows that IPsec is not enabled on this cluster.

  Key Findings

  Configuration Status:
  - IPsec is DISABLED in the cluster network configuration
  - No ovn-ipsec-host daemonset pods are present (expected when IPsec is disabled)
  - No IPsec tunnel data collected in the must-gather

  Conclusion

  This cluster is not using IPsec encryption for overlay network traffic. This is a valid configuration - IPsec is an optional security feature in OpenShift that provides encryption for pod-to-pod traffic across nodes.

  If you were expecting IPsec to be enabled, you would need to configure it in the cluster's Network operator configuration by adding ipsecConfig under the ovnKubernetesConfig section.

Example output 3:

> /must-gather:ipsec is running… /mnt/sdn-issues/ipsec-failure-lane/quay-io-openshift-release-dev-ocp-v4-0-art-dev-sha256-6a74019b7c3262e3ea77a084c42a21b71abab96c29478cbc2b9bcf8072638d6a 

● I'll analyze the IPsec configuration and tunnel status from this must-gather data.

● Bash(python3 plugins/must-gather/skills/must-gather-analyzer/scripts/analyze_ipsec.py "/mnt/sdn-issues/ipsec-failure-lane/quay-io-openshift-release-dev-ocp-v4-0-art-…)
  ⎿  status log directory                                                                                                                                                                                        
     ovn-ipsec-host-5w5bf
     /mnt/sdn-issues/ipsec-failure-lane/quay-io-openshift-release-dev-ocp-v4-0-art-dev-sha256-6a74019b7c3262e3ea77a084c42a21b71abab96c29478cbc2b9bcf8072638d6a/network_logs/ipsec/status/ovn-ipsec-host-5w5bf.log
     … +158 lines (ctrl+o to expand)

● IPsec Analysis Results

  The analysis reveals one failed IPsec tunnel in an otherwise healthy configuration.

  Key Findings

  Configuration Status:
  - IPsec is enabled in Full mode
  - All 6 ovn-ipsec-host pods are running with 2/2 containers ready
  - Zero restarts across all pods

  Tunnel Health:
  - 59/60 connections established (98.3% success rate)
  - 1 connection failed to establish

  Failed Connection Details

  Pod: ovn-ipsec-host-5w5bf (Node: ip-10-0-127-96.us-east-2.compute.internal)

  Failed Connection: ovn-0f62a7-0-in-1 (inbound direction)
  - The corresponding outbound connection ovn-0f62a7-0-out-1 is established
  - This suggests a unidirectional tunnel failure

  Analysis

  This is an asymmetric tunnel failure where:
  - Outbound traffic from node ip-10-0-127-96 to the remote peer (identified by 0f62a7) works
  - Inbound traffic from that remote peer back to ip-10-0-127-96 is not encrypted

  The remote peer appears to be node ip-10-0-65-62 (pod ovn-ipsec-host-cfqt5), which shows both directions of connection ovn-0f62a7 as established on its side. This indicates a mismatch in tunnel state between the two nodes.

  Recommendations

  1. Check for timing issues: The tunnel may have failed to establish during initial cluster setup or after a network disruption
  2. Review pod logs: Examine logs from ovn-ipsec-host-5w5bf for IKE negotiation failures
  3. Check certificate validity: Verify IPsec certificates haven't expired or become invalid
  4. Consider tunnel restart: Restarting the affected pod may re-establish the tunnel

Summary by CodeRabbit

New Features

Added IPsec analysis functionality to the Must-Gather Analyzer supporting configuration evaluation, pod status monitoring, and tunnel status verification with optional node filtering and summary metrics.

Documentation

Added comprehensive documentation for IPsec analysis commands, including usage examples, expected outputs, and integration with existing must-gather analysis workflow.

Assisted-by: Claude <[email protected]> Signed-off-by: Periyasamy Palanisamy <[email protected]>

openshift-ci · 2025-11-14T13:48:24Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: pperiyasamy
Once this PR has been reviewed and has the lgtm label, please assign brandisher for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

coderabbitai · 2025-11-14T13:48:37Z

Walkthrough

The changes introduce IPsec analysis capabilities to the Must-Gather Analyzer plugin. A new Python analysis script processes must-gather data to evaluate IPsec configuration and tunnel status. Documentation is added across multiple markdown files describing the IPsec analysis command, usage examples, and integration with existing analysis capabilities.

Changes

Cohort / File(s)	Summary
Documentation updates `plugins/must-gather/README.md`, `plugins/must-gather/commands/analyze.md`	Augment existing documentation to include IPsec analysis; document keyword routing, execution order within full analysis workflow, and CLI examples for invoking IPsec analysis.
New IPsec command documentation `plugins/must-gather/commands/ipsec.md`	New file describing the must-gather ipsec command, including synopsis, analysis scope, node filtering, implementation workflow, examples, and error handling guidance.
New IPsec analysis script `plugins/must-gather/skills/must-gather-analyzer/scripts/analyze_ipsec.py`	New Python script implementing IPsec analysis; parses YAML network configuration for enablement/mode, enumerates ovn-ipsec-host pods, parses IPsec connections, evaluates tunnel status, and aggregates results into a comprehensive report with optional node filtering.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Extra attention on analyze_ipsec.py:
- Logic for parsing YAML and openshift.conf files with error handling and missing-file gracefully degradation
- Tunnel status aggregation logic across multiple pod entries and connection definitions
- Node filtering implementation and its integration across multiple analysis functions
- Correctness of pod enumeration and status evaluation from daemonset data

Pre-merge checks and finishing touches

✅ Passed checks (7 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title 'must-gather: Add IPsec tunnel analyzer' clearly and specifically summarizes the main change—adding IPsec tunnel analysis capability to the must-gather tool.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
No Real People Names In Style References	✅ Passed	No real person names appear in style references, plugin commands, skill documentation, example prompts, or instructions in the added files.
No Assumed Git Remote Names	✅ Passed	The PR does not contain any hardcoded git remote names like 'origin' or 'upstream' in the modified files.
Git Push Safety Rules	✅ Passed	PR contains only documentation and Python analysis script with no git push commands, force operations, or CI/CD automation violations detected.
No Untrusted Mcp Servers	✅ Passed	Pull request introduces only IPsec analysis Python script and documentation with no MCP server installations from any source.

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 3

🧹 Nitpick comments (3)

plugins/must-gather/skills/must-gather-analyzer/scripts/analyze_ipsec.py (1)
36-43: Narrow overly broad Exception handlers in parsing helpers

parse_yaml_file, parse_ipsec_connections, and check_connection_status all catch bare Exception, which can mask programming errors and makes troubleshooting harder. The failure modes here are primarily file I/O and parsing issues.

Consider tightening these to the expected exception types instead of a blanket Exception, e.g.:
-import yaml
+import yaml
+from yaml import YAMLError
@@
-    try:
-        with open(file_path, 'r') as f:
-            return yaml.safe_load(f)
-    except Exception as e:
+    try:
+        with open(file_path, "r") as f:
+            return yaml.safe_load(f)
+    except (OSError, YAMLError) as e:
         print(f"Warning: Failed to parse {file_path}: {e}", file=sys.stderr)
         return None
@@
-    try:
-        with open(openshift_conf_path, 'r', errors='ignore') as f:
+    try:
+        with open(openshift_conf_path, "r", errors="ignore") as f:
@@
-    except Exception as e:
+    except OSError as e:
         print(f"Warning: Failed to parse {openshift_conf_path}: {e}", file=sys.stderr)
@@
-    try:
-        with open(status_log_path, 'r', errors='ignore') as f:
+    try:
+        with open(status_log_path, "r", errors="ignore") as f:
@@
-    except Exception as e:
+    except OSError as e:
         print(f"Warning: Failed to read {status_log_path}: {e}", file=sys.stderr)
This keeps the robustness but avoids swallowing unexpected bugs.

Also applies to: 166-183, 187-224
plugins/must-gather/commands/ipsec.md (1)
10-12: Add languages to fenced code blocks for better tooling support

Several fenced code blocks here omit a language identifier (markdownlint MD040). Adding explicit languages improves rendering and editor support, e.g.:
-```
+/must-gather:ipsec [must-gather-path] [node-name]
-```
+```text
@@
-```
+```text
 network_logs/
 └── ipsec/
@@
-```
+```
@@
-```
+```bash
 /must-gather:ipsec ./must-gather.local.123456789
@@
-```
+```bash
 /must-gather:ipsec ./must-gather/... worker-1
@@
-```
+```text
 No IPsec tunnel data found in network_logs/ipsec/
@@
-```
+```text
 ⚠ Error: openshift.conf not found
Adjust languages (bash, text, etc.) as appropriate for each block. This keeps markdownlint happy and helps downstream consumers.

Also applies to: 30-37, 87-88, 133-172, 176-199
plugins/must-gather/commands/analyze.md (1)
155-207: Specify a language for the Output Structure code block

markdownlint flags the example output block (including IPSEC STATUS:) for missing a language identifier (MD040). Adding something like text improves tooling support without changing semantics:
-```
+```text
 ================================================================================
 MUST-GATHER ANALYSIS SUMMARY
@@
 Logs to Review:
 - [Specific log files to examine based on findings]
-```
+```
This is purely a formatting/tooling improvement; behavior is unaffected.

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Cache: Disabled due to data retention organization setting

Knowledge base: Disabled due to data retention organization setting

📥 Commits

Reviewing files that changed from the base of the PR and between aeea277 and 372b13b.

📒 Files selected for processing (4)

plugins/must-gather/README.md (2 hunks)
plugins/must-gather/commands/analyze.md (6 hunks)
plugins/must-gather/commands/ipsec.md (1 hunks)
plugins/must-gather/skills/must-gather-analyzer/scripts/analyze_ipsec.py (1 hunks)

🧰 Additional context used

🪛 markdownlint-cli2 (0.18.1)

plugins/must-gather/commands/ipsec.md

10-10: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

30-30: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

42-42: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

87-87: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

133-133: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

139-139: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

145-145: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

151-151: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

157-157: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

163-163: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

169-169: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

177-177: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

184-184: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

190-190: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

196-196: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

plugins/must-gather/commands/analyze.md

177-177: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

🪛 Ruff (0.14.4)

plugins/must-gather/skills/must-gather-analyzer/scripts/analyze_ipsec.py

41-41: Do not catch blind exception: Exception

(BLE001)

181-181: Do not catch blind exception: Exception

(BLE001)

219-219: Do not catch blind exception: Exception

(BLE001)

🔇 Additional comments (3)

plugins/must-gather/README.md (2)

254-303: IPsec analysis script documentation aligns with implementation

The analyze_ipsec.py section correctly describes the script’s behavior and CLI:

Usage matches main() (./analyze_ipsec.py <must-gather-path> [node-name]).

Reported outputs (configuration status, ovn-ipsec-host pods, tunnel counts and statuses) correspond to print_ipsec_analysis().

This block looks consistent with the Python implementation.

350-377: /must-gather:ipsec command description is consistent with the analyzer

The new slash-command section correctly mirrors what analyze_ipsec.py does:

Arguments and examples line up with the must_gather_path + optional node filter.

“What it provides” matches the script’s sections (config, daemonset pods, tunnel status, per-connection details).

No functional discrepancies spotted here.

plugins/must-gather/commands/analyze.md (1)

25-25: IPsec integration in must-gather:analyze matches the new analyzer

The updates here integrate IPsec cleanly into the existing analysis flow:

Bullet list correctly mentions “IPsec configuration and tunnel status”.

analyze_ipsec.py is listed alongside other scripts in the expected path.

Keyword routing (“ipsec”, “ipsec tunnels”, etc.) maps to analyze_ipsec.py only, which matches the intended behavior.

Full-analysis execution order now includes IPsec between network and events, aligned with the PR description.

The “IPSEC STATUS” section and example usage (analyze ipsec tunnels) match the Python script’s output and CLI.

This doc block is consistent with the implementation.

Also applies to: 59-64, 109-116, 119-130, 177-179, 235-240

coderabbitai · 2025-11-14T13:54:37Z

plugins/must-gather/commands/ipsec.md

+The must-gather should contain:
+```
+network_logs/
+└── ipsec/
+    └── <pod>_ipsec.d/
+        ├── openshift.conf        # IPsec connection configurations
+        └── status/
+            └── <pod>.log         # Connection status logs
+```


⚠️ Potential issue | 🟠 Major

Align documented IPsec log directory layout with implementation

This doc repeatedly states that status logs live under each pod’s directory, e.g.:

network_logs/ipsec/<pod>_ipsec.d/status/<pod>.log (Prerequisites, Implementation step 4, Notes).

In the Python implementation (analyze_ipsec_tunnels() in analyze_ipsec.py), the status directory is currently resolved as network_logs/ipsec/status (sibling of <pod>_ipsec.d) rather than inside <pod>_ipsec.d, and then status/<pod>.log is read from there.

Right now, docs and code disagree on where <pod>.log is expected. Whichever layout is actually produced by must-gather should be the single source of truth; both the documentation here and the script should be updated to match it. The script-side fix can also support both layouts via a small fallback (see comment on analyze_ipsec_tunnels in analyze_ipsec.py).

I’d recommend:

Confirming the real must-gather directory structure for IPsec logs.

Updating both this file and analyze_ipsec.py so the described path and the path used in code are consistent.

Also applies to: 68-77, 204-211

🧰 Tools

🪛 markdownlint-cli2 (0.18.1)

30-30: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

coderabbitai · 2025-11-14T13:54:37Z

plugins/must-gather/skills/must-gather-analyzer/scripts/analyze_ipsec.py

+    tunnel_analysis = {
+        'pods': [],
+        'total_connections': 0,
+        'established_connections': 0,
+        'failed_connections': 0
+    }
+
+    # Build a mapping of pod name to node name from the pod analysis
+    # This requires re-reading pod info, so we'll do a lightweight version
+    pod_to_node = {}
+    patterns = [
+        "namespaces/openshift-ovn-kubernetes/pods/ovn-ipsec-host-*/*.yaml",
+        "*/namespaces/openshift-ovn-kubernetes/pods/ovn-ipsec-host-*/*.yaml",
+    ]
+
+    seen_pods = set()
+    for pattern in patterns:
+        for pod_file in must_gather_path.glob(pattern):
+            if pod_file.name == 'pods.yaml':
+                continue
+
+            pod = parse_yaml_file(pod_file)
+            if pod:
+                name = pod.get('metadata', {}).get('name', '')
+                if name.startswith('ovn-ipsec-host-') and name not in seen_pods:
+                    seen_pods.add(name)
+                    node = pod.get('spec', {}).get('nodeName', 'Unknown')
+                    pod_to_node[name] = node
+
+    # Look for ipsec directories
+    ipsec_patterns = [
+        "network_logs/ipsec/*_ipsec.d",
+        "*/network_logs/ipsec/*_ipsec.d"
+    ]
+
+    for pattern in ipsec_patterns:
+        for ipsec_dir in must_gather_path.glob(pattern):
+            if not ipsec_dir.is_dir():
+                continue
+
+            # Extract pod name from directory (e.g., "ovn-ipsec-host-abc_ipsec.d" -> "ovn-ipsec-host-abc")
+            pod_name = ipsec_dir.name.replace('_ipsec.d', '')
+
+            # Apply node filter if specified
+            node_name = pod_to_node.get(pod_name, 'Unknown')
+            if filter_node and filter_node.lower() != 'all' and node_name != filter_node:
+                continue
+
+            openshift_conf = ipsec_dir / 'openshift.conf'
+            # Status directory is at network_logs/ipsec/status, not inside the pod's ipsec.d directory
+            status_dir = ipsec_dir.parent / 'status'
+
+            pod_info = {
+                'pod': pod_name,
+                'node': node_name,
+                'config_found': openshift_conf.exists(),
+                'status_dir_found': status_dir.exists(),
+                'connections': []
+            }
+
+            if not openshift_conf.exists():
+                pod_info['error'] = 'openshift.conf not found'
+                tunnel_analysis['pods'].append(pod_info)
+                continue
+
+            # Parse connections from openshift.conf
+            connections = parse_ipsec_connections(openshift_conf)
+            tunnel_analysis['total_connections'] += len(connections)
+
+            # Status log file follows naming convention: status/<pod_name>.log
+            status_log = status_dir / f"{pod_name}.log" if status_dir.exists() else None
+            print ("status log directory")
+            print (pod_name)
+            print (status_log)
+            # Check status for each connection
+            for conn_name in connections:
+                if status_log:
+
+                    conn_status = check_connection_status(status_log, conn_name)
+                    established = conn_status['established']
+                    info = conn_status['info']
+                else:
+                    established = False
+                    info = 'Status log not found'
+
+                pod_info['connections'].append({
+                    'name': conn_name,
+                    'established': established,
+                    'info': info
+                })
+
+                if established:
+                    tunnel_analysis['established_connections'] += 1
+                else:
+                    tunnel_analysis['failed_connections'] += 1
+
+            if not status_dir.exists():
+                pod_info['status_warning'] = 'status directory not found'
+
+            tunnel_analysis['pods'].append(pod_info)
+
+    return tunnel_analysis


⚠️ Potential issue | 🔴 Critical

Status log directory path and debug prints will break tunnel detection

In analyze_ipsec_tunnels, the status log directory handling and extra prints have two problems:

Path mismatch vs documented layout

The docstring here and plugins/must-gather/commands/ipsec.md both describe the status logs as:

network_logs/ipsec/<pod>_ipsec.d/status/<pod>.log

But the implementation uses:

# Status directory is at network_logs/ipsec/status, not inside the pod's ipsec.d directory status_dir = ipsec_dir.parent / 'status'

This will miss logs stored under <pod>_ipsec.d/status/ and cause all connections to report "Status log not found" even when logs exist.

To support both the per‑pod and shared status layouts safely, you can prefer the per‑pod status/ under <pod>_ipsec.d and fall back to the shared directory if needed:

- # Status directory is at network_logs/ipsec/status, not inside the pod's ipsec.d directory - status_dir = ipsec_dir.parent / 'status' + # Status directory is typically inside the pod's ipsec.d directory: + # network_logs/ipsec/<pod>_ipsec.d/status/<pod>.log + # Fall back to a shared status directory (network_logs/ipsec/status) + # for compatibility with alternative layouts. + pod_status_dir = ipsec_dir / "status" + shared_status_dir = ipsec_dir.parent / "status" + status_dir = pod_status_dir if pod_status_dir.exists() else shared_status_dir @@ - status_log = status_dir / f"{pod_name}.log" if status_dir.exists() else None - print ("status log directory") - print (pod_name) - print (status_log) + status_log = status_dir / f"{pod_name}.log" if status_dir.exists() else None

Leftover debug prints

The print("status log directory"), print(pod_name), and print(status_log) lines will pollute the user-facing report. Removing them (as in the diff above) keeps output aligned with the documented format.

As-is, this combination is likely to misreport tunnel state even when valid logs are present.

🤖 Prompt for AI Agents

In plugins/must-gather/skills/must-gather-analyzer/scripts/analyze_ipsec.py around lines 251 to 352, the code incorrectly assumes the status directory is only at network_logs/ipsec/status and also leaves debug print statements; update the logic to first look for a per-pod status directory at <pod>_ipsec.d/status/<pod>.log (i.e., status_dir = ipsec_dir / 'status' and status_log = status_dir / f"{pod_name}.log" if exists), and if not present fall back to the shared parent status directory (ipsec_dir.parent / 'status') and its <pod>.log; remove the three print(...) lines; ensure status_dir_found reflects the actual directory checked and that status_log is None only when neither location contains the log so connection status falls back to "Status log not found".

coderabbitai · 2025-11-14T13:54:37Z

plugins/must-gather/skills/must-gather-analyzer/scripts/analyze_ipsec.py

+    issues = []
+
+    if ipsec_config and ipsec_config.get('enabled'):
+        if not ipsec_pods:
+            if filter_node and filter_node.lower() != 'all':
+                issues.append(f"No ovn-ipsec-host pods found for node {filter_node}")
+            else:
+                issues.append("IPsec is enabled but no ovn-ipsec-host pods found")
+
+        if any(p['status'] != 'Running' for p in ipsec_pods):
+            issues.append("Some ovn-ipsec-host pods are not in Running state")
+
+        if tunnel_analysis['failed_connections'] > 0:
+            issues.append(f"{tunnel_analysis['failed_connections']} IPsec connections not established")
+
+        if not tunnel_analysis['pods'] and not filter_node:
+            issues.append("No IPsec tunnel data found in network_logs/ipsec/")
+
+    if issues:
+        print("⚠ Issues Detected:")
+        for idx, issue in enumerate(issues, 1):
+            print(f"{idx}. {issue}")
+    else:
+        if ipsec_config and ipsec_config.get('enabled'):
+            if tunnel_analysis['total_connections'] > 0:
+                print(f"✓ All {tunnel_analysis['established_connections']} IPsec connections are established")
+            else:
+                print("✓ IPsec is enabled (no tunnel data available for verification)")
+        else:
+            print("IPsec is not enabled on this cluster")
+


⚠️ Potential issue | 🟠 Major

Avoid treating External IPsec mode as a failure in the summary

get_ipsec_config() sets enabled to True for any mode other than "Disabled" (i.e., both "Full" and "External"). analyze_ipsec() only analyzes pods and tunnels when mode == "Full", which is correct because ovn-ipsec-host is not used in External mode.

However, the summary logic here uses ipsec_config.get('enabled') and therefore:

Treats External mode as “enabled”.

Then flags “IPsec is enabled but no ovn-ipsec-host pods found” and “No IPsec tunnel data found…” even though both are expected in External mode.

This will generate false alarms on clusters using External IPsec.

Consider keying the summary off mode instead, e.g.:

- issues = [] - - if ipsec_config and ipsec_config.get('enabled'): + issues = [] + mode = ipsec_config.get("mode") if ipsec_config else None + + if mode == "Full": @@ - if issues: + if issues: print("⚠ Issues Detected:") @@ - else: - if ipsec_config and ipsec_config.get('enabled'): - if tunnel_analysis['total_connections'] > 0: - print(f"✓ All {tunnel_analysis['established_connections']} IPsec connections are established") - else: - print("✓ IPsec is enabled (no tunnel data available for verification)") - else: - print("IPsec is not enabled on this cluster") + else: + if mode == "Full": + if tunnel_analysis["total_connections"] > 0: + print(f"✓ All {tunnel_analysis['established_connections']} IPsec connections are established") + else: + print("✓ IPsec is enabled in Full mode (no tunnel data available for verification)") + elif mode == "External": + print("IPsec is enabled in External mode (ovn-ipsec-host pods and tunnel data are not expected)") + else: + print("IPsec is not enabled on this cluster")

This keeps the issue list meaningful for Full mode while avoiding spurious warnings for External mode.

🤖 Prompt for AI Agents

In plugins/must-gather/skills/must-gather-analyzer/scripts/analyze_ipsec.py around lines 488 to 518, the summary currently keys off ipsec_config.get('enabled') which treats External mode as enabled and produces false positives; change the conditional logic to check ipsec_config.get('mode') == 'Full' when deciding whether to validate ovn-ipsec-host pods, tunnel data, and when emitting the summary about missing pods/tunnels or established connections so that only Full mode triggers those checks and External mode does not produce those warnings.

Prashanth684 · 2025-11-18T21:33:32Z

lgtm pending the fix for coderabbit suggestions and linting.

Add ipsec analyzer for must-gather

372b13b

Assisted-by: Claude <[email protected]> Signed-off-by: Periyasamy Palanisamy <[email protected]>

openshift-ci bot requested review from enxebre and stleerh November 14, 2025 13:48

pperiyasamy changed the title ~~must-gather: Add IPsec tunnel analyzer for must-gather diagnostics~~ must-gather: Add IPsec tunnel analyzer Nov 14, 2025

coderabbitai bot reviewed Nov 14, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

must-gather: Add IPsec tunnel analyzer #157

must-gather: Add IPsec tunnel analyzer #157

pperiyasamy commented Nov 14, 2025 •

edited by coderabbitai bot

Loading

Uh oh!

openshift-ci bot commented Nov 14, 2025

Uh oh!

coderabbitai bot commented Nov 14, 2025 •

edited

Loading

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Nov 14, 2025

Uh oh!

coderabbitai bot Nov 14, 2025

Uh oh!

coderabbitai bot Nov 14, 2025

Uh oh!

Prashanth684 commented Nov 18, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

must-gather: Add IPsec tunnel analyzer #157

Are you sure you want to change the base?

must-gather: Add IPsec tunnel analyzer #157

Conversation

pperiyasamy commented Nov 14, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

New Features

Documentation

Uh oh!

openshift-ci bot commented Nov 14, 2025

Uh oh!

coderabbitai bot commented Nov 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Pre-merge checks and finishing touches

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Nov 14, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Nov 14, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Nov 14, 2025

Choose a reason for hiding this comment

Uh oh!

Prashanth684 commented Nov 18, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

pperiyasamy commented Nov 14, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Nov 14, 2025 •

edited

Loading