Skip to content

Conversation

@pperiyasamy
Copy link

@pperiyasamy pperiyasamy commented Nov 14, 2025

Introduces /must-gather:ipsec command to analyze IPsec configuration and tunnel establishment status in OpenShift clusters. The analyzer:

  • Checks IPsec enablement in cluster network configuration
  • Validates ovn-ipsec-host daemonset pod health
  • Verifies tunnel establishment by parsing connection configs and
    status logs for ESTABLISHED_CHILD_SA markers
  • Supports node-specific filtering for targeted troubleshooting

The analysis script examines network_logs/ipsec/ data to report on all IPsec connections across the cluster mesh.
Assisted-by: Claude [email protected]

Example output 1:

> /must-gather:ipsec is running… /mnt/sdn-issues/ipsec-upgrade/4.21/quay-proxy-ci-openshift-org-openshift-ci-sha256-271512c45b445012535d9370b718c0db2143ce119d3042f0f02fcb1c652c1365 

● I'll analyze the IPsec configuration and tunnel status from the must-gather data.

● Bash(python3 plugins/must-gather/skills/must-gather-analyzer/scripts/analyze_ipsec.py "/mnt/sdn-issues/ipsec-upgrade/4.21/quay-proxy-ci-openshift-org-openshift-ci-sh…)
  ⎿  status log directory                                                                                                                                                                                  
     ovn-ipsec-host-42r4s
     /mnt/sdn-issues/ipsec-upgrade/4.21/quay-proxy-ci-openshift-org-openshift-ci-sha256-271512c45b445012535d9370b718c0db2143ce119d3042f0f02fcb1c652c1365/network_logs/ipsec/status/ovn-ipsec-host-42r4s.log
     … +157 lines (ctrl+o to expand)

● IPsec Analysis Results

  The IPsec analysis shows a healthy cluster configuration:

  Key Findings

  Configuration Status:
  - IPsec is enabled in Full mode
  - All 6 ovn-ipsec-host pods are running with 2/2 containers ready
  - Zero restarts across all pods

  Tunnel Health:
  - 60/60 connections established (100% success rate)
  - Each of the 6 nodes has 10 bidirectional IPsec connections (5 inbound + 5 outbound)
  - All connections show ESTABLISHED_CHILD_SA status, indicating active encrypted tunnels

  Node Coverage:
  The IPsec daemonset is running on all 6 nodes:
  - ip-10-0-90-216.us-west-1.compute.internal
  - ip-10-0-39-115.us-west-1.compute.internal
  - ip-10-0-114-30.us-west-1.compute.internal
  - ip-10-0-48-5.us-west-1.compute.internal
  - ip-10-0-117-247.us-west-1.compute.internal
  - ip-10-0-78-134.us-west-1.compute.internal

  Conclusion

  No issues detected. The IPsec encryption is fully operational across the entire cluster with all node-to-node tunnels successfully established.

Example output 2:

> /must-gather:ipsec is running… /mnt/sdn-issues/egressip-stale-snat-lrp/must-gather.local.1056168555590501156/quay-io-openshift-release-dev-ocp-v4-0-art-dev-sha256-4e4aab0ca41f3711e720fff2d053ebe4e16b7698a8cc171f438e914c844055b9 

● I'll analyze the IPsec configuration and tunnel status from this must-gather data.

● Bash(python3 plugins/must-gather/skills/must-gather-analyzer/scripts/analyze_ipsec.py "/mnt/sdn-issues/egressip-stale-snat-lrp/must-gather.local.1056168555590501156/…)
  ⎿  ====================================================================================================
     IPSEC CONFIGURATION
     ====================================================================================================
     … +21 lines (ctrl+o to expand)

● IPsec Analysis Results

  The analysis shows that IPsec is not enabled on this cluster.

  Key Findings

  Configuration Status:
  - IPsec is DISABLED in the cluster network configuration
  - No ovn-ipsec-host daemonset pods are present (expected when IPsec is disabled)
  - No IPsec tunnel data collected in the must-gather

  Conclusion

  This cluster is not using IPsec encryption for overlay network traffic. This is a valid configuration - IPsec is an optional security feature in OpenShift that provides encryption for pod-to-pod traffic across nodes.

  If you were expecting IPsec to be enabled, you would need to configure it in the cluster's Network operator configuration by adding ipsecConfig under the ovnKubernetesConfig section.

Example output 3:

> /must-gather:ipsec is running… /mnt/sdn-issues/ipsec-failure-lane/quay-io-openshift-release-dev-ocp-v4-0-art-dev-sha256-6a74019b7c3262e3ea77a084c42a21b71abab96c29478cbc2b9bcf8072638d6a 

● I'll analyze the IPsec configuration and tunnel status from this must-gather data.

● Bash(python3 plugins/must-gather/skills/must-gather-analyzer/scripts/analyze_ipsec.py "/mnt/sdn-issues/ipsec-failure-lane/quay-io-openshift-release-dev-ocp-v4-0-art-…)
  ⎿  status log directory                                                                                                                                                                                        
     ovn-ipsec-host-5w5bf
     /mnt/sdn-issues/ipsec-failure-lane/quay-io-openshift-release-dev-ocp-v4-0-art-dev-sha256-6a74019b7c3262e3ea77a084c42a21b71abab96c29478cbc2b9bcf8072638d6a/network_logs/ipsec/status/ovn-ipsec-host-5w5bf.log
     … +158 lines (ctrl+o to expand)

● IPsec Analysis Results

  The analysis reveals one failed IPsec tunnel in an otherwise healthy configuration.

  Key Findings

  Configuration Status:
  - IPsec is enabled in Full mode
  - All 6 ovn-ipsec-host pods are running with 2/2 containers ready
  - Zero restarts across all pods

  Tunnel Health:
  - 59/60 connections established (98.3% success rate)
  - 1 connection failed to establish

  Failed Connection Details

  Pod: ovn-ipsec-host-5w5bf (Node: ip-10-0-127-96.us-east-2.compute.internal)

  Failed Connection: ovn-0f62a7-0-in-1 (inbound direction)
  - The corresponding outbound connection ovn-0f62a7-0-out-1 is established
  - This suggests a unidirectional tunnel failure

  Analysis

  This is an asymmetric tunnel failure where:
  - Outbound traffic from node ip-10-0-127-96 to the remote peer (identified by 0f62a7) works
  - Inbound traffic from that remote peer back to ip-10-0-127-96 is not encrypted

  The remote peer appears to be node ip-10-0-65-62 (pod ovn-ipsec-host-cfqt5), which shows both directions of connection ovn-0f62a7 as established on its side. This indicates a mismatch in tunnel state between the two nodes.

  Recommendations

  1. Check for timing issues: The tunnel may have failed to establish during initial cluster setup or after a network disruption
  2. Review pod logs: Examine logs from ovn-ipsec-host-5w5bf for IKE negotiation failures
  3. Check certificate validity: Verify IPsec certificates haven't expired or become invalid
  4. Consider tunnel restart: Restarting the affected pod may re-establish the tunnel

Summary by CodeRabbit

New Features

  • Added IPsec analysis functionality to the Must-Gather Analyzer supporting configuration evaluation, pod status monitoring, and tunnel status verification with optional node filtering and summary metrics.

Documentation

  • Added comprehensive documentation for IPsec analysis commands, including usage examples, expected outputs, and integration with existing must-gather analysis workflow.

Assisted-by: Claude <[email protected]>
Signed-off-by: Periyasamy Palanisamy <[email protected]>
@openshift-ci openshift-ci bot requested review from enxebre and stleerh November 14, 2025 13:48
@openshift-ci
Copy link

openshift-ci bot commented Nov 14, 2025

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: pperiyasamy
Once this PR has been reviewed and has the lgtm label, please assign brandisher for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Nov 14, 2025

Walkthrough

The changes introduce IPsec analysis capabilities to the Must-Gather Analyzer plugin. A new Python analysis script processes must-gather data to evaluate IPsec configuration and tunnel status. Documentation is added across multiple markdown files describing the IPsec analysis command, usage examples, and integration with existing analysis capabilities.

Changes

Cohort / File(s) Summary
Documentation updates
plugins/must-gather/README.md, plugins/must-gather/commands/analyze.md
Augment existing documentation to include IPsec analysis; document keyword routing, execution order within full analysis workflow, and CLI examples for invoking IPsec analysis.
New IPsec command documentation
plugins/must-gather/commands/ipsec.md
New file describing the must-gather ipsec command, including synopsis, analysis scope, node filtering, implementation workflow, examples, and error handling guidance.
New IPsec analysis script
plugins/must-gather/skills/must-gather-analyzer/scripts/analyze_ipsec.py
New Python script implementing IPsec analysis; parses YAML network configuration for enablement/mode, enumerates ovn-ipsec-host pods, parses IPsec connections, evaluates tunnel status, and aggregates results into a comprehensive report with optional node filtering.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

  • Extra attention on analyze_ipsec.py:
    • Logic for parsing YAML and openshift.conf files with error handling and missing-file gracefully degradation
    • Tunnel status aggregation logic across multiple pod entries and connection definitions
    • Node filtering implementation and its integration across multiple analysis functions
    • Correctness of pod enumeration and status evaluation from daemonset data

Pre-merge checks and finishing touches

✅ Passed checks (7 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'must-gather: Add IPsec tunnel analyzer' clearly and specifically summarizes the main change—adding IPsec tunnel analysis capability to the must-gather tool.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
No Real People Names In Style References ✅ Passed No real person names appear in style references, plugin commands, skill documentation, example prompts, or instructions in the added files.
No Assumed Git Remote Names ✅ Passed The PR does not contain any hardcoded git remote names like 'origin' or 'upstream' in the modified files.
Git Push Safety Rules ✅ Passed PR contains only documentation and Python analysis script with no git push commands, force operations, or CI/CD automation violations detected.
No Untrusted Mcp Servers ✅ Passed Pull request introduces only IPsec analysis Python script and documentation with no MCP server installations from any source.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@pperiyasamy pperiyasamy changed the title must-gather: Add IPsec tunnel analyzer for must-gather diagnostics must-gather: Add IPsec tunnel analyzer Nov 14, 2025
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Nitpick comments (3)
plugins/must-gather/skills/must-gather-analyzer/scripts/analyze_ipsec.py (1)

36-43: Narrow overly broad Exception handlers in parsing helpers

parse_yaml_file, parse_ipsec_connections, and check_connection_status all catch bare Exception, which can mask programming errors and makes troubleshooting harder. The failure modes here are primarily file I/O and parsing issues.

Consider tightening these to the expected exception types instead of a blanket Exception, e.g.:

-import yaml
+import yaml
+from yaml import YAMLError
@@
-    try:
-        with open(file_path, 'r') as f:
-            return yaml.safe_load(f)
-    except Exception as e:
+    try:
+        with open(file_path, "r") as f:
+            return yaml.safe_load(f)
+    except (OSError, YAMLError) as e:
         print(f"Warning: Failed to parse {file_path}: {e}", file=sys.stderr)
         return None
@@
-    try:
-        with open(openshift_conf_path, 'r', errors='ignore') as f:
+    try:
+        with open(openshift_conf_path, "r", errors="ignore") as f:
@@
-    except Exception as e:
+    except OSError as e:
         print(f"Warning: Failed to parse {openshift_conf_path}: {e}", file=sys.stderr)
@@
-    try:
-        with open(status_log_path, 'r', errors='ignore') as f:
+    try:
+        with open(status_log_path, "r", errors="ignore") as f:
@@
-    except Exception as e:
+    except OSError as e:
         print(f"Warning: Failed to read {status_log_path}: {e}", file=sys.stderr)

This keeps the robustness but avoids swallowing unexpected bugs.

Also applies to: 166-183, 187-224

plugins/must-gather/commands/ipsec.md (1)

10-12: Add languages to fenced code blocks for better tooling support

Several fenced code blocks here omit a language identifier (markdownlint MD040). Adding explicit languages improves rendering and editor support, e.g.:

-```
+/must-gather:ipsec [must-gather-path] [node-name]
-```
+```text
@@
-```
+```text
 network_logs/
 └── ipsec/
@@
-```
+```
@@
-```
+```bash
 /must-gather:ipsec ./must-gather.local.123456789
@@
-```
+```bash
 /must-gather:ipsec ./must-gather/... worker-1
@@
-```
+```text
 No IPsec tunnel data found in network_logs/ipsec/
@@
-```
+```text
 ⚠ Error: openshift.conf not found

Adjust languages (bash, text, etc.) as appropriate for each block. This keeps markdownlint happy and helps downstream consumers.

Also applies to: 30-37, 87-88, 133-172, 176-199

plugins/must-gather/commands/analyze.md (1)

155-207: Specify a language for the Output Structure code block

markdownlint flags the example output block (including IPSEC STATUS:) for missing a language identifier (MD040). Adding something like text improves tooling support without changing semantics:

-```
+```text
 ================================================================================
 MUST-GATHER ANALYSIS SUMMARY
@@
 Logs to Review:
 - [Specific log files to examine based on findings]
-```
+```

This is purely a formatting/tooling improvement; behavior is unaffected.

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Cache: Disabled due to data retention organization setting

Knowledge base: Disabled due to data retention organization setting

📥 Commits

Reviewing files that changed from the base of the PR and between aeea277 and 372b13b.

📒 Files selected for processing (4)
  • plugins/must-gather/README.md (2 hunks)
  • plugins/must-gather/commands/analyze.md (6 hunks)
  • plugins/must-gather/commands/ipsec.md (1 hunks)
  • plugins/must-gather/skills/must-gather-analyzer/scripts/analyze_ipsec.py (1 hunks)
🧰 Additional context used
🪛 markdownlint-cli2 (0.18.1)
plugins/must-gather/commands/ipsec.md

10-10: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


30-30: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


42-42: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


87-87: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


133-133: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


139-139: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


145-145: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


151-151: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


157-157: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


163-163: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


169-169: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


177-177: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


184-184: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


190-190: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


196-196: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

plugins/must-gather/commands/analyze.md

177-177: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

🪛 Ruff (0.14.4)
plugins/must-gather/skills/must-gather-analyzer/scripts/analyze_ipsec.py

41-41: Do not catch blind exception: Exception

(BLE001)


181-181: Do not catch blind exception: Exception

(BLE001)


219-219: Do not catch blind exception: Exception

(BLE001)

🔇 Additional comments (3)
plugins/must-gather/README.md (2)

254-303: IPsec analysis script documentation aligns with implementation

The analyze_ipsec.py section correctly describes the script’s behavior and CLI:

  • Usage matches main() (./analyze_ipsec.py <must-gather-path> [node-name]).
  • Reported outputs (configuration status, ovn-ipsec-host pods, tunnel counts and statuses) correspond to print_ipsec_analysis().

This block looks consistent with the Python implementation.


350-377: /must-gather:ipsec command description is consistent with the analyzer

The new slash-command section correctly mirrors what analyze_ipsec.py does:

  • Arguments and examples line up with the must_gather_path + optional node filter.
  • “What it provides” matches the script’s sections (config, daemonset pods, tunnel status, per-connection details).

No functional discrepancies spotted here.

plugins/must-gather/commands/analyze.md (1)

25-25: IPsec integration in must-gather:analyze matches the new analyzer

The updates here integrate IPsec cleanly into the existing analysis flow:

  • Bullet list correctly mentions “IPsec configuration and tunnel status”.
  • analyze_ipsec.py is listed alongside other scripts in the expected path.
  • Keyword routing (“ipsec”, “ipsec tunnels”, etc.) maps to analyze_ipsec.py only, which matches the intended behavior.
  • Full-analysis execution order now includes IPsec between network and events, aligned with the PR description.
  • The “IPSEC STATUS” section and example usage (analyze ipsec tunnels) match the Python script’s output and CLI.

This doc block is consistent with the implementation.

Also applies to: 59-64, 109-116, 119-130, 177-179, 235-240

Comment on lines +29 to +37
The must-gather should contain:
```
network_logs/
└── ipsec/
└── <pod>_ipsec.d/
├── openshift.conf # IPsec connection configurations
└── status/
└── <pod>.log # Connection status logs
```
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Align documented IPsec log directory layout with implementation

This doc repeatedly states that status logs live under each pod’s directory, e.g.:

  • network_logs/ipsec/<pod>_ipsec.d/status/<pod>.log (Prerequisites, Implementation step 4, Notes).

In the Python implementation (analyze_ipsec_tunnels() in analyze_ipsec.py), the status directory is currently resolved as network_logs/ipsec/status (sibling of <pod>_ipsec.d) rather than inside <pod>_ipsec.d, and then status/<pod>.log is read from there.

Right now, docs and code disagree on where <pod>.log is expected. Whichever layout is actually produced by must-gather should be the single source of truth; both the documentation here and the script should be updated to match it. The script-side fix can also support both layouts via a small fallback (see comment on analyze_ipsec_tunnels in analyze_ipsec.py).

I’d recommend:

  • Confirming the real must-gather directory structure for IPsec logs.
  • Updating both this file and analyze_ipsec.py so the described path and the path used in code are consistent.

Also applies to: 68-77, 204-211

🧰 Tools
🪛 markdownlint-cli2 (0.18.1)

30-30: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

Comment on lines +251 to +352
tunnel_analysis = {
'pods': [],
'total_connections': 0,
'established_connections': 0,
'failed_connections': 0
}

# Build a mapping of pod name to node name from the pod analysis
# This requires re-reading pod info, so we'll do a lightweight version
pod_to_node = {}
patterns = [
"namespaces/openshift-ovn-kubernetes/pods/ovn-ipsec-host-*/*.yaml",
"*/namespaces/openshift-ovn-kubernetes/pods/ovn-ipsec-host-*/*.yaml",
]

seen_pods = set()
for pattern in patterns:
for pod_file in must_gather_path.glob(pattern):
if pod_file.name == 'pods.yaml':
continue

pod = parse_yaml_file(pod_file)
if pod:
name = pod.get('metadata', {}).get('name', '')
if name.startswith('ovn-ipsec-host-') and name not in seen_pods:
seen_pods.add(name)
node = pod.get('spec', {}).get('nodeName', 'Unknown')
pod_to_node[name] = node

# Look for ipsec directories
ipsec_patterns = [
"network_logs/ipsec/*_ipsec.d",
"*/network_logs/ipsec/*_ipsec.d"
]

for pattern in ipsec_patterns:
for ipsec_dir in must_gather_path.glob(pattern):
if not ipsec_dir.is_dir():
continue

# Extract pod name from directory (e.g., "ovn-ipsec-host-abc_ipsec.d" -> "ovn-ipsec-host-abc")
pod_name = ipsec_dir.name.replace('_ipsec.d', '')

# Apply node filter if specified
node_name = pod_to_node.get(pod_name, 'Unknown')
if filter_node and filter_node.lower() != 'all' and node_name != filter_node:
continue

openshift_conf = ipsec_dir / 'openshift.conf'
# Status directory is at network_logs/ipsec/status, not inside the pod's ipsec.d directory
status_dir = ipsec_dir.parent / 'status'

pod_info = {
'pod': pod_name,
'node': node_name,
'config_found': openshift_conf.exists(),
'status_dir_found': status_dir.exists(),
'connections': []
}

if not openshift_conf.exists():
pod_info['error'] = 'openshift.conf not found'
tunnel_analysis['pods'].append(pod_info)
continue

# Parse connections from openshift.conf
connections = parse_ipsec_connections(openshift_conf)
tunnel_analysis['total_connections'] += len(connections)

# Status log file follows naming convention: status/<pod_name>.log
status_log = status_dir / f"{pod_name}.log" if status_dir.exists() else None
print ("status log directory")
print (pod_name)
print (status_log)
# Check status for each connection
for conn_name in connections:
if status_log:

conn_status = check_connection_status(status_log, conn_name)
established = conn_status['established']
info = conn_status['info']
else:
established = False
info = 'Status log not found'

pod_info['connections'].append({
'name': conn_name,
'established': established,
'info': info
})

if established:
tunnel_analysis['established_connections'] += 1
else:
tunnel_analysis['failed_connections'] += 1

if not status_dir.exists():
pod_info['status_warning'] = 'status directory not found'

tunnel_analysis['pods'].append(pod_info)

return tunnel_analysis
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Status log directory path and debug prints will break tunnel detection

In analyze_ipsec_tunnels, the status log directory handling and extra prints have two problems:

  1. Path mismatch vs documented layout

The docstring here and plugins/must-gather/commands/ipsec.md both describe the status logs as:

  • network_logs/ipsec/<pod>_ipsec.d/status/<pod>.log

But the implementation uses:

# Status directory is at network_logs/ipsec/status, not inside the pod's ipsec.d directory
status_dir = ipsec_dir.parent / 'status'

This will miss logs stored under <pod>_ipsec.d/status/ and cause all connections to report "Status log not found" even when logs exist.

To support both the per‑pod and shared status layouts safely, you can prefer the per‑pod status/ under <pod>_ipsec.d and fall back to the shared directory if needed:

-            # Status directory is at network_logs/ipsec/status, not inside the pod's ipsec.d directory
-            status_dir = ipsec_dir.parent / 'status'
+            # Status directory is typically inside the pod's ipsec.d directory:
+            #   network_logs/ipsec/<pod>_ipsec.d/status/<pod>.log
+            # Fall back to a shared status directory (network_logs/ipsec/status)
+            # for compatibility with alternative layouts.
+            pod_status_dir = ipsec_dir / "status"
+            shared_status_dir = ipsec_dir.parent / "status"
+            status_dir = pod_status_dir if pod_status_dir.exists() else shared_status_dir
@@
-            status_log = status_dir / f"{pod_name}.log" if status_dir.exists() else None
-            print ("status log directory")
-            print (pod_name)
-            print (status_log)
+            status_log = status_dir / f"{pod_name}.log" if status_dir.exists() else None
  1. Leftover debug prints

The print("status log directory"), print(pod_name), and print(status_log) lines will pollute the user-facing report. Removing them (as in the diff above) keeps output aligned with the documented format.

As-is, this combination is likely to misreport tunnel state even when valid logs are present.

🤖 Prompt for AI Agents
In plugins/must-gather/skills/must-gather-analyzer/scripts/analyze_ipsec.py
around lines 251 to 352, the code incorrectly assumes the status directory is
only at network_logs/ipsec/status and also leaves debug print statements; update
the logic to first look for a per-pod status directory at
<pod>_ipsec.d/status/<pod>.log (i.e., status_dir = ipsec_dir / 'status' and
status_log = status_dir / f"{pod_name}.log" if exists), and if not present fall
back to the shared parent status directory (ipsec_dir.parent / 'status') and its
<pod>.log; remove the three print(...) lines; ensure status_dir_found reflects
the actual directory checked and that status_log is None only when neither
location contains the log so connection status falls back to "Status log not
found".

Comment on lines +488 to +518
issues = []

if ipsec_config and ipsec_config.get('enabled'):
if not ipsec_pods:
if filter_node and filter_node.lower() != 'all':
issues.append(f"No ovn-ipsec-host pods found for node {filter_node}")
else:
issues.append("IPsec is enabled but no ovn-ipsec-host pods found")

if any(p['status'] != 'Running' for p in ipsec_pods):
issues.append("Some ovn-ipsec-host pods are not in Running state")

if tunnel_analysis['failed_connections'] > 0:
issues.append(f"{tunnel_analysis['failed_connections']} IPsec connections not established")

if not tunnel_analysis['pods'] and not filter_node:
issues.append("No IPsec tunnel data found in network_logs/ipsec/")

if issues:
print("⚠ Issues Detected:")
for idx, issue in enumerate(issues, 1):
print(f"{idx}. {issue}")
else:
if ipsec_config and ipsec_config.get('enabled'):
if tunnel_analysis['total_connections'] > 0:
print(f"✓ All {tunnel_analysis['established_connections']} IPsec connections are established")
else:
print("✓ IPsec is enabled (no tunnel data available for verification)")
else:
print("IPsec is not enabled on this cluster")

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Avoid treating External IPsec mode as a failure in the summary

get_ipsec_config() sets enabled to True for any mode other than "Disabled" (i.e., both "Full" and "External"). analyze_ipsec() only analyzes pods and tunnels when mode == "Full", which is correct because ovn-ipsec-host is not used in External mode.

However, the summary logic here uses ipsec_config.get('enabled') and therefore:

  • Treats External mode as “enabled”.
  • Then flags “IPsec is enabled but no ovn-ipsec-host pods found” and “No IPsec tunnel data found…” even though both are expected in External mode.

This will generate false alarms on clusters using External IPsec.

Consider keying the summary off mode instead, e.g.:

-    issues = []
-
-    if ipsec_config and ipsec_config.get('enabled'):
+    issues = []
+    mode = ipsec_config.get("mode") if ipsec_config else None
+
+    if mode == "Full":
@@
-    if issues:
+    if issues:
         print("⚠ Issues Detected:")
@@
-    else:
-        if ipsec_config and ipsec_config.get('enabled'):
-            if tunnel_analysis['total_connections'] > 0:
-                print(f"✓ All {tunnel_analysis['established_connections']} IPsec connections are established")
-            else:
-                print("✓ IPsec is enabled (no tunnel data available for verification)")
-        else:
-            print("IPsec is not enabled on this cluster")
+    else:
+        if mode == "Full":
+            if tunnel_analysis["total_connections"] > 0:
+                print(f"✓ All {tunnel_analysis['established_connections']} IPsec connections are established")
+            else:
+                print("✓ IPsec is enabled in Full mode (no tunnel data available for verification)")
+        elif mode == "External":
+            print("IPsec is enabled in External mode (ovn-ipsec-host pods and tunnel data are not expected)")
+        else:
+            print("IPsec is not enabled on this cluster")

This keeps the issue list meaningful for Full mode while avoiding spurious warnings for External mode.

🤖 Prompt for AI Agents
In plugins/must-gather/skills/must-gather-analyzer/scripts/analyze_ipsec.py
around lines 488 to 518, the summary currently keys off
ipsec_config.get('enabled') which treats External mode as enabled and produces
false positives; change the conditional logic to check ipsec_config.get('mode')
== 'Full' when deciding whether to validate ovn-ipsec-host pods, tunnel data,
and when emitting the summary about missing pods/tunnels or established
connections so that only Full mode triggers those checks and External mode does
not produce those warnings.

@Prashanth684
Copy link
Contributor

lgtm pending the fix for coderabbit suggestions and linting.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants