Skip to content

Conversation

@usv240
Copy link

@usv240 usv240 commented Sep 20, 2025

PR Title

feat(clickhouse): add volume permissions support to fix issue #623

PR Description

This adds volume permissions support to fix the permission errors that users are hitting when using persistent volumes with ClickHouse.

What's the issue?

When using persistent volumes, ClickHouse crashes on startup with permission denied errors like:

filesystem error: in create_directories: Permission denied ["/var/lib/clickhouse/preprocessed_configs"]

This happens because persistent volumes are mounted as root:root but ClickHouse runs as user 101:101, so it can't create the directories it needs.

How I fixed it

I implemented the same volume permissions pattern that's already used in the Zookeeper chart in this repo:

  • Added a volumePermissions section to values.yaml with all the standard configuration options
  • Added an init container that runs as root and fixes the ownership before ClickHouse starts
  • Added a helper template function for building the image name
  • Wrote comprehensive tests to make sure it works in all the different scenarios

How to use it

Just enable volume permissions when you have persistence enabled:

clickhouse:
  persistence:
    enabled: true
  volumePermissions:
    enabled: true

Tested thoroughly

I ran all the existing tests plus added 12 new test cases specifically for volume permissions. Also created a manual testing guide so others can verify it works.

This should fix #623 and anyone else hitting the same permission issues.

Summary by CodeRabbit

  • New Features

    • Optional volume-permissions init container to fix data directory ownership; new volumePermissions config (image, pullSecrets, securityContext, resources).
    • Startup, liveness, and readiness probe settings.
    • Added Zookeeper and operator chart manifests (services, statefulset, metrics, scripts, templates) and example ClickHouseInstallation manifests.
  • Documentation

    • Manual testing guide for volume permissions (repro, verify, troubleshooting).
  • Tests

    • Helm unit tests for init container, images, mounts, resources, securityContext, and persistence combinations.

Add volumePermissions init container to fix permission denied errors
when ClickHouse creates directories in persistent volumes.

Fixes SigNoz#623
@usv240 usv240 requested a review from a team as a code owner September 20, 2025 05:13
@coderabbitai
Copy link

coderabbitai bot commented Sep 24, 2025

Walkthrough

Adds a volume-permissions initContainer and supporting image helper, Helm unit tests, values for volumePermissions and probes, a manual testing guide, and a secret namespace lookup change in the ClickHouse chart. The initContainer conditionally fixes PV ownership before ClickHouse starts when enabled.

Changes

Cohort / File(s) Summary
Documentation
charts/clickhouse/MANUAL_TESTING.md
New manual testing guide describing the volume-permissions fix, root cause, example initContainer YAML, step-by-step tests, verification commands, expected results, and troubleshooting.
Template helpers
charts/clickhouse/templates/_helper.tpl
Adds define "clickhouse.volumePermissions.image" to render image (registry/tag/digest) and includes volumePermissions.image.pullSecrets into aggregated clickhouse.imagePullSecrets.
ClickHouse pod template
charts/clickhouse/templates/clickhouse-instance/clickhouse-instance.yaml
Adds conditional volume-permissions initContainer (command/args, image, imagePullPolicy, optional securityContext/resources) and mounts for data volumes (handles persistence, existingClaim, volumeClaimTemplates). Preserves existing initContainers and ordering.
Unit tests
charts/clickhouse/tests/clickhouse-instance_volume_permissions_test.yaml
New Helm UniTests covering presence/absence of the initContainer, image/tag/digest/pullPolicy, resource and securityContext propagation, volumeMount permutations (persistence/existingClaim/templates), and init container ordering.
Chart values
charts/clickhouse/values.yaml
Adds volumePermissions block (enabled, image fields, pullSecrets, resources, containerSecurityContext) and top-level probe configs (startupProbe, livenessProbe, readinessProbe).
Secret namespace lookup
charts/clickhouse/templates/clickhouse-operator/secret.yaml
Changes namespace source to prefer .Values.namespace with fallback to .Release.Namespace.
Zookeeper debug/chart additions
.debug/clickhouse/charts/zookeeper/...
Multiple added/updated debug templates and manifests (statefulset, services, scripts ConfigMap, metrics svc, scripts, and other debug placeholders); many files are new or empty debug templates.
ClickHouse operator debug templates
.debug/clickhouse/templates/clickhouse-operator/...
Added many operator ConfigMaps, deployment, RBAC, service, serviceaccount, and related debug manifests (mostly declarative ConfigMaps and templates).
ClickHouse instance debug manifests
.debug/clickhouse/templates/clickhouse-instance/...
Added full ClickHouseInstallation manifest, ConfigMaps (scripts, custom-functions), serviceaccount, and related debug templates.

Sequence Diagram(s)

sequenceDiagram
    autonumber
    participant U as User (Helm values)
    participant H as Helm renderer
    participant K as Kubernetes API
    participant P as Pod (init phase)
    participant I as InitContainer: volume-permissions
    participant C as ClickHouse main container

    U->>H: enable volumePermissions + set persistence/template values
    H->>K: apply rendered Pod manifest (includes initContainer if enabled)
    K-->>P: create Pod
    alt volumePermissions enabled & volume mounted
        P->>I: start volume-permissions initContainer
        I->>I: mkdir -p /var/lib/clickhouse && chown -R <user>:<group> /var/lib/clickhouse
        I-->>P: exit 0
    else
        Note over P: no volume-permissions initContainer
    end
    P->>C: start ClickHouse main container
    C-->>K: probes/report ready
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

Suggested reviewers

  • grandwizard28
  • prashant-shahi
  • srikanthccv

Poem

I hop in with tiny paws so neat,
I chown the burrow, fix the seat.
With rooty nudge and nimble art,
The ClickHouse pod gets a fresh start.
No more permission sorrow—hop, restart! 🐇✨

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Out of Scope Changes Check ⚠️ Warning The PR includes a large number of changes under .debug/charts/zookeeper and clickhouse-operator templates that are unrelated to the volume permission fix described in issue #623 and the PR objectives, indicating extraneous content beyond addressing the scoped requirement. Remove or isolate the debug zookeeper and clickhouse-operator template files into separate PRs so that this PR only contains changes relevant to the volumePermissions support.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check ✅ Passed The title “feat(clickhouse): add volume permissions support” succinctly captures the primary change by indicating that volume permission handling is being added to the ClickHouse chart, and it aligns directly with the feature introduced without including extraneous details.
Linked Issues Check ✅ Passed The changes directly address the permission errors described in issue #623 by introducing a new volumePermissions values block, an initContainer to correct ownership before startup, and accompanying tests and documentation to validate the fix under persistence scenarios.
Docstring Coverage ✅ Passed No functions found in the changes. Docstring coverage check skipped.
✨ Finishing touches
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Tip

👮 Agentic pre-merge checks are now available in preview!

Pro plan users can now enable pre-merge checks in their settings to enforce checklists before merging PRs.

  • Built-in checks – Quickly apply ready-made checks to enforce title conventions, require pull request descriptions that follow templates, validate linked issues for compliance, and more.
  • Custom agentic checks – Define your own rules using CodeRabbit’s advanced agentic capabilities to enforce organization-specific policies and workflows. For example, you can instruct CodeRabbit’s agent to verify that API documentation is updated whenever API schema files are modified in a PR. Note: Upto 5 custom checks are currently allowed during the preview period. Pricing for this feature will be announced in a few weeks.

Please see the documentation for more information.

Example:

reviews:
  pre_merge_checks:
    custom_checks:
      - name: "Undocumented Breaking Changes"
        mode: "warning"
        instructions: |
          Pass/fail criteria: All breaking changes to public APIs, CLI flags, environment variables, configuration keys, database schemas, or HTTP/GraphQL endpoints must be documented in the "Breaking Change" section of the PR description and in CHANGELOG.md. Exclude purely internal or private changes (e.g., code not exported from package entry points or explicitly marked as internal).

Please share your feedback with us on this Discord post.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (7)
charts/clickhouse/templates/_helper.tpl (1)

129-150: Solid image helper; wire pullSecrets support for init image

Helper handles registry/tag/digest well. One gap: volumePermissions.image.pullSecrets is defined in values but never used, so private registries for this init image won’t work unless users rely on global/image-level pull secrets.

  • Either remove volumePermissions.image.pullSecrets from values, or (better) include it in the pod imagePullSecrets.

Example (outside this hunk) to extend clickhouse.imagePullSecrets:

{{- define "clickhouse.imagePullSecrets" -}}
{{- if or .Values.global.imagePullSecrets .Values.imagePullSecrets .Values.volumePermissions.image.pullSecrets }}
imagePullSecrets:
{{- range .Values.global.imagePullSecrets }}
  - name: {{ . }}
{{- end }}
{{- range .Values.imagePullSecrets }}
  - name: {{ . }}
{{- end }}
{{- range .Values.volumePermissions.image.pullSecrets }}
  - name: {{ . }}
{{- end }}
{{- end }}
{{- end }}
charts/clickhouse/MANUAL_TESTING.md (2)

6-9: Add fenced code language to satisfy markdownlint MD040

Annotate the error snippet as plain text.

-```
+```text
 filesystem error: in create_directories: Permission denied ["/var/lib/clickhouse/preprocessed_configs"]
-```
+```

80-86: Add fenced code language to satisfy markdownlint MD040

Annotate the “Expected Output” snippet.

-```
+```console
 + mkdir -p /var/lib/clickhouse
 + chown -R 101:101 /var/lib/clickhouse
 + find /var/lib/clickhouse -mindepth 1 -maxdepth 1 -not -name .snapshot -not -name lost+found
 + xargs -r chown -R 101:101
-```
+```
charts/clickhouse/values.yaml (1)

231-266: Expose is fine; but pullSecrets here are currently unused

volumePermissions.image.pullSecrets is not consumed. Either document that global/image-level pull secrets should be used or wire these into the pod imagePullSecrets (preferred; see helper suggestion in _helper.tpl comment).

charts/clickhouse/templates/clickhouse-instance/clickhouse-instance.yaml (3)

196-205: Avoid emitting empty imagePullPolicy

Current template can render imagePullPolicy: "" if a user unsets the value, which is invalid. Make it conditional (and align with other containers).

-              imagePullPolicy: {{ default "" .Values.volumePermissions.image.pullPolicy | quote }}
+              {{- if .Values.volumePermissions.image.pullPolicy }}
+              imagePullPolicy: {{ .Values.volumePermissions.image.pullPolicy }}
+              {{- end }}

214-231: Unreachable branch for emptyDir mount in this initContainer

This initContainer only renders when persistence is enabled or templates.volumeClaimTemplates is set. The data-volume emptyDir mount under and (not .Values.templates.volumeClaimTemplates) (not .Values.persistence.enabled) cannot be reached. Safe to delete for clarity.


203-208: More robust chown for odd filenames

To be resilient to unusual names, consider null-delimited piping:

-                  find /var/lib/clickhouse -mindepth 1 -maxdepth 1 -not -name ".snapshot" -not -name "lost+found" | xargs -r chown -R {{ .Values.securityContext.runAsUser }}:{{ .Values.securityContext.fsGroup }}
+                  find /var/lib/clickhouse -mindepth 1 -maxdepth 1 -not -name ".snapshot" -not -name "lost+found" -print0 | xargs -0 -r chown -R {{ .Values.securityContext.runAsUser }}:{{ .Values.securityContext.fsGroup }}

Note: update tests accordingly if you adopt this.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 2d07150 and c915a36.

📒 Files selected for processing (5)
  • charts/clickhouse/MANUAL_TESTING.md (1 hunks)
  • charts/clickhouse/templates/_helper.tpl (1 hunks)
  • charts/clickhouse/templates/clickhouse-instance/clickhouse-instance.yaml (2 hunks)
  • charts/clickhouse/tests/clickhouse-instance_volume_permissions_test.yaml (1 hunks)
  • charts/clickhouse/values.yaml (1 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2024-12-16T16:15:01.460Z
Learnt from: prashant-shahi
PR: SigNoz/charts#565
File: charts/clickhouse/values.yaml:261-264
Timestamp: 2024-12-16T16:15:01.460Z
Learning: In `charts/clickhouse/values.yaml`, the `customStartupProbe`, `customLivenessProbe`, and `customReadinessProbe` configurations are provided for advanced users to specify custom probe types such as HTTP, Exec, TCP, and gRPC.

Applied to files:

  • charts/clickhouse/values.yaml
🪛 markdownlint-cli2 (0.18.1)
charts/clickhouse/MANUAL_TESTING.md

6-6: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


80-80: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

🔇 Additional comments (1)
charts/clickhouse/tests/clickhouse-instance_volume_permissions_test.yaml (1)

9-20: Tests read well and cover core conditions

Good coverage for enable/disable matrix and volume mount variants. Nice digest/tag assertions too.

Consider adding one small assertion to catch accidental typos in resource headers (e.g., apiVersion presence) so regressions like a prefixed character get flagged early.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (3)
charts/clickhouse/templates/_helper.tpl (2)

129-150: Harden image string construction: trim slashes to avoid //repo and /@sha256 edge cases

Good addition and digest handling. To make it more robust against user-supplied values with trailing/leading slashes on registry/repo, trim them before composition.

Apply this diff:

 {{- define "clickhouse.volumePermissions.image" -}}
-{{- $registryName := default .Values.volumePermissions.image.registry .Values.global.imageRegistry -}}
-{{- $repositoryName := .Values.volumePermissions.image.repository -}}
-{{- $tag := .Values.volumePermissions.image.tag | toString -}}
-{{- $digest := .Values.volumePermissions.image.digest | toString -}}
-{{- if $registryName -}}
+{{- $registryName := default .Values.volumePermissions.image.registry .Values.global.imageRegistry -}}
+{{- $repositoryName := .Values.volumePermissions.image.repository -}}
+{{- $tag := .Values.volumePermissions.image.tag | toString -}}
+{{- $digest := .Values.volumePermissions.image.digest | toString -}}
+{{- $reg := $registryName | trimSuffix "/" -}}
+{{- $repo := $repositoryName | trimPrefix "/" -}}
+{{- if $registryName -}}
     {{- if $digest -}}
-        {{- printf "%s/%s@%s" $registryName $repositoryName $digest -}}
+        {{- printf "%s/%s@%s" $reg $repo $digest -}}
     {{- else -}}
-        {{- printf "%s/%s:%s" $registryName $repositoryName $tag -}}
+        {{- printf "%s/%s:%s" $reg $repo $tag -}}
     {{- end -}}
 {{- else -}}
     {{- if $digest -}}
-        {{- printf "%s@%s" $repositoryName $digest -}}
+        {{- printf "%s@%s" $repo $digest -}}
     {{- else -}}
-        {{- printf "%s:%s" $repositoryName $tag -}}
+        {{- printf "%s:%s" $repo $tag -}}
     {{- end -}}
 {{- end -}}
 {{- end -}}

Also consider aligning digest support across other image helpers for consistency (follow-up).


192-205: Safer, deduped imagePullSecrets; avoid nil-path pitfalls and duplicates

Accessing .Values.volumePermissions.image.pullSecrets in the or can error if intermediate maps are absent, and duplicates are possible across lists. Use dig, merge, and uniq.

Apply this diff:

 {{- define "clickhouse.imagePullSecrets" -}}
-{{- if or .Values.global.imagePullSecrets .Values.imagePullSecrets .Values.volumePermissions.image.pullSecrets }}
-imagePullSecrets:
-{{- range .Values.global.imagePullSecrets }}
-  - name: {{ . }}
-{{- end }}
-{{- range .Values.imagePullSecrets }}
-  - name: {{ . }}
-{{- end }}
-{{- range .Values.volumePermissions.image.pullSecrets }}
-  - name: {{ . }}
-{{- end }}
-{{- end }}
+{{- $global := .Values.global.imagePullSecrets | default (list) -}}
+{{- $local := .Values.imagePullSecrets | default (list) -}}
+{{- $vp := dig "volumePermissions" "image" "pullSecrets" .Values | default (list) -}}
+{{- $all := concat $global $local $vp -}}
+{{- $pulls := uniq $all -}}
+{{- if $pulls }}
+imagePullSecrets:
+{{- range $pulls }}
+  - name: {{ . }}
+{{- end }}
+{{- end }}
 {{- end }}
charts/clickhouse/templates/clickhouse-operator/secret.yaml (1)

6-6: Use clickhouse.namespace helper in secret.yaml

-  namespace: {{ .Values.namespace | default .Release.Namespace }}
+  namespace: {{ include "clickhouse.namespace" . }}

The helper in _helper.tpl (default .Release.Namespace .Values.namespace) produces identical output, so there’s no upgrade impact and this change is purely for template consistency.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between c915a36 and e98c0f8.

📒 Files selected for processing (3)
  • charts/clickhouse/templates/_helper.tpl (2 hunks)
  • charts/clickhouse/templates/clickhouse-instance/clickhouse-instance.yaml (1 hunks)
  • charts/clickhouse/templates/clickhouse-operator/secret.yaml (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • charts/clickhouse/templates/clickhouse-instance/clickhouse-instance.yaml

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 7

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
.debug/clickhouse/templates/clickhouse-operator/configmaps/etc-usersd-files.yaml (1)

1-54: Do not commit .debug artifacts to the repo

This looks like a Helm unittest/debug render. Please remove this file and ignore the entire .debug/ tree.

Suggested follow-ups:

  • Delete this file from the PR.
  • Add to .gitignore:
+.debug/
🧹 Nitpick comments (26)
.debug/clickhouse/templates/clickhouse-instance/serviceaccount.yaml (2)

4-15: Disable token automount by default for least privilege

Unless the pod needs K8s API access, set automountServiceAccountToken: false on the ServiceAccount (and/or pod) to avoid mounting tokens into all containers, including initContainers.

 kind: ServiceAccount
 metadata:
   name: RELEASE-NAME-clickhouse
   namespace: NAMESPACE
   labels:
@@
   annotations:
-    {}
+    {}
+automountServiceAccountToken: false

14-15: Wire annotations from values instead of a hard-coded empty map

Expose .Values.serviceAccount.annotations so operators can add IAM, PSP/PSS, or workload identity annotations when needed.

-  annotations:
-    {}
+  annotations:
+{{- with .Values.serviceAccount.annotations }}
+{{ toYaml . | nindent 4 }}
+{{- end }}
.debug/clickhouse/templates/clickhouse-operator/configmaps/etc-files.yaml (7)

3-6: Debug artifact or real manifest? Hardcoded name/namespace.

Metadata uses placeholders (RELEASE-NAME, NAMESPACE). If this is only a rendered debug/snapshot, ensure the .debug directory is excluded from chart packaging (via .helmignore). If it’s meant to be applied, wire these to Helm templates instead.


7-13: Version drift risk in labels.

Static labels (helm.sh/chart: clickhouse-24.1.18, app.kubernetes.io/version: "24.1.2", chop: 0.21.2) will rot. If this file is ever applied, derive them from chart/app/operator versions; otherwise keep it strictly as debug output and exclude from packages.


64-76: Avoid “password: default” in user template defaults.

Even if comments say ‘default’ user doesn’t use this field, shipping a template default of "default" can lead to unintended weak credentials if reused. Prefer omitting the password field here or set it to empty and require explicit Secrets.


145-151: Secret lookup: confirm name/namespace behavior.

namespace: "" means “operator’s namespace,” while name is hardcoded to "RELEASE-NAME-clickhouse-operator". Verify this matches the actual Secret created by the chart after your recent “secret namespace lookup change,” or wire it to the same helper used elsewhere to avoid mismatches.


268-270: No StatefulSet revision history.

revisionHistoryLimit: 0 hampers rollback/debugging. Recommend a small history (e.g., 5–10).


281-282: Termination grace may be too low for ClickHouse.

30s can be tight for graceful shutdown and flushes under load. Consider 120–300s unless CHI manifests override this reliably.


152-160: Operator→CH timeouts are aggressive.

connect: 2s, query: 5s, and metrics collect: 9s may cause spurious failures on busy clusters or slow networks. Recommend connect 5–10s, query 20–30s, metrics 20–30s, unless you have evidence these values are safe.

.debug/clickhouse/templates/clickhouse-instance/scripts-configmap.yaml (3)

15-36: Make the script POSIX, add retry/timeout, and handle signals (avoid indefinite hangs).

This can block forever if ClickHouse never comes up, and it assumes bash+wget availability. Propose a portable version with curl/wget fallback, env‑overridable HOST/PORT/DELAY, capped retries, and SIGTERM handling.

-    #!/bin/bash
-
-    # Delay between attempts in seconds
-    DELAY=10
-    # Default host and port
-    HOST="localhost"
-    PORT="8123"
-
-    echo "Waiting for Clickhouse to become ready..."
-
-    while true; do
-        response=$(wget --spider -S "http://${HOST}:${PORT}/ping" 2>&1)
-        exit_code=$?
-        
-        if [ $exit_code -eq 0 ]; then
-            echo "Clickhouse is ready!"
-            exit 0
-        else
-            echo "Clickhouse is not ready yet. Retrying in ${DELAY} seconds..."
-            sleep $DELAY
-        fi
-    done
+    #!/bin/sh
+    set -eu
+
+    # Defaults (overridable via env)
+    HOST="${HOST:-127.0.0.1}"
+    PORT="${PORT:-8123}"
+    DELAY="${DELAY:-10}"          # seconds between attempts
+    MAX_ATTEMPTS="${MAX_ATTEMPTS:-60}" # ~10 minutes by default
+
+    trap 'echo "wait-until-ready: terminated"; exit 1' INT TERM
+
+    has_cmd() { command -v "$1" >/dev/null 2>&1; }
+
+    echo "Waiting for ClickHouse at http://${HOST}:${PORT}/ping ..."
+    ATTEMPT=1
+    while [ "$ATTEMPT" -le "$MAX_ATTEMPTS" ]; do
+      if has_cmd curl; then
+        if curl -fsS -m 3 "http://${HOST}:${PORT}/ping" >/dev/null; then
+          echo "ClickHouse is ready!"
+          exit 0
+        fi
+      elif has_cmd wget; then
+        if wget -qO- --timeout=3 --tries=1 "http://${HOST}:${PORT}/ping" >/dev/null; then
+          echo "ClickHouse is ready!"
+          exit 0
+        fi
+      else
+        echo "Neither curl nor wget is available; cannot check readiness."
+        exit 2
+      fi
+      echo "Not ready (attempt ${ATTEMPT}/${MAX_ATTEMPTS}). Retrying in ${DELAY}s..."
+      ATTEMPT=$((ATTEMPT+1))
+      sleep "$DELAY"
+    done
+    echo "Timed out waiting for ClickHouse after $((MAX_ATTEMPTS*DELAY))s."
+    exit 1

26-27: If keeping wget: remove unused capture, add per-attempt timeouts, and quiet output.

Current code stores response but never uses it, and retries can block per attempt. Minimal tweak:

-        response=$(wget --spider -S "http://${HOST}:${PORT}/ping" 2>&1)
-        exit_code=$?
+        wget -q --spider --timeout=3 --tries=1 "http://${HOST}:${PORT}/ping"
+        exit_code=$?

20-21: Allow env overrides for HOST/PORT (useful in debug pods and sidecars).

Small portability win:

-    HOST="localhost"
-    PORT="8123"
+    HOST="${HOST:-127.0.0.1}"
+    PORT="${PORT:-8123}"
.debug/clickhouse/templates/clickhouse-operator/configmaps/etc-usersd-files.yaml (3)

7-13: Label/version consistency for operator vs app

For an operator component, app.kubernetes.io/version typically matches the operator version (here chop 0.21.2). You’ve set 24.1.2, which reads like a ClickHouse server version. Align or drop if unused by tooling.


19-21: Loopback-only user access may block cluster testing

Restricting testuser to 127.0.0.1 likely prevents access from other pods/nodes. If this is intentional for a unit-test fixture, fine; otherwise broaden networks to the cluster CIDR or service mesh ranges.


46-53: Revisit default_database_engine: prefer Atomic unless you have a constraint

Setting default_database_engine to Ordinary can limit modern features and DDL safety. Unless there’s a specific reason, consider Atomic (the common default in recent ClickHouse).

.debug/clickhouse/charts/zookeeper/templates/scripts-configmap.yaml (1)

16-33: Harden setup.sh with strict mode and clearer errors.

Add bash strict flags and guard file reads for better failure surfacing.

   #!/bin/bash
+  set -Eeuo pipefail
+  shopt -s nocasematch || true
@@
-    if [[ -f "/bitnami/zookeeper/data/myid" ]]; then
-        export ZOO_SERVER_ID="$(cat /bitnami/zookeeper/data/myid)"
+    if [[ -r "/bitnami/zookeeper/data/myid" ]]; then
+        if ! ZOO_SERVER_ID="$(cat /bitnami/zookeeper/data/myid)"; then
+          echo "Failed to read /bitnami/zookeeper/data/myid" >&2
+          exit 1
+        fi
+        export ZOO_SERVER_ID
@@
-        else
-            echo "Failed to get index from hostname $HOSTNAME"
+        else
+            echo "Failed to get ordinal index from hostname: ${HOSTNAME}" >&2
             exit 1
         fi
     fi
     exec /entrypoint.sh /run.sh
.debug/clickhouse/charts/zookeeper/templates/statefulset.yaml (1)

35-55: Minor cleanups: serviceAccount and empty initContainers.

  • serviceAccountName: default — consider making this a chart value to avoid binding to cluster defaults.
  • initContainers: present but empty; drop the key until needed to reduce noise.

If these pods will write to pre-provisioned PVs with restrictive ownership, you may still need a volume-permissions initContainer (chown/chmod) despite fsGroup, especially with existing data. Aligns with the broader PR goal around volume permissions.

.debug/clickhouse/templates/clickhouse-instance/configmap.yaml (1)

14-35: Executable UDF needs binary path, perms, and CH setting enabled.

Ensure:

  • The histogramQuantile binary exists at the resolved working dir or use an absolute path (e.g., /var/lib/clickhouse/user_scripts/histogramQuantile) and mount it.
  • File perms allow execution by user 101:101; if placed on a PV, confirm your volume-permissions initContainer also chowns that path.
  • ClickHouse has allow_executable_user_defined_functions enabled; otherwise this config will be ignored.

I can propose a small values + volumeMounts patch to mount the executable into user_scripts with correct ownership.

.debug/clickhouse/templates/clickhouse-operator/configmaps/etc-templatesd-files.yaml (2)

47-47: Template image version to avoid drift with chart AppVersion.

Hardcoding clickhouse/clickhouse-server:22.3 can confuse users when chart AppVersion is different. Prefer a templated value (e.g., include "clickhouse.image" .) or wire to .Chart.AppVersion/.Values.image.tag.


13-13: Template CHOP version label.

clickhouse.altinity.com/chop is pinned to 0.21.2. Consider templating from .Values.clickhouseOperator.version for consistency with operator images.

charts/clickhouse/templates/_helper.tpl (1)

195-211: Aggregate pull secrets across all images; stabilize output order.

Good call adding volumePermissions pullSecrets and deduping. Two improvements:

  • Include initContainers.init.image.pullSecrets and initContainers.udf.image.pullSecrets so the Pod can pull those images without relying on global/local secrets.
  • Optional: sort the final list for deterministic rendering in tests.

Proposed diff:

 {{- define "clickhouse.imagePullSecrets" -}}
-{{- $global := .Values.global.imagePullSecrets | default (list) -}}
-{{- $local := .Values.imagePullSecrets | default (list) -}}
-{{- $vp := list -}}
+{{- $global := .Values.global.imagePullSecrets | default (list) -}}
+{{- $local := .Values.imagePullSecrets | default (list) -}}
+{{- $vp := list -}}
+{{- $init := list -}}
+{{- $udf := list -}}
 {{- if .Values.volumePermissions -}}
   {{- if .Values.volumePermissions.image -}}
     {{- $vp = .Values.volumePermissions.image.pullSecrets | default (list) -}}
   {{- end -}}
 {{- end -}}
-{{- $all := concat $global $local $vp -}}
-{{- $pulls := uniq $all -}}
+{{- if .Values.initContainers.init.image }}
+  {{- $init = .Values.initContainers.init.image.pullSecrets | default (list) -}}
+{{- end -}}
+{{- if .Values.initContainers.udf.image }}
+  {{- $udf = .Values.initContainers.udf.image.pullSecrets | default (list) -}}
+{{- end -}}
+{{- $all := concat $global $local $vp $init $udf -}}
+{{- $pulls := uniq $all | sortAlpha -}}
 {{- if $pulls }}
 imagePullSecrets:
 {{- range $pulls }}
   - name: {{ . }}
 {{- end }}
 {{- end }}
 {{- end }}
.debug/clickhouse/templates/clickhouse-operator/role.yaml (3)

41-44: Broaden Events verbs for standard recorder behavior.

Operators usually need create, patch (and sometimes update) on events. Consider adding patch (and update if used).

   verbs:
     - create
+    - patch
+    - update

68-75: Add list/watch on ReplicaSets if the operator watches them.

You grant get/patch/update/delete but not list/watch. If any RS informer is used, include list/watch.


17-29: Duplicate secrets rules — simplify.

Secrets are granted twice. Keep one block with combined verbs for clarity.

Also applies to: 139-145

.debug/clickhouse/templates/clickhouse-operator/deployment.yaml (1)

36-37: Harden pod/containers: disallow privilege escalation and run as non-root.

Address CKV_K8S_20/23: add per-container securityContext with allowPrivilegeEscalation: false, runAsNonRoot: true, and readOnlyRootFilesystem: true (if compatible).

         - name: operator
           image: docker.io/altinity/clickhouse-operator:0.21.2
           imagePullPolicy: IfNotPresent
+          securityContext:
+            allowPrivilegeEscalation: false
+            runAsNonRoot: true
+            readOnlyRootFilesystem: true
...
         - name: metrics-exporter
           image: docker.io/altinity/metrics-exporter:0.21.2
           imagePullPolicy: IfNotPresent
+          securityContext:
+            allowPrivilegeEscalation: false
+            runAsNonRoot: true
+            readOnlyRootFilesystem: true

Also applies to: 55-69, 122-136

.debug/clickhouse/templates/clickhouse-instance/clickhouse-instance.yaml (1)

100-121: Init container downloads binaries without integrity verification.

Add checksum verification and stricter shell flags, or pre-bake the binary into an image/ConfigMap.

-              command:
+              command:
                 - sh
                 - -c
                 - |
-                  set -e
+                  set -euo pipefail
                   version="v0.0.1"
                   node_os=$(uname -s | tr '[:upper:]' '[:lower:]')
                   node_arch=$(uname -m | sed s/aarch64/arm64/ | sed s/x86_64/amd64/)
                   echo "Fetching histogram-binary for ${node_os}/${node_arch}"
                   cd /tmp
-                  wget -O histogram-quantile.tar.gz "https://github.com/SigNoz/signoz/releases/download/histogram-quantile%2F${version}/histogram-quantile_${node_os}_${node_arch}.tar.gz"
-                  tar -xzf histogram-quantile.tar.gz
+                  url="https://github.com/SigNoz/signoz/releases/download/histogram-quantile%2F${version}/histogram-quantile_${node_os}_${node_arch}.tar.gz"
+                  wget -O histogram-quantile.tar.gz "${url}"
+                  # Optionally verify checksum: echo "<sha256>  histogram-quantile.tar.gz" | sha256sum -c -
+                  tar -xzf histogram-quantile.tar.gz
                   chmod +x histogram-quantile
                   mv histogram-quantile /var/lib/clickhouse/user_scripts/histogramQuantile
                   echo "histogram-quantile installed successfully"
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e98c0f8 and bdd09dc.

📒 Files selected for processing (34)
  • .debug/clickhouse/charts/zookeeper/templates/NOTES.txt (1 hunks)
  • .debug/clickhouse/charts/zookeeper/templates/configmap.yaml (1 hunks)
  • .debug/clickhouse/charts/zookeeper/templates/extra-list.yaml (1 hunks)
  • .debug/clickhouse/charts/zookeeper/templates/metrics-svc.yaml (1 hunks)
  • .debug/clickhouse/charts/zookeeper/templates/networkpolicy.yaml (1 hunks)
  • .debug/clickhouse/charts/zookeeper/templates/pdb.yaml (1 hunks)
  • .debug/clickhouse/charts/zookeeper/templates/prometheusrule.yaml (1 hunks)
  • .debug/clickhouse/charts/zookeeper/templates/scripts-configmap.yaml (1 hunks)
  • .debug/clickhouse/charts/zookeeper/templates/secrets.yaml (1 hunks)
  • .debug/clickhouse/charts/zookeeper/templates/serviceaccount.yaml (1 hunks)
  • .debug/clickhouse/charts/zookeeper/templates/servicemonitor.yaml (1 hunks)
  • .debug/clickhouse/charts/zookeeper/templates/statefulset.yaml (1 hunks)
  • .debug/clickhouse/charts/zookeeper/templates/svc-headless.yaml (1 hunks)
  • .debug/clickhouse/charts/zookeeper/templates/svc.yaml (1 hunks)
  • .debug/clickhouse/charts/zookeeper/templates/tls-secrets.yaml (1 hunks)
  • .debug/clickhouse/templates/clickhouse-instance/clickhouse-instance.yaml (1 hunks)
  • .debug/clickhouse/templates/clickhouse-instance/configmap.yaml (1 hunks)
  • .debug/clickhouse/templates/clickhouse-instance/scripts-configmap.yaml (1 hunks)
  • .debug/clickhouse/templates/clickhouse-instance/serviceaccount.yaml (1 hunks)
  • .debug/clickhouse/templates/clickhouse-instance/storage-class.yaml (1 hunks)
  • .debug/clickhouse/templates/clickhouse-operator/configmaps/etc-confd-files.yaml (1 hunks)
  • .debug/clickhouse/templates/clickhouse-operator/configmaps/etc-configd-files.yaml (1 hunks)
  • .debug/clickhouse/templates/clickhouse-operator/configmaps/etc-files.yaml (1 hunks)
  • .debug/clickhouse/templates/clickhouse-operator/configmaps/etc-templatesd-files.yaml (1 hunks)
  • .debug/clickhouse/templates/clickhouse-operator/configmaps/etc-usersd-files.yaml (1 hunks)
  • .debug/clickhouse/templates/clickhouse-operator/deployment.yaml (1 hunks)
  • .debug/clickhouse/templates/clickhouse-operator/namespace.yaml (1 hunks)
  • .debug/clickhouse/templates/clickhouse-operator/role.yaml (1 hunks)
  • .debug/clickhouse/templates/clickhouse-operator/rolebinding.yaml (1 hunks)
  • .debug/clickhouse/templates/clickhouse-operator/secret.yaml (1 hunks)
  • .debug/clickhouse/templates/clickhouse-operator/service.yaml (1 hunks)
  • .debug/clickhouse/templates/clickhouse-operator/serviceaccount.yaml (1 hunks)
  • .debug/clickhouse/templates/clickhouse-operator/servicemonitor.yaml (1 hunks)
  • charts/clickhouse/templates/_helper.tpl (2 hunks)
✅ Files skipped from review due to trivial changes (18)
  • .debug/clickhouse/charts/zookeeper/templates/extra-list.yaml
  • .debug/clickhouse/templates/clickhouse-operator/servicemonitor.yaml
  • .debug/clickhouse/charts/zookeeper/templates/prometheusrule.yaml
  • .debug/clickhouse/charts/zookeeper/templates/serviceaccount.yaml
  • .debug/clickhouse/charts/zookeeper/templates/secrets.yaml
  • .debug/clickhouse/templates/clickhouse-operator/service.yaml
  • .debug/clickhouse/charts/zookeeper/templates/pdb.yaml
  • .debug/clickhouse/templates/clickhouse-instance/storage-class.yaml
  • .debug/clickhouse/charts/zookeeper/templates/tls-secrets.yaml
  • .debug/clickhouse/charts/zookeeper/templates/NOTES.txt
  • .debug/clickhouse/templates/clickhouse-operator/secret.yaml
  • .debug/clickhouse/charts/zookeeper/templates/networkpolicy.yaml
  • .debug/clickhouse/charts/zookeeper/templates/svc.yaml
  • .debug/clickhouse/templates/clickhouse-operator/serviceaccount.yaml
  • .debug/clickhouse/charts/zookeeper/templates/servicemonitor.yaml
  • .debug/clickhouse/charts/zookeeper/templates/configmap.yaml
  • .debug/clickhouse/templates/clickhouse-operator/configmaps/etc-confd-files.yaml
  • .debug/clickhouse/templates/clickhouse-operator/configmaps/etc-configd-files.yaml
🧰 Additional context used
🪛 Checkov (3.2.334)
.debug/clickhouse/templates/clickhouse-operator/deployment.yaml

[medium] 1-190: Containers should not run with allowPrivilegeEscalation

(CKV_K8S_20)


[medium] 1-190: Minimize the admission of root containers

(CKV_K8S_23)

🔇 Additional comments (12)
.debug/clickhouse/templates/clickhouse-instance/serviceaccount.yaml (3)

2-15: LGTM on adding a dedicated ServiceAccount

Good addition; aligns with separating identity for the ClickHouse instance/init containers.


5-6: Verify name/namespace templating and conditional creation

Confirm the real template uses your fullname helper and honors values.serviceAccount.create/values.serviceAccount.name to support existing SAs and custom names; ensure serviceAccountName in the Pod is wired accordingly.


12-12: Check app.kubernetes.io/version matches the ClickHouse image tag

Please verify "24.1.2" reflects the actual ClickHouse image version used by the chart to keep labels accurate for selectors/ops inventory.

.debug/clickhouse/templates/clickhouse-instance/scripts-configmap.yaml (1)

14-14: Confirm executable permissions or invoke via sh -c when mounting this ConfigMap.

ConfigMap files are 0644 by default; binaries won’t be executable unless you set defaultMode or call via the shell.

  • Ensure volume has an executable mode or invoke explicitly:
volumes:
  - name: scripts
    configMap:
      name: {{ include "clickhouse.fullname" . }}-scripts
      defaultMode: 0555
containers:
  - name: clickhouse
    volumeMounts:
      - name: scripts
        mountPath: /scripts
    command: ["/scripts/wait-until-ready.sh"]
# or
    command: ["sh", "-c", "/scripts/wait-until-ready.sh"]
.debug/clickhouse/templates/clickhouse-operator/configmaps/etc-usersd-files.yaml (1)

36-45: Default profile tweaks LGTM

Settings look reasonable for tests. No blocking issues.

If these are production-facing, confirm they match your performance/SLO targets.

.debug/clickhouse/charts/zookeeper/templates/svc-headless.yaml (1)

12-26: Headless Service wiring looks correct.

Port names align with the StatefulSet; headless + publishNotReadyAddresses is appropriate for ZK STS discovery.

.debug/clickhouse/charts/zookeeper/templates/statefulset.yaml (1)

124-138: Confirm probe utilities exist in the image.

Liveness/readiness use nc and timeout. Ensure signoz/zookeeper:3.7.1 includes these; otherwise probes will fail despite a healthy process. If unsure, switch to zkCli.sh/ruok via built-in scripts or ship a lightweight toolbox.

.debug/clickhouse/templates/clickhouse-operator/namespace.yaml (1)

1-2: No functional change.

Whitespace-only; safe to ignore.

.debug/clickhouse/templates/clickhouse-operator/rolebinding.yaml (1)

15-22: Confirm scope: RoleBinding vs ClusterRoleBinding.

If the operator is configured to watch multiple namespaces or cluster-wide, this should be a ClusterRoleBinding with a ClusterRole. If it’s strictly single-namespace, RoleBinding + Role is correct.

charts/clickhouse/templates/_helper.tpl (1)

133-151: Volume-permissions image helper looks solid.

Handles registry, tag vs digest, and trims slashes consistently. No blockers.

Add a unit test asserting digest form rendering (with and without global.imageRegistry) to lock this in.

.debug/clickhouse/templates/clickhouse-instance/clickhouse-instance.yaml (2)

83-90: Good: pod runs as 101 with fsGroup and OnRootMismatch.

This aligns with ClickHouse UID/GID and mitigates many PV permission issues. Combined with the new initContainer it should satisfy issue #623.


176-185: Include initContainers.enabled when testing volume-permissions
The volume-permissions initContainer is gated by both initContainers.enabled and volumePermissions.enabled (plus persistence or templates). Be sure to render with

--set initContainers.enabled=true,volumePermissions.enabled=true

across your scenarios.

Comment on lines +20 to +22
- name: tcp-metrics
port: 9141
targetPort: metrics
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Service targets a non-existent pod port name 'metrics'.

StatefulSet exposes ports named client/follower/election only. targetPort: metrics won’t resolve, so scraping on 9141 will fail. Align one of:

  • Add a metrics endpoint (sidecar or container) and expose a containerPort named metrics: 9141.
  • Or change Service to target an existing port, or drop the metrics Service until an exporter is added.
🤖 Prompt for AI Agents
.debug/clickhouse/charts/zookeeper/templates/metrics-svc.yaml lines 20-22: the
Service is targeting a non-existent container port name "metrics" (StatefulSet
only exposes client/follower/election) so targetPort: metrics will not resolve;
either add a metrics exporter container/sidecar and declare a containerPort:
9141 with name: metrics in the Pod/StatefulSet, or update this Service to use
one of the existing port names (client/follower/election) or remove the metrics
Service entirely until an exporter is added—make the change so targetPort
references a real containerPort name or numeric port.

Comment on lines +116 to +122
ports:
- name: client
containerPort: 2181
- name: follower
containerPort: 2888
- name: election
containerPort: 3888
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Expose a 'metrics' port or adjust/remove the metrics Service.

Pods don’t declare a containerPort named metrics, while metrics-svc targets it. Add the port (and an actual metrics endpoint/exporter), or update/remove the Service to avoid a dead endpoint.

           ports:
             - name: client
               containerPort: 2181
             - name: follower
               containerPort: 2888
             - name: election
               containerPort: 3888
+            # If a metrics exporter is present and listens on 9141:
+            # - name: metrics
+            #   containerPort: 9141
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
ports:
- name: client
containerPort: 2181
- name: follower
containerPort: 2888
- name: election
containerPort: 3888
ports:
- name: client
containerPort: 2181
- name: follower
containerPort: 2888
- name: election
containerPort: 3888
# If a metrics exporter is present and listens on 9141:
# - name: metrics
# containerPort: 9141
🤖 Prompt for AI Agents
In .debug/clickhouse/charts/zookeeper/templates/statefulset.yaml around lines
116-122 the Pod spec does not expose a containerPort named "metrics" while the
metrics Service targets that port; add a container port entry named "metrics"
(e.g., containerPort: <port-number>) and ensure the container includes a metrics
exporter or endpoint bound to that port, or alternatively remove or update the
metrics Service to point to an existing port/name; make the change so Service
targetPort/name matches an actual containerPort and verify the exporter is
configured and listening.

Comment on lines +15 to +34
01-clickhouse-user.xml: |
<yandex>
<users>
<testuser>
<networks>
<ip>127.0.0.1</ip>
</networks>
<password>${CLICKHOUSE_PASSWORD}</password>
<profile>default</profile>
<quota>default</quota>
</testuser>
</users>
<profiles>
<testuser>
<log_queries>0</log_queries>
<skip_unavailable_shards>1</skip_unavailable_shards>
<http_connection_timeout>10</http_connection_timeout>
</testuser>
</profiles>
</yandex>
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Avoid passwords in ConfigMaps; fix profile mismatch

  • Storing ${CLICKHOUSE_PASSWORD} in a ConfigMap is a security smell. Prefer a Secret and hashed value (password_sha256_hex), or the operator’s supported Secret mechanism if available.
  • User references default while you also define a separate …. Either switch the user’s profile to testuser or drop the unused profile block to avoid confusion.

Minimal corrective diff for the profile mismatch (if you keep this file locally for tests):

-          <profile>default</profile>
+          <profile>testuser</profile>
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
01-clickhouse-user.xml: |
<yandex>
<users>
<testuser>
<networks>
<ip>127.0.0.1</ip>
</networks>
<password>${CLICKHOUSE_PASSWORD}</password>
<profile>default</profile>
<quota>default</quota>
</testuser>
</users>
<profiles>
<testuser>
<log_queries>0</log_queries>
<skip_unavailable_shards>1</skip_unavailable_shards>
<http_connection_timeout>10</http_connection_timeout>
</testuser>
</profiles>
</yandex>
01-clickhouse-user.xml: |
<yandex>
<users>
<testuser>
<networks>
<ip>127.0.0.1</ip>
</networks>
<password>${CLICKHOUSE_PASSWORD}</password>
<profile>testuser</profile>
<quota>default</quota>
</testuser>
</users>
<profiles>
<testuser>
<log_queries>0</log_queries>
<skip_unavailable_shards>1</skip_unavailable_shards>
<http_connection_timeout>10</http_connection_timeout>
</testuser>
</profiles>
</yandex>
🤖 Prompt for AI Agents
In
.debug/clickhouse/templates/clickhouse-operator/configmaps/etc-usersd-files.yaml
around lines 15-34, the ConfigMap contains a plaintext
<password>${CLICKHOUSE_PASSWORD}</password> (security risk) and the user
declares <profile>default</profile> while a <profiles><testuser>…</testuser>
block exists (mismatch); fix by moving the password into a Kubernetes Secret or
the operator-supported secret mechanism and store a hashed password
(password_sha256_hex) instead of plaintext in the ConfigMap, and resolve the
profile mismatch by either changing the user's <profile> to testuser or removing
the unused testuser profile block so profiles align.

Comment on lines +119 to +121
resources:
null

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

resources: null will fail validation — omit when unset.

Render resources only when values are provided, or use {}. In Helm, guard with with to avoid emitting null.

-          resources:
-            null
+{{- /* In template: only render when non-empty */ -}}
+{{- with .Values.clickhouseOperator.resources }}
+          resources:
+{{ toYaml . | nindent 12 }}
+{{- end }}

Repeat for metrics-exporter.

Also applies to: 189-190

🤖 Prompt for AI Agents
In .debug/clickhouse/templates/clickhouse-operator/deployment.yaml around lines
119-121 (and similarly for metrics-exporter at lines 189-190), the template
renders "resources: null" which fails validation; change the template to
conditionally render the resources block only when resource values are provided
(use a Helm "with" or "if" guard) or render an empty map "{}" instead of null;
update both locations so that resources are omitted when unset or output as {}
to satisfy validation.

Comment on lines +166 to +185
- name: OPERATOR_CONTAINER_CPU_REQUEST
valueFrom:
resourceFieldRef:
containerName: operator
resource: requests.cpu
- name: OPERATOR_CONTAINER_CPU_LIMIT
valueFrom:
resourceFieldRef:
containerName: operator
resource: limits.cpu
- name: OPERATOR_CONTAINER_MEM_REQUEST
valueFrom:
resourceFieldRef:
containerName: operator
resource: requests.memory
- name: OPERATOR_CONTAINER_MEM_LIMIT
valueFrom:
resourceFieldRef:
containerName: operator
resource: limits.memory
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Downward API env in metrics-exporter references the operator’s resources — use self container.

resourceFieldRef.containerName should refer to the same container. Point these four env vars at metrics-exporter.

-  containerName: operator
+  containerName: metrics-exporter

Apply to CPU request/limit and memory request/limit.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
- name: OPERATOR_CONTAINER_CPU_REQUEST
valueFrom:
resourceFieldRef:
containerName: operator
resource: requests.cpu
- name: OPERATOR_CONTAINER_CPU_LIMIT
valueFrom:
resourceFieldRef:
containerName: operator
resource: limits.cpu
- name: OPERATOR_CONTAINER_MEM_REQUEST
valueFrom:
resourceFieldRef:
containerName: operator
resource: requests.memory
- name: OPERATOR_CONTAINER_MEM_LIMIT
valueFrom:
resourceFieldRef:
containerName: operator
resource: limits.memory
- name: OPERATOR_CONTAINER_CPU_REQUEST
valueFrom:
resourceFieldRef:
containerName: metrics-exporter
resource: requests.cpu
- name: OPERATOR_CONTAINER_CPU_LIMIT
valueFrom:
resourceFieldRef:
containerName: metrics-exporter
resource: limits.cpu
- name: OPERATOR_CONTAINER_MEM_REQUEST
valueFrom:
resourceFieldRef:
containerName: metrics-exporter
resource: requests.memory
- name: OPERATOR_CONTAINER_MEM_LIMIT
valueFrom:
resourceFieldRef:
containerName: metrics-exporter
resource: limits.memory
🤖 Prompt for AI Agents
In .debug/clickhouse/templates/clickhouse-operator/deployment.yaml around lines
166 to 185 the Downward API env vars for OPERATOR_CONTAINER_CPU_REQUEST,
OPERATOR_CONTAINER_CPU_LIMIT, OPERATOR_CONTAINER_MEM_REQUEST and
OPERATOR_CONTAINER_MEM_LIMIT incorrectly reference containerName: operator;
update each resourceFieldRef.containerName to the metrics-exporter container
(containerName: metrics-exporter) so the env vars pull that container’s
requests.cpu, limits.cpu, requests.memory and limits.memory respectively.

Comment on lines +1 to +5
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
name: RELEASE-NAME-clickhouse-operator
namespace: NAMESPACE
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Cluster-scoped resources in a namespaced Role — switch to ClusterRole (and bind accordingly).

This Role includes cluster-scoped resources (persistentvolumes, customresourcedefinitions). A namespaced Role cannot grant these. Use a ClusterRole and a ClusterRoleBinding, or drop those resources from this Role.

 apiVersion: rbac.authorization.k8s.io/v1
-kind: Role
+kind: ClusterRole
 metadata:
-  namespace: NAMESPACE
   name: RELEASE-NAME-clickhouse-operator

Also ensure you use a ClusterRoleBinding targeting the operator ServiceAccount.

Also applies to: 45-55, 146-151

Comment on lines +129 to +137
- clickhouseinstallations/status
- clickhouseinstallationtemplates/status
- clickhouseoperatorconfigurations/status
verbs:
- get
- update
- patch
- create
- delete
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Remove unsupported verbs on status subresources.

Status subresources typically allow get/patch/update only. Drop create/delete on …/status to avoid invalid RBAC.

   verbs:
-    - get
-    - update
-    - patch
-    - create
-    - delete
+    - get
+    - update
+    - patch
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
- clickhouseinstallations/status
- clickhouseinstallationtemplates/status
- clickhouseoperatorconfigurations/status
verbs:
- get
- update
- patch
- create
- delete
- clickhouseinstallations/status
- clickhouseinstallationtemplates/status
- clickhouseoperatorconfigurations/status
verbs:
- get
- update
- patch
🤖 Prompt for AI Agents
.debug/clickhouse/templates/clickhouse-operator/role.yaml lines 129-137: the
RBAC rule includes unsupported verbs for status subresources; remove the create
and delete verbs so the verbs list for the clickhouseinstallations/status,
clickhouseinstallationtemplates/status, and
clickhouseoperatorconfigurations/status entries contains only get, update, and
patch. Ensure no other rules grant create/delete to these /*/status resources
and save the file.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Simple install fails: create_directories: Permission denied

1 participant