-
Notifications
You must be signed in to change notification settings - Fork 129
feat(clickhouse): add volume permissions support #752
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Add volumePermissions init container to fix permission denied errors when ClickHouse creates directories in persistent volumes. Fixes SigNoz#623
WalkthroughAdds a volume-permissions initContainer and supporting image helper, Helm unit tests, values for volumePermissions and probes, a manual testing guide, and a secret namespace lookup change in the ClickHouse chart. The initContainer conditionally fixes PV ownership before ClickHouse starts when enabled. Changes
Sequence Diagram(s)sequenceDiagram
autonumber
participant U as User (Helm values)
participant H as Helm renderer
participant K as Kubernetes API
participant P as Pod (init phase)
participant I as InitContainer: volume-permissions
participant C as ClickHouse main container
U->>H: enable volumePermissions + set persistence/template values
H->>K: apply rendered Pod manifest (includes initContainer if enabled)
K-->>P: create Pod
alt volumePermissions enabled & volume mounted
P->>I: start volume-permissions initContainer
I->>I: mkdir -p /var/lib/clickhouse && chown -R <user>:<group> /var/lib/clickhouse
I-->>P: exit 0
else
Note over P: no volume-permissions initContainer
end
P->>C: start ClickHouse main container
C-->>K: probes/report ready
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related PRs
Suggested reviewers
Poem
Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✨ Finishing touches🧪 Generate unit tests
Tip 👮 Agentic pre-merge checks are now available in preview!Pro plan users can now enable pre-merge checks in their settings to enforce checklists before merging PRs.
Please see the documentation for more information. Example: reviews:
pre_merge_checks:
custom_checks:
- name: "Undocumented Breaking Changes"
mode: "warning"
instructions: |
Pass/fail criteria: All breaking changes to public APIs, CLI flags, environment variables, configuration keys, database schemas, or HTTP/GraphQL endpoints must be documented in the "Breaking Change" section of the PR description and in CHANGELOG.md. Exclude purely internal or private changes (e.g., code not exported from package entry points or explicitly marked as internal).Please share your feedback with us on this Discord post. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (7)
charts/clickhouse/templates/_helper.tpl (1)
129-150: Solid image helper; wire pullSecrets support for init imageHelper handles registry/tag/digest well. One gap:
volumePermissions.image.pullSecretsis defined in values but never used, so private registries for this init image won’t work unless users rely on global/image-level pull secrets.
- Either remove
volumePermissions.image.pullSecretsfrom values, or (better) include it in the podimagePullSecrets.Example (outside this hunk) to extend
clickhouse.imagePullSecrets:{{- define "clickhouse.imagePullSecrets" -}} {{- if or .Values.global.imagePullSecrets .Values.imagePullSecrets .Values.volumePermissions.image.pullSecrets }} imagePullSecrets: {{- range .Values.global.imagePullSecrets }} - name: {{ . }} {{- end }} {{- range .Values.imagePullSecrets }} - name: {{ . }} {{- end }} {{- range .Values.volumePermissions.image.pullSecrets }} - name: {{ . }} {{- end }} {{- end }} {{- end }}charts/clickhouse/MANUAL_TESTING.md (2)
6-9: Add fenced code language to satisfy markdownlint MD040Annotate the error snippet as plain text.
-``` +```text filesystem error: in create_directories: Permission denied ["/var/lib/clickhouse/preprocessed_configs"] -``` +```
80-86: Add fenced code language to satisfy markdownlint MD040Annotate the “Expected Output” snippet.
-``` +```console + mkdir -p /var/lib/clickhouse + chown -R 101:101 /var/lib/clickhouse + find /var/lib/clickhouse -mindepth 1 -maxdepth 1 -not -name .snapshot -not -name lost+found + xargs -r chown -R 101:101 -``` +```charts/clickhouse/values.yaml (1)
231-266: Expose is fine; but pullSecrets here are currently unused
volumePermissions.image.pullSecretsis not consumed. Either document that global/image-level pull secrets should be used or wire these into the podimagePullSecrets(preferred; see helper suggestion in _helper.tpl comment).charts/clickhouse/templates/clickhouse-instance/clickhouse-instance.yaml (3)
196-205: Avoid emitting empty imagePullPolicyCurrent template can render
imagePullPolicy: ""if a user unsets the value, which is invalid. Make it conditional (and align with other containers).- imagePullPolicy: {{ default "" .Values.volumePermissions.image.pullPolicy | quote }} + {{- if .Values.volumePermissions.image.pullPolicy }} + imagePullPolicy: {{ .Values.volumePermissions.image.pullPolicy }} + {{- end }}
214-231: Unreachable branch for emptyDir mount in this initContainerThis initContainer only renders when persistence is enabled or
templates.volumeClaimTemplatesis set. Thedata-volumeemptyDir mount underand (not .Values.templates.volumeClaimTemplates) (not .Values.persistence.enabled)cannot be reached. Safe to delete for clarity.
203-208: More robust chown for odd filenamesTo be resilient to unusual names, consider null-delimited piping:
- find /var/lib/clickhouse -mindepth 1 -maxdepth 1 -not -name ".snapshot" -not -name "lost+found" | xargs -r chown -R {{ .Values.securityContext.runAsUser }}:{{ .Values.securityContext.fsGroup }} + find /var/lib/clickhouse -mindepth 1 -maxdepth 1 -not -name ".snapshot" -not -name "lost+found" -print0 | xargs -0 -r chown -R {{ .Values.securityContext.runAsUser }}:{{ .Values.securityContext.fsGroup }}Note: update tests accordingly if you adopt this.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (5)
charts/clickhouse/MANUAL_TESTING.md(1 hunks)charts/clickhouse/templates/_helper.tpl(1 hunks)charts/clickhouse/templates/clickhouse-instance/clickhouse-instance.yaml(2 hunks)charts/clickhouse/tests/clickhouse-instance_volume_permissions_test.yaml(1 hunks)charts/clickhouse/values.yaml(1 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2024-12-16T16:15:01.460Z
Learnt from: prashant-shahi
PR: SigNoz/charts#565
File: charts/clickhouse/values.yaml:261-264
Timestamp: 2024-12-16T16:15:01.460Z
Learning: In `charts/clickhouse/values.yaml`, the `customStartupProbe`, `customLivenessProbe`, and `customReadinessProbe` configurations are provided for advanced users to specify custom probe types such as HTTP, Exec, TCP, and gRPC.
Applied to files:
charts/clickhouse/values.yaml
🪛 markdownlint-cli2 (0.18.1)
charts/clickhouse/MANUAL_TESTING.md
6-6: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
80-80: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
🔇 Additional comments (1)
charts/clickhouse/tests/clickhouse-instance_volume_permissions_test.yaml (1)
9-20: Tests read well and cover core conditionsGood coverage for enable/disable matrix and volume mount variants. Nice digest/tag assertions too.
Consider adding one small assertion to catch accidental typos in resource headers (e.g.,
apiVersionpresence) so regressions like a prefixed character get flagged early.
charts/clickhouse/templates/clickhouse-instance/clickhouse-instance.yaml
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (3)
charts/clickhouse/templates/_helper.tpl (2)
129-150: Harden image string construction: trim slashes to avoid//repoand/@sha256edge casesGood addition and digest handling. To make it more robust against user-supplied values with trailing/leading slashes on registry/repo, trim them before composition.
Apply this diff:
{{- define "clickhouse.volumePermissions.image" -}} -{{- $registryName := default .Values.volumePermissions.image.registry .Values.global.imageRegistry -}} -{{- $repositoryName := .Values.volumePermissions.image.repository -}} -{{- $tag := .Values.volumePermissions.image.tag | toString -}} -{{- $digest := .Values.volumePermissions.image.digest | toString -}} -{{- if $registryName -}} +{{- $registryName := default .Values.volumePermissions.image.registry .Values.global.imageRegistry -}} +{{- $repositoryName := .Values.volumePermissions.image.repository -}} +{{- $tag := .Values.volumePermissions.image.tag | toString -}} +{{- $digest := .Values.volumePermissions.image.digest | toString -}} +{{- $reg := $registryName | trimSuffix "/" -}} +{{- $repo := $repositoryName | trimPrefix "/" -}} +{{- if $registryName -}} {{- if $digest -}} - {{- printf "%s/%s@%s" $registryName $repositoryName $digest -}} + {{- printf "%s/%s@%s" $reg $repo $digest -}} {{- else -}} - {{- printf "%s/%s:%s" $registryName $repositoryName $tag -}} + {{- printf "%s/%s:%s" $reg $repo $tag -}} {{- end -}} {{- else -}} {{- if $digest -}} - {{- printf "%s@%s" $repositoryName $digest -}} + {{- printf "%s@%s" $repo $digest -}} {{- else -}} - {{- printf "%s:%s" $repositoryName $tag -}} + {{- printf "%s:%s" $repo $tag -}} {{- end -}} {{- end -}} {{- end -}}Also consider aligning digest support across other image helpers for consistency (follow-up).
192-205: Safer, deduped imagePullSecrets; avoid nil-path pitfalls and duplicatesAccessing
.Values.volumePermissions.image.pullSecretsin theorcan error if intermediate maps are absent, and duplicates are possible across lists. Usedig, merge, anduniq.Apply this diff:
{{- define "clickhouse.imagePullSecrets" -}} -{{- if or .Values.global.imagePullSecrets .Values.imagePullSecrets .Values.volumePermissions.image.pullSecrets }} -imagePullSecrets: -{{- range .Values.global.imagePullSecrets }} - - name: {{ . }} -{{- end }} -{{- range .Values.imagePullSecrets }} - - name: {{ . }} -{{- end }} -{{- range .Values.volumePermissions.image.pullSecrets }} - - name: {{ . }} -{{- end }} -{{- end }} +{{- $global := .Values.global.imagePullSecrets | default (list) -}} +{{- $local := .Values.imagePullSecrets | default (list) -}} +{{- $vp := dig "volumePermissions" "image" "pullSecrets" .Values | default (list) -}} +{{- $all := concat $global $local $vp -}} +{{- $pulls := uniq $all -}} +{{- if $pulls }} +imagePullSecrets: +{{- range $pulls }} + - name: {{ . }} +{{- end }} +{{- end }} {{- end }}charts/clickhouse/templates/clickhouse-operator/secret.yaml (1)
6-6: Useclickhouse.namespacehelper in secret.yaml- namespace: {{ .Values.namespace | default .Release.Namespace }} + namespace: {{ include "clickhouse.namespace" . }}The helper in
_helper.tpl(default .Release.Namespace .Values.namespace) produces identical output, so there’s no upgrade impact and this change is purely for template consistency.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
charts/clickhouse/templates/_helper.tpl(2 hunks)charts/clickhouse/templates/clickhouse-instance/clickhouse-instance.yaml(1 hunks)charts/clickhouse/templates/clickhouse-operator/secret.yaml(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- charts/clickhouse/templates/clickhouse-instance/clickhouse-instance.yaml
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 7
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
.debug/clickhouse/templates/clickhouse-operator/configmaps/etc-usersd-files.yaml (1)
1-54: Do not commit .debug artifacts to the repoThis looks like a Helm unittest/debug render. Please remove this file and ignore the entire .debug/ tree.
Suggested follow-ups:
- Delete this file from the PR.
- Add to .gitignore:
+.debug/
🧹 Nitpick comments (26)
.debug/clickhouse/templates/clickhouse-instance/serviceaccount.yaml (2)
4-15: Disable token automount by default for least privilegeUnless the pod needs K8s API access, set
automountServiceAccountToken: falseon the ServiceAccount (and/or pod) to avoid mounting tokens into all containers, including initContainers.kind: ServiceAccount metadata: name: RELEASE-NAME-clickhouse namespace: NAMESPACE labels: @@ annotations: - {} + {} +automountServiceAccountToken: false
14-15: Wire annotations from values instead of a hard-coded empty mapExpose
.Values.serviceAccount.annotationsso operators can add IAM, PSP/PSS, or workload identity annotations when needed.- annotations: - {} + annotations: +{{- with .Values.serviceAccount.annotations }} +{{ toYaml . | nindent 4 }} +{{- end }}.debug/clickhouse/templates/clickhouse-operator/configmaps/etc-files.yaml (7)
3-6: Debug artifact or real manifest? Hardcoded name/namespace.Metadata uses placeholders (RELEASE-NAME, NAMESPACE). If this is only a rendered debug/snapshot, ensure the .debug directory is excluded from chart packaging (via .helmignore). If it’s meant to be applied, wire these to Helm templates instead.
7-13: Version drift risk in labels.Static labels (helm.sh/chart: clickhouse-24.1.18, app.kubernetes.io/version: "24.1.2", chop: 0.21.2) will rot. If this file is ever applied, derive them from chart/app/operator versions; otherwise keep it strictly as debug output and exclude from packages.
64-76: Avoid “password: default” in user template defaults.Even if comments say ‘default’ user doesn’t use this field, shipping a template default of "default" can lead to unintended weak credentials if reused. Prefer omitting the password field here or set it to empty and require explicit Secrets.
145-151: Secret lookup: confirm name/namespace behavior.
namespace: ""means “operator’s namespace,” whilenameis hardcoded to "RELEASE-NAME-clickhouse-operator". Verify this matches the actual Secret created by the chart after your recent “secret namespace lookup change,” or wire it to the same helper used elsewhere to avoid mismatches.
268-270: No StatefulSet revision history.
revisionHistoryLimit: 0hampers rollback/debugging. Recommend a small history (e.g., 5–10).
281-282: Termination grace may be too low for ClickHouse.30s can be tight for graceful shutdown and flushes under load. Consider 120–300s unless CHI manifests override this reliably.
152-160: Operator→CH timeouts are aggressive.
connect: 2s,query: 5s, and metricscollect: 9smay cause spurious failures on busy clusters or slow networks. Recommend connect 5–10s, query 20–30s, metrics 20–30s, unless you have evidence these values are safe..debug/clickhouse/templates/clickhouse-instance/scripts-configmap.yaml (3)
15-36: Make the script POSIX, add retry/timeout, and handle signals (avoid indefinite hangs).This can block forever if ClickHouse never comes up, and it assumes bash+wget availability. Propose a portable version with curl/wget fallback, env‑overridable HOST/PORT/DELAY, capped retries, and SIGTERM handling.
- #!/bin/bash - - # Delay between attempts in seconds - DELAY=10 - # Default host and port - HOST="localhost" - PORT="8123" - - echo "Waiting for Clickhouse to become ready..." - - while true; do - response=$(wget --spider -S "http://${HOST}:${PORT}/ping" 2>&1) - exit_code=$? - - if [ $exit_code -eq 0 ]; then - echo "Clickhouse is ready!" - exit 0 - else - echo "Clickhouse is not ready yet. Retrying in ${DELAY} seconds..." - sleep $DELAY - fi - done + #!/bin/sh + set -eu + + # Defaults (overridable via env) + HOST="${HOST:-127.0.0.1}" + PORT="${PORT:-8123}" + DELAY="${DELAY:-10}" # seconds between attempts + MAX_ATTEMPTS="${MAX_ATTEMPTS:-60}" # ~10 minutes by default + + trap 'echo "wait-until-ready: terminated"; exit 1' INT TERM + + has_cmd() { command -v "$1" >/dev/null 2>&1; } + + echo "Waiting for ClickHouse at http://${HOST}:${PORT}/ping ..." + ATTEMPT=1 + while [ "$ATTEMPT" -le "$MAX_ATTEMPTS" ]; do + if has_cmd curl; then + if curl -fsS -m 3 "http://${HOST}:${PORT}/ping" >/dev/null; then + echo "ClickHouse is ready!" + exit 0 + fi + elif has_cmd wget; then + if wget -qO- --timeout=3 --tries=1 "http://${HOST}:${PORT}/ping" >/dev/null; then + echo "ClickHouse is ready!" + exit 0 + fi + else + echo "Neither curl nor wget is available; cannot check readiness." + exit 2 + fi + echo "Not ready (attempt ${ATTEMPT}/${MAX_ATTEMPTS}). Retrying in ${DELAY}s..." + ATTEMPT=$((ATTEMPT+1)) + sleep "$DELAY" + done + echo "Timed out waiting for ClickHouse after $((MAX_ATTEMPTS*DELAY))s." + exit 1
26-27: If keeping wget: remove unused capture, add per-attempt timeouts, and quiet output.Current code stores response but never uses it, and retries can block per attempt. Minimal tweak:
- response=$(wget --spider -S "http://${HOST}:${PORT}/ping" 2>&1) - exit_code=$? + wget -q --spider --timeout=3 --tries=1 "http://${HOST}:${PORT}/ping" + exit_code=$?
20-21: Allow env overrides for HOST/PORT (useful in debug pods and sidecars).Small portability win:
- HOST="localhost" - PORT="8123" + HOST="${HOST:-127.0.0.1}" + PORT="${PORT:-8123}".debug/clickhouse/templates/clickhouse-operator/configmaps/etc-usersd-files.yaml (3)
7-13: Label/version consistency for operator vs appFor an operator component, app.kubernetes.io/version typically matches the operator version (here chop 0.21.2). You’ve set 24.1.2, which reads like a ClickHouse server version. Align or drop if unused by tooling.
19-21: Loopback-only user access may block cluster testingRestricting testuser to 127.0.0.1 likely prevents access from other pods/nodes. If this is intentional for a unit-test fixture, fine; otherwise broaden networks to the cluster CIDR or service mesh ranges.
46-53: Revisit default_database_engine: prefer Atomic unless you have a constraintSetting default_database_engine to Ordinary can limit modern features and DDL safety. Unless there’s a specific reason, consider Atomic (the common default in recent ClickHouse).
.debug/clickhouse/charts/zookeeper/templates/scripts-configmap.yaml (1)
16-33: Harden setup.sh with strict mode and clearer errors.Add bash strict flags and guard file reads for better failure surfacing.
#!/bin/bash + set -Eeuo pipefail + shopt -s nocasematch || true @@ - if [[ -f "/bitnami/zookeeper/data/myid" ]]; then - export ZOO_SERVER_ID="$(cat /bitnami/zookeeper/data/myid)" + if [[ -r "/bitnami/zookeeper/data/myid" ]]; then + if ! ZOO_SERVER_ID="$(cat /bitnami/zookeeper/data/myid)"; then + echo "Failed to read /bitnami/zookeeper/data/myid" >&2 + exit 1 + fi + export ZOO_SERVER_ID @@ - else - echo "Failed to get index from hostname $HOSTNAME" + else + echo "Failed to get ordinal index from hostname: ${HOSTNAME}" >&2 exit 1 fi fi exec /entrypoint.sh /run.sh.debug/clickhouse/charts/zookeeper/templates/statefulset.yaml (1)
35-55: Minor cleanups: serviceAccount and empty initContainers.
- serviceAccountName: default — consider making this a chart value to avoid binding to cluster defaults.
- initContainers: present but empty; drop the key until needed to reduce noise.
If these pods will write to pre-provisioned PVs with restrictive ownership, you may still need a volume-permissions initContainer (chown/chmod) despite fsGroup, especially with existing data. Aligns with the broader PR goal around volume permissions.
.debug/clickhouse/templates/clickhouse-instance/configmap.yaml (1)
14-35: Executable UDF needs binary path, perms, and CH setting enabled.Ensure:
- The histogramQuantile binary exists at the resolved working dir or use an absolute path (e.g., /var/lib/clickhouse/user_scripts/histogramQuantile) and mount it.
- File perms allow execution by user 101:101; if placed on a PV, confirm your volume-permissions initContainer also chowns that path.
- ClickHouse has allow_executable_user_defined_functions enabled; otherwise this config will be ignored.
I can propose a small values + volumeMounts patch to mount the executable into user_scripts with correct ownership.
.debug/clickhouse/templates/clickhouse-operator/configmaps/etc-templatesd-files.yaml (2)
47-47: Template image version to avoid drift with chart AppVersion.Hardcoding clickhouse/clickhouse-server:22.3 can confuse users when chart AppVersion is different. Prefer a templated value (e.g., include "clickhouse.image" .) or wire to .Chart.AppVersion/.Values.image.tag.
13-13: Template CHOP version label.clickhouse.altinity.com/chop is pinned to 0.21.2. Consider templating from .Values.clickhouseOperator.version for consistency with operator images.
charts/clickhouse/templates/_helper.tpl (1)
195-211: Aggregate pull secrets across all images; stabilize output order.Good call adding volumePermissions pullSecrets and deduping. Two improvements:
- Include initContainers.init.image.pullSecrets and initContainers.udf.image.pullSecrets so the Pod can pull those images without relying on global/local secrets.
- Optional: sort the final list for deterministic rendering in tests.
Proposed diff:
{{- define "clickhouse.imagePullSecrets" -}} -{{- $global := .Values.global.imagePullSecrets | default (list) -}} -{{- $local := .Values.imagePullSecrets | default (list) -}} -{{- $vp := list -}} +{{- $global := .Values.global.imagePullSecrets | default (list) -}} +{{- $local := .Values.imagePullSecrets | default (list) -}} +{{- $vp := list -}} +{{- $init := list -}} +{{- $udf := list -}} {{- if .Values.volumePermissions -}} {{- if .Values.volumePermissions.image -}} {{- $vp = .Values.volumePermissions.image.pullSecrets | default (list) -}} {{- end -}} {{- end -}} -{{- $all := concat $global $local $vp -}} -{{- $pulls := uniq $all -}} +{{- if .Values.initContainers.init.image }} + {{- $init = .Values.initContainers.init.image.pullSecrets | default (list) -}} +{{- end -}} +{{- if .Values.initContainers.udf.image }} + {{- $udf = .Values.initContainers.udf.image.pullSecrets | default (list) -}} +{{- end -}} +{{- $all := concat $global $local $vp $init $udf -}} +{{- $pulls := uniq $all | sortAlpha -}} {{- if $pulls }} imagePullSecrets: {{- range $pulls }} - name: {{ . }} {{- end }} {{- end }} {{- end }}.debug/clickhouse/templates/clickhouse-operator/role.yaml (3)
41-44: Broaden Events verbs for standard recorder behavior.Operators usually need create, patch (and sometimes update) on events. Consider adding patch (and update if used).
verbs: - create + - patch + - update
68-75: Add list/watch on ReplicaSets if the operator watches them.You grant get/patch/update/delete but not list/watch. If any RS informer is used, include list/watch.
17-29: Duplicate secrets rules — simplify.Secrets are granted twice. Keep one block with combined verbs for clarity.
Also applies to: 139-145
.debug/clickhouse/templates/clickhouse-operator/deployment.yaml (1)
36-37: Harden pod/containers: disallow privilege escalation and run as non-root.Address CKV_K8S_20/23: add per-container securityContext with
allowPrivilegeEscalation: false,runAsNonRoot: true, andreadOnlyRootFilesystem: true(if compatible).- name: operator image: docker.io/altinity/clickhouse-operator:0.21.2 imagePullPolicy: IfNotPresent + securityContext: + allowPrivilegeEscalation: false + runAsNonRoot: true + readOnlyRootFilesystem: true ... - name: metrics-exporter image: docker.io/altinity/metrics-exporter:0.21.2 imagePullPolicy: IfNotPresent + securityContext: + allowPrivilegeEscalation: false + runAsNonRoot: true + readOnlyRootFilesystem: trueAlso applies to: 55-69, 122-136
.debug/clickhouse/templates/clickhouse-instance/clickhouse-instance.yaml (1)
100-121: Init container downloads binaries without integrity verification.Add checksum verification and stricter shell flags, or pre-bake the binary into an image/ConfigMap.
- command: + command: - sh - -c - | - set -e + set -euo pipefail version="v0.0.1" node_os=$(uname -s | tr '[:upper:]' '[:lower:]') node_arch=$(uname -m | sed s/aarch64/arm64/ | sed s/x86_64/amd64/) echo "Fetching histogram-binary for ${node_os}/${node_arch}" cd /tmp - wget -O histogram-quantile.tar.gz "https://github.com/SigNoz/signoz/releases/download/histogram-quantile%2F${version}/histogram-quantile_${node_os}_${node_arch}.tar.gz" - tar -xzf histogram-quantile.tar.gz + url="https://github.com/SigNoz/signoz/releases/download/histogram-quantile%2F${version}/histogram-quantile_${node_os}_${node_arch}.tar.gz" + wget -O histogram-quantile.tar.gz "${url}" + # Optionally verify checksum: echo "<sha256> histogram-quantile.tar.gz" | sha256sum -c - + tar -xzf histogram-quantile.tar.gz chmod +x histogram-quantile mv histogram-quantile /var/lib/clickhouse/user_scripts/histogramQuantile echo "histogram-quantile installed successfully"
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (34)
.debug/clickhouse/charts/zookeeper/templates/NOTES.txt(1 hunks).debug/clickhouse/charts/zookeeper/templates/configmap.yaml(1 hunks).debug/clickhouse/charts/zookeeper/templates/extra-list.yaml(1 hunks).debug/clickhouse/charts/zookeeper/templates/metrics-svc.yaml(1 hunks).debug/clickhouse/charts/zookeeper/templates/networkpolicy.yaml(1 hunks).debug/clickhouse/charts/zookeeper/templates/pdb.yaml(1 hunks).debug/clickhouse/charts/zookeeper/templates/prometheusrule.yaml(1 hunks).debug/clickhouse/charts/zookeeper/templates/scripts-configmap.yaml(1 hunks).debug/clickhouse/charts/zookeeper/templates/secrets.yaml(1 hunks).debug/clickhouse/charts/zookeeper/templates/serviceaccount.yaml(1 hunks).debug/clickhouse/charts/zookeeper/templates/servicemonitor.yaml(1 hunks).debug/clickhouse/charts/zookeeper/templates/statefulset.yaml(1 hunks).debug/clickhouse/charts/zookeeper/templates/svc-headless.yaml(1 hunks).debug/clickhouse/charts/zookeeper/templates/svc.yaml(1 hunks).debug/clickhouse/charts/zookeeper/templates/tls-secrets.yaml(1 hunks).debug/clickhouse/templates/clickhouse-instance/clickhouse-instance.yaml(1 hunks).debug/clickhouse/templates/clickhouse-instance/configmap.yaml(1 hunks).debug/clickhouse/templates/clickhouse-instance/scripts-configmap.yaml(1 hunks).debug/clickhouse/templates/clickhouse-instance/serviceaccount.yaml(1 hunks).debug/clickhouse/templates/clickhouse-instance/storage-class.yaml(1 hunks).debug/clickhouse/templates/clickhouse-operator/configmaps/etc-confd-files.yaml(1 hunks).debug/clickhouse/templates/clickhouse-operator/configmaps/etc-configd-files.yaml(1 hunks).debug/clickhouse/templates/clickhouse-operator/configmaps/etc-files.yaml(1 hunks).debug/clickhouse/templates/clickhouse-operator/configmaps/etc-templatesd-files.yaml(1 hunks).debug/clickhouse/templates/clickhouse-operator/configmaps/etc-usersd-files.yaml(1 hunks).debug/clickhouse/templates/clickhouse-operator/deployment.yaml(1 hunks).debug/clickhouse/templates/clickhouse-operator/namespace.yaml(1 hunks).debug/clickhouse/templates/clickhouse-operator/role.yaml(1 hunks).debug/clickhouse/templates/clickhouse-operator/rolebinding.yaml(1 hunks).debug/clickhouse/templates/clickhouse-operator/secret.yaml(1 hunks).debug/clickhouse/templates/clickhouse-operator/service.yaml(1 hunks).debug/clickhouse/templates/clickhouse-operator/serviceaccount.yaml(1 hunks).debug/clickhouse/templates/clickhouse-operator/servicemonitor.yaml(1 hunks)charts/clickhouse/templates/_helper.tpl(2 hunks)
✅ Files skipped from review due to trivial changes (18)
- .debug/clickhouse/charts/zookeeper/templates/extra-list.yaml
- .debug/clickhouse/templates/clickhouse-operator/servicemonitor.yaml
- .debug/clickhouse/charts/zookeeper/templates/prometheusrule.yaml
- .debug/clickhouse/charts/zookeeper/templates/serviceaccount.yaml
- .debug/clickhouse/charts/zookeeper/templates/secrets.yaml
- .debug/clickhouse/templates/clickhouse-operator/service.yaml
- .debug/clickhouse/charts/zookeeper/templates/pdb.yaml
- .debug/clickhouse/templates/clickhouse-instance/storage-class.yaml
- .debug/clickhouse/charts/zookeeper/templates/tls-secrets.yaml
- .debug/clickhouse/charts/zookeeper/templates/NOTES.txt
- .debug/clickhouse/templates/clickhouse-operator/secret.yaml
- .debug/clickhouse/charts/zookeeper/templates/networkpolicy.yaml
- .debug/clickhouse/charts/zookeeper/templates/svc.yaml
- .debug/clickhouse/templates/clickhouse-operator/serviceaccount.yaml
- .debug/clickhouse/charts/zookeeper/templates/servicemonitor.yaml
- .debug/clickhouse/charts/zookeeper/templates/configmap.yaml
- .debug/clickhouse/templates/clickhouse-operator/configmaps/etc-confd-files.yaml
- .debug/clickhouse/templates/clickhouse-operator/configmaps/etc-configd-files.yaml
🧰 Additional context used
🪛 Checkov (3.2.334)
.debug/clickhouse/templates/clickhouse-operator/deployment.yaml
[medium] 1-190: Containers should not run with allowPrivilegeEscalation
(CKV_K8S_20)
[medium] 1-190: Minimize the admission of root containers
(CKV_K8S_23)
🔇 Additional comments (12)
.debug/clickhouse/templates/clickhouse-instance/serviceaccount.yaml (3)
2-15: LGTM on adding a dedicated ServiceAccountGood addition; aligns with separating identity for the ClickHouse instance/init containers.
5-6: Verify name/namespace templating and conditional creationConfirm the real template uses your
fullnamehelper and honorsvalues.serviceAccount.create/values.serviceAccount.nameto support existing SAs and custom names; ensureserviceAccountNamein the Pod is wired accordingly.
12-12: Check app.kubernetes.io/version matches the ClickHouse image tagPlease verify
"24.1.2"reflects the actual ClickHouse image version used by the chart to keep labels accurate for selectors/ops inventory..debug/clickhouse/templates/clickhouse-instance/scripts-configmap.yaml (1)
14-14: Confirm executable permissions or invoke via sh -c when mounting this ConfigMap.ConfigMap files are 0644 by default; binaries won’t be executable unless you set defaultMode or call via the shell.
- Ensure volume has an executable mode or invoke explicitly:
volumes: - name: scripts configMap: name: {{ include "clickhouse.fullname" . }}-scripts defaultMode: 0555 containers: - name: clickhouse volumeMounts: - name: scripts mountPath: /scripts command: ["/scripts/wait-until-ready.sh"] # or command: ["sh", "-c", "/scripts/wait-until-ready.sh"].debug/clickhouse/templates/clickhouse-operator/configmaps/etc-usersd-files.yaml (1)
36-45: Default profile tweaks LGTMSettings look reasonable for tests. No blocking issues.
If these are production-facing, confirm they match your performance/SLO targets.
.debug/clickhouse/charts/zookeeper/templates/svc-headless.yaml (1)
12-26: Headless Service wiring looks correct.Port names align with the StatefulSet; headless + publishNotReadyAddresses is appropriate for ZK STS discovery.
.debug/clickhouse/charts/zookeeper/templates/statefulset.yaml (1)
124-138: Confirm probe utilities exist in the image.Liveness/readiness use nc and timeout. Ensure signoz/zookeeper:3.7.1 includes these; otherwise probes will fail despite a healthy process. If unsure, switch to zkCli.sh/ruok via built-in scripts or ship a lightweight toolbox.
.debug/clickhouse/templates/clickhouse-operator/namespace.yaml (1)
1-2: No functional change.Whitespace-only; safe to ignore.
.debug/clickhouse/templates/clickhouse-operator/rolebinding.yaml (1)
15-22: Confirm scope: RoleBinding vs ClusterRoleBinding.If the operator is configured to watch multiple namespaces or cluster-wide, this should be a ClusterRoleBinding with a ClusterRole. If it’s strictly single-namespace, RoleBinding + Role is correct.
charts/clickhouse/templates/_helper.tpl (1)
133-151: Volume-permissions image helper looks solid.Handles registry, tag vs digest, and trims slashes consistently. No blockers.
Add a unit test asserting digest form rendering (with and without global.imageRegistry) to lock this in.
.debug/clickhouse/templates/clickhouse-instance/clickhouse-instance.yaml (2)
83-90: Good: pod runs as 101 with fsGroup and OnRootMismatch.This aligns with ClickHouse UID/GID and mitigates many PV permission issues. Combined with the new initContainer it should satisfy issue #623.
176-185: Include initContainers.enabled when testing volume-permissions
Thevolume-permissionsinitContainer is gated by bothinitContainers.enabledandvolumePermissions.enabled(plus persistence or templates). Be sure to render with--set initContainers.enabled=true,volumePermissions.enabled=trueacross your scenarios.
| - name: tcp-metrics | ||
| port: 9141 | ||
| targetPort: metrics |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Service targets a non-existent pod port name 'metrics'.
StatefulSet exposes ports named client/follower/election only. targetPort: metrics won’t resolve, so scraping on 9141 will fail. Align one of:
- Add a metrics endpoint (sidecar or container) and expose a containerPort named metrics: 9141.
- Or change Service to target an existing port, or drop the metrics Service until an exporter is added.
🤖 Prompt for AI Agents
.debug/clickhouse/charts/zookeeper/templates/metrics-svc.yaml lines 20-22: the
Service is targeting a non-existent container port name "metrics" (StatefulSet
only exposes client/follower/election) so targetPort: metrics will not resolve;
either add a metrics exporter container/sidecar and declare a containerPort:
9141 with name: metrics in the Pod/StatefulSet, or update this Service to use
one of the existing port names (client/follower/election) or remove the metrics
Service entirely until an exporter is added—make the change so targetPort
references a real containerPort name or numeric port.
| ports: | ||
| - name: client | ||
| containerPort: 2181 | ||
| - name: follower | ||
| containerPort: 2888 | ||
| - name: election | ||
| containerPort: 3888 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Expose a 'metrics' port or adjust/remove the metrics Service.
Pods don’t declare a containerPort named metrics, while metrics-svc targets it. Add the port (and an actual metrics endpoint/exporter), or update/remove the Service to avoid a dead endpoint.
ports:
- name: client
containerPort: 2181
- name: follower
containerPort: 2888
- name: election
containerPort: 3888
+ # If a metrics exporter is present and listens on 9141:
+ # - name: metrics
+ # containerPort: 9141📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| ports: | |
| - name: client | |
| containerPort: 2181 | |
| - name: follower | |
| containerPort: 2888 | |
| - name: election | |
| containerPort: 3888 | |
| ports: | |
| - name: client | |
| containerPort: 2181 | |
| - name: follower | |
| containerPort: 2888 | |
| - name: election | |
| containerPort: 3888 | |
| # If a metrics exporter is present and listens on 9141: | |
| # - name: metrics | |
| # containerPort: 9141 |
🤖 Prompt for AI Agents
In .debug/clickhouse/charts/zookeeper/templates/statefulset.yaml around lines
116-122 the Pod spec does not expose a containerPort named "metrics" while the
metrics Service targets that port; add a container port entry named "metrics"
(e.g., containerPort: <port-number>) and ensure the container includes a metrics
exporter or endpoint bound to that port, or alternatively remove or update the
metrics Service to point to an existing port/name; make the change so Service
targetPort/name matches an actual containerPort and verify the exporter is
configured and listening.
| 01-clickhouse-user.xml: | | ||
| <yandex> | ||
| <users> | ||
| <testuser> | ||
| <networks> | ||
| <ip>127.0.0.1</ip> | ||
| </networks> | ||
| <password>${CLICKHOUSE_PASSWORD}</password> | ||
| <profile>default</profile> | ||
| <quota>default</quota> | ||
| </testuser> | ||
| </users> | ||
| <profiles> | ||
| <testuser> | ||
| <log_queries>0</log_queries> | ||
| <skip_unavailable_shards>1</skip_unavailable_shards> | ||
| <http_connection_timeout>10</http_connection_timeout> | ||
| </testuser> | ||
| </profiles> | ||
| </yandex> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Avoid passwords in ConfigMaps; fix profile mismatch
- Storing ${CLICKHOUSE_PASSWORD} in a ConfigMap is a security smell. Prefer a Secret and hashed value (password_sha256_hex), or the operator’s supported Secret mechanism if available.
- User references default while you also define a separate …. Either switch the user’s profile to testuser or drop the unused profile block to avoid confusion.
Minimal corrective diff for the profile mismatch (if you keep this file locally for tests):
- <profile>default</profile>
+ <profile>testuser</profile>📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| 01-clickhouse-user.xml: | | |
| <yandex> | |
| <users> | |
| <testuser> | |
| <networks> | |
| <ip>127.0.0.1</ip> | |
| </networks> | |
| <password>${CLICKHOUSE_PASSWORD}</password> | |
| <profile>default</profile> | |
| <quota>default</quota> | |
| </testuser> | |
| </users> | |
| <profiles> | |
| <testuser> | |
| <log_queries>0</log_queries> | |
| <skip_unavailable_shards>1</skip_unavailable_shards> | |
| <http_connection_timeout>10</http_connection_timeout> | |
| </testuser> | |
| </profiles> | |
| </yandex> | |
| 01-clickhouse-user.xml: | | |
| <yandex> | |
| <users> | |
| <testuser> | |
| <networks> | |
| <ip>127.0.0.1</ip> | |
| </networks> | |
| <password>${CLICKHOUSE_PASSWORD}</password> | |
| <profile>testuser</profile> | |
| <quota>default</quota> | |
| </testuser> | |
| </users> | |
| <profiles> | |
| <testuser> | |
| <log_queries>0</log_queries> | |
| <skip_unavailable_shards>1</skip_unavailable_shards> | |
| <http_connection_timeout>10</http_connection_timeout> | |
| </testuser> | |
| </profiles> | |
| </yandex> |
🤖 Prompt for AI Agents
In
.debug/clickhouse/templates/clickhouse-operator/configmaps/etc-usersd-files.yaml
around lines 15-34, the ConfigMap contains a plaintext
<password>${CLICKHOUSE_PASSWORD}</password> (security risk) and the user
declares <profile>default</profile> while a <profiles><testuser>…</testuser>
block exists (mismatch); fix by moving the password into a Kubernetes Secret or
the operator-supported secret mechanism and store a hashed password
(password_sha256_hex) instead of plaintext in the ConfigMap, and resolve the
profile mismatch by either changing the user's <profile> to testuser or removing
the unused testuser profile block so profiles align.
| resources: | ||
| null | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
resources: null will fail validation — omit when unset.
Render resources only when values are provided, or use {}. In Helm, guard with with to avoid emitting null.
- resources:
- null
+{{- /* In template: only render when non-empty */ -}}
+{{- with .Values.clickhouseOperator.resources }}
+ resources:
+{{ toYaml . | nindent 12 }}
+{{- end }}Repeat for metrics-exporter.
Also applies to: 189-190
🤖 Prompt for AI Agents
In .debug/clickhouse/templates/clickhouse-operator/deployment.yaml around lines
119-121 (and similarly for metrics-exporter at lines 189-190), the template
renders "resources: null" which fails validation; change the template to
conditionally render the resources block only when resource values are provided
(use a Helm "with" or "if" guard) or render an empty map "{}" instead of null;
update both locations so that resources are omitted when unset or output as {}
to satisfy validation.
| - name: OPERATOR_CONTAINER_CPU_REQUEST | ||
| valueFrom: | ||
| resourceFieldRef: | ||
| containerName: operator | ||
| resource: requests.cpu | ||
| - name: OPERATOR_CONTAINER_CPU_LIMIT | ||
| valueFrom: | ||
| resourceFieldRef: | ||
| containerName: operator | ||
| resource: limits.cpu | ||
| - name: OPERATOR_CONTAINER_MEM_REQUEST | ||
| valueFrom: | ||
| resourceFieldRef: | ||
| containerName: operator | ||
| resource: requests.memory | ||
| - name: OPERATOR_CONTAINER_MEM_LIMIT | ||
| valueFrom: | ||
| resourceFieldRef: | ||
| containerName: operator | ||
| resource: limits.memory |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Downward API env in metrics-exporter references the operator’s resources — use self container.
resourceFieldRef.containerName should refer to the same container. Point these four env vars at metrics-exporter.
- containerName: operator
+ containerName: metrics-exporterApply to CPU request/limit and memory request/limit.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| - name: OPERATOR_CONTAINER_CPU_REQUEST | |
| valueFrom: | |
| resourceFieldRef: | |
| containerName: operator | |
| resource: requests.cpu | |
| - name: OPERATOR_CONTAINER_CPU_LIMIT | |
| valueFrom: | |
| resourceFieldRef: | |
| containerName: operator | |
| resource: limits.cpu | |
| - name: OPERATOR_CONTAINER_MEM_REQUEST | |
| valueFrom: | |
| resourceFieldRef: | |
| containerName: operator | |
| resource: requests.memory | |
| - name: OPERATOR_CONTAINER_MEM_LIMIT | |
| valueFrom: | |
| resourceFieldRef: | |
| containerName: operator | |
| resource: limits.memory | |
| - name: OPERATOR_CONTAINER_CPU_REQUEST | |
| valueFrom: | |
| resourceFieldRef: | |
| containerName: metrics-exporter | |
| resource: requests.cpu | |
| - name: OPERATOR_CONTAINER_CPU_LIMIT | |
| valueFrom: | |
| resourceFieldRef: | |
| containerName: metrics-exporter | |
| resource: limits.cpu | |
| - name: OPERATOR_CONTAINER_MEM_REQUEST | |
| valueFrom: | |
| resourceFieldRef: | |
| containerName: metrics-exporter | |
| resource: requests.memory | |
| - name: OPERATOR_CONTAINER_MEM_LIMIT | |
| valueFrom: | |
| resourceFieldRef: | |
| containerName: metrics-exporter | |
| resource: limits.memory |
🤖 Prompt for AI Agents
In .debug/clickhouse/templates/clickhouse-operator/deployment.yaml around lines
166 to 185 the Downward API env vars for OPERATOR_CONTAINER_CPU_REQUEST,
OPERATOR_CONTAINER_CPU_LIMIT, OPERATOR_CONTAINER_MEM_REQUEST and
OPERATOR_CONTAINER_MEM_LIMIT incorrectly reference containerName: operator;
update each resourceFieldRef.containerName to the metrics-exporter container
(containerName: metrics-exporter) so the env vars pull that container’s
requests.cpu, limits.cpu, requests.memory and limits.memory respectively.
| apiVersion: rbac.authorization.k8s.io/v1 | ||
| kind: Role | ||
| metadata: | ||
| name: RELEASE-NAME-clickhouse-operator | ||
| namespace: NAMESPACE |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cluster-scoped resources in a namespaced Role — switch to ClusterRole (and bind accordingly).
This Role includes cluster-scoped resources (persistentvolumes, customresourcedefinitions). A namespaced Role cannot grant these. Use a ClusterRole and a ClusterRoleBinding, or drop those resources from this Role.
apiVersion: rbac.authorization.k8s.io/v1
-kind: Role
+kind: ClusterRole
metadata:
- namespace: NAMESPACE
name: RELEASE-NAME-clickhouse-operatorAlso ensure you use a ClusterRoleBinding targeting the operator ServiceAccount.
Also applies to: 45-55, 146-151
| - clickhouseinstallations/status | ||
| - clickhouseinstallationtemplates/status | ||
| - clickhouseoperatorconfigurations/status | ||
| verbs: | ||
| - get | ||
| - update | ||
| - patch | ||
| - create | ||
| - delete |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove unsupported verbs on status subresources.
Status subresources typically allow get/patch/update only. Drop create/delete on …/status to avoid invalid RBAC.
verbs:
- - get
- - update
- - patch
- - create
- - delete
+ - get
+ - update
+ - patch📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| - clickhouseinstallations/status | |
| - clickhouseinstallationtemplates/status | |
| - clickhouseoperatorconfigurations/status | |
| verbs: | |
| - get | |
| - update | |
| - patch | |
| - create | |
| - delete | |
| - clickhouseinstallations/status | |
| - clickhouseinstallationtemplates/status | |
| - clickhouseoperatorconfigurations/status | |
| verbs: | |
| - get | |
| - update | |
| - patch |
🤖 Prompt for AI Agents
.debug/clickhouse/templates/clickhouse-operator/role.yaml lines 129-137: the
RBAC rule includes unsupported verbs for status subresources; remove the create
and delete verbs so the verbs list for the clickhouseinstallations/status,
clickhouseinstallationtemplates/status, and
clickhouseoperatorconfigurations/status entries contains only get, update, and
patch. Ensure no other rules grant create/delete to these /*/status resources
and save the file.
PR Title
feat(clickhouse): add volume permissions support to fix issue #623
PR Description
This adds volume permissions support to fix the permission errors that users are hitting when using persistent volumes with ClickHouse.
What's the issue?
When using persistent volumes, ClickHouse crashes on startup with permission denied errors like:
This happens because persistent volumes are mounted as root:root but ClickHouse runs as user 101:101, so it can't create the directories it needs.
How I fixed it
I implemented the same volume permissions pattern that's already used in the Zookeeper chart in this repo:
volumePermissionssection to values.yaml with all the standard configuration optionsHow to use it
Just enable volume permissions when you have persistence enabled:
Tested thoroughly
I ran all the existing tests plus added 12 new test cases specifically for volume permissions. Also created a manual testing guide so others can verify it works.
This should fix #623 and anyone else hitting the same permission issues.
Summary by CodeRabbit
New Features
Documentation
Tests