-
Notifications
You must be signed in to change notification settings - Fork 46
[DO NOT MERGE] for building the images #560
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Skipping CI for Draft Pull Request. |
Reviewer's GuideThis PR implements end-to-end OCI registry support for LMEvalJobs by extending the API and CRD schema, enhancing the controller to inject OCI configuration into pods and command flags, integrating an OCI upload step in the driver, adding user input validation for OCI settings, and covering all new behaviors with comprehensive unit tests. Sequence diagram for OCI upload flow in LMEvalJob driversequenceDiagram
participant Controller
participant Pod
participant Driver
participant "OCI Registry"
Controller->>Pod: Inject OCI env vars and volume mounts
Pod->>Driver: Start with --upload-to-oci flag
Driver->>Driver: On job completion, call uploadToOCI()
Driver->>"OCI Registry": Execute oci.py to upload results
"OCI Registry"-->>Driver: Respond with upload status
Driver->>Controller: Update job status
ER diagram for new OCI output fields in LMEvalJob CRDerDiagram
OUTPUTS {
string id
PersistentVolumeClaimManaged pvcManaged
OCISpec oci
}
OCISPEC {
string path
string tag
string subject
bool verifySSL
}
SECRETKEYSELECTOR {
string name
string key
bool optional
}
OUTPUTS ||--o| OCISPEC : has
OCISPEC ||--o| SECRETKEYSELECTOR : registry
OCISPEC ||--o| SECRETKEYSELECTOR : repository
OCISPEC ||--o| SECRETKEYSELECTOR : username
OCISPEC ||--o| SECRETKEYSELECTOR : password
OCISPEC ||--o| SECRETKEYSELECTOR : token
OCISPEC ||--o| SECRETKEYSELECTOR : caBundle
Class diagram for new and updated OCI output types in LMEvalJobclassDiagram
class Outputs {
+PersistentVolumeClaimManaged pvcManaged
+OCISpec oci
}
class OCISpec {
+SecretKeySelector registry
+SecretKeySelector repository
+string tag
+string path
+string subject
+SecretKeySelector username
+SecretKeySelector password
+SecretKeySelector token
+bool verifySSL
+SecretKeySelector caBundle
+HasCertificates()
+HasUsernamePassword()
+HasToken()
}
Outputs --> OCISpec
OCISpec --> "1" SecretKeySelector : registry
OCISpec --> "1" SecretKeySelector : repository
OCISpec --> "0..1" SecretKeySelector : username
OCISpec --> "0..1" SecretKeySelector : password
OCISpec --> "0..1" SecretKeySelector : token
OCISpec --> "0..1" SecretKeySelector : caBundle
File-Level Changes
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
/ok-to-test |
|
@tarilabs: you cannot LGTM your own PR. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
|
PR image build and manifest generation completed successfully! 📦 PR image: 📦 LMES driver image: 📦 LMES job image: 📦 Guardrails orchestrator image: 🗂️ CI manifests |
…y#530) * feat: Add subject to LMEval CRD's OCI outputs * chore: Add more test cases
Signed-off-by: tarilabs <[email protected]>
Co-Authored-By: Claude <[email protected]> Signed-off-by: tarilabs <[email protected]>
|
Note Other AI code review bot(s) detectedCodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review. WalkthroughOCI registry output support is added to LMEval job specifications and drivers. A new OCISpec type defines OCI artifact configuration with registry, repository, path, tag, and subject fields alongside credential and SSL verification options. The driver gains an UploadToOCI flag and method to execute OCI uploads. Pod creation is extended with OCI environment variables, volume handling, and command flags. Validation logic and test coverage for OCI configurations are introduced. Changes
Sequence DiagramsequenceDiagram
participant User
participant Controller
participant Driver
participant OCI Registry
participant Python Script
User->>Controller: Create LMEvalJob with OCI output
Controller->>Controller: Validate OCI path & auth
Controller->>Controller: CreatePod with OCI env vars
Controller->>Driver: Launch driver with --upload-to-oci
Driver->>Driver: Run evaluation
Driver->>Driver: Check UploadToOCI flag
alt OCI Upload Enabled
Driver->>Python Script: Execute OCI upload script
Python Script->>OCI Registry: Upload results
OCI Registry-->>Python Script: Success
Python Script-->>Driver: Return status
else OCI Upload Disabled
Driver->>Driver: Skip upload
end
Driver-->>Controller: Job complete
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes
Poem
Pre-merge checks and finishing touches❌ Failed checks (1 warning, 1 inconclusive)
✅ Passed checks (1 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Co-Authored-By: Claude <[email protected]> Signed-off-by: tarilabs <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey there - I've reviewed your changes - here's some feedback:
- The CreatePod and generateCmd functions have grown very large with OCI‐specific logic—consider extracting the OCI env var/volume injection and flag handling into dedicated helper functions to improve readability and reuse.
- The driver.uploadToOCI method uses fmt.Println for debug output and doesn’t capture stderr—switch to structured logging and capture both stdout and stderr to provide clearer error messages and avoid noisy console logs.
- The SetupWithManager method logs the entire Options object (including potentially sensitive values)—you may want to remove or sanitize that log to avoid leaking secrets in controller logs.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- The CreatePod and generateCmd functions have grown very large with OCI‐specific logic—consider extracting the OCI env var/volume injection and flag handling into dedicated helper functions to improve readability and reuse.
- The driver.uploadToOCI method uses fmt.Println for debug output and doesn’t capture stderr—switch to structured logging and capture both stdout and stderr to provide clearer error messages and avoid noisy console logs.
- The SetupWithManager method logs the entire Options object (including potentially sensitive values)—you may want to remove or sanitize that log to avoid leaking secrets in controller logs.
## Individual Comments
### Comment 1
<location> `controllers/lmes/lmevaljob_controller_test.go:3866` </location>
<code_context>
}
+
+// Test OCI controller functionality
+func Test_OCICommandGeneration(t *testing.T) {
+ svcOpts := &serviceOptions{
+ PodImage: "podimage:latest",
</code_context>
<issue_to_address>
**suggestion (testing):** Consider adding negative and malformed OCI spec cases to command generation tests.
Including tests for malformed OCISpecs, such as missing or invalid fields, will improve error handling and flag generation coverage.
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
| } | ||
|
|
||
| // Test OCI controller functionality | ||
| func Test_OCICommandGeneration(t *testing.T) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion (testing): Consider adding negative and malformed OCI spec cases to command generation tests.
Including tests for malformed OCISpecs, such as missing or invalid fields, will improve error handling and flag generation coverage.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
♻️ Duplicate comments (1)
config/overlays/odh/params.env (1)
6-6: Inconsistent image source change - see comment on config/base/params.env.This overlay switches the image source from
opendatahubback totrustyai, which contradicts the base configuration change. Please refer to the review comment onconfig/base/params.envline 6 for details.
🧹 Nitpick comments (10)
config/crd/bases/trustyai.opendatahub.io_lmevaljobs.yaml (5)
323-326: Prevent empty string in required 'path'Pattern allows empty; with required, empty strings still pass. Enforce non‑empty.
- path: - description: Path within the results to package as artifact - pattern: ^[a-zA-Z0-9._/-]*$ - type: string + path: + description: Path within the results to package as artifact + type: string + pattern: ^[a-zA-Z0-9._/-]+$ + minLength: 1
371-374: Tighten Docker tag validationAdd max length and start‑char constraint to match Docker tag conventions.
- tag: - description: Optional tag for the artifact (defaults to job name if not specified) - pattern: ^[a-zA-Z0-9._-]*$ - type: string + tag: + description: Optional tag for the artifact (defaults to job name if not specified) + type: string + pattern: ^[A-Za-z0-9_][A-Za-z0-9._-]{0,127}$ + maxLength: 128
416-418: Set a sane default for verifySSLDefault to true to avoid accidental insecure pushes.
- verifySSL: - description: Whether to verify SSL certificates - type: boolean + verifySSL: + description: Whether to verify SSL certificates + type: boolean + default: true
279-423: Add CEL validations: auth XOR and required combosEnforce either token or username+password (but not both), and require auth if anonymous pushes are not supported.
oci: description: Upload results to OCI registry properties: ... required: - path - registry - repository type: object + x-kubernetes-validations: + - rule: "has(self.token) != (has(self.username) && has(self.password))" + message: "Specify either token or username+password, not both." + - rule: "has(self.token) || (has(self.username) && has(self.password))" + message: "Authentication required: provide token or username+password." + - rule: "size(self.path) > 0" + message: "path must be non-empty."If anonymous push must be allowed, drop the second rule. Please confirm expected behavior.
282-302: Consider allowing CA bundle via ConfigMap as well as SecretCA bundles are not secrets by default. Allow ConfigMapKeySelector alternative for flexibility.
caBundle: description: CA bundle for custom certificates - properties: + oneOf: + - properties: + key: { type: string, description: The key of the secret to select from. Must be a valid secret key. } + name: { type: string, description: Name of the Secret. } + optional: { type: boolean, description: Specify whether the Secret or its key must be defined } + required: [ key ] + type: object + x-kubernetes-map-type: atomic + - properties: + key: { type: string, description: The key of the config map to select from. } + name: { type: string, description: Name of the ConfigMap. } + optional: { type: boolean, description: Specify whether the ConfigMap or its key must be defined } + required: [ key ] + type: object + x-kubernetes-map-type: atomiccontrollers/lmes/driver/driver_test.go (1)
447-593: OCI driver tests cover key flows; consider minor naming and env-handling tweaksThe new OCI tests exercise the flag wiring, env requirements, and upload error propagation well. Two small, optional improvements:
Test_OCIUploadSuccessactually asserts a failure due to missing script; a name likeTest_OCIUploadFlagTriggersUploadFailurewould better describe behavior.- Using
t.Setenvinstead ofos.Setenv/defer Unsetenvwould simplify env cleanup and make the tests safer against future parallelization.controllers/lmes/validation.go (1)
115-123: OCI validation wiring looks correct; consider tightening path semanticsThe OCI validation hook in
ValidateUserInputand the newValidateOCIPath/ValidateOCIAuthhelpers are consistent with the existing S3/URL checks and correctly guard against traversal and shell metacharacters.Given the comment “Path within the results to package as artifact”, you may want to additionally reject absolute paths to avoid any future misuse as a raw filesystem path, e.g.:
func ValidateOCIPath(path string) error { if path == "" { return nil // Empty path is valid for root } + // Disallow absolute paths to ensure the path stays within the results tree + if strings.HasPrefix(path, "/") { + return fmt.Errorf("OCI path must be relative, not start with '/'") + }This would align the implementation more tightly with the intended “within results” semantics without affecting current valid examples like
"results".Also applies to: 543-595
controllers/lmes/lmevaljob_controller.go (1)
914-950: OCI output wiring in pod/command looks sound; only minor cleanup possible
- Restricting the “outputs” PVC mount/volume to cases where a managed or existing PVC is actually configured avoids spurious volumes when only OCI output is set — this is a good fix.
- The OCI env/volume block correctly reflects
OCISpec(registry, repository, path, optional tag/subject, auth mode, SSL/CA bundle) and is independent of offline/online model IO, which matches the design of “upload results after run”.generateCmdappending--upload-to-ociwhenHasOCIOutput()is true ties the controller cleanly to the new driver flag.If you want to simplify slightly, the extra
job.Spec.Outputs != nil && job.Spec.Outputs.OCISpec != nilchecks inside theHasOCIOutput()condition are redundant and could be dropped, but that’s purely cosmetic.Also applies to: 1168-1280, 1541-1556
controllers/lmes/lmevaljob_controller_test.go (1)
3925-4413: OCI pod configuration tests are comprehensive; consider small DRY/robustness tweaksThe OCI pod tests do a nice job exhaustively checking env vars, auth modes, subject/tag handling, and certificate volumes against
OCISpec.Two optional test-only refinements:
- Instead of assuming the main container is
pod.Spec.Containers[0], you could reuse the existinggetContainer(pod)helper to make the tests resilient to any future container ordering changes.- The env→map conversion pattern is repeated in each subtest; factoring that into a small helper (e.g.,
envByName(t, pod, "OCI_REGISTRY")) would reduce duplication and make intent clearer.api/lmes/v1alpha1/lmevaljob_types.go (1)
396-406: OCI API additions are consistent; slight redundancy in helper predicatesThe
OCISpectype and its addition toOutputsline up cleanly with how the controller and validation use these fields (env construction,ValidateOCIPath/ValidateOCIAuth, and the driver’s--upload-to-ociflag). The helper methods onOCISpec(HasCertificates,HasUsernamePassword,HasToken) also match the controller logic.There is a small redundancy between:
(*LMEvalJobSpec).HasOCIOutput()and(*Outputs).HasOCI(),both effectively checking for a non-nil
OCISpec. Not an issue, but if you find only one style being used in practice, you could consolidate on that to keep the API surface a bit leaner.Also applies to: 469-501, 625-627, 633-643, 660-663
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (13)
Dockerfile.lmes-job(1 hunks)api/lmes/v1alpha1/lmevaljob_types.go(4 hunks)api/lmes/v1alpha1/zz_generated.deepcopy.go(2 hunks)cmd/lmes_driver/main.go(2 hunks)config/base/params.env(1 hunks)config/crd/bases/trustyai.opendatahub.io_lmevaljobs.yaml(1 hunks)config/overlays/odh/params.env(1 hunks)controllers/lmes/driver/driver.go(3 hunks)controllers/lmes/driver/driver_test.go(1 hunks)controllers/lmes/lmevaljob_controller.go(6 hunks)controllers/lmes/lmevaljob_controller_test.go(1 hunks)controllers/lmes/lmevaljob_controller_validation_test.go(2 hunks)controllers/lmes/validation.go(2 hunks)
🧰 Additional context used
🧬 Code graph analysis (7)
controllers/lmes/validation.go (1)
api/lmes/v1alpha1/lmevaljob_types.go (2)
Outputs(396-406)OCISpec(469-501)
controllers/lmes/driver/driver_test.go (3)
controllers/lmes/driver/driver.go (2)
NewDriver(117-141)DriverOption(58-77)cmd/lmes_driver/main.go (1)
OutputPath(37-37)controllers/lmes/constants.go (1)
OutputPath(28-28)
api/lmes/v1alpha1/zz_generated.deepcopy.go (1)
api/lmes/v1alpha1/lmevaljob_types.go (1)
OCISpec(469-501)
controllers/lmes/lmevaljob_controller_test.go (3)
api/lmes/v1alpha1/lmevaljob_types.go (3)
LMEvalJob(701-707)Outputs(396-406)OCISpec(469-501)controllers/lmes/config.go (1)
NewDefaultPermissionConfig(180-185)controllers/lmes/lmevaljob_controller.go (1)
CreatePod(890-1355)
controllers/lmes/lmevaljob_controller_validation_test.go (2)
api/lmes/v1alpha1/lmevaljob_types.go (5)
LMEvalJob(701-707)LMEvalJobSpec(553-610)Outputs(396-406)OCISpec(469-501)TaskList(244-254)controllers/lmes/validation.go (3)
ValidateOCIPath(544-568)ValidateOCIAuth(571-595)ValidateUserInput(76-156)
controllers/lmes/lmevaljob_controller.go (2)
controllers/lmes/config.go (1)
Options(37-48)api/lmes/v1alpha1/lmevaljob_types.go (2)
Outputs(396-406)OCISpec(469-501)
controllers/lmes/driver/driver.go (2)
cmd/lmes_driver/main.go (1)
OutputPath(37-37)controllers/lmes/constants.go (1)
OutputPath(28-28)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
- GitHub Check: build
- GitHub Check: deploy
- GitHub Check: Sourcery review
- GitHub Check: Sourcery review
- GitHub Check: build-and-push-ci
🔇 Additional comments (18)
config/base/params.env (1)
6-6: Verify image source consistency between base and overlay.The base configuration switches from
trustyaitoopendatahubfor the lmes-pod-image, but the overlay inconfig/overlays/odh/params.env(line 6) switches it back totrustyai. This appears contradictory. Please confirm this is intentional for the build process, or if one of these changes should be reverted.controllers/lmes/driver/driver.go (3)
70-70: LGTM - Clean addition of OCI upload flag.The UploadToOCI field is properly added to DriverOption struct.
505-509: LGTM - OCI upload integrated correctly into completion flow.The OCI upload is appropriately called only after successful result retrieval, and errors are properly propagated. The placement ensures uploads happen only for completed jobs with valid results.
246-302: The review comment is based on a misunderstanding of Go's exec security model.When you call exec.Command(name, args...), Go does not invoke a shell, so shell metacharacters in those argument strings are not interpreted by a shell and cannot be used for command injection. The
OCI_REGISTRYvalue is passed as an array element toexec.Command(), not as part of a shell command string, which means shell metacharacters pose no injection risk.Additionally, the value originates from a Kubernetes Secret (controllers/lmes/lmevaljob_controller.go:1172-1174), which is a trusted configuration source, not untrusted user input.
While other sensitive inputs in this file (git URLs, branches, commits) are validated and documented with
#nosec G204comments, those validations address which program to execute, not shell interpretation risks. The suggested validation forOCI_REGISTRYis unnecessary and appears to conflate shell execution security with exec.Command() array argument safety.Likely an incorrect or invalid review comment.
api/lmes/v1alpha1/zz_generated.deepcopy.go (2)
436-476: LGTM - Auto-generated deepcopy implementation for OCISpec.The generated code correctly implements deep copy logic for all OCISpec fields, including proper handling of optional pointer fields.
562-566: LGTM - Auto-generated deepcopy handling for OCISpec in Outputs.The generated code correctly handles the optional OCISpec field in Outputs.
controllers/lmes/lmevaljob_controller_validation_test.go (6)
10-10: LGTM - Required import for OCI secret references.The corev1 import is needed for SecretKeySelector types used in OCI configuration.
1493-1524: LGTM - Comprehensive helper method tests.Good test coverage for HasOCIOutput and OutputsHasOCI helper methods, including nil checks and positive cases.
1526-1575: LGTM - Thorough OCISpec helper tests.Excellent coverage of OCISpec helper methods with edge cases like partial username/password combinations.
1577-1657: LGTM - Strong path validation tests.Comprehensive test coverage for OCI path validation including:
- Path traversal attempts (../)
- Shell metacharacters
- Valid patterns with dots, underscores, hyphens
1659-1742: LGTM - Thorough authentication validation tests.Excellent test coverage for OCI authentication including:
- Mutual exclusion of username/password and token
- Incomplete credential pairs
- Nil spec handling
1744-1817: LGTM - Good integration test coverage.Integration tests properly validate the complete OCI configuration flow including path and authentication validation.
cmd/lmes_driver/main.go (2)
63-63: LGTM - Clean CLI flag addition.The
--upload-to-ociflag is properly defined with a clear description and safe default (false).
133-133: LGTM - Proper flag propagation to driver.The flag value is correctly passed to the DriverOption struct.
Dockerfile.lmes-job (1)
15-18: Branch is actively maintained and current—concern about staleness is not substantiated.The verification reveals:
- Branch
tarilabs-20250917exists and was last updated on November 18, 2025 (1 day ago)- The date in the branch name (
20250917) follows an intentional naming convention and does not indicate staleness- The fork is actively maintained with recent commits
The original concern about the branch name being incorrect or the branch being stale is not supported by the evidence. Since the PR is marked "DO NOT MERGE," using a personal fork for testing is appropriate. However, ensure the fork is replaced with the official upstream source before merging to production.
config/crd/bases/trustyai.opendatahub.io_lmevaljobs.yaml (1)
279-423: Review comment is incorrect; no action required.The Go OCISpec definition explicitly types both
RegistryandRepositoryascorev1.SecretKeySelector—not strings. The CRD schema correctly reflects this design. The controller code and test fixtures confirm this pattern throughout. The review's assumption of a schema mismatch is unfounded; the CRD and Go implementation are already aligned.Likely an incorrect or invalid review comment.
controllers/lmes/lmevaljob_controller.go (1)
261-291: Config/DSC logging is useful and appears safeLogging
Optionsafter configmap and DSC application is helpful for diagnosing misconfig; givenserviceOptionsonly carries non-secret config (images, intervals, flags), this shouldn’t expose sensitive data. Just keep this in mind if new secret-like fields are ever added toserviceOptions.controllers/lmes/lmevaljob_controller_test.go (1)
3865-3923: OCI command-generation tests align well with controller behaviorThe
Test_OCICommandGenerationsubtests correctly assert that--upload-to-ociis included only when anOCISpecis present onOutputs, matching theHasOCIOutput()guard ingenerateCmd. This should catch future regressions in CLI wiring.
|
@tarilabs: The following test failed, say
Full PR test history. Your PR dashboard. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
|
for some reasons, the images are not re-created. I'm going to close and re-open. |
@ruivieira can you kindly add the labels for me please?
I can't
Summary by Sourcery
Add support for uploading evaluation results to OCI registries throughout the operator and driver
New Features:
Enhancements:
Tests:
Summary by CodeRabbit
New Features
--upload-to-ocicommand-line flag to enable OCI result uploads.Chores