fix: Prevent lm-eval command escape #503

ruivieira · 2025-06-20T15:32:36Z

Summary by Sourcery

Implement strict input validation and secure command generation for LMEvalJob: introduce ValidateUserInput, integrate it into the reconciler to reject malicious CRs, harden CRD schema with patterns/enums, refactor CLI construction to use argument slices, and expand test coverage accordingly.

New Features:

Integrate ValidateUserInput into the controller to reject invalid LMEvalJob specs before execution
Enforce CRD-level validation via kubebuilder tags and schema patterns/enums on model names, arguments, and other fields
Provide structured command argument generation that returns []string instead of concatenated shell commands

Bug Fixes:

Prevent command injection by removing sh -ec and sanitizing all CLI inputs

Enhancements:

Refactor generateArgs and generateCmd to remove shell wrappers and build safe argument slices
Extend API types and CRD YAML with Pattern and Enum constraints to harden input schemas
Update controller to mark jobs as failed on validation errors

Tests:

Update existing controller tests to use new argument format and default model values
Add comprehensive validation tests for models, args, limits, system instructions, chat templates, artifacts, Git sources, and boundary cases

sourcery-ai · 2025-06-20T15:32:41Z

Reviewer's Guide

This PR refactors command construction to eliminate shell invocation, integrates comprehensive input validation in the controller and CRD schemas to block unsafe patterns, and updates tests to reflect these changes.

Sequence diagram for LMEvalJob input validation in controller

sequenceDiagram
    participant Controller
    participant Validation
    participant StatusUpdate
    Controller->>Validation: ValidateUserInput(job)
    alt Validation fails
        Validation-->>Controller: error
        Controller->>StatusUpdate: Update job.Status (State=Complete, Reason=Failed, Message)
        Controller-->>Controller: Log error
        Controller-->>Controller: Return error
    else Validation passes
        Validation-->>Controller: nil
        Controller->>Controller: Continue with job processing
    end

Class diagram for new validation logic and patterns

classDiagram
    class LMEvalJob {
        +LMEvalJobSpec Spec
        +LMEvalJobStatus Status
    }
    class LMEvalJobSpec {
        +string Model
        +[]Arg ModelArgs
        +TaskList TaskList
        +string Limit
        +[]Arg GenArgs
        +*OfflineSpec Offline
        +*int BatchSize
        +string SystemInstruction
        +*ChatTemplate ChatTemplate
    }
    class TaskList {
        +[]string TaskNames
        +[]TaskRecipe TaskRecipes
        +*CustomArtifacts CustomArtifacts
        +*CustomTasks CustomTasks
    }
    class Validation {
        +ValidateUserInput(job)
        +ValidateModelName(model)
        +ValidateArgs(args, argType)
        +ValidateTaskNames(taskNames)
        +ValidateTaskRecipes(args, argType)
        +ValidateCustomArtifactValue(input)
        +ValidateLimit(limit)
        +ValidateS3Path(path)
        +ValidateBatchSizeInput(batchSize)
        +ValidateSystemInstruction(instruction)
        +ValidateChatTemplateName(name)
        +ValidateGitURL(url)
        +ValidateGitPath(path)
        +ValidateGitBranch(branch)
        +ValidateGitCommit(commit)
        +ContainsShellMetacharacters(input)
    }
    class Patterns {
        +PatternModelName
        +PatternTemplateName
        +PatternArgName
        +PatternArgValue
        +PatternLimit
        +PatternTaskName
        +AllowedModels
    }
    LMEvalJobSpec o-- TaskList
    TaskList o-- CustomArtifacts
    TaskList o-- CustomTasks
    LMEvalJob o-- LMEvalJobSpec
    LMEvalJob o-- LMEvalJobStatus
    Validation ..> Patterns : uses

File-Level Changes

Change	Details	Files
Switch command generation to build argument slices and remove shell wrapper	Removed use of strings.Builder and "sh -ec" wrapper Split the lm_eval command and flags into separate slice elements Appended each flag and its value via cmds = append Updated generateCmd/generateArgs to return []string without shell invocation	`controllers/lmes/lmevaljob_controller.go` `controllers/lmes/lmevaljob_controller_test.go`
Introduce ValidateUserInput logic and integrate it into the reconciler	New validation.go with functions for model, args, tasks, artifacts, limit, S3 path, batch size, system instruction, chat template, and Git source patterns.go defining regex patterns and allowed model map handleNewCR now calls ValidateUserInput and marks job as failed on validation error	`controllers/lmes/validation.go` `controllers/lmes/patterns.go` `controllers/lmes/lmevaljob_controller.go`
Annotate API types and CRD YAML with kubebuilder validation tags	Added +kubebuilder:validation:Pattern and Enum tags to Arg, Card, Template, SystemPrompt, Task, CustomArtifact, GitSource, ChatTemplate, and LMEvalJobSpec fields Updated CRD base YAML to match new patterns and enforce field constraints	`api/lmes/v1alpha1/lmevaljob_types.go` `config/crd/bases/trustyai.opendatahub.io_lmevaljobs.yaml`
Expand test coverage for validation and command generation	Updated existing controller tests to expect argument slices instead of shell strings Added validation tests in lmevaljob_controller_validation_test.go covering ValidateUserInput and individual validators Added integration test Test_ControllerIntegration to verify validation and command generation in the controller	`controllers/lmes/lmevaljob_controller_test.go` `controllers/lmes/lmevaljob_controller_validation_test.go`
Suppress false-positive security warnings on exec.Command	Added // #nosec G204 comments before exec.Command calls for git clone, checkout, and file copy steps Rationale: inputs are validated upstream	`controllers/lmes/driver/driver.go`

Tips and commands

Interacting with Sourcery

Trigger a new review: Comment @sourcery-ai review on the pull request.
Continue discussions: Reply directly to Sourcery's review comments.
Generate a GitHub issue from a review comment: Ask Sourcery to create an
issue from a review comment by replying to it. You can also reply to a
review comment with @sourcery-ai issue to create an issue from it.
Generate a pull request title: Write @sourcery-ai anywhere in the pull
request title to generate a title at any time. You can also comment
@sourcery-ai title on the pull request to (re-)generate the title at any time.
Generate a pull request summary: Write @sourcery-ai summary anywhere in
the pull request body to generate a PR summary at any time exactly where you
want it. You can also comment @sourcery-ai summary on the pull request to
(re-)generate the summary at any time.
Generate reviewer's guide: Comment @sourcery-ai guide on the pull
request to (re-)generate the reviewer's guide at any time.
Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
pull request to resolve all Sourcery comments. Useful if you've already
addressed all the comments and don't want to see them anymore.
Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
request to dismiss all existing Sourcery reviews. Especially useful if you
want to start fresh with a new review - don't forget to comment
@sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

Enable or disable review features such as the Sourcery-generated pull request
summary, the reviewer's guide, and others.
Change the review language.
Add, remove or edit custom review instructions.
Adjust other review settings.

Getting Help

Contact our support team for questions or feedback.
Visit our documentation for detailed guides and information.
Keep in touch with the Sourcery team by following us on X/Twitter, LinkedIn or GitHub.

sourcery-ai

Hey @ruivieira - I've reviewed your changes - here's some feedback:

The PR mixes shell‐escaping fixes and extensive input validation/CRD changes — consider splitting it into smaller, focused PRs (one for command generation and one for validation/schema updates) for easier review and maintainability.
The validation logic and kubebuilder regex tags are duplicated across patterns.go, validation.go and the CRD YAML — consider centralizing these patterns to avoid drift and ensure consistency.
Injecting ValidateUserInput in the controller is helpful, but it might be cleaner to move input validation into a webhook (or a dedicated admission step) so invalid CRs are rejected before reconciliation.

Prompt for AI Agents

Please address the comments from this code review:
## Overall Comments
- The PR mixes shell‐escaping fixes and extensive input validation/CRD changes — consider splitting it into smaller, focused PRs (one for command generation and one for validation/schema updates) for easier review and maintainability.
- The validation logic and kubebuilder regex tags are duplicated across patterns.go, validation.go and the CRD YAML — consider centralizing these patterns to avoid drift and ensure consistency.
- Injecting ValidateUserInput in the controller is helpful, but it might be cleaner to move input validation into a webhook (or a dedicated admission step) so invalid CRs are rejected before reconciliation.

Sourcery is free for open source - if you like our reviews please consider sharing them ✨

_{Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.}

github-actions · 2025-06-20T15:47:44Z

PR image build and manifest generation completed successfully!

📦 PR image: quay.io/trustyai/trustyai-service-operator-ci:87b2673381de5a5ce1f121aad52066362800e103

📦 LMES driver image: quay.io/trustyai/ta-lmes-driver-ci:87b2673381de5a5ce1f121aad52066362800e103

📦 LMES job image: quay.io/trustyai/ta-lmes-job-ci:87b2673381de5a5ce1f121aad52066362800e103

📦 Guardrails orchestrator image: quay.io/trustyai/ta-guardrails-orchestrator-ci:87b2673381de5a5ce1f121aad52066362800e103

🗂️ CI manifests

      devFlags:
        manifests:
          - contextDir: config
            sourcePath: ''
            uri: https://api.github.com/repos/trustyai-explainability/trustyai-service-operator-ci/tarball/operator-87b2673381de5a5ce1f121aad52066362800e103

openshift-ci · 2025-06-20T16:42:13Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: christinaexyou

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

openshift-ci · 2025-06-20T17:30:13Z

New changes are detected. LGTM label has been removed.

openshift-ci · 2025-06-20T18:14:28Z

@ruivieira: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name	Commit	Details	Required	Rerun command
ci/prow/trustyai-service-operator-e2e	`87b2673`	link	true	`/test trustyai-service-operator-e2e`

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

openshift-merge-robot · 2025-06-25T09:25:26Z

PR needs rebase.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

fix: Prevent lm-eval command escape

0a62ed6

ruivieira requested review from RobGeada and adolfo-ab June 20, 2025 15:32

ruivieira self-assigned this Jun 20, 2025

ruivieira added the kind/enhancement New feature or request label Jun 20, 2025

sourcery-ai bot reviewed Jun 20, 2025

View reviewed changes

ruivieira added the ok-to-test label Jun 20, 2025

fix: Args test expectation

35b67f3

christinaexyou approved these changes Jun 20, 2025

View reviewed changes

openshift-ci bot assigned christinaexyou Jun 20, 2025

openshift-ci bot added the lgtm label Jun 20, 2025

fix: Device patching

87b2673

openshift-ci bot removed the lgtm label Jun 20, 2025

openshift-merge-robot added the needs-rebase label Jun 25, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix: Prevent lm-eval command escape #503

fix: Prevent lm-eval command escape #503

Uh oh!

ruivieira commented Jun 20, 2025 •

edited by sourcery-ai bot

Loading

Uh oh!

sourcery-ai bot commented Jun 20, 2025 •

edited

Loading

Interacting with Sourcery

Customizing Your Experience

Getting Help

Uh oh!

sourcery-ai bot left a comment

Uh oh!

github-actions bot commented Jun 20, 2025 •

edited

Loading

Uh oh!

openshift-ci bot commented Jun 20, 2025

Uh oh!

openshift-ci bot commented Jun 20, 2025

Uh oh!

openshift-ci bot commented Jun 20, 2025

Uh oh!

openshift-merge-robot commented Jun 25, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

fix: Prevent lm-eval command escape #503

Are you sure you want to change the base?

fix: Prevent lm-eval command escape #503

Uh oh!

Conversation

ruivieira commented Jun 20, 2025 • edited by sourcery-ai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by Sourcery

Uh oh!

sourcery-ai bot commented Jun 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviewer's Guide

Sequence diagram for LMEvalJob input validation in controller

Class diagram for new validation logic and patterns

File-Level Changes

Interacting with Sourcery

Customizing Your Experience

Getting Help

Uh oh!

sourcery-ai bot left a comment

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Jun 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

openshift-ci bot commented Jun 20, 2025

Uh oh!

openshift-ci bot commented Jun 20, 2025

Uh oh!

openshift-ci bot commented Jun 20, 2025

Uh oh!

openshift-merge-robot commented Jun 25, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

ruivieira commented Jun 20, 2025 •

edited by sourcery-ai bot

Loading

sourcery-ai bot commented Jun 20, 2025 •

edited

Loading

github-actions bot commented Jun 20, 2025 •

edited

Loading