Skip to content

Conversation

@ruivieira
Copy link
Member

@ruivieira ruivieira commented Jun 20, 2025

Summary by Sourcery

Implement strict input validation and secure command generation for LMEvalJob: introduce ValidateUserInput, integrate it into the reconciler to reject malicious CRs, harden CRD schema with patterns/enums, refactor CLI construction to use argument slices, and expand test coverage accordingly.

New Features:

  • Integrate ValidateUserInput into the controller to reject invalid LMEvalJob specs before execution
  • Enforce CRD-level validation via kubebuilder tags and schema patterns/enums on model names, arguments, and other fields
  • Provide structured command argument generation that returns []string instead of concatenated shell commands

Bug Fixes:

  • Prevent command injection by removing sh -ec and sanitizing all CLI inputs

Enhancements:

  • Refactor generateArgs and generateCmd to remove shell wrappers and build safe argument slices
  • Extend API types and CRD YAML with Pattern and Enum constraints to harden input schemas
  • Update controller to mark jobs as failed on validation errors

Tests:

  • Update existing controller tests to use new argument format and default model values
  • Add comprehensive validation tests for models, args, limits, system instructions, chat templates, artifacts, Git sources, and boundary cases

@ruivieira ruivieira requested review from RobGeada and adolfo-ab June 20, 2025 15:32
@ruivieira ruivieira self-assigned this Jun 20, 2025
@ruivieira ruivieira added the kind/enhancement New feature or request label Jun 20, 2025
@sourcery-ai
Copy link
Contributor

sourcery-ai bot commented Jun 20, 2025

Reviewer's Guide

This PR refactors command construction to eliminate shell invocation, integrates comprehensive input validation in the controller and CRD schemas to block unsafe patterns, and updates tests to reflect these changes.

Sequence diagram for LMEvalJob input validation in controller

sequenceDiagram
    participant Controller
    participant Validation
    participant StatusUpdate
    Controller->>Validation: ValidateUserInput(job)
    alt Validation fails
        Validation-->>Controller: error
        Controller->>StatusUpdate: Update job.Status (State=Complete, Reason=Failed, Message)
        Controller-->>Controller: Log error
        Controller-->>Controller: Return error
    else Validation passes
        Validation-->>Controller: nil
        Controller->>Controller: Continue with job processing
    end
Loading

Class diagram for new validation logic and patterns

classDiagram
    class LMEvalJob {
        +LMEvalJobSpec Spec
        +LMEvalJobStatus Status
    }
    class LMEvalJobSpec {
        +string Model
        +[]Arg ModelArgs
        +TaskList TaskList
        +string Limit
        +[]Arg GenArgs
        +*OfflineSpec Offline
        +*int BatchSize
        +string SystemInstruction
        +*ChatTemplate ChatTemplate
    }
    class TaskList {
        +[]string TaskNames
        +[]TaskRecipe TaskRecipes
        +*CustomArtifacts CustomArtifacts
        +*CustomTasks CustomTasks
    }
    class Validation {
        +ValidateUserInput(job)
        +ValidateModelName(model)
        +ValidateArgs(args, argType)
        +ValidateTaskNames(taskNames)
        +ValidateTaskRecipes(args, argType)
        +ValidateCustomArtifactValue(input)
        +ValidateLimit(limit)
        +ValidateS3Path(path)
        +ValidateBatchSizeInput(batchSize)
        +ValidateSystemInstruction(instruction)
        +ValidateChatTemplateName(name)
        +ValidateGitURL(url)
        +ValidateGitPath(path)
        +ValidateGitBranch(branch)
        +ValidateGitCommit(commit)
        +ContainsShellMetacharacters(input)
    }
    class Patterns {
        +PatternModelName
        +PatternTemplateName
        +PatternArgName
        +PatternArgValue
        +PatternLimit
        +PatternTaskName
        +AllowedModels
    }
    LMEvalJobSpec o-- TaskList
    TaskList o-- CustomArtifacts
    TaskList o-- CustomTasks
    LMEvalJob o-- LMEvalJobSpec
    LMEvalJob o-- LMEvalJobStatus
    Validation ..> Patterns : uses
Loading

File-Level Changes

Change Details Files
Switch command generation to build argument slices and remove shell wrapper
  • Removed use of strings.Builder and "sh -ec" wrapper
  • Split the lm_eval command and flags into separate slice elements
  • Appended each flag and its value via cmds = append
  • Updated generateCmd/generateArgs to return []string without shell invocation
controllers/lmes/lmevaljob_controller.go
controllers/lmes/lmevaljob_controller_test.go
Introduce ValidateUserInput logic and integrate it into the reconciler
  • New validation.go with functions for model, args, tasks, artifacts, limit, S3 path, batch size, system instruction, chat template, and Git source
  • patterns.go defining regex patterns and allowed model map
  • handleNewCR now calls ValidateUserInput and marks job as failed on validation error
controllers/lmes/validation.go
controllers/lmes/patterns.go
controllers/lmes/lmevaljob_controller.go
Annotate API types and CRD YAML with kubebuilder validation tags
  • Added +kubebuilder:validation:Pattern and Enum tags to Arg, Card, Template, SystemPrompt, Task, CustomArtifact, GitSource, ChatTemplate, and LMEvalJobSpec fields
  • Updated CRD base YAML to match new patterns and enforce field constraints
api/lmes/v1alpha1/lmevaljob_types.go
config/crd/bases/trustyai.opendatahub.io_lmevaljobs.yaml
Expand test coverage for validation and command generation
  • Updated existing controller tests to expect argument slices instead of shell strings
  • Added validation tests in lmevaljob_controller_validation_test.go covering ValidateUserInput and individual validators
  • Added integration test Test_ControllerIntegration to verify validation and command generation in the controller
controllers/lmes/lmevaljob_controller_test.go
controllers/lmes/lmevaljob_controller_validation_test.go
Suppress false-positive security warnings on exec.Command
  • Added // #nosec G204 comments before exec.Command calls for git clone, checkout, and file copy steps
  • Rationale: inputs are validated upstream
controllers/lmes/driver/driver.go

Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

Copy link
Contributor

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @ruivieira - I've reviewed your changes - here's some feedback:

  • The PR mixes shell‐escaping fixes and extensive input validation/CRD changes — consider splitting it into smaller, focused PRs (one for command generation and one for validation/schema updates) for easier review and maintainability.
  • The validation logic and kubebuilder regex tags are duplicated across patterns.go, validation.go and the CRD YAML — consider centralizing these patterns to avoid drift and ensure consistency.
  • Injecting ValidateUserInput in the controller is helpful, but it might be cleaner to move input validation into a webhook (or a dedicated admission step) so invalid CRs are rejected before reconciliation.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- The PR mixes shell‐escaping fixes and extensive input validation/CRD changes — consider splitting it into smaller, focused PRs (one for command generation and one for validation/schema updates) for easier review and maintainability.
- The validation logic and kubebuilder regex tags are duplicated across patterns.go, validation.go and the CRD YAML — consider centralizing these patterns to avoid drift and ensure consistency.
- Injecting ValidateUserInput in the controller is helpful, but it might be cleaner to move input validation into a webhook (or a dedicated admission step) so invalid CRs are rejected before reconciliation.

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

@github-actions
Copy link

github-actions bot commented Jun 20, 2025

PR image build and manifest generation completed successfully!

📦 PR image: quay.io/trustyai/trustyai-service-operator-ci:87b2673381de5a5ce1f121aad52066362800e103

📦 LMES driver image: quay.io/trustyai/ta-lmes-driver-ci:87b2673381de5a5ce1f121aad52066362800e103

📦 LMES job image: quay.io/trustyai/ta-lmes-job-ci:87b2673381de5a5ce1f121aad52066362800e103

📦 Guardrails orchestrator image: quay.io/trustyai/ta-guardrails-orchestrator-ci:87b2673381de5a5ce1f121aad52066362800e103

🗂️ CI manifests

      devFlags:
        manifests:
          - contextDir: config
            sourcePath: ''
            uri: https://api.github.com/repos/trustyai-explainability/trustyai-service-operator-ci/tarball/operator-87b2673381de5a5ce1f121aad52066362800e103

@openshift-ci
Copy link

openshift-ci bot commented Jun 20, 2025

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: christinaexyou

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot removed the lgtm label Jun 20, 2025
@openshift-ci
Copy link

openshift-ci bot commented Jun 20, 2025

New changes are detected. LGTM label has been removed.

@openshift-ci
Copy link

openshift-ci bot commented Jun 20, 2025

@ruivieira: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/trustyai-service-operator-e2e 87b2673 link true /test trustyai-service-operator-e2e

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@openshift-merge-robot
Copy link
Collaborator

PR needs rebase.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants