Skip to content

Conversation

@ysinghc
Copy link
Contributor

@ysinghc ysinghc commented Nov 11, 2025

SUMMARY

This PR adds comprehensive validation for the query_context field in both ChartPostSchema and ChartPutSchema to ensure that when provided, it contains the required metadata fields (datasource and queries) needed for chart data retrieval.

Changes:

  • Created a new validate_query_context_metadata() function in superset/utils/schema.py that validates the structure of query_context JSON
  • Updated ChartPostSchema to use the new validator (previously used generic validate_json)
  • Updated ChartPutSchema to add the new validator (previously had no validation)
  • Added comprehensive unit tests for the new validator in tests/unit_tests/utils/test_schema.py
  • Added integration tests in tests/unit_tests/charts/test_schemas.py to verify schema-level validation

Rationale:
Previously, the query_context field only validated that the value was valid JSON, but didn't ensure it contained the necessary structure for chart operations. This could lead to runtime errors when charts are rendered with incomplete query context data. The new validation ensures data integrity at the API level.

BEFORE/AFTER SCREENSHOTS OR ANIMATED GIF

N/A - Backend validation change with no UI impact

TESTING INSTRUCTIONS

  1. Run the unit tests:

    pytest tests/unit_tests/utils/test_schema.py -v
    pytest tests/unit_tests/charts/test_schemas.py::test_chart_post_schema_query_context_validation -v
    pytest tests/unit_tests/charts/test_schemas.py::test_chart_put_schema_query_context_validation -v
  2. Manual API testing (optional):

    • Start Superset in development mode
    • Try to create a chart (POST /api/v1/chart/) with invalid query_context (missing datasource or queries)
    • Verify that the API returns a validation error
    • Try with valid query_context containing both fields - should succeed
    • Try with null query_context - should succeed (field is nullable)
  3. Verify pre-commit hooks pass:

    git add .
    pre-commit run

ADDITIONAL INFORMATION

@bito-code-review
Copy link
Contributor

bito-code-review bot commented Nov 11, 2025

Code Review Agent Run #50c92e

Actionable Suggestions - 0
Additional Suggestions - 10
  • tests/unit_tests/charts/test_schemas.py - 2
    • Docstring missing period at end · Line 161-161
      Docstring should end with a period. This issue also appears on line 246.
      Code suggestion
       @@ -161,1 +161,1 @@
      -    """Test that ChartPostSchema validates query_context contains required metadata"""
      +    """Test that ChartPostSchema validates query_context contains required metadata."""
    • Missing trailing comma in dictionary · Line 169-169
      Missing trailing comma in dictionary/list literals. This issue appears on lines 169, 192, 254, and 265.
      Code suggestion
       @@ -168,2 +168,2 @@
      -            "queries": [{"metrics": ["count"], "columns": []}],
      -        }
      +            "queries": [{"metrics": ["count"], "columns": []}],
      +        }
  • superset/utils/schema.py - 3
    • Docstring format and imperative mood · Line 57-57
      Docstring should start summary on first line and use imperative mood. Change `"Validator for query_context field"` to `"Validate query_context field"`.
      Code suggestion
       @@ -56,4 +56,3 @@
       def validate_query_context_metadata(value: bytes | bytearray | str | None) -> None:
      -    """
      -    Validator for query_context field to ensure it contains required metadata.
      +    """Validate query_context field to ensure it contains required metadata.
      
           Validates that the query_context JSON contains the required 'datasource' and
    • String literal in exception message · Line 72-72
      Exception message `"JSON not valid"` should be assigned to a variable first. This is part of multiple similar issues (lines 53, 72, 76) that should be addressed consistently.
      Code suggestion
       @@ -69,3 +69,4 @@
           try:
               parsed_data = json.loads(value)
           except json.JSONDecodeError as ex:
      -        raise ValidationError("JSON not valid") from ex
      +        error_msg = "JSON not valid"
      +        raise ValidationError(error_msg) from ex
    • String literal in exception message · Line 76-76
      Exception message `"Query context must be a valid JSON object"` should be assigned to a variable first. This is part of multiple similar issues that should be addressed consistently.
      Code suggestion
       @@ -74,2 +74,3 @@
           # Validate required fields exist in the query_context
           if not isinstance(parsed_data, dict):
      -        raise ValidationError("Query context must be a valid JSON object")
      +        error_msg = "Query context must be a valid JSON object"
      +        raise ValidationError(error_msg)
  • tests/unit_tests/utils/test_schema.py - 5
    • Missing module docstring at file start · Line 1-1
      Add a module-level docstring to describe the purpose of this test file. This improves code documentation and follows Python conventions.
      Code suggestion
       @@ -17,6 +17,9 @@
      -
      -import pytest
      +"""
      +Unit tests for superset.utils.schema module.
      +"""
      +
      +import pytest
    • Docstring should end with period punctuation · Line 30-30
      Multiple docstrings throughout the file are missing proper punctuation. Function docstrings should end with a period. This affects lines 30, 37, 44, 52, 58, 65, 73, 79, 91, 103, 111, 119, 131, 141, 152, 166, 178, 190, 202, 208, 215, 223, 231, 239, 270, 277, 284, 291, 298, 305, 312, and 321.
      Code suggestion
       @@ -29,2 +29,2 @@
      -def test_validate_json_valid() -> None:
      -    """Test validate_json with valid JSON string"""
      +def test_validate_json_valid() -> None:
      +    """Test validate_json with valid JSON string."""
    • Missing trailing comma in dictionary literal · Line 84-84
      Multiple dictionary literals are missing trailing commas. This affects lines 84, 96, 121, 159, 171, 183, 195, 259, and 263. Adding trailing commas improves code consistency and makes diffs cleaner.
      Code suggestion
       @@ -81,5 +81,5 @@
      -        {
      -            "datasource": {"type": "table", "id": 1},
      -            "queries": [{"metrics": ["count"], "columns": []}],
      -        }
      +        {
      +            "datasource": {"type": "table", "id": 1},
      +            "queries": [{"metrics": ["count"], "columns": []}],
      +        }
    • Boolean positional argument in function call · Line 232-232
      Boolean value `True` is passed as a positional argument to `json.dumps()`. Consider using a keyword argument for better readability.
      Code suggestion
       @@ -231,2 +231,2 @@
      -    """Test validate_query_context_metadata with boolean instead of JSON object"""
      -    bool_value = json.dumps(True)
      +    """Test validate_query_context_metadata with boolean instead of JSON object"""
      +    bool_value = json.dumps(obj=True)
    • Magic number used in comparison statement · Line 301-301
      Magic numbers `2` are used in comparisons on lines 301 and 317. Consider defining these as named constants for better code maintainability.
      Code suggestion
       @@ -298,4 +298,5 @@
      -def test_one_of_case_insensitive_non_string_valid() -> None:
      -    """Test OneOfCaseInsensitive validator with non-string valid value"""
      -    validator = OneOfCaseInsensitive([1, 2, 3])
      -    result = validator(2)
      +def test_one_of_case_insensitive_non_string_valid() -> None:
      +    """Test OneOfCaseInsensitive validator with non-string valid value"""
      +    VALID_NUMBER = 2
      +    validator = OneOfCaseInsensitive([1, VALID_NUMBER, 3])
      +    result = validator(VALID_NUMBER)
Review Details
  • Files reviewed - 4 · Commit Range: 287674d..287674d
    • superset/charts/schemas.py
    • superset/utils/schema.py
    • tests/unit_tests/charts/test_schemas.py
    • tests/unit_tests/utils/test_schema.py
  • Files skipped - 0
  • Tools
    • Whispers (Secret Scanner) - ✔︎ Successful
    • Detect-secrets (Secret Scanner) - ✔︎ Successful
    • MyPy (Static Code Analysis) - ✔︎ Successful
    • Astral Ruff (Static Code Analysis) - ✔︎ Successful

Bito Usage Guide

Commands

Type the following command in the pull request comment and save the comment.

  • /review - Manually triggers a full AI review.

  • /pause - Pauses automatic reviews on this pull request.

  • /resume - Resumes automatic reviews.

  • /resolve - Marks all Bito-posted review comments as resolved.

  • /abort - Cancels all in-progress reviews.

Refer to the documentation for additional commands.

Configuration

This repository uses Default Agent You can customize the agent settings here or contact your Bito workspace admin at [email protected].

Documentation & Help

AI Code Review powered by Bito Logo

@dosubot dosubot bot added the api Related to the REST API label Nov 11, 2025
@ysinghc ysinghc changed the title feat(schema): add validation for query_context metadata in ChartPostSchema and ChartPutSchema fix(charts): add query_context validation to prevent incomplete metadata Nov 11, 2025
Copy link

@korbit-ai korbit-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review by Korbit AI

Korbit automatically attempts to detect when you fix issues in new commits.
Category Issue Status
Design Duplicate JSON validation logic ▹ view
Design Imperative field validation ▹ view
Functionality Falsy JSON strings incorrectly bypass validation ▹ view
Performance Inefficient field validation with list collection ▹ view
Files scanned
File Path Reviewed
superset/utils/schema.py
superset/charts/schemas.py

Explore our documentation to understand the languages and file types we support and the files we ignore.

Check out our docs on how you can make Korbit work best for you and your team.

Loving Korbit!? Share us on LinkedIn Reddit and X

@codecov
Copy link

codecov bot commented Nov 11, 2025

Codecov Report

❌ Patch coverage is 20.00000% with 12 lines in your changes missing coverage. Please review.
✅ Project coverage is 69.11%. Comparing base (d123249) to head (b267f2f).
⚠️ Report is 24 commits behind head on master.

Files with missing lines Patch % Lines
superset/utils/schema.py 20.00% 12 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff             @@
##           master   #36076       +/-   ##
===========================================
+ Coverage        0   69.11%   +69.11%     
===========================================
  Files           0      632      +632     
  Lines           0    47959    +47959     
  Branches        0     5467     +5467     
===========================================
+ Hits            0    33146    +33146     
- Misses          0    13536    +13536     
- Partials        0     1277     +1277     
Flag Coverage Δ
hive 44.21% <6.66%> (?)
mysql 68.03% <20.00%> (?)
postgres 68.08% <20.00%> (?)
presto 48.34% <6.66%> (?)
python 69.07% <20.00%> (?)
sqlite 67.70% <20.00%> (?)
unit 100.00% <ø> (?)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@github-actions github-actions bot removed the api Related to the REST API label Nov 11, 2025
ysinghc

This comment was marked as duplicate.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds validation to the query_context field in chart schemas to ensure it contains required metadata (datasource and queries) when provided, preventing runtime errors from incomplete query context data.

  • Created a new validate_query_context_metadata() function that validates both JSON structure and required fields
  • Updated ChartPostSchema to use the new validator (replacing generic validate_json)
  • Updated ChartPutSchema to add the new validator (previously had no validation)
  • Added comprehensive unit and integration tests

Reviewed Changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 comment.

File Description
superset/utils/schema.py Adds validate_query_context_metadata() function to validate query_context JSON structure and required fields
superset/charts/schemas.py Updates ChartPostSchema and ChartPutSchema to use the new validator for query_context field
tests/unit_tests/utils/test_schema.py Adds comprehensive unit tests for the new validator covering valid/invalid cases, edge cases, and error messages
tests/unit_tests/charts/test_schemas.py Adds integration tests verifying schema-level validation for both POST and PUT operations

@ysinghc ysinghc marked this pull request as draft November 16, 2025 13:03
@ysinghc ysinghc marked this pull request as ready for review November 16, 2025 13:03
@dosubot dosubot bot added the api Related to the REST API label Nov 16, 2025
Copy link

@korbit-ai korbit-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've completed my review and didn't find any issues.

Files scanned
File Path Reviewed
superset/utils/schema.py
superset/charts/schemas.py

Explore our documentation to understand the languages and file types we support and the files we ignore.

Check out our docs on how you can make Korbit work best for you and your team.

Loving Korbit!? Share us on LinkedIn Reddit and X

@rusackas
Copy link
Member

Running CI, but I think i defer to @betodealmeida here :D

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

api Related to the REST API size/L

Projects

None yet

Development

Successfully merging this pull request may close these issues.

"error": "Dataset schema is invalid, caused by: QueryContextFactory.create() missing 2 required keyword-only arguments: 'datasource' and 'queries'"

2 participants