Skip to content

Conversation

@gabotorresruiz
Copy link
Contributor

SUMMARY

This PR adds error logging around CSV chunk concatenation to improve debugging of upload failures.

When uploading CSVs with date columns, pandas emits a UserWarning about date format inference, which is followed by an exception during chunk concatenation.

This is just some defensive code to see if we can actually catch what is causing the error. This provides visibility into the root cause of concatenation failures without changing functionality.

@bito-code-review
Copy link
Contributor

bito-code-review bot commented Nov 13, 2025

Code Review Agent Run #cd4045

Actionable Suggestions - 0
Additional Suggestions - 1
  • superset/commands/database/uploaders/csv_reader.py - 1
    • Line exceeds maximum length limit · Line 422-422
      Line 422 exceeds the 88-character limit (90 characters). Consider breaking the string into multiple lines or using a shorter variable name to improve readability.
      Code suggestion
       @@ -421,3 +421,4 @@
                                logger.warning(
      -                            "Error concatenating CSV chunks: %s. "
      -                            "This may be due to inconsistent date parsing across chunks.",
      +                            "Error concatenating CSV chunks: %s. "
      +                            "This may be due to inconsistent date parsing "
      +                            "across chunks.",
                                    str(ex),
Review Details
  • Files reviewed - 1 · Commit Range: da7c8a6..da7c8a6
    • superset/commands/database/uploaders/csv_reader.py
  • Files skipped - 0
  • Tools
    • Whispers (Secret Scanner) - ✔︎ Successful
    • Detect-secrets (Secret Scanner) - ✔︎ Successful
    • MyPy (Static Code Analysis) - ✔︎ Successful
    • Astral Ruff (Static Code Analysis) - ✔︎ Successful

Bito Usage Guide

Commands

Type the following command in the pull request comment and save the comment.

  • /review - Manually triggers a full AI review.

  • /pause - Pauses automatic reviews on this pull request.

  • /resume - Resumes automatic reviews.

  • /resolve - Marks all Bito-posted review comments as resolved.

  • /abort - Cancels all in-progress reviews.

Refer to the documentation for additional commands.

Configuration

This repository uses Default Agent You can customize the agent settings here or contact your Bito workspace admin at [email protected].

Documentation & Help

AI Code Review powered by Bito Logo

Copy link

@korbit-ai korbit-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've completed my review and didn't find any issues.

Files scanned
File Path Reviewed
superset/commands/database/uploaders/csv_reader.py

Explore our documentation to understand the languages and file types we support and the files we ignore.

Check out our docs on how you can make Korbit work best for you and your team.

Loving Korbit!? Share us on LinkedIn Reddit and X

@dosubot dosubot bot added the data:csv Related to import/export of CSVs label Nov 13, 2025
@codecov
Copy link

codecov bot commented Nov 13, 2025

Codecov Report

❌ Patch coverage is 40.00000% with 3 lines in your changes missing coverage. Please review.
✅ Project coverage is 68.21%. Comparing base (fb8eb2a) to head (9fef1d9).
⚠️ Report is 7 commits behind head on master.

Files with missing lines Patch % Lines
superset/commands/database/uploaders/csv_reader.py 40.00% 3 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff             @@
##           master   #36108       +/-   ##
===========================================
+ Coverage        0   68.21%   +68.21%     
===========================================
  Files           0      629      +629     
  Lines           0    46094    +46094     
  Branches        0     4996     +4996     
===========================================
+ Hits            0    31442    +31442     
- Misses          0    13405    +13405     
- Partials        0     1247     +1247     
Flag Coverage Δ
hive 43.80% <0.00%> (?)
mysql 67.32% <40.00%> (?)
postgres 67.37% <40.00%> (?)
presto 47.42% <0.00%> (?)
python 68.18% <40.00%> (?)
sqlite 66.99% <40.00%> (?)
unit 100.00% <ø> (?)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@gabotorresruiz gabotorresruiz force-pushed the fix/user-warning-raising-error branch from da7c8a6 to f9d7ec4 Compare November 15, 2025 03:10
@gabotorresruiz gabotorresruiz force-pushed the fix/user-warning-raising-error branch from f9d7ec4 to f210704 Compare November 17, 2025 17:19
@sadpandajoe
Copy link
Member

Superset uses Git pre-commit hooks courtesy of pre-commit. To install run the following:

pip3 install -r requirements/development.txt
pre-commit install

A series of checks will now run when you make a git commit.

Alternatively it is possible to run pre-commit by running pre-commit manually:

pre-commit run --all-files

Comment on lines 417 to 426
try:
result = pd.concat(chunks, ignore_index=False)
except Exception as ex:
logger.warning(
"Error concatenating CSV chunks: %s. "
"This may be due to inconsistent date parsing across chunks.",
str(ex),
)
raise

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we have repro steps, can we not just run this locally to get the logs for root cause or are we saying this is intermittent? Also if we are merging this, can we just have a quick test to make sure exception is thrown and is still raised afterwards.

@pull-request-size pull-request-size bot added size/M and removed size/S labels Nov 17, 2025
@sadpandajoe sadpandajoe merged commit 9d06a58 into apache:master Nov 17, 2025
67 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

data:csv Related to import/export of CSVs size/M

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants