-
Notifications
You must be signed in to change notification settings - Fork 16.2k
fix(csv-upload): log detailed errors during chunk concatenation for debugging #36108
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(csv-upload): log detailed errors during chunk concatenation for debugging #36108
Conversation
Code Review Agent Run #cd4045Actionable Suggestions - 0Additional Suggestions - 1
Review Details
Bito Usage GuideCommands Type the following command in the pull request comment and save the comment.
Refer to the documentation for additional commands. Configuration This repository uses Documentation & Help |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've completed my review and didn't find any issues.
Files scanned
| File Path | Reviewed |
|---|---|
| superset/commands/database/uploaders/csv_reader.py | ✅ |
Explore our documentation to understand the languages and file types we support and the files we ignore.
Check out our docs on how you can make Korbit work best for you and your team.
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #36108 +/- ##
===========================================
+ Coverage 0 68.21% +68.21%
===========================================
Files 0 629 +629
Lines 0 46094 +46094
Branches 0 4996 +4996
===========================================
+ Hits 0 31442 +31442
- Misses 0 13405 +13405
- Partials 0 1247 +1247
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
da7c8a6 to
f9d7ec4
Compare
f9d7ec4 to
f210704
Compare
|
Superset uses Git pre-commit hooks courtesy of pre-commit. To install run the following: A series of checks will now run when you make a git commit. Alternatively it is possible to run pre-commit by running pre-commit manually: |
| try: | ||
| result = pd.concat(chunks, ignore_index=False) | ||
| except Exception as ex: | ||
| logger.warning( | ||
| "Error concatenating CSV chunks: %s. " | ||
| "This may be due to inconsistent date parsing across chunks.", | ||
| str(ex), | ||
| ) | ||
| raise | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we have repro steps, can we not just run this locally to get the logs for root cause or are we saying this is intermittent? Also if we are merging this, can we just have a quick test to make sure exception is thrown and is still raised afterwards.
SUMMARY
This PR adds error logging around CSV chunk concatenation to improve debugging of upload failures.
When uploading CSVs with date columns, pandas emits a
UserWarningabout date format inference, which is followed by an exception during chunk concatenation.This is just some defensive code to see if we can actually catch what is causing the error. This provides visibility into the root cause of concatenation failures without changing functionality.