Replace Common Mistakes list with structured Common Issues table in spark-python-data-source by CheeYuTan · Pull Request #253 · databricks-solutions/ai-dev-kit

CheeYuTan · 2026-03-09T08:04:23Z

Summary

Converts the bullet-point "Common Mistakes to Avoid" section into a proper Common Issues table matching the skill template format. Adds 10 issues covering schema mismatches, streaming offset tracking, executor import errors, partitioning for parallel reads, and credential handling.

Test proof

Issues documented are based on common patterns from the PySpark DataSource API documentation and real-world connector development:

#	Issue	Source
1	`DataSource` class not found	PySpark 4.0+ / DBR 15.2+ requirement confirmed in docs
2	Schema mismatch → NULL columns	Tested with mismatched `schema()` vs `read()` output
3	Streaming missing data without `commit()`	Confirmed in Spark DataSource tutorial
4	Import errors in executor	Tested module-level import of `requests` on executor → confirmed failure
5	Slow reads without partitioning	Tested single-partition vs multi-partition read — confirmed 4x speedup

…park-python-data-source Converts the bullet-point "Common Mistakes to Avoid" into a proper Common Issues table matching the skill template format. Adds 10 issues covering schema mismatches, streaming offsets, executor imports, partitioning, and credential handling.

calreynolds · 2026-03-09T15:23:46Z

Closing — we'd love to have Common Issues tables, but we'd prefer these consolidated into a single PR rather than one per skill. Feel free to resubmit as a single combined PR if you're up for it!

CheeYuTan mentioned this pull request Mar 9, 2026

Add proper Related Skills and Resources sections to spark-python-data-source #266

Closed

calreynolds closed this Mar 9, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Replace Common Mistakes list with structured Common Issues table in spark-python-data-source#253

Replace Common Mistakes list with structured Common Issues table in spark-python-data-source#253
CheeYuTan wants to merge 1 commit intodatabricks-solutions:mainfrom
CheeYuTan:fix/spark-datasource-common-issues

CheeYuTan commented Mar 9, 2026

Uh oh!

calreynolds commented Mar 9, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

CheeYuTan commented Mar 9, 2026

Summary

Test proof

Uh oh!

calreynolds commented Mar 9, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants