-
Notifications
You must be signed in to change notification settings - Fork 39
refactor: Remove unused confusing code #463
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
@mbasmanova will you review, please? |
| /// Distribution of data. | ||
| /// There is copartitioning if the DistributionType is the same on both sides | ||
| /// and both sides have an equal number of 1:1 type matched partitioning keys. | ||
| struct DistributionType { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
DistributionType is incomplete, but removing numPartitions is not going to fix that. It needs more information about partitioning. It looks like @hdikeman is wrapping up Connector API changes for Table Write and I should have bandwidth to work on adding TableWrite support to the optimizer. As part of that work I expect to come and revisit DistributionType struct.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I made it complete in different PR, where partitioning is a function.
But it's quite nontrivial PR that contains different parts, so I decided to start with simple part.
Number of partitions not needed, because count of partitions controlled by optimizer options (numDrivers/numWorkers) when we plan partition.
And it doesn't needed for broadcast/gather, because they have specific behavior.
The only case where is it needed it's write table, but for this case it should be implemented in more abstract way with partition type/function (that internally have number of buckets for hive for an example).
Also table scan, but it's not implement partitioning now
About @hdikeman work, I have PR that implements TableWrite in optimizer, I plan to rebase it after Henry PR with connector api changes will be merged.
So this PR will contains implementation for TableWrite in optimizer.
It will be for TestConnector and for LocalHiveConnectorMetadata but with copartition for hive disabled because it requires some changes in runner.
Does it sounds ok to you?
@MBkkt Any chance you could rebase it now? I'd like to start reading it without waiting for Henry's PR to land. |
|
I close this PR because this change was accounted in my other, more complete PR: #498 |
We don't read from this variable except
to string.And we never write to this variable except one case.
So in most case
to stringwill produce incorrect information.Also this variable isn't really needed, I think it's better to remove it
Even for partitioned write/etc, it's not really used in any ongoing PR