Skip to content

Conversation

@oerling
Copy link
Contributor

@oerling oerling commented Aug 9, 2025

No description provided.

@meta-cla meta-cla bot added the CLA Signed This label is managed by the Meta Open Source bot. label Aug 9, 2025
@facebook-github-bot
Copy link
Contributor

@oerling has imported this pull request. If you are a Meta employee, you can view this in D79934460.

@mbasmanova mbasmanova changed the title fix: Add commentts to clarify Distribution fix: Add comments to clarify Distribution Aug 9, 2025
@mbasmanova mbasmanova changed the title fix: Add comments to clarify Distribution docs: Add comments to clarify Distribution Aug 9, 2025
@mbasmanova
Copy link
Contributor

CC: @feilong-liu @MBkkt

Copy link
Contributor

@mbasmanova mbasmanova left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@oerling Thank you for adding these comments. Super helpful. Please, fix typos before landing.

/// data at rest and workers for data in flight based on values of
/// special columns. We use the word partitioning to mean that where
/// a row of data at rest or in flight is found depends on a
/// function of one or more columns of the row. Hash partitioning nd
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo: partitioning nd -> partitioning and

/// rows of a partition are the rows of a dataset for which the
/// partitioning function satisfy some criteria, e.g. has % number
/// of partitions == id of partition. The corresponding concept in
/// Hive, e.g. Presto or Spark is bucketing. the word partitioning
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo: the word -> The word


/// Distribution of data. 'numPartitions' is 1 if the data is not partitioned.
/// There is copartitioning if the DistributionType is the same on both sides
/// There is copartitioning if the DistributionType iscompatible on both sides
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo: iscompatible -> is compatible

Should there be a function that takes 2 DistributionType's and returns true if they are compatible?

ShuffleMode mode{ShuffleMode::kNone};
int32_t numPartitions{1};
LocusCP locus{nullptr};
bool isGather{false};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would you document locus and isGather properties? Why do we need 'locus'?

/// True if 'other' has the same ordering columns and order type.
bool isSameOrder(const Distribution& other) const;

/// makes a new Distribution where 'exprs' are replaced with the
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo: makes -> Makes

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Meta Open Source bot.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants