Cost based planning by gabotechs · Pull Request #311 · datafusion-contrib/datafusion-distributed

gabotechs · 2026-01-26T11:25:57Z

Reworks the distributed planner so that it integrates with DataFusion upstream statistics system in favor of having users provide a custom TaskEstimator.

There are some key changes in this PR that rework how this project assign tasks and stages to a plan:

Remove TaskEstimator

Users are no longer free to specific how many tasks should be used for a distributed query:

previously: users provide for their own nodes the number of tasks to be used in the stage containing their nodes, and upper stages are estimated based on an arbitrary cardinality reduction factor.
now: users provide runtime statistics for their nodes, using DataFusion upstream ExecutionPlan::partition_statistics() method. Based on this statistics, Distributed DataFusion calculates how many tasks are appropriate

Rely on upstream statistics system

This PR heavily consumes the Statistics provided by the different nodes in order to estimate how much data is going to flow through them.

There are still some gaps in upstream statistics that are bridge in this project with some sane defaults.

Compute cost assignation

One of the biggest additions in this PR is the compute complexity estimation for the different nodes. Each node has cost attached that is estimated based on how compute heavy they are.

The cost is measured in "bytes processed", which estimates how many bytes are expected to be processed by the node given the node itself, and the estimated rows and bytes that are going to flow through it given by upstream's ExecutionPlan::partition_statistics.

The computational complexity is taking into account for the different operators with the following enum:

pub enum Complexity {
    /// Constant complexity
    Constant,
    /// Linear with a specific column from a specific child.
    Linear(LinearComplexity),
    /// NLogM
    Log(Box<Complexity>, Box<Complexity>),
    /// N+M
    Plus(Box<Complexity>, Box<Complexity>),
    /// N*M
    Multiply(Box<Complexity>, Box<Complexity>),
}

Results

TODO: choose better defaults in order to improve performance

=== Comparing tpch_sf1 results from engine 'datafusion-distributed-main' [prev] with 'datafusion-distributed-cost-based-planning' [new] ===
      q1: prev= 471 ms, new= 586 ms, diff=1.24 slower ❌
      q2: prev= 636 ms, new= 725 ms, diff=1.14 slower ✖
      q3: prev= 969 ms, new=1191 ms, diff=1.23 slower ❌
      q4: prev= 474 ms, new= 470 ms, diff=1.01 faster ✔
      q5: prev= 477 ms, new= 790 ms, diff=1.66 slower ❌
      q6: prev= 384 ms, new= 479 ms, diff=1.25 slower ❌
      q7: prev= 687 ms, new= 715 ms, diff=1.04 slower ✖
      q8: prev= 629 ms, new= 853 ms, diff=1.36 slower ❌
      q9: prev= 534 ms, new= 625 ms, diff=1.17 slower ✖
     q10: prev= 418 ms, new= 604 ms, diff=1.44 slower ❌
     q11: prev= 542 ms, new= 602 ms, diff=1.11 slower ✖
     q12: prev= 456 ms, new= 455 ms, diff=1.00 faster ✔
     q13: prev= 467 ms, new= 411 ms, diff=1.14 faster ✔
     q14: prev= 417 ms, new= 430 ms, diff=1.03 slower ✖
     q15: prev= 754 ms, new= 769 ms, diff=1.02 slower ✖
     q16: prev= 472 ms, new= 707 ms, diff=1.50 slower ❌
     q17: prev= 468 ms, new= 823 ms, diff=1.76 slower ❌
     q18: prev= 425 ms, new= 544 ms, diff=1.28 slower ❌
     q19: prev= 493 ms, new= 886 ms, diff=1.80 slower ❌
     q20: prev= 435 ms, new= 723 ms, diff=1.66 slower ❌
     q21: prev= 822 ms, new=1049 ms, diff=1.28 slower ❌
     q22: prev= 480 ms, new= 380 ms, diff=1.26 faster ✅
   TOTAL: prev=35757 ms, new=44468 ms, diff=1.24 slower ❌

=== Comparing tpch_sf10 results from engine 'datafusion-distributed-main' [prev] with 'datafusion-distributed-cost-based-planning' [new] ===
      q1: prev=1800 ms, new=1373 ms, diff=1.31 faster ✅
      q2: prev= 769 ms, new= 885 ms, diff=1.15 slower ✖
      q3: prev=1306 ms, new=1152 ms, diff=1.13 faster ✔
      q4: prev= 994 ms, new= 835 ms, diff=1.19 faster ✔
      q5: prev=1821 ms, new=1453 ms, diff=1.25 faster ✅
      q6: prev= 860 ms, new= 746 ms, diff=1.15 faster ✔
      q7: prev=1992 ms, new=1685 ms, diff=1.18 faster ✔
      q8: prev=2200 ms, new=1682 ms, diff=1.31 faster ✅
      q9: prev=2572 ms, new=2131 ms, diff=1.21 faster ✅
     q10: prev=1325 ms, new=1261 ms, diff=1.05 faster ✔
     q11: prev= 637 ms, new= 681 ms, diff=1.07 slower ✖
     q12: prev=1144 ms, new= 932 ms, diff=1.23 faster ✅
     q13: prev= 983 ms, new=1014 ms, diff=1.03 slower ✖
     q14: prev= 966 ms, new= 955 ms, diff=1.01 faster ✔
     q15: prev=1394 ms, new=1213 ms, diff=1.15 faster ✔
     q16: prev= 522 ms, new= 727 ms, diff=1.39 slower ❌
     q17: prev=2240 ms, new=1719 ms, diff=1.30 faster ✅
     q18: prev=1983 ms, new=2074 ms, diff=1.05 slower ✖
     q19: prev=1133 ms, new=1056 ms, diff=1.07 faster ✔
     q20: prev=1059 ms, new=1090 ms, diff=1.03 slower ✖
     q21: prev=2645 ms, new=2254 ms, diff=1.17 faster ✔
     q22: prev= 474 ms, new= 630 ms, diff=1.33 slower ❌
   TOTAL: prev=92483 ms, new=82667 ms, diff=1.12 faster ✔

=== Comparing tpch_sf100 results from engine 'datafusion-distributed-main' [prev] with 'datafusion-distributed-cost-based-planning' [new] ===
      q1: prev=11051 ms, new=9273 ms, diff=1.19 faster ✔
      q2: prev=2991 ms, new=2969 ms, diff=1.01 faster ✔
      q3: prev=8828 ms, new=7782 ms, diff=1.13 faster ✔
      q4: prev=4861 ms, new=3754 ms, diff=1.29 faster ✅
      q5: prev=11823 ms, new=11027 ms, diff=1.07 faster ✔
      q6: prev=5334 ms, new=3661 ms, diff=1.46 faster ✅
      q7: prev=13460 ms, new=13466 ms, diff=1.00 slower ✖
      q8: prev=13505 ms, new=12510 ms, diff=1.08 faster ✔
      q9: prev=18401 ms, new=17907 ms, diff=1.03 faster ✔
     q10: prev=9783 ms, new=8736 ms, diff=1.12 faster ✔
     q11: prev=2693 ms, new=2801 ms, diff=1.04 slower ✖
     q12: prev=6528 ms, new=4911 ms, diff=1.33 faster ✅
     q13: prev=5324 ms, new=5007 ms, diff=1.06 faster ✔
     q14: prev=4611 ms, new=3726 ms, diff=1.24 faster ✅
     q15: prev=6443 ms, new=5881 ms, diff=1.10 faster ✔
     q16: prev=1821 ms, new=1727 ms, diff=1.05 faster ✔
     q17: prev=16330 ms, new=15968 ms, diff=1.02 faster ✔
     q18: prev=17441 ms, new=17702 ms, diff=1.01 slower ✖
     q19: prev=6020 ms, new=4987 ms, diff=1.21 faster ✅
     q20: prev=6316 ms, new=5372 ms, diff=1.18 faster ✔
     q21: prev=22781 ms, new=23260 ms, diff=1.02 slower ✖
     q22: prev=2912 ms, new=2861 ms, diff=1.02 faster ✔
   TOTAL: prev=597796 ms, new=555888 ms, diff=1.08 faster ✔

=== Comparing tpcds_sf1 results from engine 'datafusion-distributed-main' [prev] with 'datafusion-distributed-cost-based-planning' [new] ===
q1: Previously failed, and now also failed ❌
      q2: prev= 839 ms, new= 894 ms, diff=1.07 slower ✖
      q3: prev= 449 ms, new= 406 ms, diff=1.11 faster ✔
      q4: prev=2968 ms, new=3284 ms, diff=1.11 slower ✖
      q5: prev= 710 ms, new= 840 ms, diff=1.18 slower ✖
      q6: prev=1435 ms, new=1799 ms, diff=1.25 slower ❌
      q7: prev= 476 ms, new= 834 ms, diff=1.75 slower ❌
      q8: prev= 787 ms, new=1082 ms, diff=1.37 slower ❌
q9: Previously failed, but now succeeded 🟠
     q10: prev=1083 ms, new=2056 ms, diff=1.90 slower ❌
     q11: prev=2065 ms, new=2368 ms, diff=1.15 slower ✖
     q12: prev= 540 ms, new= 648 ms, diff=1.20 slower ✖
     q13: prev=1160 ms, new=1416 ms, diff=1.22 slower ❌
     q14: prev=1831 ms, new=1842 ms, diff=1.01 slower ✖
     q15: prev= 656 ms, new= 976 ms, diff=1.49 slower ❌
     q16: prev=1155 ms, new= 811 ms, diff=1.42 faster ✅
q17: Previously failed, and now also failed ❌
     q18: prev=1337 ms, new=1735 ms, diff=1.30 slower ❌
     q19: prev= 678 ms, new= 752 ms, diff=1.11 slower ✖
     q20: prev= 784 ms, new= 649 ms, diff=1.21 faster ✅
     q21: prev=1160 ms, new= 733 ms, diff=1.58 faster ✅
     q22: prev=1084 ms, new= 835 ms, diff=1.30 faster ✅
     q23: prev=1616 ms, new=1634 ms, diff=1.01 slower ✖
q24: Previously failed, and now also failed ❌
     q25: prev= 567 ms, new= 532 ms, diff=1.07 faster ✔
     q26: prev= 579 ms, new= 556 ms, diff=1.04 faster ✔
q27: Previously failed, and now also failed ❌
     q28: prev= 518 ms, new= 614 ms, diff=1.19 slower ✖
     q29: prev= 678 ms, new= 596 ms, diff=1.14 faster ✔
q30: Previously failed, and now also failed ❌
     q31: prev= 697 ms, new= 834 ms, diff=1.20 slower ✖
     q32: prev= 627 ms, new= 819 ms, diff=1.31 slower ❌
     q33: prev= 820 ms, new=1371 ms, diff=1.67 slower ❌
     q34: prev= 679 ms, new= 660 ms, diff=1.03 faster ✔
     q35: prev=1234 ms, new=1820 ms, diff=1.47 slower ❌
q36: Previously failed, and now also failed ❌
     q37: prev= 944 ms, new=1121 ms, diff=1.19 slower ✖
     q38: prev= 756 ms, new= 674 ms, diff=1.12 faster ✔
     q39: prev= 775 ms, new= 756 ms, diff=1.03 faster ✔
     q40: prev= 921 ms, new=1362 ms, diff=1.48 slower ❌
     q41: prev= 408 ms, new= 293 ms, diff=1.39 faster ✅
     q42: prev= 340 ms, new= 480 ms, diff=1.41 slower ❌
     q43: prev= 573 ms, new= 646 ms, diff=1.13 slower ✖
     q44: prev= 658 ms, new= 760 ms, diff=1.16 slower ✖
     q45: prev= 680 ms, new=1069 ms, diff=1.57 slower ❌
     q46: prev=1375 ms, new=1388 ms, diff=1.01 slower ✖
     q47: prev=1061 ms, new=1207 ms, diff=1.14 slower ✖
     q48: prev=1456 ms, new=1692 ms, diff=1.16 slower ✖
     q49: prev= 997 ms, new= 820 ms, diff=1.22 faster ✅
q50: Previously failed, and now also failed ❌
     q51: prev= 600 ms, new= 662 ms, diff=1.10 slower ✖
     q52: prev= 442 ms, new= 393 ms, diff=1.12 faster ✔
     q53: prev= 546 ms, new= 636 ms, diff=1.16 slower ✖
q54: Previously failed, and now also failed ❌
     q55: prev= 479 ms, new= 546 ms, diff=1.14 slower ✖
     q56: prev= 860 ms, new=1296 ms, diff=1.51 slower ❌
     q57: prev= 794 ms, new= 754 ms, diff=1.05 faster ✔
     q58: prev= 638 ms, new= 694 ms, diff=1.09 slower ✖
     q59: prev= 955 ms, new=1075 ms, diff=1.13 slower ✖
     q60: prev= 747 ms, new=1413 ms, diff=1.89 slower ❌
     q61: prev=1809 ms, new=1835 ms, diff=1.01 slower ✖
     q62: prev=1204 ms, new=1175 ms, diff=1.02 faster ✔
     q63: prev= 515 ms, new= 601 ms, diff=1.17 slower ✖
     q64: prev=3652 ms, new=3139 ms, diff=1.16 faster ✔
     q65: prev= 663 ms, new= 846 ms, diff=1.28 slower ❌
     q66: prev=1662 ms, new=1755 ms, diff=1.06 slower ✖
     q67: prev= 973 ms, new= 905 ms, diff=1.08 faster ✔
     q68: prev=1296 ms, new=1270 ms, diff=1.02 faster ✔
     q69: prev=1226 ms, new=1880 ms, diff=1.53 slower ❌
q70: Previously failed, and now also failed ❌
     q71: prev= 741 ms, new= 759 ms, diff=1.02 slower ✖
     q72: prev=8156 ms, new=6280 ms, diff=1.30 faster ✅
     q73: prev= 759 ms, new= 804 ms, diff=1.06 slower ✖
     q74: prev=1500 ms, new=1719 ms, diff=1.15 slower ✖
     q75: prev=1124 ms, new=1092 ms, diff=1.03 faster ✔
     q76: prev= 747 ms, new= 862 ms, diff=1.15 slower ✖
     q77: prev= 768 ms, new= 989 ms, diff=1.29 slower ❌
     q78: prev= 870 ms, new= 932 ms, diff=1.07 slower ✖
     q79: prev= 765 ms, new= 781 ms, diff=1.02 slower ✖
     q80: prev= 944 ms, new= 903 ms, diff=1.05 faster ✔
     q81: prev= 849 ms, new= 931 ms, diff=1.10 slower ✖
     q82: prev=1091 ms, new=1135 ms, diff=1.04 slower ✖
     q83: prev=1025 ms, new= 665 ms, diff=1.54 faster ✅
     q84: prev=1090 ms, new=1093 ms, diff=1.00 slower ✖
     q85: prev=1298 ms, new=1746 ms, diff=1.35 slower ❌
     q86: prev= 441 ms, new= 612 ms, diff=1.39 slower ❌
     q87: prev= 744 ms, new= 621 ms, diff=1.20 faster ✔
     q88: prev=1226 ms, new=1328 ms, diff=1.08 slower ✖
     q89: prev= 586 ms, new= 535 ms, diff=1.10 faster ✔
     q90: prev= 698 ms, new= 382 ms, diff=1.83 faster ✅
     q91: prev= 792 ms, new= 554 ms, diff=1.43 faster ✅
     q92: prev= 833 ms, new= 974 ms, diff=1.17 slower ✖
     q93: prev= 444 ms, new= 494 ms, diff=1.11 slower ✖
     q94: prev= 849 ms, new= 909 ms, diff=1.07 slower ✖
     q95: prev=1060 ms, new=1269 ms, diff=1.20 slower ✖
     q96: prev= 739 ms, new= 953 ms, diff=1.29 slower ❌
     q97: prev= 755 ms, new= 718 ms, diff=1.05 faster ✔
     q98: prev= 823 ms, new= 681 ms, diff=1.21 faster ✅
     q99: prev=1656 ms, new=1400 ms, diff=1.18 faster ✔
   TOTAL: prev=277865 ms, new=296073 ms, diff=1.07 slower ✖

=== Comparing clickbench_0-100 results from engine 'datafusion-distributed-main' [prev] with 'datafusion-distributed-cost-based-planning' [new] ===
      q0: prev= 117 ms, new= 114 ms, diff=1.03 faster ✔
      q1: prev=1467 ms, new=1108 ms, diff=1.32 faster ✅
      q2: prev=1223 ms, new=1121 ms, diff=1.09 faster ✔
      q3: prev=1431 ms, new=1266 ms, diff=1.13 faster ✔
      q4: prev=1109 ms, new=1150 ms, diff=1.04 slower ✖
      q5: prev=1662 ms, new=1467 ms, diff=1.13 faster ✔
      q6: prev= 118 ms, new= 124 ms, diff=1.05 slower ✖
      q7: prev= 818 ms, new= 945 ms, diff=1.16 slower ✖
      q8: prev=1374 ms, new=1497 ms, diff=1.09 slower ✖
      q9: prev=1710 ms, new=1739 ms, diff=1.02 slower ✖
     q10: prev= 963 ms, new=1169 ms, diff=1.21 slower ❌
     q11: prev=1064 ms, new=1118 ms, diff=1.05 slower ✖
     q12: prev=1544 ms, new=1364 ms, diff=1.13 faster ✔
     q13: prev=1846 ms, new=1978 ms, diff=1.07 slower ✖
     q14: prev=1498 ms, new=1740 ms, diff=1.16 slower ✖
     q15: prev=1328 ms, new=1305 ms, diff=1.02 faster ✔
     q16: prev=2042 ms, new=2082 ms, diff=1.02 slower ✖
q17: Previously failed, and now also failed ❌
     q18: prev=3038 ms, new=3010 ms, diff=1.01 faster ✔
     q19: prev= 904 ms, new= 920 ms, diff=1.02 slower ✖
     q20: prev=6989 ms, new=3924 ms, diff=1.78 faster ✅
     q21: prev=5307 ms, new=3333 ms, diff=1.59 faster ✅
     q22: prev=6379 ms, new=5338 ms, diff=1.20 faster ✔
     q23: prev=16790 ms, new=11364 ms, diff=1.48 faster ✅
     q24: prev=1239 ms, new=1123 ms, diff=1.10 faster ✔
     q25: prev=1129 ms, new=1232 ms, diff=1.09 slower ✖
     q26: prev=1120 ms, new=1215 ms, diff=1.08 slower ✖
     q27: prev=4897 ms, new=3491 ms, diff=1.40 faster ✅
     q28: prev=10658 ms, new=10382 ms, diff=1.03 faster ✔
     q29: prev=1152 ms, new=1184 ms, diff=1.03 slower ✖
     q30: prev=1568 ms, new=1368 ms, diff=1.15 faster ✔
     q31: prev=1761 ms, new=1707 ms, diff=1.03 faster ✔
     q32: prev=2400 ms, new=2463 ms, diff=1.03 slower ✖
     q33: prev=7126 ms, new=4547 ms, diff=1.57 faster ✅
     q34: prev=6182 ms, new=4672 ms, diff=1.32 faster ✅
     q35: prev=1504 ms, new=1469 ms, diff=1.02 faster ✔
     q36: prev= 723 ms, new= 662 ms, diff=1.09 faster ✔
     q37: prev= 474 ms, new= 455 ms, diff=1.04 faster ✔
     q38: prev= 578 ms, new= 614 ms, diff=1.06 slower ✖
     q39: prev= 867 ms, new= 844 ms, diff=1.03 faster ✔
     q40: prev= 447 ms, new= 487 ms, diff=1.09 slower ✖
     q41: prev= 425 ms, new= 585 ms, diff=1.38 slower ❌
     q42: prev= 396 ms, new= 403 ms, diff=1.02 slower ✖
   TOTAL: prev=316148 ms, new=264287 ms, diff=1.20 faster ✔

gene-bordegaray · 2026-01-26T15:53:04Z

I believe the upstream work I am taking on will help or modify how this is approached: apache/datafusion#19973

As in this should be provided as a ColumnStatistic by single-node df

yes, completely agree

gene-bordegaray · 2026-01-26T15:58:19Z

+        ProjectionExec: task_count=Some(1) output_rows=184 cost_class=XS accumulated_cost=11592 output_bytes=4784
+          SortPreservingMergeExec: task_count=Some(1) output_rows=184 cost_class=L accumulated_cost=8372 output_bytes=6440
+            [NetworkBoundary] Coalesce: task_count=Some(1) output_rows=184
+              SortExec: task_count=Some(4) output_rows=184 cost_class=XL accumulated_cost=18915 output_bytes=6440


haha cool to see scale up on compute😄

Yes! the compute cost calculation is very rough right now, I want to improve it before moving this out of draft, and ideally all the stats used for determining how many bytes will flow through the graph are given by DF upstream.

gene-bordegaray · 2026-01-26T16:22:00Z

+/// Given a list of children with a different compute cost each, and a restriction about the maximum
+/// tasks in which they are allowed to run, it assigns tasks counts to them so that the following
+/// conditions are met:


oh man, I see what you are saying. This is really tricky

gene-bordegaray · 2026-01-26T16:38:46Z

+    // Adjust total subtasks to match budget (or get as close as possible)
+    let mut total_subtasks: usize = child_subtask_counts.iter().sum();
+
+    // Trim if over budget: reduce from children with the highest subtask count


build a heap then pop the top, decrement and push it back? Would reduce time complexity but don't know if its worth legibility

I do remember some unions being quite large though 🤔

🙈 this does look like a heap-based problem, although I bet that the code can get pretty complicated if we go down that path.

I think unless a heap approach clearly demonstrates better benchmarks we probably should keep this simple.

gene-bordegaray · 2026-01-26T16:39:35Z

+    }
+
+    // Expand if under budget: add to children with the highest cost that can expand
+    while total_subtasks < task_count_budget {


another heap could be used here I believe

pop the top (highest cost), checks if it can expand, increment and push it back on

😵‍💫

… user.

gabotechs · 2026-02-10T09:04:39Z

I added some tests that check the difference between the estimated number of rows and the actual number of rows here:

https://github.com/datafusion-contrib/datafusion-distributed/blob/45aeb0b851ceb8be6a329da5e6a663c7358cd44a/tests/tpch_stats.rs

Conclusion: even if the Statistics API upstream is good and covers all our needs here, the actual logic for propagating statistics is very inaccurate, and further improvements will need to happen upstream before it's feasible to move forward with this PR. Until then, I'm moving it back to draft.

gene-bordegaray · 2026-02-10T13:49:54Z

Conclusion: even if the Statistics API upstream is good and covers all our needs here, the actual logic for propagating statistics is very inaccurate, and further improvements will need to happen upstream before it's feasible to move forward with this PR. Until then, I'm moving it back to draft.

nice, let's work to get some of that plumbing in place 😄

gabotechs · 2026-04-14T10:37:29Z

No longer working on this

gabotechs force-pushed the gabrielmusat/cost-based-planning branch from 0295eec to f0ee460 Compare January 26, 2026 12:00

gene-bordegaray reviewed Jan 26, 2026

View reviewed changes

gabotechs force-pushed the gabrielmusat/cost-based-planning branch 10 times, most recently from 13de2c9 to 6b3e755 Compare January 31, 2026 21:12

gabotechs force-pushed the gabrielmusat/cost-based-planning branch 3 times, most recently from 3c213a5 to 28d7425 Compare February 6, 2026 09:21

gabotechs mentioned this pull request Feb 6, 2026

Let partition_statistics accept pre-computed children statistics apache/datafusion#20184

Open

gabotechs force-pushed the gabrielmusat/cost-based-planning branch from 28d7425 to 9a4565b Compare February 6, 2026 12:38

gabotechs changed the base branch from main to gabrielmusat/claude-skill-benchmarks February 6, 2026 12:42

gabotechs force-pushed the gabrielmusat/cost-based-planning branch 2 times, most recently from 30504cd to 705a432 Compare February 7, 2026 17:23

Base automatically changed from gabrielmusat/claude-skill-benchmarks to main February 9, 2026 09:47

gabotechs added 5 commits February 9, 2026 10:50

Fix metrics display on leaf nodes

23536b6

Add bytes_per_row.rs

ae61551

Add compute_per_node.rs

1e3074a

Add rows_per_node.rs

1383ac4

Refactor: move children split logic to children_isolator_union_split.rs

74bb88e

gabotechs added 15 commits February 9, 2026 10:51

Change force_one_task to max_tasks

53549bc

Fix test

dbcf555

Add LIKE selectivity

79d5054

Improve compute_per_node.rs

7c49321

Better bytes-processed-per-partition

7ea55e3

Add some cost to DataSourceExec

5f46f89

Fix bad estimations on JOINs

b34b74b

Use compute class M for DataSourceExec

bc524c5

Use upstream statistics system

ab2b734

Add one more test to children_isolator_union_split.rs

2f2d2e7

Expose ComputeCostClass to the public and make it configurable by the…

06d4119

… user.

Add complexity-based cost attribution

2ca7657

Fix network boundary cost

e0c3f2b

Adapt estimations about sizes and cost

456c193

Complexity based cost analyzer

7ff2066

gabotechs force-pushed the gabrielmusat/cost-based-planning branch from d20ab72 to 7ff2066 Compare February 9, 2026 09:51

Revert unintended changes

e211c01

gabotechs force-pushed the gabrielmusat/cost-based-planning branch from 95cc751 to e211c01 Compare February 9, 2026 09:58

cargo fmt

b664e99

gabotechs marked this pull request as ready for review February 9, 2026 10:55

gabotechs added 3 commits February 9, 2026 15:33

Rename file to default_bytes_for_datatype.rs

2e8d1d3

Better display of time complexity

346a535

Add integration tests for statistics propagation

45aeb0b

gabotechs marked this pull request as draft February 10, 2026 09:04

This was referenced Mar 18, 2026

Lazily choose the amount of tasks based on sampled data #376

Open

[Epic] Adaptative Query Execution #377

Open

gabotechs closed this Apr 14, 2026

Conversation

gabotechs commented Jan 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Remove TaskEstimator

Rely on upstream statistics system

Compute cost assignation

Results

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

gene-bordegaray Jan 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

gabotechs Jan 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

gabotechs commented Feb 10, 2026

Uh oh!

gene-bordegaray commented Feb 10, 2026

Uh oh!

gabotechs commented Apr 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

gabotechs commented Jan 26, 2026 •

edited

Loading

gene-bordegaray Jan 26, 2026 •

edited

Loading

gabotechs Jan 26, 2026 •

edited

Loading