Commit c08950b
authored
perf: Improve parquet writing with plain pyarrow (#134)
* perf: Improve parquet writing with plain pyarrow
Signed-off-by: Christoph Auer <[email protected]>
* Smaller fixes
Signed-off-by: Christoph Auer <[email protected]>
* Add pyarrow dep
Signed-off-by: Christoph Auer <[email protected]>
* Fix circular import
Signed-off-by: Christoph Auer <[email protected]>
---------
Signed-off-by: Christoph Auer <[email protected]>1 parent a34f264 commit c08950b
File tree
10 files changed
+189
-510
lines changed- docling_eval
- cli
- datamodels
- dataset_builders
- prediction_providers
- utils
- docs/examples
10 files changed
+189
-510
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
910 | 910 | | |
911 | 911 | | |
912 | 912 | | |
| 913 | + | |
| 914 | + | |
| 915 | + | |
| 916 | + | |
| 917 | + | |
| 918 | + | |
913 | 919 | | |
914 | 920 | | |
915 | 921 | | |
916 | 922 | | |
917 | 923 | | |
918 | 924 | | |
919 | 925 | | |
| 926 | + | |
920 | 927 | | |
921 | 928 | | |
922 | 929 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1 | 1 | | |
| 2 | + | |
2 | 3 | | |
3 | 4 | | |
4 | 5 | | |
| |||
19 | 20 | | |
20 | 21 | | |
21 | 22 | | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
| 55 | + | |
| 56 | + | |
| 57 | + | |
| 58 | + | |
| 59 | + | |
| 60 | + | |
| 61 | + | |
| 62 | + | |
| 63 | + | |
| 64 | + | |
| 65 | + | |
| 66 | + | |
| 67 | + | |
| 68 | + | |
| 69 | + | |
| 70 | + | |
| 71 | + | |
| 72 | + | |
| 73 | + | |
| 74 | + | |
| 75 | + | |
| 76 | + | |
| 77 | + | |
22 | 78 | | |
23 | 79 | | |
24 | 80 | | |
| |||
51 | 107 | | |
52 | 108 | | |
53 | 109 | | |
| 110 | + | |
| 111 | + | |
| 112 | + | |
| 113 | + | |
| 114 | + | |
| 115 | + | |
| 116 | + | |
| 117 | + | |
| 118 | + | |
| 119 | + | |
| 120 | + | |
| 121 | + | |
| 122 | + | |
| 123 | + | |
| 124 | + | |
| 125 | + | |
54 | 126 | | |
55 | 127 | | |
56 | | - | |
57 | | - | |
58 | | - | |
59 | | - | |
60 | | - | |
61 | | - | |
62 | | - | |
63 | | - | |
64 | | - | |
65 | | - | |
66 | | - | |
67 | | - | |
68 | | - | |
69 | | - | |
70 | | - | |
71 | | - | |
72 | | - | |
73 | | - | |
| 128 | + | |
| 129 | + | |
| 130 | + | |
| 131 | + | |
| 132 | + | |
| 133 | + | |
74 | 134 | | |
75 | 135 | | |
76 | 136 | | |
| |||
207 | 267 | | |
208 | 268 | | |
209 | 269 | | |
| 270 | + | |
| 271 | + | |
| 272 | + | |
| 273 | + | |
| 274 | + | |
| 275 | + | |
| 276 | + | |
| 277 | + | |
| 278 | + | |
| 279 | + | |
| 280 | + | |
| 281 | + | |
| 282 | + | |
| 283 | + | |
| 284 | + | |
| 285 | + | |
| 286 | + | |
210 | 287 | | |
211 | 288 | | |
212 | | - | |
213 | | - | |
214 | | - | |
215 | | - | |
216 | | - | |
217 | | - | |
218 | | - | |
219 | | - | |
220 | | - | |
221 | | - | |
222 | | - | |
223 | | - | |
224 | | - | |
225 | | - | |
226 | | - | |
227 | | - | |
228 | | - | |
229 | | - | |
230 | | - | |
231 | | - | |
232 | | - | |
233 | | - | |
234 | | - | |
235 | | - | |
236 | | - | |
237 | | - | |
238 | | - | |
239 | | - | |
240 | | - | |
| 289 | + | |
| 290 | + | |
| 291 | + | |
| 292 | + | |
| 293 | + | |
| 294 | + | |
241 | 295 | | |
242 | 296 | | |
243 | 297 | | |
| |||
Lines changed: 5 additions & 1 deletion
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
47 | 47 | | |
48 | 48 | | |
49 | 49 | | |
| 50 | + | |
50 | 51 | | |
51 | 52 | | |
52 | 53 | | |
| |||
55 | 56 | | |
56 | 57 | | |
57 | 58 | | |
| 59 | + | |
| 60 | + | |
58 | 61 | | |
59 | 62 | | |
60 | 63 | | |
61 | 64 | | |
| 65 | + | |
62 | 66 | | |
63 | 67 | | |
64 | 68 | | |
| |||
799 | 803 | | |
800 | 804 | | |
801 | 805 | | |
802 | | - | |
| 806 | + | |
803 | 807 | | |
804 | 808 | | |
805 | 809 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
305 | 305 | | |
306 | 306 | | |
307 | 307 | | |
| 308 | + | |
308 | 309 | | |
309 | | - | |
310 | 310 | | |
311 | 311 | | |
312 | 312 | | |
| |||
Lines changed: 1 addition & 1 deletion
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
398 | 398 | | |
399 | 399 | | |
400 | 400 | | |
| 401 | + | |
401 | 402 | | |
402 | | - | |
403 | 403 | | |
404 | 404 | | |
405 | 405 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
7 | 7 | | |
8 | 8 | | |
9 | 9 | | |
10 | | - | |
| 10 | + | |
11 | 11 | | |
12 | 12 | | |
13 | 13 | | |
14 | 14 | | |
15 | 15 | | |
16 | 16 | | |
17 | 17 | | |
| 18 | + | |
18 | 19 | | |
19 | 20 | | |
20 | 21 | | |
| |||
78 | 79 | | |
79 | 80 | | |
80 | 81 | | |
81 | | - | |
| 82 | + | |
| 83 | + | |
| 84 | + | |
82 | 85 | | |
83 | 86 | | |
84 | 87 | | |
85 | 88 | | |
86 | | - | |
| 89 | + | |
87 | 90 | | |
88 | 91 | | |
89 | 92 | | |
| |||
97 | 100 | | |
98 | 101 | | |
99 | 102 | | |
100 | | - | |
| 103 | + | |
101 | 104 | | |
102 | 105 | | |
103 | 106 | | |
| |||
106 | 109 | | |
107 | 110 | | |
108 | 111 | | |
109 | | - | |
| 112 | + | |
| 113 | + | |
| 114 | + | |
| 115 | + | |
| 116 | + | |
110 | 117 | | |
111 | 118 | | |
112 | 119 | | |
| |||
489 | 496 | | |
490 | 497 | | |
491 | 498 | | |
| 499 | + | |
| 500 | + | |
| 501 | + | |
| 502 | + | |
| 503 | + | |
| 504 | + | |
| 505 | + | |
492 | 506 | | |
493 | 507 | | |
494 | 508 | | |
| 509 | + | |
495 | 510 | | |
496 | 511 | | |
497 | | - | |
498 | | - | |
499 | 512 | | |
500 | | - | |
| 513 | + | |
501 | 514 | | |
502 | 515 | | |
503 | 516 | | |
504 | | - | |
505 | | - | |
506 | | - | |
507 | | - | |
508 | | - | |
509 | | - | |
510 | | - | |
511 | | - | |
512 | | - | |
513 | | - | |
| 517 | + | |
| 518 | + | |
514 | 519 | | |
515 | | - | |
| 520 | + | |
| 521 | + | |
| 522 | + | |
516 | 523 | | |
517 | 524 | | |
518 | 525 | | |
519 | 526 | | |
| 527 | + | |
| 528 | + | |
| 529 | + | |
| 530 | + | |
| 531 | + | |
| 532 | + | |
| 533 | + | |
| 534 | + | |
| 535 | + | |
| 536 | + | |
| 537 | + | |
| 538 | + | |
| 539 | + | |
| 540 | + | |
| 541 | + | |
| 542 | + | |
| 543 | + | |
| 544 | + | |
| 545 | + | |
| 546 | + | |
| 547 | + | |
| 548 | + | |
| 549 | + | |
| 550 | + | |
| 551 | + | |
| 552 | + | |
| 553 | + | |
| 554 | + | |
| 555 | + | |
| 556 | + | |
| 557 | + | |
| 558 | + | |
| 559 | + | |
| 560 | + | |
| 561 | + | |
| 562 | + | |
| 563 | + | |
| 564 | + | |
| 565 | + | |
| 566 | + | |
| 567 | + | |
| 568 | + | |
| 569 | + | |
| 570 | + | |
520 | 571 | | |
521 | 572 | | |
522 | 573 | | |
| |||
0 commit comments