Conversation
3c12ccf to
00c533b
Compare
paleolimbot
left a comment
There was a problem hiding this comment.
Thank you!
I don't think it is necessary for this PR, but the ObjectStore may be a good future choice (here it would let you unify the local and s3 code path and make it easier to add future gcs or azure support).
It may be worth setting up a test for this in CI using MinIO or another S3 alternative as otherwise I don't think there are any tests here to validate the behaviour.
|
@prantogg Does this work with the |
james-willis
left a comment
There was a problem hiding this comment.
LGTM. I think Dewey's comment about objectstore makes a lot of sense though.
Thanks @paleolimbot! Agreed on On the MinIO CI test - my thinking was that since |
|
Adding a small fix for S3 error propagation for things like token expiry. |
There's nothing in this PR to indicate that |
d28fb32 to
c0550a7
Compare
c0550a7 to
534a46c
Compare
I see your point @paleolimbot! Thanks for that. Just added MinIO tests |
This PR addresses #70 - enables direct S3 streaming for generated data, eliminating local storage requirements for large-scale dataset generation. S3 URIs in
--output-dirare automatically detected and routed to a streaming multipart upload implementation.--output-dir s3://bucket/prefix/for all output formats (Parquet, CSV, TBL) and zone tablesAWS_ACCESS_KEY_ID,AWS_SECRET_ACCESS_KEY,AWS_REGION,AWS_SESSION_TOKEN, etc.) viaAmazonS3Builder::from_env()