Skip to content

feat: bulk data import/export commands#7

Closed
chrisaddams wants to merge 2 commits intomainfrom
feature/bulk-operations-framework
Closed

feat: bulk data import/export commands#7
chrisaddams wants to merge 2 commits intomainfrom
feature/bulk-operations-framework

Conversation

@chrisaddams
Copy link
Copy Markdown
Contributor

Summary

  • data import command: import CSV/JSON files into entities with validation, progress tracking, and retry logic
  • data export command: export entity data to CSV/JSON with streaming and pagination
  • Bulk processing framework with parallel execution and configurable concurrency
  • Data validation framework with type checking, pattern matching, and range validation

Based on #3 by @nivedhapalani96 with the following fixes:

  • Removed duplicate test files and test data from repo root
  • Fixed elapsed time tracking in BulkDataProcessor (was producing nonsensical values)
  • Fixed export pipeline (was a no-op — records are now streamed through the exporter)
  • Replaced double stream reads with file-size based estimation for CSV
  • Removed full JSON double-parse that defeated streaming
  • Added 1s regex timeout in validator to prevent ReDoS
  • Fixed JsonValue type unwrapping in validator
  • Resolved all compiler warnings
  • Converted tests to xUnit (13 tests, all passing alongside existing 192)

Test plan

  • dotnet build — 0 warnings, 0 errors
  • dotnet test — 205/205 passing (13 new bulk operations tests)
  • Manual test: data import test.csv entity_name --dry-run
  • Manual test: data export entity_name output.csv

nivedhapalani08 and others added 2 commits April 6, 2026 00:45
- Add comprehensive bulk operations namespace with core interfaces
- Implement streaming CSV and JSON data readers with robust parsing
- Add progress tracking infrastructure with rich console output
- Create bulk import command with validation and error handling
- Add bulk export functionality with multiple format support
- Implement resilient bulk processor with parallel execution
- Add data validation framework with configurable rules
- Create comprehensive test suite for all components
- Support for large datasets with memory-efficient streaming
- Rich CLI integration with progress reporting and error handling

This framework provides enterprise-grade bulk data operations with
streaming support, parallel processing, and comprehensive error handling.
Addresses critical user needs for importing/exporting large datasets
efficiently and reliably.
- Remove duplicate test files, test data, and standalone test project
- Fix BulkDataProcessor elapsed time calculation (was nonsensical)
- Fix export pipeline: stream through exporter directly instead of no-op processor
- Replace full-file stream reads with file-size estimate for CSV record count
- Remove double-parse in JsonDataReader that defeated streaming
- Add 1s regex timeout in BasicDataValidator to prevent ReDoS
- Fix JsonValue unwrapping in validator (int/long/bool were not extracted)
- Fix all compiler warnings (async without await, nullable reference)
- Convert tests to proper xUnit with FluentAssertions (13 tests, all passing)
@chrisaddams
Copy link
Copy Markdown
Contributor Author

Closing in favour of keeping the fixes on #3 (the original contributor's PR)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants