-
-
Notifications
You must be signed in to change notification settings - Fork 11
Assembly Conflict Handling & Documentation + CSV stuff #38
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Introduces CsvSchemaInference and ColumnTypeAnalyzer classes to analyze CSV data and infer optimal SQL Server column types, supporting both sample-based and full-file scans. Adds InferredColumn model, utilities for generating CREATE TABLE statements and column type mappings, and comprehensive tests covering various data scenarios, options, and edge cases. Updates package metadata and version to 1.1.10.
Expanded the README with detailed documentation and examples for defining strongly typed columns, using built-in and custom type converters, and combining with schema inference. Updated CsvSchemaInference methods to require List<InferredColumn> for improved type safety and consistency.
Clarifies usage of the -AvoidConflicts parameter in README, emphasizing the need to use a boolean value instead of a hashtable. Updates dbatools.library.psm1 to correctly detect and skip loading Microsoft.Data.SqlClient when already present. Expands and restructures test-avoidconflicts.ps1 for more robust, order-sensitive testing of conflict scenarios and correct parameter usage. Adds verify-fix.ps1 to quickly confirm the boolean fix for AvoidConflicts.
Introduces an AssemblyLoadContext.Resolving handler in dbatools.library.psm1 to resolve assembly version mismatches when importing with AvoidConflicts on PowerShell Core. Updates tests and documentation to validate resolver behavior, clarifies expected outcomes, and adds System.ClientModel dependency to csproj for compatibility with Azure.Core. Also bumps module version to 2025.12.25 and improves test coverage for conflict scenarios.
Bumped System.ClientModel dependency from version 1.1.0 to 1.8.0 in dbatools.csproj to ensure compatibility and access to newer features.
The PowerShell script block resolver was causing "no Runspace available" errors because AssemblyLoadContext.Resolving events fire on .NET threads that don't have PowerShell runspaces. This caused FileLoadException for assemblies like Microsoft.Extensions.Caching.Abstractions. The fix rewrites the Core resolver as a compiled C# class (CoreRedirector) using Add-Type, matching the pattern already used for the Desktop Redirector. This allows the resolver to run on any thread without requiring PowerShell. Additionally, the new resolver loads ANY assembly from the lib folder if the file exists, rather than checking against a hardcoded list. This ensures all transitive dependencies are properly resolved. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <[email protected]>
The Desktop Redirector was using a hardcoded list of assembly names, which caused FileNotFoundException for assemblies like System.Memory.Data that were present in the lib folder but not in the list. This change rewrites the Desktop Redirector to match the CoreRedirector pattern: 1. Check if any version of the assembly is already loaded (return it) 2. If not loaded, try to load from lib folder if the file exists 3. No more hardcoded assembly list - handles all DLLs dynamically Also passes the lib path as a constructor parameter instead of using string interpolation in the C# code. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <[email protected]>
🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <[email protected]>
Pull Request Review - Assembly Conflict Handling & CSV Schema InferenceThis is an excellent PR that addresses critical functionality for the dbatools.library. The implementation is well-thought-out and demonstrates strong engineering practices. Below is my comprehensive review: 🎯 Overall AssessmentRating: ✅ Strongly Approve with Minor Suggestions This PR successfully delivers two major features:
The code quality is high, test coverage is comprehensive, and documentation is thorough. 🔍 Detailed Findings1. Assembly Conflict Handling (dbatools.library.psm1)✅ Strengths:
💡 Suggestions:
|
Added 'shell: bash' to the 'Check test results' step in the test-avoidconflicts workflow to ensure consistent shell execution.
Pull Request Review: Assembly Conflict Handling & CSV Schema InferenceI've completed a comprehensive review of PR #38. Overall, this is a well-structured PR with strong test coverage and documentation. Here are my findings: ✅ Strengths1. Excellent Documentation
2. Robust Test Coverage
3. Strong Architecture
🔍 Code Quality IssuesCritical Issues1. Assembly Resolver Uses Reflection on Private Fields (dbatools.library.psm1:308-324)var optionsField = typeof(CsvDataReader).GetField("_options",
System.Reflection.BindingFlags.NonPublic | System.Reflection.BindingFlags.Instance);Issue: Accessing private fields via reflection is fragile and breaks encapsulation. If Recommendation:
2. Redirector Class Constructor Accepts libPath but Original Doesn't (dbatools.library.psm1:80-105)The Desktop PowerShell Recommendation: Verify the Moderate Issues3. Progress Calculation Logic May Report Incorrect Values (CsvSchemaInference.cs:330-365)progress = Math.Min(0.99, rowsRead * 100.0 / totalSize);Issue:
Recommendation: Add comments explaining this is a rough estimate, or improve the algorithm to track bytes read if possible. 4. Silent Catch Blocks in Redirector/CoreRedirector (dbatools.library.psm1:67-71, 86)catch
{
// Failed to load, return null to let default resolution continue
}Issue: Swallows all exceptions without logging. If assembly loading fails due to corruption or permissions, debugging will be difficult. Recommendation: Log to verbose stream or use Write-Verbose in try-catch wrapper. 5. Boolean Type Priority May Cause Issues (ColumnTypeAnalyzer.cs:354-406)The type inference priority is: GUID > Int > BigInt > Decimal > Boolean > DateTime > String Issue: A column with values // Note: Int/BigInt are checked before Boolean because "1" and "0" are valid for both,
// and integer types are more restrictiveThis is correct behavior, but the actual priority in 6. Decimal Precision Calculation Complexity (ColumnTypeAnalyzer.cs:200-266)The Recommendation:
🐛 Potential Bugs1. Count Significant Digits May Return 0 (ColumnTypeAnalyzer.cs:300)return count > 0 ? count : 1; // At least 1 digitThis ensures at least 1, but in Test case to add: 2. Leading Zeros Test Has Incorrect Comment (CsvSchemaInferenceTest.cs:528-529)// Leading zeros should preserve as integer since parsing ignores them
// but this tests that we handle them gracefullyIssue: The comment says "preserve as integer" but integers like Recommendation: Update test to verify that leading zeros result in ⚡ Performance ConsiderationsGood Practices✅ Early exit optimization in Potential Optimizations1. String Allocations in Decimal Analysis (ColumnTypeAnalyzer.cs:205-221)Multiple string operations ( Impact: Low-moderate. Only affects decimal columns. 2. Unicode Detection Scans Every Character (ColumnTypeAnalyzer.cs:125-132)Once unicode is detected, all subsequent rows still check Impact: Low. Early exit works well. 🔒 Security Considerations✅ Good Security Practices
|
When SqlServer module loads SqlClient 5.x, it uses an older System.ClientModel version (< 1.1.0) that's incompatible with our Azure.Core.dll which expects System.ClientModel 1.1+ APIs. This fix detects the version conflict and skips loading Azure.Core and Azure.Identity when AvoidConflicts is set and either: - System.ClientModel < 1.1.0 is already loaded, or - SqlClient < 6.0 is loaded (which bundles the old System.ClientModel) This prevents the MissingMethodException for IPersistableModel.Write() and allows dbatools.library to coexist with SqlServer module. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <[email protected]>
Pull Request Review: Assembly Conflict Handling & CSV Schema InferenceSummaryThis is a substantial and well-executed PR that addresses critical assembly loading issues for PowerShell module interoperability and adds valuable CSV schema inference capabilities. The implementation shows excellent engineering with comprehensive testing and documentation. Code Quality: Excellent (5/5 stars)Assembly Conflict Handling (dbatools.library.psm1)Strengths:
Minor Issues:
CSV Schema Inference FeatureStrengths:
Potential Issues:
Security: No Concerns
Performance: ExcellentCSV Library shows 1.8x faster than LumenWorks for full row processing. Assembly loading uses efficient caching and lookup patterns. Test Coverage: Exceptional (5/5 stars)
Missing: Performance regression tests, large file handling tests, cancellation token tests Documentation: Very Good (4/5 stars)Strengths: Clear examples, explains rationale, good warnings about usage Breaking Changes: Minor
Specific Observationsdbatools.library.psm1:138 - String concatenation should use Path.Combine for cross-platform compatibility ColumnTypeAnalyzer.cs:200-266 - Excellent decimal precision handling with scientific notation and SQL Server limits RecommendationsHigh Priority:
Medium Priority: Low Priority: Final Verdict: APPROVEDThis PR demonstrates deep understanding of .NET assembly loading mechanics, thoughtful API design, excellent test coverage, and clear documentation. The assembly conflict resolution is exactly what this library needs for real-world PowerShell usage, and the CSV schema inference feature adds significant value for database import workflows. Great work! |
Escapes closing brackets in schema, table, and column names to prevent SQL injection in generated SQL statements. Simplifies CsvReaderOptions access in CsvSchemaInference by using the public property instead of reflection.
Renamed and updated the test for leading zeros to clarify that such values are parsed as integers, not strings, and added assertions for inferred types. Introduced a new test to verify that decimal values without an integer part are correctly inferred as decimals and that their scale is accurately determined.
Pull Request Review - PR #38I've completed a thorough review of this PR. Overall, this is excellent work with significant improvements to assembly conflict handling and a powerful new CSV schema inference feature. The code quality is high, with comprehensive testing and documentation. SummaryThis PR delivers three major improvements:
Code Quality: Excellent ✅Strengths1. Assembly Conflict Handling (dbatools.library.psm1) The dual resolver approach is well-designed:
2. CSV Schema Inference Feature Exceptionally well-implemented:
3. Documentation
4. Testing
Potential Issues & Recommendations🔴 CriticalNone identified - No blocking issues 🟡 Medium Priority1. Missing null checks in assembly resolvers Both // dbatools.library.psm1:49
if (assembly.GetName().Name == assemblyName)Issue: While wrapped in try-catch, some assemblies can throw from Recommendation: Consider logging which assembly caused the exception for debugging: catch (Exception ex)
{
// Log: "Assembly.GetName() failed for <assembly info>: {ex.Message}"
}2. Incompatible ClientModel detection The logic for detecting incompatible Observations:
Recommendation: Consider adding a warning message when Azure assemblies are skipped: Write-Warning "Azure.Core and Azure.Identity skipped due to incompatible System.ClientModel version. Azure authentication will not be available."3. Progress callback in CSV schema inference The progress estimation for streams is rough - CsvSchemaInference.cs:334: progress = Math.Min(0.99, rowsRead * 100.0 / totalSize);Issue: Assumes bytes ≈ rows, which is inaccurate for variable-length rows. Recommendation: Document this limitation in XML comments (it's already noted for TextReader but not for Stream). 🟢 Low Priority / Nitpicks1. Magic number in ColumnTypeAnalyzer // ColumnTypeAnalyzer.cs:300
return count > 0 ? count : 1; // At least 1 digitSuggestion: Add a named constant 2. Redundant string check // CsvSchemaInference.cs:422-423
if (string.IsNullOrEmpty(sqlType))
return typeof(string);This is defensive but likely never triggers since SqlDataType is always set by analyzers. 3. Test file cleanup // CsvSchemaInferenceTest.cs:30
try { Directory.Delete(_tempDir, true); } catch { }Silent catch could hide issues. Consider logging failures in test diagnostics. Performance Considerations: Good ✅1. Assembly Loading
2. CSV Schema Inference
3. Decimal Precision Calculation
Security: Excellent ✅1. SQL Injection Prevention Properly escapes identifiers: string escapedSchema = schemaName?.Replace("]", "]]") ?? "dbo";
string escapedTable = tableName.Replace("]", "]]");2. Input Validation Good parameter validation:
3. No Command Injection Risks No shell commands constructed from user input. Test Coverage: Excellent ✅CSV Schema Inference: 705 lines covering:
Assembly Conflict Resolution: Dedicated CI workflow with:
Excellent coverage for new features. Best Practices Compliance: Excellent ✅
Recommendations SummaryMust Address Before MergeNone - this PR is ready to merge as-is. Should Consider (Post-Merge or Future PR)
Nice to Have
Final Verdict: APPROVED ✅This is production-ready code with:
The assembly conflict resolution is particularly impressive, handling both PowerShell editions with appropriate platform-specific solutions. The CSV schema inference feature is well-designed and will be very useful for database import scenarios. Great work! 🎉 Testing VerificationThe CI workflow Recommendation: The current Windows-only testing is probably sufficient given this feature's primary use case. |
Bumped dbatools.library.psd1 version to 2025.12.26. Updated the Dataplat.Dbatools.Csv changelog with new features and fixes, including MoneyConverter, VectorConverter, improved decimal parsing, and SQL identifier escaping.
Pull Request Review: Assembly Conflict Handling & CSV Schema InferenceOverviewThis PR introduces two major enhancements:
Code Quality Assessment✅ StrengthsPowerShell Module (dbatools.library.psm1)
CSV Schema Inference (C#)
Testing
|
Adds enhanced assembly conflict handling for PowerShell Core, comprehensive documentation updates for correct module usage, and the addition of a powerful SQL Server schema inference feature to the CSV library.