Add trained NN models and clean up legacy weights #11

sbryngelson · 2025-12-18T22:49:58Z

PR Type

Enhancement, Bug fix, Tests, Documentation

Description

Added trained neural network models: Introduced two production-ready TBNN models (tbnn_channel_caseholdout and tbnn_phll_caseholdout) with complete weight matrices, biases, and normalization parameters for channel flow and periodic hills cases
Removed legacy example models: Cleaned up deprecated example_tbnn and example_scalar_nut models and their associated weight files from the repository
Enhanced configuration validation: Implemented explicit NN model specification requirements in config.cpp with improved error messages and automatic path mirroring, removing legacy fallback to default data/ directory
Fixed GEP parameter order bug: Corrected parameter order in set_reference() calls in turbulence_gep.cpp from (nu_, u_ref_, delta_) to (nu_, delta_, u_ref_)
Expanded test coverage: Added comprehensive test suites for feature computation (test_features.cpp), turbulence model validation (test_turbulence_features.cpp), and generalized backend execution tests supporting both CPU and GPU
Improved backend flexibility: Refactored GPU-specific tests to support graceful CPU fallback when GPU is unavailable, with clear backend status reporting
Updated all tests to use trained models: Migrated test suite from legacy example models to trained tbnn_channel_caseholdout weights across integration, backend, and perturbed channel tests
Enhanced documentation: Created comprehensive usage guides and metadata files for trained models, including architecture details, training configuration, feature descriptions, and usage examples
Improved dataset handling: Enhanced McConkey dataset download robustness with better Kaggle CLI invocation and fallback unzip mechanisms; added local training script for TBNN/MLP models

Diagram Walkthrough

flowchart LR
  A["Legacy Models<br/>example_tbnn<br/>example_scalar_nut"] -->|"Removed"| B["Trained Models<br/>tbnn_channel_caseholdout<br/>tbnn_phll_caseholdout"]
  B -->|"Used in"| C["Updated Tests<br/>test_features<br/>test_turbulence_features<br/>test_backend_execution"]
  D["Config Validation<br/>Explicit Model Spec"] -->|"Enforces"| B
  E["GEP Bug Fix<br/>Parameter Order"] -->|"Corrects"| F["Feature Computation"]
  C -->|"Validates"| F
  G["Enhanced Documentation<br/>USAGE.md<br/>metadata.json"] -->|"Describes"| B

File Walkthrough

Relevant files

Tests

7 files

test_features.cpp `Expand feature tests with analytic velocity field verification` tests/test_features.cpp Replaced generic velocity gradient tests with analytic verification tests for pure shear, pure strain, and solid body rotation flows Added comprehensive feature computation tests including scalar features, TBNN features, and tensor basis operations Implemented anisotropy tensor construction and Reynolds stress conversion tests Added GPU/CPU gradient computation backend tests with conditional compilation	+413/-49
test_turbulence_features.cpp `Add comprehensive turbulence model feature tests` tests/test_turbulence_features.cpp New test file exercising turbulence model computation paths with nontrivial velocity gradients Tests EARSM Re_t-based blending, baseline model response, GEP model, and multiple EARSM variants (Wallin-Johansson, Gatski-Speziale, Pope) Includes batch feature computation tests, realizability constraint verification, and solver backend execution Supports both CPU and GPU backends with conditional compilation	+539/-0
test_cpu_gpu_consistency.cpp `Improve GPU/CPU consistency tests with fallback paths` tests/test_cpu_gpu_consistency.cpp Refactored to gracefully handle GPU unavailability in GPU-enabled builds Changed from skip-on-no-GPU to CPU-fallback behavior for consistency tests Updated test titles and output messages to reflect CPU/GPU backend status Improved model loading to use trained `tbnn_channel_caseholdout` instead of legacy example models	+132/-85
test_backend_execution.cpp `Generalize backend execution tests for CPU and GPU` tests/test_backend_execution.cpp Renamed from GPU-specific to backend-agnostic execution tests Added CPU path support for all tests with conditional compilation Updated model loading to use trained TBNN weights (`tbnn_channel_caseholdout`) Improved test output to clearly indicate which backend (CPU/GPU) is being tested	+128/-131
test_nn_integration.cpp `Update NN integration tests to use trained models` tests/test_nn_integration.cpp Updated model paths from legacy `example_scalar_nut` and `example_tbnn` to trained `tbnn_channel_caseholdout` Simplified model loading to use consistent trained weights across all NN integration tests	+10/-9
test_perturbed_channel.cpp `Update perturbed channel test with trained model paths` tests/test_perturbed_channel.cpp Updated turbulence model creation to use trained `tbnn_channel_caseholdout` model Simplified model path logic to select appropriate trained weights based on model type	+5/-1
CMakeLists.txt `Test infrastructure updates and new turbulence features test` CMakeLists.txt Renamed `test_gpu_execution` to `test_backend_execution` for backend-agnostic testing Renamed test case `GPUExecutionTest` to `BackendExecutionTest` Renamed test case `CPUGPUConsistencyTest` to `ConsistencyTest` Added new test executable `test_turbulence_features` for analytic validation of features, invariants, and model response	+9/-4

Configuration changes

2 files

config.cpp `Add NN model configuration validation and error handling` src/config.cpp Added validation requiring explicit NN model specification (preset or weights/scaling paths) Improved error messages listing available presets and configuration options Implemented automatic path mirroring when only weights or scaling is provided Removed legacy fallback to default `data/` directory	+31/-9
config.hpp `Update config header with explicit NN model requirements` include/config.hpp Changed NN model path defaults from `data/` to empty strings (require explicit specification) Updated documentation to indicate preset model names (e.g., `tbnn_channel_caseholdout`) Clarified that no legacy fallback paths are available	+4/-4

Bug fix

1 files

turbulence_gep.cpp `Fix parameter order in GEP feature computer initialization` src/turbulence_gep.cpp Fixed parameter order in `set_reference()` calls from `(nu_, u_ref_,` `delta_)` to `(nu_, delta_, u_ref_)` Applied fix in both `ensure_initialized()` and `update()` methods	+2/-2

Enhancement

17 files

train_realdata_local.sh `Add local training script for real McConkey dataset` scripts/train_realdata_local.sh New comprehensive training script for local TBNN/MLP training on McConkey dataset Handles venv creation in scratch, dataset download via Kaggle API, and model export Supports configurable training parameters (epochs, batch size, learning rate, device) Includes dataset validation and helpful error messages for missing credentials	+259/-0
download_mcconkey_data.sh `Improve McConkey dataset download robustness` scripts/download_mcconkey_data.sh Improved Kaggle CLI invocation to handle versions without `__main__` support Added explicit zip file handling with fallback to pure-Python unzip Removed redundant kaggle CLI availability check Enhanced error messages for missing credentials	+18/-19
layer0_W.txt `TBNN channel flow model input layer weights` data/models/tbnn_channel_caseholdout/layer0_W.txt Added new neural network weight matrix file for TBNN channel flow model Contains 64 rows of 5-dimensional weight vectors (input layer: 5 → 64 neurons) Weights stored in scientific notation format for numerical precision	+64/-0
layer3_W.txt `TBNN periodic hills model output layer weights` data/models/tbnn_phll_caseholdout/layer3_W.txt Added output layer weight matrix for TBNN periodic hills model Contains 4 rows of 64-dimensional vectors (output layer: 64 → 4 coefficients) Represents final transformation to tensor basis coefficients	+4/-0
layer3_W.txt `TBNN channel flow model output layer weights` data/models/tbnn_channel_caseholdout/layer3_W.txt Added output layer weight matrix for TBNN channel flow model Contains 4 rows of 64-dimensional vectors (output layer: 64 → 4 coefficients) Represents final transformation to tensor basis coefficients	+4/-0
layer0_b.txt `TBNN periodic hills model input layer biases` data/models/tbnn_phll_caseholdout/layer0_b.txt Added bias vector for input layer of periodic hills TBNN model Contains 64 bias values (one per neuron in first hidden layer) Stored in scientific notation for numerical precision	+64/-0
layer1_b.txt `TBNN periodic hills model first hidden layer biases` data/models/tbnn_phll_caseholdout/layer1_b.txt Added bias vector for first hidden layer of periodic hills TBNN model Contains 64 bias values for hidden layer neurons Stored in scientific notation format	+64/-0
layer2_b.txt `TBNN periodic hills model second hidden layer biases` data/models/tbnn_phll_caseholdout/layer2_b.txt Added bias vector for second hidden layer of periodic hills TBNN model Contains 64 bias values for hidden layer neurons Stored in scientific notation format	+64/-0
layer0_b.txt `TBNN channel flow model input layer biases` data/models/tbnn_channel_caseholdout/layer0_b.txt Added bias vector for input layer of channel flow TBNN model Contains 64 bias values (one per neuron in first hidden layer) Stored in scientific notation for numerical precision	+64/-0
layer1_b.txt `TBNN channel flow model first hidden layer biases` data/models/tbnn_channel_caseholdout/layer1_b.txt Added bias vector for first hidden layer of channel flow TBNN model Contains 64 bias values for hidden layer neurons Stored in scientific notation format	+64/-0
layer2_b.txt `TBNN channel flow model second hidden layer biases` data/models/tbnn_channel_caseholdout/layer2_b.txt Added bias vector for second hidden layer of channel flow TBNN model Contains 64 bias values for hidden layer neurons Stored in scientific notation format	+64/-0
input_means.txt `Input normalization means for periodic hills TBNN model` data/models/tbnn_phll_caseholdout/input_means.txt Added input feature normalization means for periodic hills TBNN model Contains 5 mean values for z-score normalization of input features Stored in scientific notation for numerical precision	+5/-0
input_stds.txt `Input normalization standard deviations for periodic hills TBNN model` data/models/tbnn_phll_caseholdout/input_stds.txt Added input feature normalization standard deviations for periodic hills TBNN model Contains 5 standard deviation values for z-score normalization Stored in scientific notation format	+5/-0
input_means.txt `Input normalization means for channel flow TBNN model` data/models/tbnn_channel_caseholdout/input_means.txt Added input feature normalization means for channel flow TBNN model Contains 5 mean values for z-score normalization of input features Stored in scientific notation for numerical precision	+5/-0
input_stds.txt `Input normalization standard deviations for channel flow TBNN model` data/models/tbnn_channel_caseholdout/input_stds.txt Added input feature normalization standard deviations for channel flow TBNN model Contains 5 standard deviation values for z-score normalization Stored in scientific notation format	+5/-0
layer3_b.txt `TBNN periodic hills model output layer biases` data/models/tbnn_phll_caseholdout/layer3_b.txt Added bias vector for output layer of periodic hills TBNN model Contains 4 bias values (one per output coefficient) Stored in scientific notation format	+4/-0
layer3_b.txt `TBNN channel flow model output layer biases` data/models/tbnn_channel_caseholdout/layer3_b.txt Added bias vector for output layer of channel flow TBNN model Contains 4 bias values (one per output coefficient) Stored in scientific notation format	+4/-0

Dependencies

1 files

layer0_W.txt `Add trained TBNN weights for periodic hills case` data/models/tbnn_phll_caseholdout/layer0_W.txt New trained weight matrix for periodic hills TBNN model (64 hidden units, 5 input features) Contains 64×5 matrix of floating-point weights	+64/-0

Documentation

6 files

USAGE.md `Add usage guide for trained channel TBNN model` data/models/tbnn_channel_caseholdout/USAGE.md Comprehensive documentation for trained channel flow TBNN model Includes architecture details, training configuration, feature descriptions, and usage examples Provides comparison with periodic hills variant and references to original papers Documents file structure and training metadata	+134/-0
README.md `Comprehensive documentation for trained neural network models` data/models/README.md Completely restructured documentation to focus on trained models instead of legacy examples Added detailed sections for two trained TBNN models: `tbnn_channel_caseholdout` and `tbnn_phll_caseholdout` Included training details, architecture specifications, validation losses, and usage examples Added model selection guide table comparing performance metrics Removed legacy weight export instructions and replaced with model-specific usage guidance	+83/-12
USAGE.md `Detailed usage guide for periodic hills TBNN model` data/models/tbnn_phll_caseholdout/USAGE.md Created comprehensive usage guide for periodic hills TBNN model Documented case-holdout training split (train: case_0p5/0p8/1p5, val: case_1p0, test: case_1p2) Detailed feature definitions (5 Ling et al. 2016 invariants) and output tensor basis Provided command-line usage examples and configuration file templates Included training hyperparameters, performance metrics, and file structure reference	+119/-0
README.md `Simplified data directory documentation with model focus` data/README.md Simplified root data directory documentation to redirect to `data/models/README.md` Removed legacy weight export instructions for PyTorch and TensorFlow Removed detailed feature ordering documentation Added quick-start examples for using trained models with `--nn_preset` flag Clarified that NN models now require explicit selection and no defaults exist	+21/-93
metadata.json `Metadata documentation for periodic hills TBNN model` data/models/tbnn_phll_caseholdout/metadata.json Added metadata file documenting TBNN periodic hills model architecture and training Specifies 5-64-64-64-4 layer configuration with tanh/linear activations Documents 5 Ling et al. 2016 invariant features with z-score normalization Includes training framework (PyTorch), optimizer (Adam), and dataset reference	+53/-0
metadata.json `Metadata documentation for channel flow TBNN model` data/models/tbnn_channel_caseholdout/metadata.json Added metadata file documenting TBNN channel flow model architecture and training Specifies 5-64-64-64-4 layer configuration with tanh/linear activations Documents 5 Ling et al. 2016 invariant features with z-score normalization Includes training framework (PyTorch), optimizer (Adam), and dataset reference	+53/-0

Additional files

34 files

input_means.txt	+0/-5
input_stds.txt	+0/-5
layer0_W.txt	+0/-64
layer0_b.txt	+0/-64
layer1_W.txt	+0/-64
layer1_b.txt	+0/-64
layer2_W.txt	+0/-64
layer2_b.txt	+0/-64
layer3_W.txt	+0/-4
layer3_b.txt	+0/-4
input_means.txt	+0/-6
input_stds.txt	+0/-6
layer0_W.txt	+0/-32
layer0_b.txt	+0/-32
layer1_W.txt	+0/-32
layer1_b.txt	+0/-32
layer2_W.txt	+0/-1
layer2_b.txt	+0/-1
metadata.json	+0/-52
input_means.txt	+0/-5
input_stds.txt	+0/-5
layer0_W.txt	+0/-64
layer0_b.txt	+0/-64
layer1_W.txt	+0/-64
layer1_b.txt	+0/-64
layer2_W.txt	+0/-64
layer2_b.txt	+0/-64
layer3_W.txt	+0/-4
layer3_b.txt	+0/-4
metadata.json	+0/-66
layer1_W.txt	+64/-0
layer2_W.txt	+64/-0
layer1_W.txt	+64/-0
layer2_W.txt	+64/-0

- Add trained TBNN models for channel and periodic hills flows - data/models/tbnn_channel_caseholdout/ (trained on McConkey channel data) - data/models/tbnn_phll_caseholdout/ (trained on McConkey periodic hills data) - Both models include weights, biases, normalization, and metadata - Remove all example/dummy weights - Deleted legacy root-level layer files from data/ - Removed example_scalar_nut/ and example_tbnn/ placeholder models - No more random/untrained weights in codebase - Update model loading to require explicit selection - Modified Config to require --nn_preset or --weights/--scaling - Updated default paths from 'data/' to empty strings - Added validation in Config::finalize() for NN models - Update all GPU CI tests to use trained models - test_nn_integration.cpp: Use tbnn_channel_caseholdout - test_gpu_execution.cpp: Use tbnn_channel_caseholdout - test_cpu_gpu_consistency.cpp: Use tbnn_channel_caseholdout - test_perturbed_channel.cpp: Dynamically select trained models - Update documentation - Rewrote data/README.md to direct users to data/models/ - Updated data/models/README.md with current model structure - Added USAGE.md files for each trained model - Update .gitignore to track trained models - Keep *_caseholdout/ directories in version control - Ignore transient training outputs (job-ID suffixes) - Add training infrastructure - scripts/train_realdata_local.sh for local training - Updated download_mcconkey_data.sh to use python -m kaggle All tests pass on GPU hardware (V100) with trained weights.

qodo-code-review · 2025-12-18T22:50:45Z

PR Compliance Guide 🔍

Below is a summary of compliance checks for this PR:

Security Compliance
⚪	Zip Slip extraction Description: The script downloads an external zip and extracts it with `unzip -q` into `${OUTPUT_DIR}` without validating archive entry paths, which can enable a Zip Slip path traversal (e.g., `../` or absolute paths inside the zip) if the downloaded dataset zip is malicious or tampered with. download_mcconkey_data.sh [40-57] Referred Code # Download dataset using Kaggle CLI # Note: some Kaggle versions do NOT support `python -m kaggle` (no __main__). export KAGGLE_CONFIG_DIR="${HOME}/.kaggle" ZIP_DIR="${OUTPUT_DIR}/_kaggle_zip" mkdir -p "${ZIP_DIR}" kaggle datasets download -d "$DATASET_NAME" -p "$ZIP_DIR" ZIP_FILE="$(ls -1t "${ZIP_DIR}"/*.zip 2>/dev/null \| head -n 1 \|\| true)" if [[ -z "${ZIP_FILE}" ]]; then echo "ERROR: Kaggle download completed but no .zip was found in ${ZIP_DIR}" >&2 exit 1 fi echo "Unzipping ${ZIP_FILE} -> ${OUTPUT_DIR}" unzip -q "${ZIP_FILE}" -d "${OUTPUT_DIR}"
	Sensitive path disclosure Description: The documentation includes a contributor-specific absolute filesystem path (e.g., `/storage/home/...`) that may reveal internal environment/user information when the repository is shared publicly. USAGE.md [125-132] Referred Code ## Training Location This model was trained in: - Workspace: `/storage/home/hcoda1/6/sbryngelson3/scratch/nn-cfd` - Training script: `scripts/run_channel_train.sbatch` - Job ID: 2998847 - Dataset generator: `generate_channel_npz.py`
Ticket Compliance
⚪	🎫 No ticket provided Create ticket/issue
Codebase Duplication Compliance
⚪	Codebase context is not defined Follow the guide to enable codebase context checks.
Custom Compliance
🟢	Generic: Comprehensive Audit Trails Objective: To create a detailed and reliable record of critical system actions for security analysis and compliance. Status: Passed Learn more about managing compliance generic rules or creating your own custom rules
	Generic: Meaningful Naming and Self-Documenting Code Objective: Ensure all identifiers clearly express their purpose and intent, making code self-documenting Status: Passed Learn more about managing compliance generic rules or creating your own custom rules
	Generic: Secure Error Handling Objective: To prevent the leakage of sensitive system information through error messages while providing sufficient detail for internal debugging. Status: Passed Learn more about managing compliance generic rules or creating your own custom rules
	Generic: Secure Logging Practices Objective: To ensure logs are useful for debugging and auditing without exposing sensitive information like PII, PHI, or cardholder data. Status: Passed Learn more about managing compliance generic rules or creating your own custom rules
	Generic: Security-First Input Validation and Data Handling Objective: Ensure all data inputs are validated, sanitized, and handled securely to prevent vulnerabilities Status: Passed Learn more about managing compliance generic rules or creating your own custom rules
🔴	Generic: Robust Error Handling and Edge Case Management Objective: Ensure comprehensive error handling that provides meaningful context and graceful degradation Status: Missing CLI validation: The script invokes `kaggle` and `unzip` without validating the commands exist or checking their exit status, which can cause confusing failures instead of actionable error handling. Referred Code # Download dataset using Kaggle CLI # Note: some Kaggle versions do NOT support `python -m kaggle` (no __main__). export KAGGLE_CONFIG_DIR="${HOME}/.kaggle" ZIP_DIR="${OUTPUT_DIR}/_kaggle_zip" mkdir -p "${ZIP_DIR}" kaggle datasets download -d "$DATASET_NAME" -p "$ZIP_DIR" ZIP_FILE="$(ls -1t "${ZIP_DIR}"/*.zip 2>/dev/null \| head -n 1 \|\| true)" if [[ -z "${ZIP_FILE}" ]]; then echo "ERROR: Kaggle download completed but no .zip was found in ${ZIP_DIR}" >&2 exit 1 fi echo "Unzipping ${ZIP_FILE} -> ${OUTPUT_DIR}" unzip -q "${ZIP_FILE}" -d "${OUTPUT_DIR}" Learn more about managing compliance generic rules or creating your own custom rules
Update

Compliance status legend

🟢 - Fully Compliant
🟡 - Partial Compliant
🔴 - Not Compliant
⚪ - Requires Further Human Verification
🏷️ - Compliance label

qodo-code-review · 2025-12-18T22:52:07Z

PR Code Suggestions ✨

Latest suggestions up to 282ad8d

Category	Suggestion	Impact
Security	Harden path traversal validation Harden the path validation in `safe_extract.py` to prevent bypasses on Windows. Normalize path separators, explicitly check for Windows absolute paths, and reject paths containing NUL bytes. scripts/safe_extract.py [53-73] for info in z.infolist(): name = info.filename - # Check for absolute paths - if name.startswith(("/", "\\")): + # Reject NUL bytes (can confuse path handling) + if "\x00" in name: + unsafe_entries.append(f"NUL byte in path: {name!r}") + continue + + # Normalize separators so traversal checks can't be bypassed with backslashes + norm_name = name.replace("\\", "/") + + # Reject absolute paths and Windows drive/UNC paths + if norm_name.startswith("/"): unsafe_entries.append(f"Absolute path: {name}") + continue + if pathlib.PureWindowsPath(norm_name).is_absolute() or pathlib.PureWindowsPath(norm_name).drive: + unsafe_entries.append(f"Absolute/drive path: {name}") continue # Check for parent directory traversal - parts = pathlib.PurePosixPath(name).parts + parts = pathlib.PurePosixPath(norm_name).parts if ".." in parts: unsafe_entries.append(f"Parent traversal: {name}") continue # Verify resolved path is within output directory - dest = (out_dir / name).resolve() + dest = (out_dir / norm_name).resolve() try: dest.relative_to(out_dir) except ValueError: unsafe_entries.append(f"Escapes output dir: {name}") continue `[To ensure code accuracy, apply this suggestion manually]` Suggestion importance[1-10]: 10 __ Why: This suggestion identifies a critical security flaw in the new `safe_extract` function, where the path validation can be bypassed on Windows, defeating the purpose of the Zip Slip protection.	High
Security	Block symlink-based zip slip Prevent a symlink-based Zip Slip vulnerability by adding a check to reject ZIP archives that contain symbolic links before calling `z.extractall(out_dir)`. scripts/safe_extract.py [80-85] # All entries validated - safe to extract if verbose: print(f"Validated {len(z.infolist())} entries - all safe") print("Extracting...") +# Block symlink entries to prevent "symlink zip slip" (write-through symlink) +for info in z.infolist(): + is_symlink = (info.external_attr >> 16) & 0o170000 == 0o120000 + if is_symlink: + raise RuntimeError(f"Unsafe ZIP archive detected:\n - Symlink entry: {info.filename}") + z.extractall(out_dir) `[To ensure code accuracy, apply this suggestion manually]` Suggestion importance[1-10]: 10 __ Why: This suggestion points out a critical security vulnerability (symlink-based Zip Slip) that is not handled by the current implementation, allowing an attacker to write files outside the intended extraction directory.	High
Possible issue	Make tests fail correctly Refactor the test functions in `test_zip_slip_protection` to use `assert` statements instead of `return False`. This ensures that test failures are correctly reported by `pytest`. tests/test_security_fixes.py [63-143] def test_zip_slip_protection(): """Test that Zip Slip protection works.""" print("\n" + "="70) print("TEST: Zip Slip Protection") print("="70) with tempfile.TemporaryDirectory() as tmpdir: tmpdir = pathlib.Path(tmpdir) # Test 1: Safe ZIP (should succeed) safe_zip = tmpdir / "safe.zip" with zipfile.ZipFile(safe_zip, 'w') as z: z.writestr("data/file1.txt", "content1") z.writestr("data/subdir/file2.txt", "content2") out_dir = tmpdir / "output_safe" - try: - safe_extract(safe_zip, out_dir, verbose=False) - assert (out_dir / "data" / "file1.txt").exists() - assert (out_dir / "data" / "subdir" / "file2.txt").exists() - print("✓ Safe ZIP extracted successfully") - except Exception as e: - print(f"✗ Safe ZIP failed: {e}") - return False - ... + safe_extract(safe_zip, out_dir, verbose=False) + assert (out_dir / "data" / "file1.txt").exists() + assert (out_dir / "data" / "subdir" / "file2.txt").exists() + + # Test 2: ZIP with parent traversal (should fail) + traversal_zip = tmpdir / "traversal.zip" + with zipfile.ZipFile(traversal_zip, 'w') as z: + z.writestr("data/file1.txt", "safe") + z.writestr("../evil.txt", "malicious") + + out_dir = tmpdir / "output_traversal" try: safe_extract(traversal_zip, out_dir, verbose=False) - print("✗ Parent traversal ZIP was not blocked!") - return False + assert False, "Parent traversal ZIP was not blocked" except RuntimeError as e: - if "Parent traversal" in str(e): - print("✓ Parent traversal blocked") - else: - print(f"✗ Wrong error: {e}") - return False - ... - print("\n✓ All Zip Slip protection tests passed!") - return True + assert "Parent traversal" in str(e) + # Test 3: ZIP with absolute path (should fail) + absolute_zip = tmpdir / "absolute.zip" + with zipfile.ZipFile(absolute_zip, 'w') as z: + z.writestr("data/file1.txt", "safe") + z.writestr("/tmp/evil.txt", "malicious") + + out_dir = tmpdir / "output_absolute" + try: + safe_extract(absolute_zip, out_dir, verbose=False) + assert False, "Absolute path ZIP was not blocked" + except RuntimeError as e: + assert "Absolute path" in str(e) + + # Test 4: ZIP with complex traversal (should fail) + complex_zip = tmpdir / "complex.zip" + with zipfile.ZipFile(complex_zip, 'w') as z: + z.writestr("data/file1.txt", "safe") + z.writestr("data/../../evil.txt", "malicious") + + out_dir = tmpdir / "output_complex" + try: + safe_extract(complex_zip, out_dir, verbose=False) + assert False, "Complex traversal ZIP was not blocked" + except RuntimeError as e: + assert ("Parent traversal" in str(e)) or ("Escapes" in str(e)) + `[To ensure code accuracy, apply this suggestion manually]` Suggestion importance[1-10]: 8 __ Why: This suggestion correctly identifies that the test functions use `return False` instead of assertions, which would cause `pytest` to incorrectly report failing tests as passing, making the test suite unreliable.	Medium
	✅ ~~Unify strain magnitude definition~~ Suggestion Impact: Updated S_mag in the GPU postprocessing fallback path from sqrt(2.0(...)) to sqrt(...), matching the Frobenius norm definition; also updated comments to reflect this. code diff: - // Fallback: use strain magnitude + // Fallback: use strain magnitude (Frobenius norm) const double dudx_v = dudx[idx_cell]; const double dvdy_v = dvdy[idx_cell]; const double Sxx = dudx_v; const double Syy = dvdy_v; - const double S_mag = sqrt(2.0 (SxxSxx + SyySyy + 2.0SxySxy)); + const double S_mag = sqrt(SxxSxx + SyySyy + 2.0SxySxy); if (S_mag > 1e-10) { const double b_mag = sqrt(b_xxb_xx + 2.0b_xyb_xy + b_yyb_yy); nu_t_val = k_val * b_mag / S_mag; @@ -508,12 +508,12 @@ if (fabs(Sxy) > 1e-10) { nu_t_val = fabs(-b_xy * k_val / Sxy); } else { - // Fallback: use strain magnitude + // Fallback: use strain magnitude (Frobenius norm) const double dudx_v = dudx[idx_cell]; const double dvdy_v = dvdy[idx_cell]; const double Sxx = dudx_v; const double Syy = dvdy_v; - const double S_mag = sqrt(2.0 * (SxxSxx + SyySyy + 2.0SxySxy)); + const double S_mag = sqrt(SxxSxx + SyySyy + 2.0SxySxy); if (S_mag > 1e-10) { const double b_mag = sqrt(b_xxb_xx + 2.0b_xyb_xy + b_yyb_yy); nu_t_val = k_val * b_mag / S_mag; Unify the strain magnitude (`S_mag`) calculation in `postprocess_nn_outputs_gpu` to use the Frobenius norm, matching the definition in other GPU kernels for consistency. src/gpu_kernels.cpp [516-520] -const double S_mag = sqrt(2.0 * (SxxSxx + SyySyy + 2.0SxySxy)); +const double S_mag = sqrt(SxxSxx + SyySyy + 2.0SxySxy); if (S_mag > 1e-10) { const double b_mag = sqrt(b_xxb_xx + 2.0b_xyb_xy + b_yyb_yy); nu_t_val = k_val * b_mag / S_mag; } `[To ensure code accuracy, apply this suggestion manually]` Suggestion importance[1-10]: 7 __ Why: This is a valid correctness fix, as it points out an inconsistent formula for `S_mag` between feature generation and post-processing on the GPU, which could lead to incorrect physical results.	Medium
	✅ ~~Make reference loading deterministic~~ Suggestion Impact: The commit removes the O(N^2) nearest-neighbor scan and replaces it with direct i/j index computation using (x-x0)/dx and (y-y0)/dy with llround, adds bounds checking, and adds a tolerance-based sanity check before setting the field value. code diff: + // Direct indexing for uniform mesh (much faster than nearest-neighbor) + const double x0 = mesh.x(mesh.i_begin()); + const double y0 = mesh.y(mesh.j_begin()); + const double inv_dx = 1.0 / mesh.dx; + const double inv_dy = 1.0 / mesh.dy; + while (std::getline(file, line)) { // Skip comments and blank lines if (line.empty() \|\| line[0] == '#') continue; @@ -52,28 +58,28 @@ double x, y, value; if (!(iss >> x >> y >> value)) continue; - // Find closest mesh point (simple nearest-neighbor) - int best_i = -1, best_j = -1; - double min_dist = 1e30; - for (int j = mesh.j_begin(); j < mesh.j_end(); ++j) { - for (int i = mesh.i_begin(); i < mesh.i_end(); ++i) { - const double dx = mesh.x(i) - x; - const double dy = mesh.y(j) - y; - const double dist = dxdx + dydy; - if (dist < min_dist) { - min_dist = dist; - best_i = i; - best_j = j; - } - } - } - if (best_i >= 0 && best_j >= 0) { - // Count only if this cell wasn't already set - if (!std::isfinite(field(best_i, best_j))) { - ++num_set; - } - field(best_i, best_j) = value; - } + // Direct index calculation for uniform mesh + const int i = mesh.i_begin() + static_cast<int>(std::llround((x - x0) * inv_dx)); + const int j = mesh.j_begin() + static_cast<int>(std::llround((y - y0) * inv_dy)); + + // Check bounds + if (i < mesh.i_begin() \|\| i >= mesh.i_end() \|\| j < mesh.j_begin() \|\| j >= mesh.j_end()) { + continue; // out-of-domain line + } + + // Optional sanity: ensure the file point matches the chosen cell center + // Use a tolerance that accounts for typical printf/iostream rounding + const double dx_err = std::abs(mesh.x(i) - x); + const double dy_err = std::abs(mesh.y(j) - y); + if (dx_err > 0.01 * mesh.dx \|\| dy_err > 0.01 * mesh.dy) { + continue; + } + + // Count only if this cell wasn't already set + if (!std::isfinite(field(i, j))) { + ++num_set; + } + field(i, j) = value; } Refactor the `read_scalar_field_from_dat` function to improve performance and robustness. Replace the slow nearest-neighbor search with a direct index calculation for the uniform mesh. tests/test_cpu_gpu_consistency.cpp [36-87] ScalarField read_scalar_field_from_dat(const std::string& filename, const Mesh& mesh) { std::ifstream file(filename); if (!file) { throw std::runtime_error("Cannot open reference file: " + filename); } - - // Initialize with NaN to detect unpopulated cells + ScalarField field(mesh, std::numeric_limits<double>::quiet_NaN()); std::string line; int num_set = 0; - + + const double x0 = mesh.x(mesh.i_begin()); + const double y0 = mesh.y(mesh.j_begin()); + const double inv_dx = 1.0 / mesh.dx; + const double inv_dy = 1.0 / mesh.dy; + while (std::getline(file, line)) { - // Skip comments and blank lines if (line.empty() \|\| line[0] == '#') continue; - + std::istringstream iss(line); double x, y, value; if (!(iss >> x >> y >> value)) continue; - - // Find closest mesh point (simple nearest-neighbor) - int best_i = -1, best_j = -1; - double min_dist = 1e30; - for (int j = mesh.j_begin(); j < mesh.j_end(); ++j) { - for (int i = mesh.i_begin(); i < mesh.i_end(); ++i) { - const double dx = mesh.x(i) - x; - const double dy = mesh.y(j) - y; - const double dist = dxdx + dydy; - if (dist < min_dist) { - min_dist = dist; - best_i = i; - best_j = j; - } - } + + const int i = mesh.i_begin() + static_cast<int>(std::llround((x - x0) * inv_dx)); + const int j = mesh.j_begin() + static_cast<int>(std::llround((y - y0) * inv_dy)); + + if (i < mesh.i_begin() \|\| i >= mesh.i_end() \|\| j < mesh.j_begin() \|\| j >= mesh.j_end()) { + continue; // out-of-domain line } - if (best_i >= 0 && best_j >= 0) { - // Count only if this cell wasn't already set - if (!std::isfinite(field(best_i, best_j))) { - ++num_set; - } - field(best_i, best_j) = value; + + // Optional sanity: ensure (x,y) matches the target cell center closely + const double dx_err = std::abs(mesh.x(i) - x); + const double dy_err = std::abs(mesh.y(j) - y); + if (dx_err > 1e-8 \|\| dy_err > 1e-8) { + continue; } + + if (!std::isfinite(field(i, j))) { + ++num_set; + } + field(i, j) = value; } - - // Verify all interior cells were populated + const int expected = (mesh.i_end() - mesh.i_begin()) * (mesh.j_end() - mesh.j_begin()); if (num_set != expected) { throw std::runtime_error("Reference file did not populate all interior cells: " + std::to_string(num_set) + "/" + std::to_string(expected)); } - + return field; } `[To ensure code accuracy, apply this suggestion manually]` Suggestion importance[1-10]: 6 __ Why: The suggestion correctly identifies a performance and robustness issue in a test helper function and proposes a much better implementation, improving the quality and reliability of the test suite.	Low
	✅ ~~Remove hard-coded basis sizing~~ Suggestion Impact: Updated basis_flat_ resize to use TensorBasis::NUM_BASIS * 3 instead of the hard-coded 12, making the buffer sizing depend on the actual number of basis tensors. code diff: features_flat_.resize(n_cells * feature_dim); - basis_flat_.resize(n_cells * 12); // 4 basis tensors × 3 components + basis_flat_.resize(n_cells * TensorBasis::NUM_BASIS * 3); // NUM_BASIS tensors × 3 components Replace the hard-coded size `12` for `basis_flat_` with a calculation using `TensorBasis::NUM_BASIS` to avoid potential memory corruption if the number of basis tensors changes. src/turbulence_nn_tbnn.cpp [72-76] // Allocate CPU buffers features_flat_.resize(n_cells * feature_dim); -basis_flat_.resize(n_cells * 12); // 4 basis tensors × 3 components +basis_flat_.resize(n_cells * TensorBasis::NUM_BASIS * 3); // NUM_BASIS tensors × 3 components outputs_flat_.resize(n_cells * output_dim); workspace_.resize(workspace_size); `[To ensure code accuracy, apply this suggestion manually]` Suggestion importance[1-10]: 6 __ Why: The suggestion correctly identifies a hard-coded value (`12`) that should be derived from the `TensorBasis::NUM_BASIS` constant to improve maintainability and prevent future bugs.	Low
Possible issue	✅ ~~Map correct device buffer size~~ Suggestion Impact: The commit changes total_field_size from (Nx + 2Ng) (Ny + 2Ng) to stride (Ny + 2Ng), matching the suggestion to correctly size the mapped nu_t_field buffer on the GPU. code diff: - const int total_field_size = (Nx + 2Ng) * (Ny + 2Ng); + const int total_field_size = stride (Ny + 2Ng); In `postprocess_mlp_outputs_gpu`, correct the calculation of `total_field_size` to use the passed `stride` parameter, ensuring the OpenMP target mapping for `nu_t_field` is large enough to prevent out-of-bounds writes on the GPU. src/gpu_kernels.cpp [225-250] -const int total_field_size = (Nx + 2Ng) * (Ny + 2Ng); +const int total_field_size = stride (Ny + 2Ng); // CRITICAL: map(present:...) indicates these arrays are already mapped #pragma omp target teams distribute parallel for collapse(2) \ map(present: nn_outputs[0:(NxNy)], nu_t_field[0:total_field_size]) for (int jj = 0; jj < Ny; ++jj) { for (int ii = 0; ii < Nx; ++ii) { const int i = ii + Ng; const int j = jj + Ng; const int idx_in = jj * Nx + ii; const int idx_out = j * stride + i; - // Get NN prediction double nu_t_val = nn_outputs[idx_in]; - // Apply realizability and clipping - if (nu_t_val != nu_t_val \|\| nu_t_val < 0.0) { // NaN or negative + if (nu_t_val != nu_t_val \|\| nu_t_val < 0.0) { nu_t_val = 0.0; } if (nu_t_val > nu_t_max) { nu_t_val = nu_t_max; } nu_t_field[idx_out] = nu_t_val; } } `[To ensure code accuracy, apply this suggestion manually]` Suggestion importance[1-10]: 8 __ Why: The suggestion correctly identifies a critical bug where the allocated size for `nu_t_field` on the GPU is calculated with an incorrect stride, which could lead to out-of-bounds memory writes and undefined behavior.	Medium
	✅ ~~Use correct stride parameters~~ Suggestion Impact: The commit replaces the hard-coded u_stride and v_stride arguments with vel.u_stride() and vel.v_stride(), improving robustness as suggested. code diff: - mesh.Nx + 2mesh.Nghost + 1, // u_stride - mesh.Nx + 2mesh.Nghost, // v_stride - mesh.total_Nx(), // cell_stride + vel.u_stride(), // u_stride + vel.v_stride(), // v_stride + mesh.total_Nx(), // cell_stride Replace hard-coded stride calculations with accessor methods like `vel.u_stride()` when calling `gpu_kernels::compute_gradients_from_mac_gpu` to improve code robustness. tests/test_features.cpp [397-406] gpu_kernels::compute_gradients_from_mac_gpu( u_ptr, v_ptr, dudx_ptr, dudy_ptr, dvdx_ptr, dvdy_ptr, mesh.Nx, mesh.Ny, mesh.Nghost, mesh.dx, mesh.dy, - mesh.Nx + 2mesh.Nghost + 1, // u_stride - mesh.Nx + 2mesh.Nghost, // v_stride - mesh.total_Nx(), // cell_stride + vel.u_stride(), // u_stride + vel.v_stride(), // v_stride + mesh.total_Nx(), // cell_stride u_total, v_total, total_cells ); `[To ensure code accuracy, apply this suggestion manually]` Suggestion importance[1-10]: 7 __ Why: The suggestion correctly points out that using accessor methods like `vel.u_stride()` is more robust than hardcoding stride calculations, improving maintainability and reducing the risk of future bugs if data structures change.	Medium
	Detect incomplete layer file pairs Modify the file validation loop to detect and report an error if a model layer is incomplete (i.e., a weight file exists without its corresponding bias file, or vice-versa). scripts/validate_model_files.py [110-115] while True: W_file = model_dir / f'layer{layer_idx}_W.txt' b_file = model_dir / f'layer{layer_idx}_b.txt' - - if not W_file.exists() or not b_file.exists(): + + has_W = W_file.exists() + has_b = b_file.exists() + + if not has_W and not has_b: break + if has_W != has_b: + issues.append( + f"Incomplete layer {layer_idx} files: " + f"W exists={has_W}, b exists={has_b}" + ) + break + `[To ensure code accuracy, apply this suggestion manually]` Suggestion importance[1-10]: 7 __ Why: The suggestion correctly identifies a logic flaw where incomplete model layer files are not detected, and the proposed change improves the validation script's correctness.	Medium
	Use real strides and sizes Replace hard-coded stride and size calculations with accessor methods from the `mesh` and `velocity` objects to improve code robustness and consistency. src/turbulence_nn_mlp.cpp [192-197] -const int total_cells = (Nx + 2Ng) (Ny + 2Ng); +const int total_cells = mesh.total_cells(); const int u_total = velocity.u_total_size(); const int v_total = velocity.v_total_size(); -const int cell_stride = Nx + 2Ng; -const int u_stride = Nx + 2Ng + 1; -const int v_stride = Nx + 2Ng; +const int cell_stride = mesh.total_Nx(); +const int u_stride = velocity.u_stride(); +const int v_stride = velocity.v_stride(); `[To ensure code accuracy, apply this suggestion manually]` Suggestion importance[1-10]: 6 __ Why: The suggestion correctly identifies that hard-coding stride and size calculations is brittle. Using accessor methods from `mesh` and `velocity` makes the code more robust and maintainable, as demonstrated by the consistent approach in `turbulence_nn_tbnn.cpp`.	Low
	Handle single-value stats safely Ensure `np.loadtxt` results are always arrays to prevent a `TypeError` when a file contains a single value, and add a shape mismatch check for robustness. scripts/fix_normalization_stats.py [108-136] -means = np.loadtxt(means_file) -stds = np.loadtxt(stds_file) +means = np.atleast_1d(np.loadtxt(means_file)).astype(float).ravel() +stds = np.atleast_1d(np.loadtxt(stds_file)).astype(float).ravel() +if means.shape != stds.shape: + print(f"ERROR: Means/stds shape mismatch: means={means.shape}, stds={stds.shape}") + return False ... -n_features = len(means) +n_features = means.size `[To ensure code accuracy, apply this suggestion manually]` Suggestion importance[1-10]: 6 __ Why: The suggestion correctly identifies a `TypeError` edge case when `np.loadtxt` reads a single value and provides a robust fix, improving code correctness.	Low
	Require minimum model files exist Strengthen the `resolve_model_dir` test helper by checking for both `layer0_W.txt` and `layer0_b.txt` to ensure a minimal set of model files exists, preventing flaky test failures. tests/test_backend_execution.cpp [31-51] static std::string resolve_model_dir(const std::string& p) { // Strip trailing slashes std::string path = p; while (!path.empty() && path.back() == '/') { path.pop_back(); } - + + auto has_min_files = [&](const std::string& dir) { + return file_exists(dir + "/layer0_W.txt") && file_exists(dir + "/layer0_b.txt"); + }; + // Try relative to current directory (when running from repo root) - if (file_exists(path + "/layer0_W.txt")) { + if (has_min_files(path)) { return path; } - + // Try relative to build directory (when running from build/) - if (file_exists("../" + path + "/layer0_W.txt")) { + if (has_min_files("../" + path)) { return "../" + path; } - + throw std::runtime_error( "NN model files not found. Tried: " + path + " and ../" + path ); } `[To ensure code accuracy, apply this suggestion manually]` Suggestion importance[1-10]: 5 __ Why: The suggestion correctly points out a weakness in the model directory validation logic within the test suite. Checking for both weights and biases files makes the test helper more robust and provides clearer failure modes, improving the reliability of the tests.	Low
General	Fail fast when CLI missing Add a check to ensure the `kaggle` command-line tool is installed before attempting to use it, providing a clearer error message to the user if it is missing. scripts/download_mcconkey_data.sh [47] +if ! command -v kaggle >/dev/null 2>&1; then + echo "ERROR: Kaggle CLI not found on PATH." >&2 + echo "Install it with: pip install kaggle" >&2 + exit 1 +fi + kaggle datasets download -d "$DATASET_NAME" -p "$ZIP_DIR" `[To ensure code accuracy, apply this suggestion manually]` Suggestion importance[1-10]: 5 __ Why: The suggestion correctly points out that a check for the `kaggle` CLI was removed and suggests re-adding it for better user experience, which is a valid improvement.	Low
Update

Previous suggestions

✅ Suggestions up to commit ddc4146

Category	Suggestion	Impact
Possible issue	✅ ~~Fix ghosted indexing and mapping~~ Suggestion Impact: The function signature was changed from n_cells to Nx/Ny/Ng plus cell_stride/total_cells, and the kernel was rewritten from a 1D loop to a collapsed 2D (jj, ii) loop computing idx_out (interior) and idx_cell (ghosted). Additionally, when compute_tau is true, tau_xx/tau_xy/tau_yy are included in the OpenMP target map(present: ...) clause and written using idx_cell. code diff: @@ -385,7 +385,9 @@ const double* dvdx, const double* dvdy, double* nu_t, double* tau_xx, double* tau_xy, double* tau_yy, - int n_cells, int output_dim, + int Nx, int Ny, int Ng, + int cell_stride, int total_cells, + int output_dim, double nu_ref) { #ifdef USE_GPU_OFFLOAD @@ -393,78 +395,151 @@ const bool compute_tau = (tau_xx != nullptr); // CRITICAL: map(present:...) indicates these arrays are already mapped by turbulence model - // Note: tau arrays may be nullptr, but map clause is safe (runtime ignores null pointers) - #pragma omp target teams distribute parallel for \ - map(present: nn_outputs[0:(n_cellsoutput_dim)], basis[0:(n_cells12)], \ - k[0:n_cells], dudx[0:n_cells], dudy[0:n_cells], \ - dvdx[0:n_cells], dvdy[0:n_cells], nu_t[0:n_cells]) - for (int idx = 0; idx < n_cells; ++idx) { - // Extract G coefficients from NN output - double G[4] = {0.0, 0.0, 0.0, 0.0}; - int out_base = idx * output_dim; - for (int n = 0; n < NUM_BASIS && n < output_dim; ++n) { - G[n] = nn_outputs[out_base + n]; - } - - // Get basis tensors - int basis_base = idx * 12; - - // Construct anisotropy tensor: b_ij = sum_n G_n * T^n_ij - double b_xx = 0.0, b_xy = 0.0, b_yy = 0.0; - for (int n = 0; n < NUM_BASIS; ++n) { - b_xx += G[n] * basis[basis_base + n3 + 0]; - b_xy += G[n] basis[basis_base + n3 + 1]; - b_yy += G[n] basis[basis_base + n3 + 2]; - } - - // Compute Reynolds stresses if requested - double k_val = k[idx]; - if (compute_tau) { - double k_safe = (k_val > 0.0) ? k_val : 0.0; - tau_xx[idx] = 2.0 k_safe * (b_xx + 1.0/3.0); - tau_xy[idx] = 2.0 * k_safe * b_xy; - tau_yy[idx] = 2.0 * k_safe * (b_yy + 1.0/3.0); - } - - // Compute equivalent eddy viscosity - // nu_t = -b_xy * k / S_xy (when S_xy is significant) - double dudy_v = dudy[idx]; - double dvdx_v = dvdx[idx]; - double Sxy = 0.5 * (dudy_v + dvdx_v); - - double nu_t_val = 0.0; - if (fabs(Sxy) > 1e-10) { - nu_t_val = fabs(-b_xy * k_val / Sxy); - } else { - // Fallback: use strain magnitude - double dudx_v = dudx[idx]; - double dvdy_v = dvdy[idx]; - double Sxx = dudx_v; - double Syy = dvdy_v; - double S_mag = sqrt(2.0 * (SxxSxx + SyySyy + 2.0SxySxy)); - if (S_mag > 1e-10) { - double b_mag = sqrt(b_xxb_xx + 2.0b_xyb_xy + b_yyb_yy); - nu_t_val = k_val * b_mag / S_mag; + // nn_outputs/basis are interior-only (NxNy), k/gradients/nu_t/tau are ghosted (total_cells) + if (!compute_tau) { + #pragma omp target teams distribute parallel for collapse(2) \ + map(present: nn_outputs[0:(NxNyoutput_dim)], basis[0:(NxNy12)], \ + k[0:total_cells], dudx[0:total_cells], dudy[0:total_cells], \ + dvdx[0:total_cells], dvdy[0:total_cells], nu_t[0:total_cells]) + for (int jj = 0; jj < Ny; ++jj) { + for (int ii = 0; ii < Nx; ++ii) { + const int i = ii + Ng; + const int j = jj + Ng; + const int idx_cell = j cell_stride + i; // ghosted field index + const int idx_out = jj * Nx + ii; // interior index for nn/basis + + // Extract G coefficients from NN output + double G[NUM_BASIS] = {0.0, 0.0, 0.0, 0.0}; + const int out_base = idx_out * output_dim; + for (int n = 0; n < NUM_BASIS && n < output_dim; ++n) { + G[n] = nn_outputs[out_base + n]; + } + + // Get basis tensors + const int basis_base = idx_out * 12; + + // Construct anisotropy tensor: b_ij = sum_n G_n * T^n_ij + double b_xx = 0.0, b_xy = 0.0, b_yy = 0.0; + for (int n = 0; n < NUM_BASIS; ++n) { + b_xx += G[n] * basis[basis_base + n3 + 0]; + b_xy += G[n] basis[basis_base + n3 + 1]; + b_yy += G[n] basis[basis_base + n3 + 2]; + } + + // Compute equivalent eddy viscosity + const double k_val = k[idx_cell]; + const double dudy_v = dudy[idx_cell]; + const double dvdx_v = dvdx[idx_cell]; + const double Sxy = 0.5 (dudy_v + dvdx_v); + + double nu_t_val = 0.0; + if (fabs(Sxy) > 1e-10) { + nu_t_val = fabs(-b_xy * k_val / Sxy); + } else { + // Fallback: use strain magnitude + const double dudx_v = dudx[idx_cell]; + const double dvdy_v = dvdy[idx_cell]; + const double Sxx = dudx_v; + const double Syy = dvdy_v; + const double S_mag = sqrt(2.0 * (SxxSxx + SyySyy + 2.0SxySxy)); + if (S_mag > 1e-10) { + const double b_mag = sqrt(b_xxb_xx + 2.0b_xyb_xy + b_yyb_yy); + nu_t_val = k_val * b_mag / S_mag; + } + } + + // Clip to reasonable bounds + const double max_nu_t = 10.0 * nu_ref; + nu_t_val = (nu_t_val > 0.0) ? nu_t_val : 0.0; + nu_t_val = (nu_t_val < max_nu_t) ? nu_t_val : max_nu_t; + + // Check for NaN/Inf + if (nu_t_val != nu_t_val \|\| nu_t_val > 1e30) { + nu_t_val = 0.0; + } + + nu_t[idx_cell] = nu_t_val; } } - - // Clip to reasonable bounds - double max_nu_t = 10.0 * nu_ref; - nu_t_val = (nu_t_val > 0.0) ? nu_t_val : 0.0; - nu_t_val = (nu_t_val < max_nu_t) ? nu_t_val : max_nu_t; - - // Check for NaN/Inf - if (nu_t_val != nu_t_val \|\| nu_t_val > 1e30) { // NaN check: x != x is true only for NaN - nu_t_val = 0.0; - } - - nu_t[idx] = nu_t_val; + } else { + #pragma omp target teams distribute parallel for collapse(2) \ + map(present: nn_outputs[0:(NxNyoutput_dim)], basis[0:(NxNy12)], \ + k[0:total_cells], dudx[0:total_cells], dudy[0:total_cells], \ + dvdx[0:total_cells], dvdy[0:total_cells], nu_t[0:total_cells], \ + tau_xx[0:total_cells], tau_xy[0:total_cells], tau_yy[0:total_cells]) + for (int jj = 0; jj < Ny; ++jj) { + for (int ii = 0; ii < Nx; ++ii) { + const int i = ii + Ng; + const int j = jj + Ng; + const int idx_cell = j * cell_stride + i; + const int idx_out = jj * Nx + ii; + + // Extract G coefficients from NN output + double G[NUM_BASIS] = {0.0, 0.0, 0.0, 0.0}; + const int out_base = idx_out * output_dim; + for (int n = 0; n < NUM_BASIS && n < output_dim; ++n) { + G[n] = nn_outputs[out_base + n]; + } + + // Get basis tensors + const int basis_base = idx_out * 12; + + // Construct anisotropy tensor: b_ij = sum_n G_n * T^n_ij + double b_xx = 0.0, b_xy = 0.0, b_yy = 0.0; + for (int n = 0; n < NUM_BASIS; ++n) { + b_xx += G[n] * basis[basis_base + n3 + 0]; + b_xy += G[n] basis[basis_base + n3 + 1]; + b_yy += G[n] basis[basis_base + n3 + 2]; + } + + // Compute Reynolds stresses + const double k_val = k[idx_cell]; + const double k_safe = (k_val > 0.0) ? k_val : 0.0; + tau_xx[idx_cell] = 2.0 k_safe * (b_xx + 1.0/3.0); + tau_xy[idx_cell] = 2.0 * k_safe * b_xy; + tau_yy[idx_cell] = 2.0 * k_safe * (b_yy + 1.0/3.0); + + // Compute equivalent eddy viscosity + const double dudy_v = dudy[idx_cell]; + const double dvdx_v = dvdx[idx_cell]; + const double Sxy = 0.5 * (dudy_v + dvdx_v); + + double nu_t_val = 0.0; + if (fabs(Sxy) > 1e-10) { + nu_t_val = fabs(-b_xy * k_val / Sxy); + } else { + // Fallback: use strain magnitude + const double dudx_v = dudx[idx_cell]; + const double dvdy_v = dvdy[idx_cell]; + const double Sxx = dudx_v; + const double Syy = dvdy_v; + const double S_mag = sqrt(2.0 * (SxxSxx + SyySyy + 2.0SxySxy)); + if (S_mag > 1e-10) { + const double b_mag = sqrt(b_xxb_xx + 2.0b_xyb_xy + b_yyb_yy); + nu_t_val = k_val * b_mag / S_mag; + } + } + + // Clip to reasonable bounds + const double max_nu_t = 10.0 * nu_ref; + nu_t_val = (nu_t_val > 0.0) ? nu_t_val : 0.0; + nu_t_val = (nu_t_val < max_nu_t) ? nu_t_val : max_nu_t; + + // Check for NaN/Inf + if (nu_t_val != nu_t_val \|\| nu_t_val > 1e30) { + nu_t_val = 0.0; + } + + nu_t[idx_cell] = nu_t_val; + } + } } #else (void)nn_outputs; (void)basis; (void)k; (void)dudx; (void)dudy; (void)dvdx; (void)dvdy; (void)nu_t; (void)tau_xx; (void)tau_xy; (void)tau_yy; - (void)n_cells; (void)output_dim; (void)nu_ref; + (void)Nx; (void)Ny; (void)Ng; (void)cell_stride; (void)total_cells; + (void)output_dim; (void)nu_ref; #endif Refactor `postprocess_nn_outputs_gpu` to correctly handle ghosted fields by using a 2D loop with strides, similar to other GPU kernels. Additionally, ensure `tau` buffers are correctly mapped to the device when `compute_tau` is true. src/gpu_kernels.cpp [380-405] void postprocess_nn_outputs_gpu( const double* nn_outputs, const double* basis, const double* k, const double* dudx, const double* dudy, const double* dvdx, const double* dvdy, double* nu_t, double* tau_xx, double* tau_xy, double* tau_yy, - int n_cells, + int Nx, int Ny, int Ng, + int cell_stride, int total_cells, int output_dim, double nu) { #ifdef USE_GPU_OFFLOAD - const int NUM_BASIS = 4; const bool compute_tau = (tau_xx != nullptr); - - // CRITICAL: map(present:...) indicates these arrays are already mapped by turbulence model - // Note: tau arrays may be nullptr, but map clause is safe (runtime ignores null pointers) - #pragma omp target teams distribute parallel for \ - map(present: nn_outputs[0:(n_cellsoutput_dim)], basis[0:(n_cells12)], \ - k[0:n_cells], dudx[0:n_cells], dudy[0:n_cells], \ - dvdx[0:n_cells], dvdy[0:n_cells], nu_t[0:n_cells]) - for (int idx = 0; idx < n_cells; ++idx) { - // Extract G coefficients from NN output - double G[4] = {0.0, 0.0, 0.0, 0.0}; + if (!compute_tau) { + #pragma omp target teams distribute parallel for collapse(2) \ + map(present: nn_outputs[0:(NxNyoutput_dim)], basis[0:(NxNy12)], \ + k[0:total_cells], dudx[0:total_cells], dudy[0:total_cells], \ + dvdx[0:total_cells], dvdy[0:total_cells], nu_t[0:total_cells]) + for (int jj = 0; jj < Ny; ++jj) { + for (int ii = 0; ii < Nx; ++ii) { + const int i = ii + Ng; + const int j = jj + Ng; + const int idx_cell = j * cell_stride + i; // ghosted field index + const int idx_out = jj * Nx + ii; // interior index for nn/basis + + // ... use idx_out for nn_outputs/basis, idx_cell for k/gradients/nu_t ... + } + } + } else { + #pragma omp target teams distribute parallel for collapse(2) \ + map(present: nn_outputs[0:(NxNyoutput_dim)], basis[0:(NxNy12)], \ + k[0:total_cells], dudx[0:total_cells], dudy[0:total_cells], \ + dvdx[0:total_cells], dvdy[0:total_cells], nu_t[0:total_cells], \ + tau_xx[0:total_cells], tau_xy[0:total_cells], tau_yy[0:total_cells]) + for (int jj = 0; jj < Ny; ++jj) { + for (int ii = 0; ii < Nx; ++ii) { + const int i = ii + Ng; + const int j = jj + Ng; + const int idx_cell = j * cell_stride + i; + const int idx_out = jj * Nx + ii; + + // ... same, plus write tau_* at idx_cell ... + } + } + } +#else + (void)nn_outputs; (void)basis; (void)k; (void)dudx; (void)dudy; (void)dvdx; (void)dvdy; + (void)nu_t; (void)tau_xx; (void)tau_xy; (void)tau_yy; + (void)Nx; (void)Ny; (void)Ng; (void)cell_stride; (void)total_cells; + (void)output_dim; (void)nu; +#endif +} + Suggestion importance[1-10]: 10 __ Why: This suggestion identifies two critical bugs: incorrect memory indexing for ghosted fields and missing `map` clauses for `tau` buffers, both of which would cause memory corruption or crashes in the GPU pipeline.	High
	✅ ~~Write outputs to device buffer~~ Suggestion Impact: The commit removed the host-side nu_t_ptr and updated the GPU kernel call to use device_view->nu_t, aligning output writes with the device buffer and avoiding invalid host-pointer access on GPU. code diff: @@ -241,11 +241,10 @@ { TIMED_SCOPE("nn_mlp_postprocess_gpu"); double* out_ptr = outputs_flat_.data(); - double* nu_t_ptr = nu_t.data().data(); gpu_kernels::postprocess_mlp_outputs_gpu( - out_ptr, // NN outputs (n_cells * 1) - nu_t_ptr, // nu_t field with ghosts + out_ptr, // NN outputs (n_cells * 1) + device_view->nu_t, // nu_t field on device (with ghosts) Nx, Ny, Ng, cell_stride, nu_t_max_ In the `TurbulenceNNMLP::update` GPU path, pass the device-mapped buffer `device_view->nu_t` to `postprocess_mlp_outputs_gpu` instead of the host-side pointer `nu_t.data().data()` to prevent memory access errors on the GPU. src/turbulence_nn_mlp.cpp [244-252] -double* nu_t_ptr = nu_t.data().data(); - gpu_kernels::postprocess_mlp_outputs_gpu( - out_ptr, // NN outputs (n_cells * 1) - nu_t_ptr, // nu_t field with ghosts + out_ptr, // NN outputs (n_cells * 1) + device_view->nu_t, // nu_t field on device (with ghosts) Nx, Ny, Ng, cell_stride, nu_t_max_ ); Suggestion importance[1-10]: 9 __ Why: This suggestion correctly identifies a critical bug where a host pointer `nu_t.data().data()` is passed to a GPU kernel expecting a device pointer, which would lead to memory errors or crashes.	High
	✅ ~~Validate reference field completeness~~ Suggestion Impact: The commit adds , initializes the ScalarField with quiet_NaN(), tracks how many unique cells are set (using isfinite checks), and throws if the number set doesn't match the expected interior cell count. It also slightly refactors parsing to skip invalid lines. code diff: @@ -18,6 +18,7 @@ #include <fstream> #include <sstream> #include <cstring> +#include <limits> #ifdef USE_GPU_OFFLOAD #include <omp.h> @@ -38,8 +39,10 @@ throw std::runtime_error("Cannot open reference file: " + filename); } - ScalarField field(mesh); + // Initialize with NaN to detect unpopulated cells + ScalarField field(mesh, std::numeric_limits<double>::quiet_NaN()); std::string line; + int num_set = 0; while (std::getline(file, line)) { // Skip comments and blank lines @@ -47,26 +50,37 @@ std::istringstream iss(line); double x, y, value; - if (iss >> x >> y >> value) { - // Find closest mesh point (simple nearest-neighbor) - int best_i = -1, best_j = -1; - double min_dist = 1e30; - for (int j = mesh.j_begin(); j < mesh.j_end(); ++j) { - for (int i = mesh.i_begin(); i < mesh.i_end(); ++i) { - double dx = mesh.x(i) - x; - double dy = mesh.y(j) - y; - double dist = dxdx + dydy; - if (dist < min_dist) { - min_dist = dist; - best_i = i; - best_j = j; - } + if (!(iss >> x >> y >> value)) continue; + + // Find closest mesh point (simple nearest-neighbor) + int best_i = -1, best_j = -1; + double min_dist = 1e30; + for (int j = mesh.j_begin(); j < mesh.j_end(); ++j) { + for (int i = mesh.i_begin(); i < mesh.i_end(); ++i) { + const double dx = mesh.x(i) - x; + const double dy = mesh.y(j) - y; + const double dist = dxdx + dydy; + if (dist < min_dist) { + min_dist = dist; + best_i = i; + best_j = j; } } - if (best_i >= 0 && best_j >= 0) { - field(best_i, best_j) = value; + } + if (best_i >= 0 && best_j >= 0) { + // Count only if this cell wasn't already set + if (!std::isfinite(field(best_i, best_j))) { + ++num_set; } - } + field(best_i, best_j) = value; + } + } + + // Verify all interior cells were populated + const int expected = (mesh.i_end() - mesh.i_begin()) * (mesh.j_end() - mesh.j_begin()); + if (num_set != expected) { + throw std::runtime_error("Reference file did not populate all interior cells: " + + std::to_string(num_set) + "/" + std::to_string(expected)); } Improve the robustness of `read_scalar_field_from_dat` by initializing the `field` with a sentinel value (e.g., `NaN`) and verifying that all expected cells are populated from the input file to prevent uninitialized data issues. tests/test_cpu_gpu_consistency.cpp [35-73] ScalarField read_scalar_field_from_dat(const std::string& filename, const Mesh& mesh) { std::ifstream file(filename); if (!file) { throw std::runtime_error("Cannot open reference file: " + filename); } - - ScalarField field(mesh); + + ScalarField field(mesh, std::numeric_limits<double>::quiet_NaN()); std::string line; - + + int num_set = 0; + while (std::getline(file, line)) { - // Skip comments and blank lines if (line.empty() \|\| line[0] == '#') continue; - + std::istringstream iss(line); double x, y, value; if (iss >> x >> y >> value) { - // Find closest mesh point (simple nearest-neighbor) int best_i = -1, best_j = -1; double min_dist = 1e30; + for (int j = mesh.j_begin(); j < mesh.j_end(); ++j) { for (int i = mesh.i_begin(); i < mesh.i_end(); ++i) { - double dx = mesh.x(i) - x; - double dy = mesh.y(j) - y; - double dist = dxdx + dydy; + const double dx = mesh.x(i) - x; + const double dy = mesh.y(j) - y; + const double dist = dxdx + dydy; if (dist < min_dist) { min_dist = dist; best_i = i; best_j = j; } } } + if (best_i >= 0 && best_j >= 0) { + if (!std::isfinite(field(best_i, best_j))) { + ++num_set; + } field(best_i, best_j) = value; } } } - + + const int expected = mesh.Nx * mesh.Ny; + if (num_set != expected) { + throw std::runtime_error("Reference file did not populate all interior cells: " + + std::to_string(num_set) + "/" + std::to_string(expected)); + } + return field; } Suggestion importance[1-10]: 8 __ Why: This suggestion correctly identifies a potential for non-deterministic behavior if the input `.dat` file doesn't populate all mesh cells, proposing a robust solution by initializing with `NaN` and verifying completeness.	Medium
	✅ ~~Clear stale comparison outputs~~ Suggestion Impact: The script now deletes the cpu_gpu_comparison directory (rm -rf cpu_gpu_comparison) before recreating it, ensuring a clean reference output directory and preventing stale files from previous runs. code diff: echo "--- Step 1: Generate CPU reference outputs ---" +rm -rf cpu_gpu_comparison mkdir -p cpu_gpu_comparison Add `rm -rf cpu_gpu_comparison` to ensure the CPU reference directory is clean before generating new outputs, preventing interference from stale files from previous runs. .github/scripts/compare_cpu_gpu_builds.sh [28-33] echo "--- Step 1: Generate CPU reference outputs ---" +rm -rf cpu_gpu_comparison mkdir -p cpu_gpu_comparison ./test_cpu_gpu_consistency --dump-prefix cpu_gpu_comparison/cpu_ref \|\| { echo "[FAIL] CPU reference generation failed!" exit 1 } `[Suggestion processed]` Suggestion importance[1-10]: 7 __ Why: This is a valid and important suggestion for CI script reliability, as it prevents stale artifacts from a previous run from causing false passes or failures, which improves the test's integrity.	Medium
	Avoid unzipping stale archives Add a command to remove stale zip files before downloading and re-introduce a check to ensure the `kaggle` CLI is installed for better robustness. scripts/download_mcconkey_data.sh [42-53] export KAGGLE_CONFIG_DIR="${HOME}/.kaggle" + +if ! command -v kaggle >/dev/null 2>&1; then + echo "ERROR: Kaggle CLI not found in PATH (required to download ${DATASET_NAME})." >&2 + exit 1 +fi ZIP_DIR="${OUTPUT_DIR}/_kaggle_zip" mkdir -p "${ZIP_DIR}" +rm -f "${ZIP_DIR}"/.zip kaggle datasets download -d "$DATASET_NAME" -p "$ZIP_DIR" ZIP_FILE="$(ls -1t "${ZIP_DIR}"/.zip 2>/dev/null \| head -n 1 \|\| true)" if [[ -z "${ZIP_FILE}" ]]; then echo "ERROR: Kaggle download completed but no .zip was found in ${ZIP_DIR}" >&2 exit 1 fi Suggestion importance[1-10]: 6 __ Why: The suggestion correctly identifies a potential issue where stale `.zip` files could be used, and `rm -f` is a good fix. It also improves error handling by checking for the `kaggle` CLI, making the script more robust.	Low

✅ Suggestions up to commit ab96b34

Category	Suggestion	Impact
Possible issue	✅ ~~Write GPU outputs to mapped buffers~~ Suggestion Impact: The commit replaced host pointers (nu_t.data(), tau_ij->_data()) with device_view->nu_t and device_view->tau_ pointers for GPU postprocessing, and added runtime checks ensuring the required device_view buffers are present. code diff: + // Validate device_view has all required buffers + if (!device_view->u_face \|\| !device_view->v_face \|\| + !device_view->k \|\| !device_view->omega \|\| + !device_view->dudx \|\| !device_view->dudy \|\| !device_view->dvdx \|\| !device_view->dvdy \|\| + !device_view->wall_distance \|\| + !device_view->nu_t) { + throw std::runtime_error("NN-TBNN GPU pipeline: device_view missing required buffers"); + } + if (tau_ij && (!device_view->tau_xx \|\| !device_view->tau_xy \|\| !device_view->tau_yy)) { + throw std::runtime_error("NN-TBNN GPU pipeline: tau_ij requested but tau buffers not provided in device_view"); + } + { TIMED_SCOPE("nn_tbnn_full_gpu"); @@ -757,10 +769,12 @@ double* dudy_ptr = device_view->dudy; double* dvdx_ptr = device_view->dvdx; double* dvdy_ptr = device_view->dvdy; - double* nu_t_ptr = nu_t.data().data(); - double* tau_xx_ptr = tau_ij ? tau_ij->xx_data().data() : nullptr; - double* tau_xy_ptr = tau_ij ? tau_ij->xy_data().data() : nullptr; - double* tau_yy_ptr = tau_ij ? tau_ij->yy_data().data() : nullptr; + + // IMPORTANT: write results to device-mapped buffers (host pointers are not valid on device) + double* nu_t_ptr = device_view->nu_t; + double* tau_xx_ptr = device_view->tau_xx; + double* tau_xy_ptr = device_view->tau_xy; + double* tau_yy_ptr = device_view->tau_yy; In the GPU post-processing step, use device-mapped pointers from `device_view` for `nu_t` and `tau` outputs instead of host-side pointers to prevent memory errors. src/turbulence_nn_tbnn.cpp [753-772] double* out_ptr = outputs_flat_.data(); double* basis_ptr = basis_flat_.data(); double* k_ptr = device_view->k; double* dudx_ptr = device_view->dudx; double* dudy_ptr = device_view->dudy; double* dvdx_ptr = device_view->dvdx; double* dvdy_ptr = device_view->dvdy; -double* nu_t_ptr = nu_t.data().data(); -double* tau_xx_ptr = tau_ij ? tau_ij->xx_data().data() : nullptr; -double* tau_xy_ptr = tau_ij ? tau_ij->xy_data().data() : nullptr; -double* tau_yy_ptr = tau_ij ? tau_ij->yy_data().data() : nullptr; + +// IMPORTANT: write results to device-mapped buffers (host pointers are not valid on device) +double* nu_t_ptr = device_view->nu_t; +double* tau_xx_ptr = device_view->tau_xx; +double* tau_xy_ptr = device_view->tau_xy; +double* tau_yy_ptr = device_view->tau_yy; gpu_kernels::postprocess_nn_outputs_gpu( out_ptr, basis_ptr, k_ptr, dudx_ptr, dudy_ptr, dvdx_ptr, dvdy_ptr, nu_t_ptr, tau_xx_ptr, tau_xy_ptr, tau_yy_ptr, n_cells, mlp_.output_dim(), nu_ ); Suggestion importance[1-10]: 9 __ Why: This suggestion correctly identifies a critical bug where unmapped host pointers are used on the GPU, which would lead to crashes or memory corruption.	High
	Preserve strain magnitude convention Revert the change to the `S_mag` calculation. The new definition is a breaking change that will silently invalidate existing trained models by altering feature values. include/features.hpp [35-38] -/// Strain rate magnitude \|S\| = sqrt(S_ij * S_ij) (Frobenius norm) +/// Strain rate magnitude \|S\| = sqrt(2 * S_ij * S_ij) double S_mag() const { - return std::sqrt(Sxx()Sxx() + Syy()Syy() + 2.0Sxy()Sxy()); + return std::sqrt(2.0 * (Sxx()Sxx() + Syy()Syy() + 2.0Sxy()Sxy())); } Suggestion importance[1-10]: 9 __ Why: The suggestion correctly identifies that changing the mathematical definition of the strain rate magnitude `S_mag` is a breaking change that will invalidate all previously trained models by silently altering feature inputs.	High
	✅ ~~Validate device buffers before kernels~~ Suggestion Impact: The commit added pre-kernel validation of required device_view pointers and added an additional check/error for missing tau buffers when tau_ij is provided, matching the intent of preventing null-pointer GPU crashes. It also adjusted output pointers to use device_view buffers. code diff: + // Validate device_view has all required buffers + if (!device_view->u_face \|\| !device_view->v_face \|\| + !device_view->k \|\| !device_view->omega \|\| + !device_view->dudx \|\| !device_view->dudy \|\| !device_view->dvdx \|\| !device_view->dvdy \|\| + !device_view->wall_distance \|\| + !device_view->nu_t) { + throw std::runtime_error("NN-TBNN GPU pipeline: device_view missing required buffers"); + } + if (tau_ij && (!device_view->tau_xx \|\| !device_view->tau_xy \|\| !device_view->tau_yy)) { + throw std::runtime_error("NN-TBNN GPU pipeline: tau_ij requested but tau buffers not provided in device_view"); + } Before launching GPU kernels, add runtime checks to verify that all required pointers within the `device_view` struct are non-null to prevent crashes. src/turbulence_nn_tbnn.cpp [692-707] // GPU path: require device_view and gpu_ready (no fallback) if (!device_view \|\| !gpu_ready_) { throw std::runtime_error("NN-TBNN GPU pipeline requires device_view and GPU buffers initialized"); } +if (!device_view->u_face \|\| !device_view->v_face \|\| + !device_view->k \|\| !device_view->omega \|\| + !device_view->dudx \|\| !device_view->dudy \|\| !device_view->dvdx \|\| !device_view->dvdy \|\| + !device_view->wall_distance \|\| + !device_view->nu_t) { + throw std::runtime_error("NN-TBNN GPU pipeline: device_view missing required buffers"); +} +if (tau_ij && (!device_view->tau_xx \|\| !device_view->tau_xy \|\| !device_view->tau_yy)) { + throw std::runtime_error("NN-TBNN GPU pipeline: tau_ij requested but tau buffers not provided"); +} { TIMED_SCOPE("nn_tbnn_full_gpu"); - + // Ensure GPU buffers are allocated allocate_gpu_buffers(n_cells); - - const int total_cells = mesh.total_cells(); // (Nx+2Ng)*(Ny+2Ng) - was incorrectly using Nx twice + + const int total_cells = mesh.total_cells(); ...

sbryngelson · 2025-12-19T00:11:40Z

/improve

qodo-code-review · 2025-12-19T00:12:48Z

Persistent suggestions updated to latest commit c9f0ab7

sbryngelson · 2025-12-19T00:34:40Z

/improve

qodo-code-review · 2025-12-19T00:36:08Z

Persistent suggestions updated to latest commit ab96b34

Copilot

Pull request overview

This PR introduces production-ready trained TBNN models while cleaning up legacy example code and significantly enhancing the test infrastructure. The changes include two fully-trained neural network models (channel flow and periodic hills), comprehensive test coverage for turbulence features, improved GPU/CPU backend handling, and better configuration validation.

Key Changes:

Added two trained TBNN models with complete weights, biases, and normalization parameters
Fixed parameter order bug in GEP turbulence model initialization
Enhanced configuration validation requiring explicit NN model specification
Expanded test suite with analytic verification tests and comprehensive turbulence model validation
Improved backend flexibility with graceful CPU fallback when GPU unavailable

Reviewed changes

Copilot reviewed 74 out of 75 changed files in this pull request and generated 3 comments.

Show a summary per file

File	Description
src/turbulence_gep.cpp	Fixed parameter order from (nu_, u_ref_, delta_) to (nu_, delta_, u_ref_) in set_reference() calls
src/config.cpp	Added validation requiring explicit NN model specification with improved error messages
include/config.hpp	Changed default NN paths from "data/" to empty strings
tests/test_turbulence_features.cpp	New comprehensive test suite for EARSM, GEP, baseline models with analytic flow verification
tests/test_features.cpp	Replaced generic tests with analytic verification for pure shear, strain, and rotation flows
tests/test_perturbed_channel.cpp	Updated to use trained tbnn_channel_caseholdout model
tests/test_nn_integration.cpp	Updated to use trained tbnn_channel_caseholdout model
tests/test_cpu_gpu_consistency.cpp	Refactored for graceful CPU fallback when GPU unavailable
tests/test_backend_execution.cpp	Renamed from GPU-specific to backend-agnostic with CPU support
src/turbulence_nn_tbnn.cpp	Implemented full GPU pipeline with feature computation and postprocessing
src/turbulence_nn_mlp.cpp	Implemented full GPU pipeline with MLP-specific feature computation
src/gpu_kernels.cpp	Added GPU kernels for MLP features, TBNN features, and output postprocessing
include/gpu_kernels.hpp	Added declarations for new GPU kernel functions
data/models/tbnn_channel_caseholdout/*	CRITICAL ISSUE: Invalid normalization statistics with inf/extreme values
data/models/tbnn_phll_caseholdout/*	Added trained periodic hills model with valid normalization statistics
scripts/train_realdata_local.sh	New comprehensive training script for local TBNN/MLP training
scripts/download_mcconkey_data.sh	Improved Kaggle dataset download robustness

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

data/models/tbnn_channel_caseholdout/input_stds.txt

scripts/train_realdata_local.sh

scripts/download_mcconkey_data.sh

sbryngelson · 2025-12-22T20:52:12Z

/improve

qodo-code-review · 2025-12-22T20:53:41Z

Persistent suggestions updated to latest commit ddc4146

sbryngelson · 2025-12-22T21:46:01Z

/improve

qodo-code-review · 2025-12-22T21:47:42Z

Persistent suggestions updated to latest commit 282ad8d

sbryngelson added 2 commits December 18, 2025 17:03

better tests and NN tests

5d4f25a

qodo-code-review bot added the Review effort 4/5 label Dec 18, 2025

sbryngelson added 2 commits December 18, 2025 18:46

gpu resident

5a6acd8

fix ci failure

c9f0ab7

sbryngelson added 2 commits December 18, 2025 19:20

fix inconsistencies

17bcf6d

fix cpu path

ab96b34

sbryngelson requested a review from Copilot December 19, 2025 00:34

Copilot started reviewing on behalf of sbryngelson December 19, 2025 00:35 View session

Copilot AI reviewed Dec 19, 2025

View reviewed changes

data/models/tbnn_channel_caseholdout/input_stds.txt Outdated Show resolved Hide resolved

scripts/train_realdata_local.sh Outdated Show resolved Hide resolved

scripts/download_mcconkey_data.sh Outdated Show resolved Hide resolved

sbryngelson added 7 commits December 18, 2025 19:55

tests

ec0d831

tests

fe471a9

working on a fix

7410609

fix

a01534d

fix gpu bug for NNs

03fe1c3

fix consistency tests

775308b

fix gpu skippage

ddc4146

more proper tests

282ad8d

sbryngelson requested a review from Copilot December 22, 2025 21:46

Copilot started reviewing on behalf of sbryngelson December 22, 2025 21:46 View session

better testing

466435d

sbryngelson merged commit 7dc0496 into master Dec 22, 2025
3 checks passed

sbryngelson deleted the neuralnets branch December 22, 2025 23:01

Add trained NN models and clean up legacy weights #11

Add trained NN models and clean up legacy weights #11

Uh oh!

Conversation

sbryngelson commented Dec 18, 2025 • edited by qodo-code-review bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Type

Description

Diagram Walkthrough

File Walkthrough

Uh oh!

qodo-code-review bot commented Dec 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Compliance Guide 🔍

Uh oh!

qodo-code-review bot commented Dec 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Code Suggestions ✨

Previous suggestions

Uh oh!

sbryngelson commented Dec 19, 2025

Uh oh!

qodo-code-review bot commented Dec 19, 2025

Uh oh!

sbryngelson commented Dec 19, 2025

Uh oh!

qodo-code-review bot commented Dec 19, 2025

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

sbryngelson commented Dec 22, 2025

Uh oh!

qodo-code-review bot commented Dec 22, 2025

Uh oh!

sbryngelson commented Dec 22, 2025

Uh oh!

qodo-code-review bot commented Dec 22, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Milestone

Development

Uh oh!

2 participants

sbryngelson commented Dec 18, 2025 •

edited by qodo-code-review bot

Loading

qodo-code-review bot commented Dec 18, 2025 •

edited

Loading

qodo-code-review bot commented Dec 18, 2025 •

edited

Loading