-
Notifications
You must be signed in to change notification settings - Fork 2
Add units column to commodities input file #1110
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #1110 +/- ##
==========================================
+ Coverage 87.52% 87.54% +0.02%
==========================================
Files 55 55
Lines 7429 7452 +23
Branches 7429 7452 +23
==========================================
+ Hits 6502 6524 +22
Misses 630 630
- Partials 297 298 +1 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This pull request adds a units column to the commodities input file (commodities.csv) to enable validation that all SED/SVD output flows from the same process have consistent units. This is important because annual fixed costs are distributed proportionally between outputs based on flow coefficients, which only makes sense if all outputs share the same units.
Changes:
- Added
units: Stringfield to theCommoditystruct for storing unit information - Implemented
validate_output_flows_units()function to check that all SED/SVD outputs from each process have matching units - Updated all example commodities.csv files with a units column (using "PJ" for energy commodities, "kt" for CO2, and "NOUNIT" as a placeholder)
- Added new test fixtures and test cases to verify the validation works correctly
- Created a utility shell script to help add columns to CSV files across all examples
Reviewed changes
Copilot reviewed 14 out of 14 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| src/commodity.rs | Added units field to the Commodity struct with documentation |
| src/input/commodity.rs | Updated CSV reading logic to parse the units field and updated test helper functions |
| src/input/process/flow.rs | Implemented validation function to check SED/SVD output units consistency and added comprehensive test cases |
| src/fixture.rs | Added new test fixtures for commodities with different units and updated existing fixtures |
| src/process.rs | Updated test commodity constructors to include units field |
| src/input/agent/commodity_portion.rs | Updated test commodity constructors to include units field |
| schemas/input/commodities.yaml | Added schema definition for the units field with clear documentation |
| examples/*/commodities.csv | Added units column to all example commodity files with appropriate values |
| tests/add_column.sh | Added utility script to automatically add columns to CSV files across examples |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| #[rstest] | ||
| fn single_sed_svd_output(svd_commodity: Commodity, process: Process) { | ||
| let commodity = Rc::new(svd_commodity); | ||
| let (_, flows_map) = build_maps( | ||
| process, | ||
| std::iter::once((commodity.id.clone(), flow(commodity.clone(), 1.0))), | ||
| None, | ||
| ); | ||
|
|
||
| // Single output should always pass validation | ||
| validate_output_flows_units(&flows_map).unwrap(); | ||
| } |
Copilot
AI
Jan 28, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add a test case for processes that have no SED/SVD outputs (e.g., a process that only produces OTH commodities). This would help catch the validation bug identified on lines 107-111 where the validation would fail for processes with zero SED/SVD outputs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree
Co-authored-by: Copilot <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Copilot reviewed 14 out of 14 changed files in this pull request and generated 1 comment.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| #[rstest] | ||
| fn output_flows_matching_units( | ||
| svd_commodity: Commodity, | ||
| sed_commodity: Commodity, | ||
| process: Process, | ||
| ) { | ||
| // Both commodities have the same units | ||
| assert_eq!(svd_commodity.units, sed_commodity.units); | ||
|
|
||
| let commodity1 = Rc::new(svd_commodity); | ||
| let commodity2 = Rc::new(sed_commodity); | ||
| let (_, flows_map) = build_maps( | ||
| process, | ||
| [ | ||
| (commodity1.id.clone(), flow(commodity1.clone(), 1.0)), | ||
| (commodity2.id.clone(), flow(commodity2.clone(), 2.0)), | ||
| ] | ||
| .into_iter(), | ||
| None, | ||
| ); | ||
|
|
||
| // Validation should pass since the units are the same | ||
| validate_output_flows_units(&flows_map).unwrap(); | ||
| } | ||
|
|
||
| #[rstest] | ||
| fn output_flows_mismatched_units( | ||
| sed_commodity_pj: Commodity, | ||
| sed_commodity_tonnes: Commodity, | ||
| process: Process, | ||
| ) { | ||
| // Ensure the two commodities have different units | ||
| assert_ne!(sed_commodity_pj.units, sed_commodity_tonnes.units); | ||
|
|
||
| let commodity1 = Rc::new(sed_commodity_pj); | ||
| let commodity2 = Rc::new(sed_commodity_tonnes); | ||
| let (_, flows_map) = build_maps( | ||
| process, | ||
| [ | ||
| (commodity1.id.clone(), flow(commodity1.clone(), 1.0)), | ||
| (commodity2.id.clone(), flow(commodity2.clone(), 2.0)), | ||
| ] | ||
| .into_iter(), | ||
| None, | ||
| ); | ||
|
|
||
| // Different units should cause validation to fail | ||
| let result = validate_output_flows_units(&flows_map); | ||
| assert!(result.is_err()); | ||
| assert!( | ||
| result | ||
| .unwrap_err() | ||
| .to_string() | ||
| .contains("has SED/SVD outputs with different units") | ||
| ); | ||
| } | ||
|
|
||
| #[rstest] | ||
| fn output_flows_other_commodity_ignored( | ||
| sed_commodity_pj: Commodity, | ||
| other_commodity: Commodity, | ||
| process: Process, | ||
| ) { | ||
| // Modify OTH commodity to have different units | ||
| let mut other_commodity = other_commodity; | ||
| other_commodity.units = "tonnes".into(); | ||
| assert_ne!(sed_commodity_pj.units, other_commodity.units); | ||
|
|
||
| let sed_commodity = Rc::new(sed_commodity_pj); | ||
| let oth_commodity = Rc::new(other_commodity); | ||
|
|
||
| let (_, flows_map) = build_maps( | ||
| process, | ||
| [ | ||
| (sed_commodity.id.clone(), flow(sed_commodity.clone(), 1.0)), | ||
| (oth_commodity.id.clone(), flow(oth_commodity.clone(), 2.0)), | ||
| ] | ||
| .into_iter(), | ||
| None, | ||
| ); | ||
|
|
||
| // OTH commodity should be ignored, validation should pass | ||
| validate_output_flows_units(&flows_map).unwrap(); | ||
| } | ||
|
|
||
| #[rstest] | ||
| fn single_sed_svd_output(svd_commodity: Commodity, process: Process) { | ||
| let commodity = Rc::new(svd_commodity); | ||
| let (_, flows_map) = build_maps( | ||
| process, | ||
| std::iter::once((commodity.id.clone(), flow(commodity.clone(), 1.0))), | ||
| None, | ||
| ); | ||
|
|
||
| // Single output should always pass validation | ||
| validate_output_flows_units(&flows_map).unwrap(); | ||
| } |
Copilot
AI
Jan 28, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider adding a test case that verifies inputs (negative coefficients) are correctly ignored during unit validation. For example, a process with SED/SVD inputs having different units but SED/SVD outputs with the same units should pass validation. This would explicitly demonstrate that the validation logic correctly filters flows by checking flow.coeff.value() > 0.0.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree.
You may be able to make some of these tests parametrised to make life easier. For example, you could take coeffs for process flows and expected error message as arguments, then you could check for a few different kinds of errors with the same test. Just a thought!
alexdewar
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nearly there! There is one bug which needs fixing: validation will currently fail for processes without SED/SVD outputs.
Also, I don't think we want to allow users to supply an empty string for the unit name. Let's make them put something for every commodity.
I think we should add units to all the examples in this PR. @ahawkes can you advise what the units should be for the commodities in the simple, two_outputs and circularity examples? There's a lot of overlap between them -- presumably the units are the same where the commodity ID is the same?
| This field is used for validation only and is not used for internal unit conversions. MUSE | ||
| will validate that all SED/SVD output flows from the same process have the same units, | ||
| as the annual fixed costs are distributed proportionally between outputs based on flow coefficients. | ||
| E.g. `Petajoules`, `Tonnes`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd give the abbreviations because that's what users will probably do:
| E.g. `Petajoules`, `Tonnes`. | |
| E.g. "PJ" (petajoules) or "t" (tonnes) |
| .unwrap_err() | ||
| .to_string() | ||
| .contains("has SED/SVD outputs with different units") | ||
| ); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd just use the assert_error! macro in fixture.rs for this and match the whole message.
| let sed_svd_output_units: HashSet<&str> = flows | ||
| .values() | ||
| .filter_map(|flow| { | ||
| let commodity = &flow.commodity; | ||
| (flow.coeff.value() > 0.0 | ||
| && matches!( | ||
| commodity.kind, | ||
| CommodityType::ServiceDemand | CommodityType::SupplyEqualsDemand | ||
| )) | ||
| .then_some(commodity.units.as_str()) | ||
| }) | ||
| .collect(); | ||
|
|
||
| ensure!( | ||
| sed_svd_output_units.len() <= 1, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Validation will fail if there are any processes without SED/SVD outputs. How about:
| let sed_svd_output_units: HashSet<&str> = flows | |
| .values() | |
| .filter_map(|flow| { | |
| let commodity = &flow.commodity; | |
| (flow.coeff.value() > 0.0 | |
| && matches!( | |
| commodity.kind, | |
| CommodityType::ServiceDemand | CommodityType::SupplyEqualsDemand | |
| )) | |
| .then_some(commodity.units.as_str()) | |
| }) | |
| .collect(); | |
| ensure!( | |
| sed_svd_output_units.len() <= 1, | |
| let mut sed_svd_output_units = flows | |
| .values() | |
| .filter_map(|flow| { | |
| let commodity = &flow.commodity; | |
| (flow.coeff.value() > 0.0 | |
| && matches!( | |
| commodity.kind, | |
| CommodityType::ServiceDemand | CommodityType::SupplyEqualsDemand | |
| )) | |
| .then_some(commodity.units.as_str()) | |
| }); | |
| let Some(first_unit) = sed_svd_output_units.next() else { | |
| // No SED/SVD output flows | |
| continue; | |
| }; | |
| ensure!( | |
| sed_svd_output_units.all(|unit|unit == first_unit), |
It would be good to have a test for this case too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I belatedly noticed that your way also works 😛
| #[rstest] | ||
| fn single_sed_svd_output(svd_commodity: Commodity, process: Process) { | ||
| let commodity = Rc::new(svd_commodity); | ||
| let (_, flows_map) = build_maps( | ||
| process, | ||
| std::iter::once((commodity.id.clone(), flow(commodity.clone(), 1.0))), | ||
| None, | ||
| ); | ||
|
|
||
| // Single output should always pass validation | ||
| validate_output_flows_units(&flows_map).unwrap(); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree
| #[rstest] | ||
| fn output_flows_matching_units( | ||
| svd_commodity: Commodity, | ||
| sed_commodity: Commodity, | ||
| process: Process, | ||
| ) { | ||
| // Both commodities have the same units | ||
| assert_eq!(svd_commodity.units, sed_commodity.units); | ||
|
|
||
| let commodity1 = Rc::new(svd_commodity); | ||
| let commodity2 = Rc::new(sed_commodity); | ||
| let (_, flows_map) = build_maps( | ||
| process, | ||
| [ | ||
| (commodity1.id.clone(), flow(commodity1.clone(), 1.0)), | ||
| (commodity2.id.clone(), flow(commodity2.clone(), 2.0)), | ||
| ] | ||
| .into_iter(), | ||
| None, | ||
| ); | ||
|
|
||
| // Validation should pass since the units are the same | ||
| validate_output_flows_units(&flows_map).unwrap(); | ||
| } | ||
|
|
||
| #[rstest] | ||
| fn output_flows_mismatched_units( | ||
| sed_commodity_pj: Commodity, | ||
| sed_commodity_tonnes: Commodity, | ||
| process: Process, | ||
| ) { | ||
| // Ensure the two commodities have different units | ||
| assert_ne!(sed_commodity_pj.units, sed_commodity_tonnes.units); | ||
|
|
||
| let commodity1 = Rc::new(sed_commodity_pj); | ||
| let commodity2 = Rc::new(sed_commodity_tonnes); | ||
| let (_, flows_map) = build_maps( | ||
| process, | ||
| [ | ||
| (commodity1.id.clone(), flow(commodity1.clone(), 1.0)), | ||
| (commodity2.id.clone(), flow(commodity2.clone(), 2.0)), | ||
| ] | ||
| .into_iter(), | ||
| None, | ||
| ); | ||
|
|
||
| // Different units should cause validation to fail | ||
| let result = validate_output_flows_units(&flows_map); | ||
| assert!(result.is_err()); | ||
| assert!( | ||
| result | ||
| .unwrap_err() | ||
| .to_string() | ||
| .contains("has SED/SVD outputs with different units") | ||
| ); | ||
| } | ||
|
|
||
| #[rstest] | ||
| fn output_flows_other_commodity_ignored( | ||
| sed_commodity_pj: Commodity, | ||
| other_commodity: Commodity, | ||
| process: Process, | ||
| ) { | ||
| // Modify OTH commodity to have different units | ||
| let mut other_commodity = other_commodity; | ||
| other_commodity.units = "tonnes".into(); | ||
| assert_ne!(sed_commodity_pj.units, other_commodity.units); | ||
|
|
||
| let sed_commodity = Rc::new(sed_commodity_pj); | ||
| let oth_commodity = Rc::new(other_commodity); | ||
|
|
||
| let (_, flows_map) = build_maps( | ||
| process, | ||
| [ | ||
| (sed_commodity.id.clone(), flow(sed_commodity.clone(), 1.0)), | ||
| (oth_commodity.id.clone(), flow(oth_commodity.clone(), 2.0)), | ||
| ] | ||
| .into_iter(), | ||
| None, | ||
| ); | ||
|
|
||
| // OTH commodity should be ignored, validation should pass | ||
| validate_output_flows_units(&flows_map).unwrap(); | ||
| } | ||
|
|
||
| #[rstest] | ||
| fn single_sed_svd_output(svd_commodity: Commodity, process: Process) { | ||
| let commodity = Rc::new(svd_commodity); | ||
| let (_, flows_map) = build_maps( | ||
| process, | ||
| std::iter::once((commodity.id.clone(), flow(commodity.clone(), 1.0))), | ||
| None, | ||
| ); | ||
|
|
||
| // Single output should always pass validation | ||
| validate_output_flows_units(&flows_map).unwrap(); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree.
You may be able to make some of these tests parametrised to make life easier. For example, you could take coeffs for process flows and expected error message as arguments, then you could check for a few different kinds of errors with the same test. Just a thought!
| demand: DemandMap::new(), | ||
| units: "tonnes".into(), | ||
| } | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess we probs won't want these fixtures anywhere outside of input/commodity.rs? In which case that's probably where they should live
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure if this was committed by accident? Not sure we need it in any case.
You can also add columns with spreadsheet software etc.
| GASNAT,Natural gas,sed,season,NOUNIT | ||
| ELCTRI,Electricity,sed,daynight,NOUNIT | ||
| H2YPRD,Hydrogen,sed,season,NOUNIT | ||
| TPASKM,Passenger kms,svd,daynight,NOUNIT |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Presumably the unit of this one is km 😄
|
Oh, I'm being stupid. You already fixed that bug. Nvm. But let's still add units to the examples. |
|
Simple: Processes |
|
Two_outputs: Processes |
|
Actually I guess you don't need process capacity units as it comes out in the wash. |
|
Thanks @ahawkes!
|
Description
Adds a units column to commodities.csv which allows us to validate that all associated output flows of a commodity for each process have the same units.
For the
muse1_defaultandtwo_regionsmodels I have added the units specified from here #1035 (comment)For the other models I've put in placeholders which will pass the validation since all units are the same. We can either get all the right ones in this PR or add them in future.
Fixes #1035
Type of change
Key checklist
$ cargo test$ cargo docpresent in the previous release
Further checks