Skip to content

Conversation

@Aurashk
Copy link
Collaborator

@Aurashk Aurashk commented Jan 28, 2026

Description

Adds a units column to commodities.csv which allows us to validate that all associated output flows of a commodity for each process have the same units.
For the muse1_default and two_regions models I have added the units specified from here #1035 (comment)
For the other models I've put in placeholders which will pass the validation since all units are the same. We can either get all the right ones in this PR or add them in future.

Fixes #1035

Type of change

  • Bug fix (non-breaking change to fix an issue)
  • New feature (non-breaking change to add functionality)
  • Refactoring (non-breaking, non-functional change to improve maintainability)
  • Optimization (non-breaking change to speed up the code)
  • Breaking change (whatever its nature)
  • Documentation (improve or add documentation)

Key checklist

  • All tests pass: $ cargo test
  • The documentation builds and looks OK: $ cargo doc
  • Update release notes for the latest release if this PR adds a new feature or fixes a bug
    present in the previous release

Further checks

  • Code is commented, particularly in hard-to-understand areas
  • Tests added that prove fix is effective or that feature works

Copilot AI review requested due to automatic review settings January 28, 2026 12:28
@codecov
Copy link

codecov bot commented Jan 28, 2026

Codecov Report

❌ Patch coverage is 96.00000% with 1 line in your changes missing coverage. Please review.
✅ Project coverage is 87.54%. Comparing base (ff15327) to head (7c7e19c).

Files with missing lines Patch % Lines
src/input/process/flow.rs 94.44% 0 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1110      +/-   ##
==========================================
+ Coverage   87.52%   87.54%   +0.02%     
==========================================
  Files          55       55              
  Lines        7429     7452      +23     
  Branches     7429     7452      +23     
==========================================
+ Hits         6502     6524      +22     
  Misses        630      630              
- Partials      297      298       +1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This pull request adds a units column to the commodities input file (commodities.csv) to enable validation that all SED/SVD output flows from the same process have consistent units. This is important because annual fixed costs are distributed proportionally between outputs based on flow coefficients, which only makes sense if all outputs share the same units.

Changes:

  • Added units: String field to the Commodity struct for storing unit information
  • Implemented validate_output_flows_units() function to check that all SED/SVD outputs from each process have matching units
  • Updated all example commodities.csv files with a units column (using "PJ" for energy commodities, "kt" for CO2, and "NOUNIT" as a placeholder)
  • Added new test fixtures and test cases to verify the validation works correctly
  • Created a utility shell script to help add columns to CSV files across all examples

Reviewed changes

Copilot reviewed 14 out of 14 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
src/commodity.rs Added units field to the Commodity struct with documentation
src/input/commodity.rs Updated CSV reading logic to parse the units field and updated test helper functions
src/input/process/flow.rs Implemented validation function to check SED/SVD output units consistency and added comprehensive test cases
src/fixture.rs Added new test fixtures for commodities with different units and updated existing fixtures
src/process.rs Updated test commodity constructors to include units field
src/input/agent/commodity_portion.rs Updated test commodity constructors to include units field
schemas/input/commodities.yaml Added schema definition for the units field with clear documentation
examples/*/commodities.csv Added units column to all example commodity files with appropriate values
tests/add_column.sh Added utility script to automatically add columns to CSV files across examples

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +504 to +515
#[rstest]
fn single_sed_svd_output(svd_commodity: Commodity, process: Process) {
let commodity = Rc::new(svd_commodity);
let (_, flows_map) = build_maps(
process,
std::iter::once((commodity.id.clone(), flow(commodity.clone(), 1.0))),
None,
);

// Single output should always pass validation
validate_output_flows_units(&flows_map).unwrap();
}
Copy link

Copilot AI Jan 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a test case for processes that have no SED/SVD outputs (e.g., a process that only produces OTH commodities). This would help catch the validation bug identified on lines 107-111 where the validation would fail for processes with zero SED/SVD outputs.

Copilot uses AI. Check for mistakes.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree

@Aurashk Aurashk requested a review from alexdewar January 28, 2026 13:33
Copilot AI review requested due to automatic review settings January 28, 2026 13:35
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 14 out of 14 changed files in this pull request and generated 1 comment.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +419 to +515
#[rstest]
fn output_flows_matching_units(
svd_commodity: Commodity,
sed_commodity: Commodity,
process: Process,
) {
// Both commodities have the same units
assert_eq!(svd_commodity.units, sed_commodity.units);

let commodity1 = Rc::new(svd_commodity);
let commodity2 = Rc::new(sed_commodity);
let (_, flows_map) = build_maps(
process,
[
(commodity1.id.clone(), flow(commodity1.clone(), 1.0)),
(commodity2.id.clone(), flow(commodity2.clone(), 2.0)),
]
.into_iter(),
None,
);

// Validation should pass since the units are the same
validate_output_flows_units(&flows_map).unwrap();
}

#[rstest]
fn output_flows_mismatched_units(
sed_commodity_pj: Commodity,
sed_commodity_tonnes: Commodity,
process: Process,
) {
// Ensure the two commodities have different units
assert_ne!(sed_commodity_pj.units, sed_commodity_tonnes.units);

let commodity1 = Rc::new(sed_commodity_pj);
let commodity2 = Rc::new(sed_commodity_tonnes);
let (_, flows_map) = build_maps(
process,
[
(commodity1.id.clone(), flow(commodity1.clone(), 1.0)),
(commodity2.id.clone(), flow(commodity2.clone(), 2.0)),
]
.into_iter(),
None,
);

// Different units should cause validation to fail
let result = validate_output_flows_units(&flows_map);
assert!(result.is_err());
assert!(
result
.unwrap_err()
.to_string()
.contains("has SED/SVD outputs with different units")
);
}

#[rstest]
fn output_flows_other_commodity_ignored(
sed_commodity_pj: Commodity,
other_commodity: Commodity,
process: Process,
) {
// Modify OTH commodity to have different units
let mut other_commodity = other_commodity;
other_commodity.units = "tonnes".into();
assert_ne!(sed_commodity_pj.units, other_commodity.units);

let sed_commodity = Rc::new(sed_commodity_pj);
let oth_commodity = Rc::new(other_commodity);

let (_, flows_map) = build_maps(
process,
[
(sed_commodity.id.clone(), flow(sed_commodity.clone(), 1.0)),
(oth_commodity.id.clone(), flow(oth_commodity.clone(), 2.0)),
]
.into_iter(),
None,
);

// OTH commodity should be ignored, validation should pass
validate_output_flows_units(&flows_map).unwrap();
}

#[rstest]
fn single_sed_svd_output(svd_commodity: Commodity, process: Process) {
let commodity = Rc::new(svd_commodity);
let (_, flows_map) = build_maps(
process,
std::iter::once((commodity.id.clone(), flow(commodity.clone(), 1.0))),
None,
);

// Single output should always pass validation
validate_output_flows_units(&flows_map).unwrap();
}
Copy link

Copilot AI Jan 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider adding a test case that verifies inputs (negative coefficients) are correctly ignored during unit validation. For example, a process with SED/SVD inputs having different units but SED/SVD outputs with the same units should pass validation. This would explicitly demonstrate that the validation logic correctly filters flows by checking flow.coeff.value() > 0.0.

Copilot uses AI. Check for mistakes.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree.

You may be able to make some of these tests parametrised to make life easier. For example, you could take coeffs for process flows and expected error message as arguments, then you could check for a few different kinds of errors with the same test. Just a thought!

Copy link
Collaborator

@alexdewar alexdewar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nearly there! There is one bug which needs fixing: validation will currently fail for processes without SED/SVD outputs.

Also, I don't think we want to allow users to supply an empty string for the unit name. Let's make them put something for every commodity.

I think we should add units to all the examples in this PR. @ahawkes can you advise what the units should be for the commodities in the simple, two_outputs and circularity examples? There's a lot of overlap between them -- presumably the units are the same where the commodity ID is the same?

This field is used for validation only and is not used for internal unit conversions. MUSE
will validate that all SED/SVD output flows from the same process have the same units,
as the annual fixed costs are distributed proportionally between outputs based on flow coefficients.
E.g. `Petajoules`, `Tonnes`.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd give the abbreviations because that's what users will probably do:

Suggested change
E.g. `Petajoules`, `Tonnes`.
E.g. "PJ" (petajoules) or "t" (tonnes)

.unwrap_err()
.to_string()
.contains("has SED/SVD outputs with different units")
);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd just use the assert_error! macro in fixture.rs for this and match the whole message.

Comment on lines +94 to +108
let sed_svd_output_units: HashSet<&str> = flows
.values()
.filter_map(|flow| {
let commodity = &flow.commodity;
(flow.coeff.value() > 0.0
&& matches!(
commodity.kind,
CommodityType::ServiceDemand | CommodityType::SupplyEqualsDemand
))
.then_some(commodity.units.as_str())
})
.collect();

ensure!(
sed_svd_output_units.len() <= 1,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Validation will fail if there are any processes without SED/SVD outputs. How about:

Suggested change
let sed_svd_output_units: HashSet<&str> = flows
.values()
.filter_map(|flow| {
let commodity = &flow.commodity;
(flow.coeff.value() > 0.0
&& matches!(
commodity.kind,
CommodityType::ServiceDemand | CommodityType::SupplyEqualsDemand
))
.then_some(commodity.units.as_str())
})
.collect();
ensure!(
sed_svd_output_units.len() <= 1,
let mut sed_svd_output_units = flows
.values()
.filter_map(|flow| {
let commodity = &flow.commodity;
(flow.coeff.value() > 0.0
&& matches!(
commodity.kind,
CommodityType::ServiceDemand | CommodityType::SupplyEqualsDemand
))
.then_some(commodity.units.as_str())
});
let Some(first_unit) = sed_svd_output_units.next() else {
// No SED/SVD output flows
continue;
};
ensure!(
sed_svd_output_units.all(|unit|unit == first_unit),

It would be good to have a test for this case too.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I belatedly noticed that your way also works 😛

Comment on lines +504 to +515
#[rstest]
fn single_sed_svd_output(svd_commodity: Commodity, process: Process) {
let commodity = Rc::new(svd_commodity);
let (_, flows_map) = build_maps(
process,
std::iter::once((commodity.id.clone(), flow(commodity.clone(), 1.0))),
None,
);

// Single output should always pass validation
validate_output_flows_units(&flows_map).unwrap();
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree

Comment on lines +419 to +515
#[rstest]
fn output_flows_matching_units(
svd_commodity: Commodity,
sed_commodity: Commodity,
process: Process,
) {
// Both commodities have the same units
assert_eq!(svd_commodity.units, sed_commodity.units);

let commodity1 = Rc::new(svd_commodity);
let commodity2 = Rc::new(sed_commodity);
let (_, flows_map) = build_maps(
process,
[
(commodity1.id.clone(), flow(commodity1.clone(), 1.0)),
(commodity2.id.clone(), flow(commodity2.clone(), 2.0)),
]
.into_iter(),
None,
);

// Validation should pass since the units are the same
validate_output_flows_units(&flows_map).unwrap();
}

#[rstest]
fn output_flows_mismatched_units(
sed_commodity_pj: Commodity,
sed_commodity_tonnes: Commodity,
process: Process,
) {
// Ensure the two commodities have different units
assert_ne!(sed_commodity_pj.units, sed_commodity_tonnes.units);

let commodity1 = Rc::new(sed_commodity_pj);
let commodity2 = Rc::new(sed_commodity_tonnes);
let (_, flows_map) = build_maps(
process,
[
(commodity1.id.clone(), flow(commodity1.clone(), 1.0)),
(commodity2.id.clone(), flow(commodity2.clone(), 2.0)),
]
.into_iter(),
None,
);

// Different units should cause validation to fail
let result = validate_output_flows_units(&flows_map);
assert!(result.is_err());
assert!(
result
.unwrap_err()
.to_string()
.contains("has SED/SVD outputs with different units")
);
}

#[rstest]
fn output_flows_other_commodity_ignored(
sed_commodity_pj: Commodity,
other_commodity: Commodity,
process: Process,
) {
// Modify OTH commodity to have different units
let mut other_commodity = other_commodity;
other_commodity.units = "tonnes".into();
assert_ne!(sed_commodity_pj.units, other_commodity.units);

let sed_commodity = Rc::new(sed_commodity_pj);
let oth_commodity = Rc::new(other_commodity);

let (_, flows_map) = build_maps(
process,
[
(sed_commodity.id.clone(), flow(sed_commodity.clone(), 1.0)),
(oth_commodity.id.clone(), flow(oth_commodity.clone(), 2.0)),
]
.into_iter(),
None,
);

// OTH commodity should be ignored, validation should pass
validate_output_flows_units(&flows_map).unwrap();
}

#[rstest]
fn single_sed_svd_output(svd_commodity: Commodity, process: Process) {
let commodity = Rc::new(svd_commodity);
let (_, flows_map) = build_maps(
process,
std::iter::once((commodity.id.clone(), flow(commodity.clone(), 1.0))),
None,
);

// Single output should always pass validation
validate_output_flows_units(&flows_map).unwrap();
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree.

You may be able to make some of these tests parametrised to make life easier. For example, you could take coeffs for process flows and expected error message as arguments, then you could check for a few different kinds of errors with the same test. Just a thought!

demand: DemandMap::new(),
units: "tonnes".into(),
}
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess we probs won't want these fixtures anywhere outside of input/commodity.rs? In which case that's probably where they should live

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if this was committed by accident? Not sure we need it in any case.

You can also add columns with spreadsheet software etc.

GASNAT,Natural gas,sed,season,NOUNIT
ELCTRI,Electricity,sed,daynight,NOUNIT
H2YPRD,Hydrogen,sed,season,NOUNIT
TPASKM,Passenger kms,svd,daynight,NOUNIT
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Presumably the unit of this one is km 😄

@alexdewar
Copy link
Collaborator

Oh, I'm being stupid. You already fixed that bug. Nvm.

But let's still add units to the examples.

@ahawkes
Copy link
Contributor

ahawkes commented Jan 29, 2026

Simple:
Commodities
GASPRD - PJ (petajoules)
GASNAT - PJ
ELCTRI - PJ
RSHEAT - PJ
CO2EMT - ktCO2 (kilo tonnes CO2)

Processes
GASDRV PJ/year (enough capacity to produce 1 PJ per year if operating at 100% the whole time)
GASPRC PJ/year
WNDFRM GW (gigawatts - note cap2act converts output to PJ)
GASCGT - GW
RGASBR - PJ/year
RELCHP - PJ/year

@ahawkes
Copy link
Contributor

ahawkes commented Jan 29, 2026

Two_outputs:
Commodities
OILCRD - PJ
GASOLI - PJ
DIESEL - PJ
TPASKM - bpkm (billions of passenger kilometers)
BIOPRD - PJ
BIOPEL - PJ

Processes
OAGRSV - PJ/year
OILREF - PJ/year
OILRF2 - PJ/year
TPETCR - bpkm/year
TDIECR - bpkm/year
TELCCR - bpkm/year
THYBCR - bpkm/year
BIOPRO - PJ/year
BIOPLL - PJ/year
RBIOBL - PJ/year

@ahawkes
Copy link
Contributor

ahawkes commented Jan 29, 2026

Actually I guess you don't need process capacity units as it comes out in the wash.
Final commodity for circularity model - H2YPRD - PJ

@alexdewar
Copy link
Collaborator

alexdewar commented Jan 29, 2026 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add a units column to commodities.csv

4 participants