Improve diagnostics test speed #1526

imreddyTeja · 2025-11-03T19:40:40Z

The diagnostics tests can take a very long
time (8 min on gha). This commit makes two
changes to speed this test up.

Override some constructors inside the @testset

The diagnostics tests do not need a real jacobian, and they do not need to perform a real step. This commit adds constructors for the implicit and explicit tendency updates, cache update, boundary fluxes update, and jacobian compuation which do nothing. It also uses a fake
field matrix and solver.

Change define_diagnsotics! and default_diagnostics to only add valid diagnostics. Constructing a DiagnosticVariable is expensive, and this reduces the number of useless ones created.

The speedup I get from this PR is much larger when I run locally. It makes gha ci a few minutes faster, and the buildkite gpu tests ~8 min faster.

I have read and checked the items on the review checklist.

@testset

The diagnostics tests can take a very long time (8 min on gha). This commit makes two changes to speed this test up. 1. Override some constructors inside the @testset The diagnostics tests do not need a real jacobian, and they do not need to perform a real step. This commit adds constructors for the implicit and explicit tendency updates, cache update, boundary fluxes update, and jacobian compuation which do nothing. It also uses a fake field matrix and solver. 2. Change define_diagnsotics! and default_diagnostics to only add valid diagnostics. Constructing a DiagnosticVariable is expensive, and this reduces the number of useless ones created. use dummy integrator for diag test extend overriden constructors in test conditionally add diagnostics

kmdeck

The speed up is great!

Conceptually I'm a little confused about how defining diagnostics should depend on requested diagnostics. Requested diagnostics seems like more of what you want to get out of a simulation, while the definitions should be independent of that?

E.g. here we request only two diagnostics as output:

ClimaLand.jl/docs/src/tutorials/standalone/Canopy/default_canopy.jl

Lines 93 to 99 in 8fa1333

    
           diagnostics = ClimaLand.Diagnostics.default_diagnostics( 
        
               model, 
        
               start_date; 
        
               output_vars = ["ct", "trans"], 
        
               output_writer = diag_writer, 
        
               reduction_period = :hourly, 
        
           );

Is this doing something different?

imreddyTeja · 2025-11-05T21:07:08Z

The speed up is great!

Conceptually I'm a little confused about how defining diagnostics should depend on requested diagnostics. Requested diagnostics seems like more of what you want to get out of a simulation, while the definitions should be independent of that?

E.g. here we request only two diagnostics as output:

ClimaLand.jl/docs/src/tutorials/standalone/Canopy/default_canopy.jl

Lines 93 to 99 in 8fa1333

diagnostics = ClimaLand.Diagnostics.default_diagnostics(

model,

start_date;

output_vars = ["ct", "trans"],

output_writer = diag_writer,

reduction_period = :hourly,

);

Is this doing something different?

Requested diagnostics is not a great name for it. When default_diagnostics is used, requested_diagnostics would always be equal to output_vars. I'll rename it to output_vars if that makes more sense.

I think the name of define_diagnostics is a bit misleading. The diagnostic compute methods are always defined. define_diagnostics creates diagnostic variables and puts them in the ALL_DIAGNOSTICS dict. I'm not sure why, but creating diagnostic variables seems to take a very long time (maybe because the type is parameterized by the compute function, which is a closure). The current setup creates diagnostic variables that are not relevant for the model (for example the sif diagnostic variable is created for the bucket). If a user doesn't want to use every possible diagnostic, the diagnostic variable for it does not need to be created. Maybe a better solution would be to create diagnostic variables for all possible diagnostics, rather than for every single diagnostic (done currently), or only for the output vars (what this pr does)

juliasloan25 · 2025-11-05T21:32:06Z

The speed up is great!
Conceptually I'm a little confused about how defining diagnostics should depend on requested diagnostics. Requested diagnostics seems like more of what you want to get out of a simulation, while the definitions should be independent of that?
E.g. here we request only two diagnostics as output:

ClimaLand.jl/docs/src/tutorials/standalone/Canopy/default_canopy.jl

Lines 93 to 99 in 8fa1333

diagnostics = ClimaLand.Diagnostics.default_diagnostics(

model,

start_date;

output_vars = ["ct", "trans"],

output_writer = diag_writer,

reduction_period = :hourly,

);

Is this doing something different?

Requested diagnostics is not a great name for it. When default_diagnostics is used, requested_diagnostics would always be equal to output_vars. I'll rename it to output_vars if that makes more sense.

I think the name of define_diagnostics is a bit misleading. The diagnostic compute methods are always defined. define_diagnostics creates diagnostic variables and puts them in the ALL_DIAGNOSTICS dict. I'm not sure why, but creating diagnostic variables seems to take a very long time (maybe because the type is parameterized by the compute function, which is a closure). The current setup creates diagnostic variables that are not relevant for the model (for example the sif diagnostic variable is created for the bucket). If a user doesn't want to use every possible diagnostic, the diagnostic variable for it does not need to be created. Maybe a better solution would be to create diagnostic variables for all possible diagnostics, rather than for every single diagnostic (done currently), or only for the output vars (what this pr does)

Thanks Teja, this is helpful! It definitely makes sense to avoid constructing DiagnosticVariables we aren't using, especially if the construction is time consuming. Can we rename define_diagnostics to construct_diagnostics? I think that more explicitly shows that the DiagnosticVariables are being constructed via that method.

To me the name requested_diags does make sense. It's equivalent to diagnostics rather than output_vars in default_diagnostics (though if the user passes in a vector of output_vars we also have diagnostics == output_vars so all 3 are the same).

juliasloan25 · 2025-11-05T21:28:30Z

src/diagnostics/diagnostic.jl

-        comments,
-        compute!,
-    )
+    if isnothing(requested_diags) || short_name in requested_diags


Instead of using requested_diags == nothing, we can use requested_diags = get_possible_diagnostics(model) to get the entire set of diagnostics available for a model. Then we can just have:

Suggested change

if isnothing(requested_diags) || short_name in requested_diags

if short_name in requested_diags

and this will have the same effect of not restricting the DiagnosticVariables that are constructed

That is what I had originally, but docs/src/diagnostics/make_diagnostic_table.jl relies on adding all possible diagnostic variables while passing in a nothing model. I can't think of a nice way to deal with this. Maybe adding
get_possible_diagnostics(::Nothing) = :all_diagnostics or =[] and then checking for that.

juliasloan25 · 2025-11-05T21:29:13Z

src/diagnostics/construct_diagnostics.jl

+compute function for `land_model`. If `requested_diags` is not `nothing`, diagnostics are
+only added if their `short_name` is in `requested_diags`.


Suggested change

compute function for `land_model`. If `requested_diags` is not `nothing`, diagnostics are

only added if their `short_name` is in `requested_diags`.

compute function for `land_model`. Diagnostics are

only constructed if their `short_name` is in `requested_diags`.

kmdeck

Thanks for explaining more.

Renaming define_diagnostics! to construct_diagnostics! is fine - or even construct_requested_diagnostics!? and we can keep the name requested_diags, and I like Julia's idea of the default being get_possible_diagnostics(land_model).

I stumbled a bit seeing that define_diagnostics! takes the model as input, but then does things internally that apply to all models - like, as you wrote, define_diagnostics!(bucket_model) adds canopy diagnostics, etc. so i almost wonder if a better solution might be to have different methods for different models, but this is more work for the same result, so I think what you have here is good!

Rename to `construct_diagnostics`. This name is more reflective of its functionality and purpose. The ! is dropped because none of the args are modified.

imreddyTeja force-pushed the tr/test-integrator branch 2 times, most recently from ae018aa to a36b12b Compare November 4, 2025 18:52

imreddyTeja requested review from juliasloan25 and kmdeck November 4, 2025 18:53

kmdeck reviewed Nov 4, 2025

View reviewed changes

juliasloan25 reviewed Nov 5, 2025

View reviewed changes

kmdeck approved these changes Nov 6, 2025

View reviewed changes

Rename define_diagnostics!

6d4426e

Rename to `construct_diagnostics`. This name is more reflective of its functionality and purpose. The ! is dropped because none of the args are modified.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Improve diagnostics test speed #1526

Improve diagnostics test speed #1526

Uh oh!

imreddyTeja commented Nov 3, 2025 •

edited

Loading

Uh oh!

kmdeck left a comment

Uh oh!

imreddyTeja commented Nov 5, 2025

Uh oh!

juliasloan25 commented Nov 5, 2025

Uh oh!

juliasloan25 Nov 5, 2025

Uh oh!

imreddyTeja Nov 6, 2025

Uh oh!

juliasloan25 Nov 5, 2025

Uh oh!

kmdeck left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

	diagnostics = ClimaLand.Diagnostics.default_diagnostics(
	model,
	start_date;
	output_vars = ["ct", "trans"],
	output_writer = diag_writer,
	reduction_period = :hourly,
	);

	if isnothing(requested_diags) \|\| short_name in requested_diags
	if short_name in requested_diags

		compute function for `land_model`. If `requested_diags` is not `nothing`, diagnostics are
		only added if their `short_name` is in `requested_diags`.

Improve diagnostics test speed #1526

Are you sure you want to change the base?

Improve diagnostics test speed #1526

Uh oh!

Conversation

imreddyTeja commented Nov 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

kmdeck left a comment

Choose a reason for hiding this comment

Uh oh!

imreddyTeja commented Nov 5, 2025

Uh oh!

juliasloan25 commented Nov 5, 2025

Uh oh!

juliasloan25 Nov 5, 2025

Choose a reason for hiding this comment

Uh oh!

imreddyTeja Nov 6, 2025

Choose a reason for hiding this comment

Uh oh!

juliasloan25 Nov 5, 2025

Choose a reason for hiding this comment

Uh oh!

kmdeck left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

imreddyTeja commented Nov 3, 2025 •

edited

Loading