New ht thcovmat by achiefa · Pull Request #2126 · NNPDF/nnpdf

achiefa · 2024-07-15T10:11:47Z

This PR allows for the inclusion of theory uncertainties due to the effect of power corrections. The theory covariance matrix is constructed by computing the shifts for the theoretical predictions as done for the MHOUs. The shift is computed at the level of the structure functions. Then, the shifts for the structure functions are combined to reconstruct the shift for the xsec. For this reason, the calculation of the shift depends on the dataset. Currently, only 1-JET and DIS (NC and CC) are supported.

At the level of the runcard, power corrections are specified as follows

theorycovmatconfig:
  point_prescriptions: ["power corrections"]
  pc_parameters:
  - {ht: H2p, yshift: [0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.0], nodes: [0.0, 0.001, 0.01, 0.1, 0.3, 0.5, 0.7, 0.9, 1]}
  - {ht: H2d, yshift: [0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.0], nodes: [0.0, 0.001, 0.01, 0.1, 0.3, 0.5, 0.7, 0.9, 1]}
  - {ht: HLp, yshift: [0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.0], nodes: [0.0, 0.001, 0.01, 0.1, 0.3, 0.5, 0.7, 0.9, 1]}
  - {ht: HLd, yshift: [0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.0], nodes: [0.0, 0.001, 0.01, 0.1, 0.3, 0.5, 0.7, 0.9, 1]}
  - {ht: H3p, yshift: [0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.0], nodes: [0.0, 0.001, 0.01, 0.1, 0.3, 0.5, 0.7, 0.9, 1]}
  - {ht: H3d, yshift: [0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.0], nodes: [0.0, 0.001, 0.01, 0.1, 0.3, 0.5, 0.7, 0.9, 1]}
  - {ht: Hj, yshift: [2.0, 2.0, 2.0, 2.0, 2.0, 2.0], nodes: [0.25, 0.75, 1.25, 1.75, 2.25, 2.75]}
  pc_included_procs: ["JETS", "DIS NC", "DIS CC"]
  pc_excluded_exps: [HERA_NC_318GEV_EAVG_CHARM-SIGMARED,
                     HERA_NC_318GEV_EAVG_BOTTOM-SIGMARED]
  pdf: 240701-02-rs-nnpdf40-baseline
  use_thcovmat_in_fitting: true
  use_thcovmat_in_sampling: true

For each process/sf implemented, we need to specify a series of information:

The name of the power correction, under the key ht. This is useful to identify the type of power correction, in particular when computing the posterior.
The values in yshift are the magnitudes of the prior.
nodes contains the points where the prior is shifted

The array pc_included_procs specifies the processes for which the shifts are computed. I've also implemented the possibility to exclude some particular datasets within the selected processes, and this can be done by specifying the names in pc_excluded_exps.

~~The key func_type is temporary and will be deleted once we decide which function to use to construct the prior.~~

All the relevant details for this PR are contained in the module higher_twist_functions.py. For each observable for which the shift has to be computed, I implemented a factory that constructs a function which will then compute the shift. I thought it was kind of necessary to make the shift dependent on the shift parameters (yshift and nodes) and on the prescription according to which we vary the parameters (which for now is fixed), and not on the kinematics (I think this is known as currying in computer science).

TO DO

Add documentation
~~[x] Remove func_type~~
What happens if power_corrections is specified but the parameters are not given?
Change the name ht to name (?)
Something else?

github-actions · 2024-07-15T11:33:42Z

Greetings from your nice fit 🤖 !
I have good news for you, I just finished my tasks:

Fit Name: NNBOT-2ad60a7b9-2024-07-15
Fit Report wrt master: https://vp.nnpdf.science/f0TNPeX3SlO5uwbJkr7I8Q==
Fit Report wrt latest stable reference: https://vp.nnpdf.science/X6jx-wHfT5iJuz14nPatnQ==
Fit Data: https://data.nnpdf.science/fits/NNBOT-2ad60a7b9-2024-07-15.tar.gz

Check the report carefully, and please buy me a ☕ , or better, a GPU 😉!

achiefa · 2025-01-14T14:14:04Z

Hi @roy, I've updated the code so that now the covmat for power corrections can be constructed as in the case of scale variations. Please, look and let me know if something is unclear. I'd like if you could double-check the functions that compute the shifts, in particular when normalisation factors and conversion factors are used.

There are few things that I still don't like here and there, but you're free to propose modifications.

github-actions · 2025-01-14T15:29:47Z

Greetings from your nice fit 🤖 !
I have good news for you, I just finished my tasks:

Fit Name: NNBOT-701fd3dd4-2025-01-14
Fit Report wrt master: https://vp.nnpdf.science/LhL3COHERwuMwHEEeGvymg==
Fit Report wrt latest stable reference: https://vp.nnpdf.science/-y7-UpCPSFGPxzP_41VXLA==
Fit Data: https://data.nnpdf.science/fits/NNBOT-701fd3dd4-2025-01-14.tar.gz

Check the report carefully, and please buy me a ☕ , or better, a GPU 😉!

scarlehoff

Hi @achiefa thanks for this.

I've left a few comments on the code. The 1000line higher_twist_functions I haven't look into in much detail, but one thing that I'd like to mention is please avoid using commondata_table. This is basically an object containing all commondata information like in the old days which was necessary to add for some of validphys functions but the idea is to instead get only the information you want.

For instance, the process type is part of the metadata now (no need to load the data and uncertainties!)
Same for the kinematics (although by the time you need the kinematics you often have already read the whole commondata). You can get the kinematics directly with cd.metadata.load_kinematics() and then you already have x, Q, and y for DIS as god intended!

achiefa · 2026-02-12T10:48:44Z

@scarlehoff @RoyStegeman Given that we agreed in Amsterdam to use power corrections in the studies for 4.1, I believe this PR must be reviewed and possibly merged into master. I've polished the code and added a doc page that explains what power corrections are in the context of NNPDF and how they can be included in the fit. Let me know if that is enough.

scarlehoff

Thanks. We can talk on Tuesday to have someone go through it in more detail.

I've just looked at the changes very quickly and left a few comments, but since you have been rebasing this can probably be merged relatively quickly.

Btw, I think it might be a good idea to add a trimmed down version of the runcard you have added to the regression tests folder (in the root), without the "9 point" only with the power corrections.
That way we'll be testing at least partially the theory covmat part (which we are not doing right now to avoid downloading too many fktables).

scarlehoff · 2026-02-12T11:12:47Z

+    """
+    # Check that `deltas1` and `deltas2` have the same shifts
+    if deltas1.keys() != deltas2.keys():
+        raise RuntimeError('The two dictionaries do not contain the same shifts.')


Since you already added a check, this can be checked beforehand as well? Or is it not possible?

~~Where did I add a check?~~

Do you mean the check check_pc_parameters? I'm not sure that the check you pointed out can be included in check_pc_parameters. Actually, I think the check above is a bit superfluous.

In the checks.py there's a new check_pc_parameters. You can also add a check for the deltas.

I think here there is a more subtle problem. compute_covs_pt_prescrip expects deltas1 and deltas2 to be lists of np.ndarrays, while here I'm using dicts. I think thist must be changed.

RoyStegeman · 2026-02-19T06:18:24Z

In the end, have the theories on the storage server been updated with the k-factors?

achiefa · 2026-02-19T07:35:43Z

Hi @RoyStegeman, what do you mean? Do you mean the non-perturbative c-factors provided by the experimentalists? We agreed in Amsterdam not to consider them, as the information provided by the experimentalists is, most of the time, sparse, lacking, or missing. For this reason, we will use the PC analysis as a diagnostic tool for the data selection criteria. We also agreed to consider dijet data only. If corrections turn out to be compatible with zero, then these datasets pass the selection criteria and possibly we will add the PC uncertainties as an additional covmat. If corrections are compatible with the experimental uncertainties, the dataset will be discarded, and we reassess the single-jet data. Anyway, the experimental NP corrections have now been added to theory_slim. See this PR. If instead you meant the k-factors for the PC the we determined in our analysis, then the answer is no.

RoyStegeman · 2026-02-19T09:07:13Z

instead you meant the k-factors for the PC

Indeed, this is what I meant. But from your message I understand that if they significantly deviate from 1, then the data will not be included so there is no need for the k-factors anyway

scarlehoff

Left a few comments but haven't run the code yet. I've also look at higher_twist_functions very superficially for now, just got curious about the cuts.

I'm a bit worried about all the if conditions that this one prescription adds and whether it breaks some assumptions in the code. In particular the one where theoryid(s) is a one-item list...

scarlehoff

I tried to run the exmaple runcard and got:

~ vp-setupfit Basic_runcard_pc_covmat.yml  
ValueError: Shape of passed values is (3148, 3148), indices imply (3418, 3418)

This does look like a problem with the cuts (with the funny coincidence of 14 -> 41)

Also, a lot of errors like this one before crashing:

[INFO]: Summary: 3238/3850 datapoints passed kinematic cuts.
/Volumes/csDisk/Repositories/NNPDF/nnpdf/validphys2/src/validphys/covmats_utils.py:88: RuntimeWarning: divide by zero encountered in matmul
  return np.diag(diagonal) + corr_sys_mat @ corr_sys_mat.T

For the rest, I've had a look at the higher_twist_functions.py‎ module and I've flagged a few things. It's probably not comprehensive enough, it is very long 😅 but anyway it can probably be merged once the issue above gets solved.

scarlehoff · 2026-05-20T10:28:24Z

+        )
+        cond_high = np.multiply(
+            a >= bin_mid, a < bin_high if shift_pos != len(y_shift) - 1 else a <= bin_high
+        )


I'm trying to wrap my head around these two statements but I'm a bit confused. I think you cannot get a True * True as long as shift_pos != len(y_shift) - 1

But if not shift_pos != len(y_shift) - 1 then you can get True in both if a == bin_mid (because the second one is <= while I think it should be just <?

But maybe I'm understanding this wrong.

(also, why instead of multiplying Trues and Falses don't you use an & here?)

Tbg, I wrote this function almost two years ago and I detested it ever since. I'm pretty sure I can come up with a shorter version that does exactly the same thing.

Something like this looks much better and cleaner.

def linear_bin_function_proposed( a: npt.ArrayLike, y_shift: npt.ArrayLike, nodes: npt.ArrayLike ) -> np.ndarray: a = np.asarray(a, dtype=float) res = np.interp(a, nodes, y_shift) res[(a < nodes[0]) | (a > nodes[-1])] = 0.0 return res

I'm not that worried about the complexity of the function or the length, I'm just not convinced it is correct ^^U

achiefa · 2026-05-21T14:19:16Z

Hi @scarlehoff, thanks for your comments and suggestions. I went through them all and I'll push the new fixes. I also fixed the basic runcard: now it works without crashing. I just don't understand how to reproduce this warning

[INFO]: Summary: 3238/3850 datapoints passed kinematic cuts.
/Volumes/csDisk/Repositories/NNPDF/nnpdf/validphys2/src/validphys/covmats_utils.py:88: RuntimeWarning: divide by zero encountered in matmul
  return np.diag(diagonal) + corr_sys_mat @ corr_sys_mat.T

which btw is the same as #2455.

scarlehoff · 2026-05-21T14:32:10Z

If you are not getting the warning perhaps it is an issue with my version of numpy...

achiefa · 2026-05-21T15:38:11Z

If you are not getting the warning perhaps it is an issue with my version of numpy...

I'm not able to reproduce this warning.

scarlehoff · 2026-05-25T10:19:12Z

Indeed, it seems it was my version of numpy.

scarlehoff

Thanks, I'm not going to go through the whole of higher_twist_functions.py again (also because I think you played god with git's history so it is even more difficult to know what is the different with respect to the previous version) but I'll trust the changes are basically those suggested 😅

vp-setupfit at least now works

scarlehoff · 2026-05-25T10:24:04Z

-      log.info(f'{luxset} Lux pdf checked.')
+        luxset = fiatlux['luxset']
+        luxset.load()
+        log.info(f'{luxset} Lux pdf checked.')


I'm a bit worried about these files having been touched. They were not in the previous version I reviewed were there?

achiefa added the run-fit-bot Starts fit bot from a PR. label Jul 15, 2024

achiefa requested a review from RoyStegeman July 15, 2024 10:11

achiefa self-assigned this Jul 15, 2024

scarlehoff marked this pull request as draft July 17, 2024 14:26

achiefa force-pushed the new_ht_thcovmat branch from 58ffdd3 to 89aa862 Compare July 29, 2024 10:09

RoyStegeman reviewed Oct 4, 2024

View reviewed changes

achiefa force-pushed the new_ht_thcovmat branch 2 times, most recently from 28662e8 to 6b71fae Compare October 11, 2024 22:16

RoyStegeman force-pushed the new_ht_thcovmat branch from d75f1b9 to 8e1dc74 Compare November 14, 2024 16:20

achiefa force-pushed the new_ht_thcovmat branch 2 times, most recently from 348be41 to 9ac473c Compare January 14, 2025 14:06

achiefa added run-fit-bot Starts fit bot from a PR. and removed run-fit-bot Starts fit bot from a PR. labels Jan 14, 2025

achiefa commented Jan 14, 2025

View reviewed changes

Comment thread validphys2/src/validphys/theorycovariance/construction.py Outdated

achiefa commented Jan 14, 2025

View reviewed changes

Comment thread validphys2/src/validphys/theorycovariance/construction.py Outdated

achiefa removed the run-fit-bot Starts fit bot from a PR. label Jan 14, 2025

achiefa force-pushed the new_ht_thcovmat branch from 9ac473c to 84642cb Compare January 14, 2025 14:18

achiefa commented Jan 14, 2025

View reviewed changes

Comment thread validphys2/src/validphys/theorycovariance/construction.py Outdated

achiefa force-pushed the new_ht_thcovmat branch from 39f5640 to ef59af1 Compare January 30, 2025 13:14

achiefa marked this pull request as ready for review January 30, 2025 14:13

RoyStegeman force-pushed the new_ht_thcovmat branch from fdab4c9 to 9f286c6 Compare February 14, 2025 14:35

achiefa force-pushed the new_ht_thcovmat branch from 3c8b923 to de6497e Compare February 21, 2025 14:47

achiefa requested a review from scarlehoff February 21, 2025 15:37

achiefa force-pushed the new_ht_thcovmat branch from a7e265a to 49bca4f Compare February 28, 2025 12:12

scarlehoff reviewed Mar 2, 2025

View reviewed changes

achiefa force-pushed the new_ht_thcovmat branch from 32314ea to d47da56 Compare April 14, 2025 09:16

achiefa force-pushed the new_ht_thcovmat branch from d47da56 to f709249 Compare April 23, 2025 14:12

achiefa added the regenerate-data label Jan 12, 2026

achiefa force-pushed the new_ht_thcovmat branch from 0dd7e0a to aa7c97b Compare January 12, 2026 11:31

achiefa force-pushed the new_ht_thcovmat branch from ea9411e to f033542 Compare February 12, 2026 10:45

scarlehoff reviewed Feb 12, 2026

View reviewed changes

scarlehoff requested a review from andrpie February 17, 2026 14:06

achiefa force-pushed the new_ht_thcovmat branch from 7178843 to 50c2fba Compare February 18, 2026 12:47

achiefa force-pushed the new_ht_thcovmat branch from 50c2fba to 9616c40 Compare February 19, 2026 08:33

achiefa removed the regenerate-data label Mar 2, 2026

achiefa force-pushed the new_ht_thcovmat branch 2 times, most recently from 964badf to fa0ba36 Compare March 2, 2026 15:56

achiefa force-pushed the new_ht_thcovmat branch from 931ea73 to bbf9a4a Compare March 9, 2026 14:09

scarlehoff reviewed May 12, 2026

View reviewed changes

scarlehoff reviewed May 20, 2026

View reviewed changes

achiefa added regenerate-data and removed regenerate-data labels May 20, 2026

achiefa added 4 commits May 21, 2026 16:28

Implementation of HT and power corrections

63d39f1

Update docs

1cfe40f

Update kinematics labels

7253dde

Add kinematic plot with color-scheme for the expected HT correction

c20c0ff

achiefa force-pushed the new_ht_thcovmat branch from 801b8b1 to c20c0ff Compare May 21, 2026 15:32

Apply suggestion

ce5494b

scarlehoff approved these changes May 25, 2026

View reviewed changes

Conversation

achiefa commented Jul 15, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

TO DO

Uh oh!

github-actions Bot commented Jul 15, 2024

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

achiefa commented Jan 14, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

github-actions Bot commented Jan 14, 2025

Uh oh!

scarlehoff left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

achiefa commented Feb 12, 2026

Uh oh!

scarlehoff left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

achiefa Feb 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

RoyStegeman commented Feb 19, 2026

Uh oh!

achiefa commented Feb 19, 2026

Uh oh!

RoyStegeman commented Feb 19, 2026

Uh oh!

scarlehoff left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

scarlehoff left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

achiefa commented Jul 15, 2024 •

edited

Loading

achiefa Feb 12, 2026 •

edited

Loading