Skip to content

New ht thcovmat#2126

Open
achiefa wants to merge 5 commits into
masterfrom
new_ht_thcovmat
Open

New ht thcovmat#2126
achiefa wants to merge 5 commits into
masterfrom
new_ht_thcovmat

Conversation

@achiefa
Copy link
Copy Markdown
Contributor

@achiefa achiefa commented Jul 15, 2024

This PR allows for the inclusion of theory uncertainties due to the effect of power corrections. The theory covariance matrix is constructed by computing the shifts for the theoretical predictions as done for the MHOUs. The shift is computed at the level of the structure functions. Then, the shifts for the structure functions are combined to reconstruct the shift for the xsec. For this reason, the calculation of the shift depends on the dataset. Currently, only 1-JET and DIS (NC and CC) are supported.

At the level of the runcard, power corrections are specified as follows

theorycovmatconfig:
  point_prescriptions: ["power corrections"]
  pc_parameters:
  - {ht: H2p, yshift: [0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.0], nodes: [0.0, 0.001, 0.01, 0.1, 0.3, 0.5, 0.7, 0.9, 1]}
  - {ht: H2d, yshift: [0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.0], nodes: [0.0, 0.001, 0.01, 0.1, 0.3, 0.5, 0.7, 0.9, 1]}
  - {ht: HLp, yshift: [0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.0], nodes: [0.0, 0.001, 0.01, 0.1, 0.3, 0.5, 0.7, 0.9, 1]}
  - {ht: HLd, yshift: [0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.0], nodes: [0.0, 0.001, 0.01, 0.1, 0.3, 0.5, 0.7, 0.9, 1]}
  - {ht: H3p, yshift: [0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.0], nodes: [0.0, 0.001, 0.01, 0.1, 0.3, 0.5, 0.7, 0.9, 1]}
  - {ht: H3d, yshift: [0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.0], nodes: [0.0, 0.001, 0.01, 0.1, 0.3, 0.5, 0.7, 0.9, 1]}
  - {ht: Hj, yshift: [2.0, 2.0, 2.0, 2.0, 2.0, 2.0], nodes: [0.25, 0.75, 1.25, 1.75, 2.25, 2.75]}
  pc_included_procs: ["JETS", "DIS NC", "DIS CC"]
  pc_excluded_exps: [HERA_NC_318GEV_EAVG_CHARM-SIGMARED,
                     HERA_NC_318GEV_EAVG_BOTTOM-SIGMARED]
  pdf: 240701-02-rs-nnpdf40-baseline
  use_thcovmat_in_fitting: true
  use_thcovmat_in_sampling: true

For each process/sf implemented, we need to specify a series of information:

  1. The name of the power correction, under the key ht. This is useful to identify the type of power correction, in particular when computing the posterior.
  2. The values in yshift are the magnitudes of the prior.
  3. nodes contains the points where the prior is shifted

The array pc_included_procs specifies the processes for which the shifts are computed. I've also implemented the possibility to exclude some particular datasets within the selected processes, and this can be done by specifying the names in pc_excluded_exps.

The key func_type is temporary and will be deleted once we decide which function to use to construct the prior.

All the relevant details for this PR are contained in the module higher_twist_functions.py. For each observable for which the shift has to be computed, I implemented a factory that constructs a function which will then compute the shift. I thought it was kind of necessary to make the shift dependent on the shift parameters (yshift and nodes) and on the prescription according to which we vary the parameters (which for now is fixed), and not on the kinematics (I think this is known as currying in computer science).

TO DO

  • Add documentation
  • [x] Remove func_type
  • What happens if power_corrections is specified but the parameters are not given?
  • Change the name ht to name (?)
  • Something else?

@achiefa achiefa added the run-fit-bot Starts fit bot from a PR. label Jul 15, 2024
@achiefa achiefa requested a review from RoyStegeman July 15, 2024 10:11
@achiefa achiefa self-assigned this Jul 15, 2024
@github-actions
Copy link
Copy Markdown

Greetings from your nice fit 🤖 !
I have good news for you, I just finished my tasks:

Check the report carefully, and please buy me a ☕ , or better, a GPU 😉!

@scarlehoff scarlehoff marked this pull request as draft July 17, 2024 14:26
Comment thread n3fit/src/n3fit/layers/DIS.py Outdated
Comment thread n3fit/src/n3fit/layers/DIS.py Outdated
Comment thread n3fit/src/n3fit/layers/DIS.py Outdated
Comment thread n3fit/src/n3fit/scripts/vp_setupfit.py Outdated
Comment thread n3fit/src/n3fit/scripts/n3fit_exec.py Outdated
Comment thread validphys2/src/validphys/results.py Outdated
Comment thread validphys2/src/validphys/theorycovariance/construction.py Outdated
@achiefa achiefa force-pushed the new_ht_thcovmat branch 2 times, most recently from 28662e8 to 6b71fae Compare October 11, 2024 22:16
@achiefa achiefa force-pushed the new_ht_thcovmat branch 2 times, most recently from 348be41 to 9ac473c Compare January 14, 2025 14:06
@achiefa
Copy link
Copy Markdown
Contributor Author

achiefa commented Jan 14, 2025

Hi @roy, I've updated the code so that now the covmat for power corrections can be constructed as in the case of scale variations. Please, look and let me know if something is unclear. I'd like if you could double-check the functions that compute the shifts, in particular when normalisation factors and conversion factors are used.

There are few things that I still don't like here and there, but you're free to propose modifications.

@achiefa achiefa added run-fit-bot Starts fit bot from a PR. and removed run-fit-bot Starts fit bot from a PR. labels Jan 14, 2025
Comment thread validphys2/src/validphys/theorycovariance/construction.py Outdated
Comment thread validphys2/src/validphys/theorycovariance/construction.py Outdated
@achiefa achiefa removed the run-fit-bot Starts fit bot from a PR. label Jan 14, 2025
Comment thread validphys2/src/validphys/theorycovariance/construction.py Outdated
@github-actions
Copy link
Copy Markdown

Greetings from your nice fit 🤖 !
I have good news for you, I just finished my tasks:

Check the report carefully, and please buy me a ☕ , or better, a GPU 😉!

@achiefa achiefa marked this pull request as ready for review January 30, 2025 14:13
@achiefa achiefa requested a review from scarlehoff February 21, 2025 15:37
Copy link
Copy Markdown
Member

@scarlehoff scarlehoff left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @achiefa thanks for this.

I've left a few comments on the code. The 1000line higher_twist_functions I haven't look into in much detail, but one thing that I'd like to mention is please avoid using commondata_table. This is basically an object containing all commondata information like in the old days which was necessary to add for some of validphys functions but the idea is to instead get only the information you want.

For instance, the process type is part of the metadata now (no need to load the data and uncertainties!)
Same for the kinematics (although by the time you need the kinematics you often have already read the whole commondata). You can get the kinematics directly with cd.metadata.load_kinematics() and then you already have x, Q, and y for DIS as god intended!

Comment thread n3fit/runcards/examples/Basic_runcard_pc_covmat.yml
Comment thread validphys2/examples/theory_covariance/chi2table_ht.yaml Outdated
Comment thread validphys2/src/validphys/commondata.py Outdated
Comment thread validphys2/src/validphys/config.py Outdated
Comment thread validphys2/src/validphys/config.py Outdated
Comment thread validphys2/src/validphys/theorycovariance/construction.py Outdated
Comment thread validphys2/src/validphys/theorycovariance/construction.py Outdated
Comment thread validphys2/src/validphys/theorycovariance/higher_twist_functions.py Outdated
Comment thread validphys2/src/validphys/theorycovariance/higher_twist_functions.py Outdated
Comment thread validphys2/src/validphys/theorycovariance/higher_twist_functions.py Outdated
@achiefa
Copy link
Copy Markdown
Contributor Author

achiefa commented Feb 12, 2026

@scarlehoff @RoyStegeman Given that we agreed in Amsterdam to use power corrections in the studies for 4.1, I believe this PR must be reviewed and possibly merged into master. I've polished the code and added a doc page that explains what power corrections are in the context of NNPDF and how they can be included in the fit. Let me know if that is enough.

Copy link
Copy Markdown
Member

@scarlehoff scarlehoff left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. We can talk on Tuesday to have someone go through it in more detail.

I've just looked at the changes very quickly and left a few comments, but since you have been rebasing this can probably be merged relatively quickly.

Btw, I think it might be a good idea to add a trimmed down version of the runcard you have added to the regression tests folder (in the root), without the "9 point" only with the power corrections.
That way we'll be testing at least partially the theory covmat part (which we are not doing right now to avoid downloading too many fktables).

Comment thread n3fit/src/n3fit/performfit.py Outdated
Comment thread validphys2/src/validphys/theorycovariance/construction.py Outdated
Comment thread validphys2/src/validphys/theorycovariance/construction.py Outdated
"""
# Check that `deltas1` and `deltas2` have the same shifts
if deltas1.keys() != deltas2.keys():
raise RuntimeError('The two dictionaries do not contain the same shifts.')
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since you already added a check, this can be checked beforehand as well? Or is it not possible?

Copy link
Copy Markdown
Contributor Author

@achiefa achiefa Feb 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where did I add a check?

Do you mean the check check_pc_parameters? I'm not sure that the check you pointed out can be included in check_pc_parameters. Actually, I think the check above is a bit superfluous.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the checks.py there's a new check_pc_parameters. You can also add a check for the deltas.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK!

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think here there is a more subtle problem. compute_covs_pt_prescrip expects deltas1 and deltas2 to be lists of np.ndarrays, while here I'm using dicts. I think thist must be changed.

Comment thread validphys2/src/validphys/theorycovariance/construction.py Outdated
Comment thread validphys2/src/validphys/config.py Outdated
Comment thread validphys2/src/validphys/config.py Outdated
Comment thread validphys2/src/validphys/config.py Outdated
Comment thread validphys2/src/validphys/config.py Outdated
Comment thread n3fit/runcards/examples/Basic_runcard_pc_covmat.yml Outdated
@RoyStegeman
Copy link
Copy Markdown
Member

In the end, have the theories on the storage server been updated with the k-factors?

@achiefa
Copy link
Copy Markdown
Contributor Author

achiefa commented Feb 19, 2026

Hi @RoyStegeman, what do you mean? Do you mean the non-perturbative c-factors provided by the experimentalists? We agreed in Amsterdam not to consider them, as the information provided by the experimentalists is, most of the time, sparse, lacking, or missing. For this reason, we will use the PC analysis as a diagnostic tool for the data selection criteria. We also agreed to consider dijet data only. If corrections turn out to be compatible with zero, then these datasets pass the selection criteria and possibly we will add the PC uncertainties as an additional covmat. If corrections are compatible with the experimental uncertainties, the dataset will be discarded, and we reassess the single-jet data. Anyway, the experimental NP corrections have now been added to theory_slim. See this PR. If instead you meant the k-factors for the PC the we determined in our analysis, then the answer is no.

@RoyStegeman
Copy link
Copy Markdown
Member

instead you meant the k-factors for the PC

Indeed, this is what I meant. But from your message I understand that if they significantly deviate from 1, then the data will not be included so there is no need for the k-factors anyway

@achiefa achiefa force-pushed the new_ht_thcovmat branch 2 times, most recently from 964badf to fa0ba36 Compare March 2, 2026 15:56
Copy link
Copy Markdown
Member

@scarlehoff scarlehoff left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left a few comments but haven't run the code yet. I've also look at higher_twist_functions very superficially for now, just got curious about the cuts.

I'm a bit worried about all the if conditions that this one prescription adds and whether it breaks some assumptions in the code. In particular the one where theoryid(s) is a one-item list...

Comment thread nnpdf_data/nnpdf_data/commondata/CMS_2JET_7TEV/metadata.yaml
Comment thread validphys2/src/validphys/theorycovariance/construction.py Outdated
Comment thread validphys2/src/validphys/theorycovariance/construction.py
Comment thread validphys2/src/validphys/theorycovariance/construction.py
Comment thread validphys2/src/validphys/theorycovariance/construction.py Outdated
Comment thread validphys2/src/validphys/theorycovariance/higher_twist_functions.py
Comment thread validphys2/src/validphys/config.py
Copy link
Copy Markdown
Member

@scarlehoff scarlehoff left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried to run the exmaple runcard and got:

~ vp-setupfit Basic_runcard_pc_covmat.yml  
ValueError: Shape of passed values is (3148, 3148), indices imply (3418, 3418)

This does look like a problem with the cuts (with the funny coincidence of 14 -> 41)

Also, a lot of errors like this one before crashing:

[INFO]: Summary: 3238/3850 datapoints passed kinematic cuts.
/Volumes/csDisk/Repositories/NNPDF/nnpdf/validphys2/src/validphys/covmats_utils.py:88: RuntimeWarning: divide by zero encountered in matmul
  return np.diag(diagonal) + corr_sys_mat @ corr_sys_mat.T

For the rest, I've had a look at the higher_twist_functions.py‎ module and I've flagged a few things. It's probably not comprehensive enough, it is very long 😅 but anyway it can probably be merged once the issue above gets solved.

Comment thread doc/sphinx/source/vp/theorycov/power_corrections.rst Outdated
Comment thread doc/sphinx/source/vp/theorycov/power_corrections.rst
Comment thread nnpdf_data/nnpdf_data/filter_utils/legacy_jets_utils.py
Comment thread validphys2/src/validphys/theorycovariance/construction.py
Comment thread validphys2/src/validphys/theorycovariance/construction.py Outdated
Comment thread validphys2/src/validphys/theorycovariance/construction.py Outdated
Comment thread validphys2/src/validphys/theorycovariance/higher_twist_functions.py Outdated
Comment thread validphys2/src/validphys/theorycovariance/higher_twist_functions.py
Comment thread validphys2/src/validphys/theorycovariance/higher_twist_functions.py Outdated
)
cond_high = np.multiply(
a >= bin_mid, a < bin_high if shift_pos != len(y_shift) - 1 else a <= bin_high
)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm trying to wrap my head around these two statements but I'm a bit confused. I think you cannot get a True * True as long as shift_pos != len(y_shift) - 1

But if not shift_pos != len(y_shift) - 1 then you can get True in both if a == bin_mid (because the second one is <= while I think it should be just <?

But maybe I'm understanding this wrong.

(also, why instead of multiplying Trues and Falses don't you use an & here?)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tbg, I wrote this function almost two years ago and I detested it ever since. I'm pretty sure I can come up with a shorter version that does exactly the same thing.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something like this looks much better and cleaner.

def linear_bin_function_proposed(
     a: npt.ArrayLike, y_shift: npt.ArrayLike, nodes: npt.ArrayLike
) -> np.ndarray:
    a = np.asarray(a, dtype=float)
    res = np.interp(a, nodes, y_shift)
    res[(a < nodes[0]) | (a > nodes[-1])] = 0.0
    return res

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not that worried about the complexity of the function or the length, I'm just not convinced it is correct ^^U

@achiefa
Copy link
Copy Markdown
Contributor Author

achiefa commented May 21, 2026

Hi @scarlehoff, thanks for your comments and suggestions. I went through them all and I'll push the new fixes. I also fixed the basic runcard: now it works without crashing. I just don't understand how to reproduce this warning

[INFO]: Summary: 3238/3850 datapoints passed kinematic cuts.
/Volumes/csDisk/Repositories/NNPDF/nnpdf/validphys2/src/validphys/covmats_utils.py:88: RuntimeWarning: divide by zero encountered in matmul
  return np.diag(diagonal) + corr_sys_mat @ corr_sys_mat.T

which btw is the same as #2455.

@scarlehoff
Copy link
Copy Markdown
Member

If you are not getting the warning perhaps it is an issue with my version of numpy...

@achiefa achiefa force-pushed the new_ht_thcovmat branch from 801b8b1 to c20c0ff Compare May 21, 2026 15:32
@achiefa
Copy link
Copy Markdown
Contributor Author

achiefa commented May 21, 2026

If you are not getting the warning perhaps it is an issue with my version of numpy...

I'm not able to reproduce this warning.

@scarlehoff
Copy link
Copy Markdown
Member

Indeed, it seems it was my version of numpy.

Copy link
Copy Markdown
Member

@scarlehoff scarlehoff left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I'm not going to go through the whole of higher_twist_functions.py again (also because I think you played god with git's history so it is even more difficult to know what is the different with respect to the previous version) but I'll trust the changes are basically those suggested 😅

vp-setupfit at least now works

log.info(f'{luxset} Lux pdf checked.')
luxset = fiatlux['luxset']
luxset.load()
log.info(f'{luxset} Lux pdf checked.')
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a bit worried about these files having been touched. They were not in the previous version I reviewed were there?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants