Allow AP_SAT, AP_RND for 'maximum' precision in HLS Config#1422
Allow AP_SAT, AP_RND for 'maximum' precision in HLS Config#1422morunner wants to merge 2 commits intofastmachinelearning:mainfrom
Conversation
5d8237f to
55ed9c0
Compare
|
Generally setting the maximum precision is not something we recommend using much, I don't think. It's better to either quantize the values in the training, or if doing PTQ, explicitly set certain widths to more reasonable values in the configuration. The maximum width is not granular enough for that. One can see what width one gets without the maxumum setting and modify the configuration till it is satisfactory. Also, rounding and saturation for the accumulator often make the accumulation much slower. It is better to keep it wider and if needed, use saturation and rounding in the activation step right after it, where its cost is insignificant. This more fine-grained way of doing things is recommended instead of using the maximum precision. |
|
@morunner do you have some results (in terms of resource usage) before and after this change. @jmitrevs and @calad0i mentioned in our last dev meeting that AP_SAT may not be the most resource-friendly for accumulators (due to the underlying implementation of the saturation operation). Most times, the recommended way is to simply increase the bit width of the variable. If that's the case from your results we should keep this PR open as a reference (in case someone is interested in using similar functions), but not merge it. |
|
Sorry for the late reply. I wanted to finalize the model architecture first before optimizing this. I have re-run synthesis with the Vitis backend for two of the same models, one with the AP_RND, AP_SAT modes set for the default, maximum as well as dense layer weight and bias precisions. And one with the default modes (AP_TRN,AP_WRAP) for said parameters. The model with rounding and saturation indeed consumes more LUTs (25% instead of 20%, for the Alveo u55c) and has slightly higher latency. But using truncation and wraparound of course comes with a decrease in accuracy. Hence, I agree with @bo3z to not merge this PR. For the time being, using rounding and saturation is still a viable approach for us since we achieve satisfiable resource utilization, latency and meet timing closure with the current setup. But it is good to know that there is still some performance to get out of the model by investing more time in tuning precisions. |
You can always increase the bitwidth to match/get better accuracy, since accum is very cheap in comparison to the other operations happening there. In most cases, one should be able to have no real performance difference between the python model and the hls model bit allocating sufficient bitwidths in the accumulator to avoid overflow/underflows. Unless there is very specific hardware restriction (usually not), I would suggest against the use of maximum precision. |
Description
During the development of a gravnet model for hls4ml, I found that setting rounding and saturation for the maximum allowed precision can be beneficial for increasing accuracy. Below you can see histograms and mean difference with one standard deviation when the maximum precision is
ap_fixed<16,8,AP_RND,AP_SAT,0>and when rounding and saturation are not enabled (ap_fixed<16,8>).In the current upstream main branch of hls4ml, rounding and saturation modes set through the 'maximum' field in the HLS config are ignored during precision inference (see e.g. here). I thus propose a single function
_apply_max_precision_constraintsto be applied where necessary in theinfer_precision.pymodule, which adheres the following rules:(meaning the user likely set them explicitly).
We can of course discuss, what the preferred ruleset should be here.
No additional dependencies are required for this change.
Type of change
Tests
Pytest
Added a new pytest module,
test_max_precision.py, which tests the newly added_apply_max_precision_constraintsfunction isolated and within the_infer_precisionfunction, using mocks.Conversion to HLS
Ran the full jupyter notebook for gravnet keras conversion to hls at hls4ml-gravnet (Link) to generate the below listed plots, with the proposed change enabled and disabled. The profiling section was ran with this fix applied. We currently do not provide the fully trained model open-source, since it is not finalized. @bo3z please contact me directly, also regarding the dataset, if needed.
Checklist
pre-commiton the files I edited or added.GravNet plots showing accuracies and bias across layers
With rounding and saturation enabled for the maximum precision.