oneAPI backend update: kernel and layer optimizations#1246
oneAPI backend update: kernel and layer optimizations#1246
Conversation
| // and send to the sink. Adaptive to SYCL HLS and hardware acceleration flow. | ||
| template <class src_T, class dest_pipe> struct DMA_convert_data { | ||
| #if !defined(IS_BSP) | ||
| // When targeting a device family, we instantiate an Avalon Memory Mapped Host for |
There was a problem hiding this comment.
I wonder if all the DMA_convert_data things should be moved to a different file. In the SYCL HLS style they are effectively part of the testbench, so I think should be in a different file. In the accelerator flow, they still are different kernels, utility kernels in a way, so I think they should be separate.
| @@ -13,22 +13,28 @@ | |||
| namespace nnet { | |||
|
|
|||
| template <class srcType, class dest_pipe, size_t SIZE> void convert_data(sycl::queue &q, srcType *src) { | |||
There was a problem hiding this comment.
We should discuss what happens with this function vs the new DMA versions of these.
|
I noticed, by the way, that ReLU uses blocking reads, and all the components use blocking writes. Is there a requirement to use nonblocking reads and writes? Note, we do need to handle back-pressure, which is much more natural to do with blocking I/O. |
|
Closing this, with work going into #1370. |
Description
This is a replacement of #1218, moving the branch to the main repository for easier contribution by others.
Type of change
This PR introduces improvements to the oneAPI inference backend, focusing on:
Sideband Signal Support
Updated Dense and ReLU Layer for Always-Running Execution
sop/eopsideband signals for synchronization.whileloop for always-on kernel execution.Added DMA Kernels for Hardware Execution
DMA_convert_dataandDMA_convert_data_backmove data between host and FPGA efficiently.Utility Functions for Compile-Time Type Extraction
Tests
Tested the updated layers in emulation, simulation, and hardware run. Tests conducted by generating the project file using the oneAPI backend code generator, and compiling for the binary using cmake.
Test Configuration:
setvarsscript.Checklist
pre-commiton the files I edited or added.