DotProd Autotiler
This test computes a Vector x Matrix product with different data types (float32, float16) running both sequentially on the FC and with the Autotiler generated code on the 8 parallel cluster’s cores.
The Matrixes are initially stored in L2 and the code generated tiles and double buffers the data movement between L2 and L1. To check the results a sequential, non-optimized Matrix Add is computed by the FC core in the main code.
How to run
To run the code with In [64] and W [64x64] –> Out [64x64]:
mkdir build && cd build && cmake ../
make run -j
To change the size of the matrixes, you can modify the following lines in the CMakeLists.txt file:
list(APPEND DOTPROD_OPT_FLAG -DIN_CH=128 -DOUT_CH=128)