The performance and energy results measured on the GAPuino Board are reported in this section. In following sections we explain how to reproduce these results with your GAPuino Board.
The following chart shows the execution time speed-up of GAP8. The baseline is the Pure RISC-V ISA. All the experiment were conducted on a GAPuino Board.
As we can see GAP8 benefits from all the 3 optimization steps. The speed up of GAP8 ISA extensions, vectorization and parallelization is between 14.5x and 53.9x. Which means that for instance at max GAP8 frequency for a 5×5 byte convolutional layer on a 112×112 grayscale input image with 100 output filters, on GAP8 it takes only 12.8 ms while on a single core standard RISC-V ISA 729.4 ms.
The next analysis shows the energy efficiency gain with respect to pure RISC-V Standard ISA.
Here the results shows that for a given benchmark, we have a benefit from the GAP8 ISA extensions (blue bar), from the vector units (red bar) and from parallelism (yellow bar). The parallelism, exploiting shared instruction cache and shared memory, gives an additional ~2x gain in energy efficiency.