Software Development Tools

GAP SDK Tool Chain makes it easy for developers to extract the maximum energy efficiency and performance from the GAP architecture with the highest productivity. We provide multiple examples of different tasks and also full, end to end application examples.

The GAP SDK includes an open-source extended version of the RISC-V GCC compiler which includes optimization and code generation steps for all the RISC-V ISA extensions implemented by GAP processors. It also includes all the tools necessary to flash, run, and debug GAP. There is no need to use assembly language to extract maximum performance from GAP processors.

GCC C/C ++ toolchain

Standard GNU compiler toolchain ported to GAP.


Open source JTAG bridge supporting many cables and chips, including Olimex/Jlink and so on. It also semihosting and debug. Read more >>


GVSOC is a near-cycle accurate, full SoC simulator that runs GAP executables on a Linux PC. The tool’s sophisticated code trace and signal tracing functions give the developer detailed information on code execution, speeding the debugging and optimization of applications. Watch our tutorial on how to use GVSOC on our blog post >>


Profiler is a visualization tool for profiling and debugging GAP applications and used with GVSOC, our Full System SoC Simulator. Profiler gives you a visual view of what is happening inside the chip and allows you to control the simulator through a graphic interface

NN Menu

NN Menu is a  GreenWaves’ GitHub repository that includes NN examples and benchmarks of well-known architectures that help you to build your application and run it on the GAP processor. Read our full tutorial on NN Menu on our blog post >>

NN Tool

NNtool is a python package that facilitates the porting of  NN graphs from high level deep learning framework to a GAP processor. NNTool implements model optimizations, quantization and AutoTiler model generation. Extensive testing functionality is available via a python library of GAP-like computational kernels. Visit our GitHub to read more >>

NN/DSP Generators

Library of pre-optimized NN kernels


The AutoTiler optimizes data memory movement between memory areas in and outside GAP processors, and generates readable, debuggable C code running implementing memory movement and calling defined kernels on the GAP cluster. The AutoTiler generated code minimizes memory accesses, and overlaps memory movement and computation reducing the memory bottleneck which is found in many Neural Network and DSP operations.


GAPflow includes tools that automatically convert quantized and floating-point TensorFlow Lite graphs to readable C code ready for execution on GAP. The process can be fully automated as part of a program build but the tool-based approach means that the process is inspectable and debuggable. GAPflow includes the GAP AutoTiler that optimizes the movement of data across the memory hierarchy at compile-time overlapping computation with memory movement to ensure the best possible performance. The GAP AutoTiler makes it easy for a developer to extract the maximum benefit from the GAP architecture, avoiding the need for energy-hungry data caching for streamed workloads. Read more >>