Using ELF overlays for data and intstruction load

Introduction

ELF overlays are a mechanism allowing multiple elf sections to share the same VMA (Virtual Memory Address). In the context of GAP, virtual memory addresses are not virtual address in the sense of an MMU backed CPU. But they are the address used to generate code. That is, a variable which has virtual address 0x1C001000 will be accessed with this address in the rest of the code. Even if its load address (actual physical address) is not the same. In the context of GAP, a load address is either identical to the virtual address (L2 loaded sections). Or a 0x0 based address for aliased addresses. Or, as is the case in this section a 0xC0000000 based address, that is a flash address. This mechanism allows to load multiple sections in time sharing, i.e. sections sharing the same virtual memory addresses will NOT be accessible at the same time. This is particularly useful for applications which have multiple modes, with mutually exclusive sections.

Advantages over XIP

An alternative to this solution is XIP, which can also be used to share memory dynamically. This solution’s advantage is full control of memory by the programmer. In the case of XIP, a cache miss might happen at any point in time. This might in turn break, or make real time programming more complex. Icache refills, and thus icache miss penalties are also larger, and data access is longer. On the contrary, overlays mount and unmount are software controlled. As such, their occurrence is completely predictable. Once an overlay is mounted, no cost will happen.

XIP could still be better if the section to be mounted is larger than available L2 memory. If data access penalty, or real time issues are not of concern, then it also provides a seamless memory management.

Prerequisites

As a prerequisite to use overlays, it is necessary to:

  • Use an application flash section of the type app binary. This is already the case with default layouts provided by the SDK.

  • Declare a custom ldscript to be used, in which overlayed sections will be declared, see Kconfig options to do so.

  • Ensure these sections do not cross reference each others. See NOCROSSREFS flag in ldscript language.

How to use it

Firstly, data and code need to be declared with section attributes matching the desired section.

volatile uint8_t overlayed_data_0[4] __attribute__((section(".layer_trial_0"))) = {0x0,0x1,0x2,0x3};
volatile uint8_t overlayed_data_1[4] __attribute__((section(".layer_trial_1"))) = {0x4,0x5,0x6,0x7};
volatile uint8_t overlayed_data_2[4] __attribute__((section(".layer_trial_2"))) = {0x8,0x9,0xa,0xb};
volatile uint8_t overlayed_data_3[4] __attribute__((section(".layer_trial_3"))) = {0xc,0xd,0xe,0x0f};

__attribute__((section(".layer_code_0"))) void try_overlay_code0(void)
{
    layer_code_0_called = 1;
 printf("I came from code overlay 0\n");
}
__attribute__((section(".layer_code_1"))) void try_overlay_code1(void)
{
    layer_code_1_called = 1;
    printf("I came from code overlay 1\n");
}

Then, the main code must load the overlay before accessing it.

pi_app_bin_section_load_by_name(flash_ref, ".layer_trial_0", &app_description);

uint32_t cur = 0;
for(volatile int i = 0; i < 4; i++)
{
    printf("value of overlay0[%i]=%.2i (addr:0x%x)\n",i,overlayed_data_0[i],&overlayed_data_0[i]);
    if(overlayed_data_0[i] != (cur + i))
    {
        printf("wrong value, expected: %i\n",(cur+i));
        return -1;
    }
}

A full example can be found in the SDK, examples/gap9/basic/overlayed_binary/.