AddressIndependent ScatterLoading

Analysis of scatter-loading mechanism and address-independent compilation of microcomputer (2)

Created at 2020-12-10 19:03:34

//Assembly code
//Which is already known here: r10 is the start address, r11 is the end address +1 (used to judge the end together with r10), and r7 is the start address -1 (actually a base address)
 __scatterload_null
        0x000000d6:    45da        .E      CMP      r10,r11
        0x000000d8:    d101        ..      BNE      0xde ; __scatterload_null + 8
        0x000000da:    f000f878    ..x.    BL       __rt_entry ; 0x1ce
        0x000000de:    f2af0e09    ....    ADR      lr,{pc}-7 ; 0xd7
        0x000000e2:    e8ba000f    ....    LDM      r10!,{r0-r3}
        0x000000e6:    f0130f01    ....    TST      r3,#1
        0x000000ea:    bf18        ..      IT       NE
        0x000000ec:    1afb        ..      SUBNE    r3,r7,r3
        0x000000ee:    f0430301    C...    ORR      r3,r3,#1
        0x000000f2:    4718        .G      BX       r3
;;The code in the middle is omitted...
Region$$Table$$Base
        0x00038060:    00000055    U...    DCD    85
        0x00038064:    00000002    ....    DCD    2
        0x00038068:    00001b38    8...    DCD    6968
        0x0003806c:    00037f63    c...    DCD    229219
        0x00038070:    00000819    ....    DCD    2073
        0x00038074:    00001b3a    :...    DCD    6970
        0x00038078:    0000f5d0    ....    DCD    62928
        0x0003807c:    00037eeb    .~..    DCD    229099
        
//Corresponding C language code (it is also the pseudo code. In fact, the input parameters of the function are placed in r0~r3, so some details of C does not correspond to the corresponding assembly)
//May wish to make the following naming: start=r10, end=r11, base=r7
//The reason why int32_t is used as a reference, there is nothing particular about it, I just think that a 32-bit machine, the address is 32-bit, it can be positive or negative!
            
void __scatterload_null(int32_t start, int32_t end, int32_t base)
{
    TRegionTable OnceOption;
    
    while(1) 
    {
        if(start == end)
        {
            break;
        }
        else
        {
            memcpy(&OnceOption, *start, 16);//16=4 The size of registers
            if( OnceOption.func & 0x1 )
            {
                OnceOption.func = base - OnceOption.func;
            }
            OnceOption.func( OnceOption.load_addr,
                             OnceOption.run_addr,
                             OnceOption.size 
                             base );
             start += 4;
        }
    }
    //This is actually jump to __rt_entry() to execute
   //But in C language I really have no idea how to describe this process, let’s just think that __scatterload_null() function is executed after it is executed
         
}
typedef struct
{
    int32_t     load_addr; //The loading view is the address of the ram data in the bin file when it is downloaded to the microcontroller, usually arranged after code+ro
    int32_t        run_addr;  //Run view, which is the actual address in ram at runtime
    int32_t        size;
    void (*func)(int32_t load_addr, int32_t run_addr, int32_t size, int32_t base);
}TRegionTable;
TRegionTable RegionTable[] @"Region$$Table$$Base" =
{
    //Note that the three bits [1:0] except for size are ignored, these two bits are only used for identification (the specific reasons will be discussed later)
    { 85,    2,        6968,    __decompress$$rt2veneer,            },
    { 2073,    6970,    62928,    __scatterload_zeroinit$$rt2veneer,    },
};

//How do load_addr and size come from? Check out below:
//This is also extracted from the same .S file
** Section #3 'data' (SHT_PROGBITS) [SHF_ALLOC + SHF_WRITE]
    Size   : 1988 bytes (alignment 4)
    Address: 0x000380b4
** Section #4 'bss' (SHT_NOBITS) [SHF_ALLOC + SHF_WRITE]
    Size   : 62928 bytes (alignment 4)
    Address: 0x00039bec
//We will see 6968 ≠ 1988，62928 = 62928
//The reason for this is data compression technology. Open the .map file, it is all clear.
 Code (inc.  data)       RO Data        RW Data      ZI Data  Debug   
 202804      13872      26752       6968      62928    1059524   Grand Totals
 202804      13872      26752       1988      62928    1059524   ELF Image Totals (compressed)
 202804      13872      26752       1988          0          0   ROM Totals

//I will give an introduction of the structure of Region$$Table$$Base
//Each 16Byte is a group, which are the address at load time, the address at runtime, the number of data operated, and the function of operation
//Then it forms into a table according to different needs
//The so-called different needs refer to: variables with initial values and variables without initial values, and then this can be subdivided, as shown below:

//ram --- | Variable without initial value ----------------------- .bss（Call __scatterload_zeroinit()）
//        |                  | Assign all initial values to 0s---- .bss（Call __scatterload_zeroinit()）
//        | Variable with initial value --- | Assign all initial values to non-0 --- .rw（Call __decompress()）
//        |                  | Assigning initial values is messy ---- .rw（Call__scatterload_copy()）
//Of course the above classification may not be accurate, because __decompress() is too special, compression and decompression, this thing cannot be said too detailed

//Skip to introduce how Region$$Table$$Base gets the operation function
//First of all, I list the information of these functions in the assembly (and the previous ones are in the same program):


__decompress$$rt2veneer
    0x000000fc:    f0100f01    ....    TST      r0,#1
    0x00000100:    bf18        ..      IT       NE
    0x00000102:    19c0        ..      ADDNE    r0,r0,r7
    0x00000104:    f0110f01    ....    TST      r1,#1
    0x00000108:    bf18        ..      IT       NE
    0x0000010a:    19c9        ..      ADDNE    r1,r1,r7
    0x0000010c:    f0110f02    ....    TST      r1,#2
    0x00000110:    bf18        ..      IT       NE
    0x00000112:    4449        ID      ADDNE    r1,r1,r9
    0x00000114:    f0210103    !...    BIC      r1,r1,#3
!!dczerorl2
__decompress
__decompress1
    0x00000118:    440a        .D      ADD      r2,r2,r1

;; Zhōngjiān de dàimǎ shěnglüè...
12/5000
;;The code in the middle is omitted...
__scatterload_zeroinit$$rt2veneer
     0x00000174:    f0100f01    ....    TST      r0,#1
     0x00000178:    bf18        ..      IT       NE
     0x0000017a:    19c0        ..      ADDNE    r0,r0,r7
     0x0000017c:    f0110f01    ....    TST      r1,#1
     0x00000180:    bf18        ..      IT       NE
     0x00000182:    19c9        ..      ADDNE    r1,r1,r7
     0x00000184:    f0110f02    ....    TST      r1,#2
     0x00000188:    bf18        ..      IT       NE
     0x0000018a:    4449        ID      ADDNE    r1,r1,r9
     0x0000018c:    f0210103    !...    BIC      r1,r1,#3
 !!handler_zi
 __scatterload_zeroinit
     0x00000190:    2300        .#      MOVS     r3,#0
     
//Then extract several parameters from Region$$Table$$Base
        0x0003805f:    XXXXXXXX    ....    DCD    XXXXXX ;;This is the r7 and base as we mentioned in the front, and this address value is used.
Region$$Table$$Base
        0x00038060:    00000055    U...    DCD    85 ;;In particular, this load_addr is related to the data arrangement, which is an offset
        0x00038064:    00000002    ....    DCD    2
        0x00038068:    00001b38    8...    DCD    6968
        0x0003806c:    00037f63    c...    DCD    229219 ;;Give a label func1
        0x00038070:    00000819    ....    DCD    2073
        0x00038074:    00001b3a    :...    DCD    6970
        0x00038078:    0000f5d0    ....    DCD    62928
        0x0003807c:    00037eeb    .~..    DCD    229099 ;;Give a label func2
        
//Then there is an operation in __scatterload_null() to let r3=r7-r3: 0x000000ec: 1afb .. SUBNE r3,r7,r3
//r3 is the address of func1 and func2, let's calculate it:
//First is func1, 0x0003805f-0x00037f63 = 0x000000fc, is __decompress$$rt2veneer()
//Then func2, 0x0003805f-0x00037eeb = 0x00000174, is __scatterload_zeroinit$$rt2veneer()
//Of course, if you do not use address-independent compilation, there will be some discrepancies, but you can always find something (or simply, you can simulate the single-step execution and get the entire execution flow)

//You will see:
//bit[0] and bit[1] of these two values at func1 and func2 (and even those two representing addresses in Region$$Table$$Base, load_addr and run_addr) are for special purposes.
//bit[0]: =1 indicates that this value is an offset, which needs to be processed with the base address and then used (or difference or addition, the way to obtain the base address is what I said before)
//bit[1]: =1 means that this value is address-independent, that is, rwpi, and you need to specify a base address when you run it, which is r9.

The above is the complete execution process of scatter-loading. However, there is a situation in which an array is defined in ram. This array contains function pointers and is assigned initial values, as shown below:

typedef void (*Func)( void* arg);
Func func[2] = {StartThread_Entry, StartThread_Entry};
    
int main( void )
{
    func[0](0);
    func[1](0);
    //...Other code
}

The above code will be successfully compiled. The address-independent compilation will be introduced next.