//.S file to find the corresponding location (you can find it according to .map)
//1. The code corresponding to the main function in main.c (seeing the $Super$$main operation mentioned earlier, it can be seen that the main that the compiler sees is different from the main that we see)
//Main=$Sub$$main+$Super$$main+main seen by the compiler, it combines these three, and KEIL opens all three interfaces to us
//Among them, $Super$$main has no practical effect, the effect is to jump back to main, and $Sub$$main and main have real functions
//On this point, you should know: all i.$Sub$$main or i.main prefixed with i. are functions compiled from .c/.s
i.main
$Super$$main
0x00030868: b51c .. PUSH {r2-r4,lr}
0x0003086a: 482c ,H LDR r0,[pc,#176] ; [0x3091c] = 0x1b0c
0x0003086c: 4448 HD ADD r0,r0,r9
0x0003086e: 6801 .h LDR r1,[r0,#0]
0x00030870: 2000 . MOVS r0,#0
0x00030872: 4788 .G BLX r1
0x00030874: 4829 )H LDR r0,[pc,#164] ; [0x3091c] = 0x1b0c
0x00030876: 4448 HD ADD r0,r0,r9
0x00030878: 6841 Ah LDR r1,[r0,#4]
0x0003087a: 2000 . MOVS r0,#0
0x0003087c: 4788 .G BLX r1
;; Omit other codes
$d
0x0003091c: 00001b0c .... DCD 6924
;; Omit other codes
//2. The following is the dynamic initialization code formed in main.c (used to perform the final operation of rwpi at runtime: materialization)
.text
__sta__dyninit
0x00001d40: 4803 .H LDR r0,[pc,#12] ; [0x1d50] = 0x21ac3
0x00001d42: 4478 xD ADD r0,r0,pc
0x00001d44: 4903 .I LDR r1,[pc,#12] ; [0x1d54] = 0x1b0c
0x00001d46: 4449 ID ADD r1,r1,r9
0x00001d48: 6008 .` STR r0,[r1,#0]
0x00001d4a: 6048 H` STR r0,[r1,#4]
0x00001d4c: 4770 pG BX lr
$d
0x00001d4e: 0000 .. DCW 0
0x00001d50: 00021ac3 .... DCD 137923
0x00001d54: 00001b0c .... DCD 6924
;; Omit other codes
//3. How to call __sta__dyninit
//__main -> __rt_entry -> __rt_lib_init -> __cpp_initialize__aeabi_
//This operation is executed after __scatterload
.text
__cpp_initialize__aeabi_
0x00002598: b570 p. PUSH {r4-r6,lr}
0x0000259a: 4c06 .L LDR r4,[pc,#24] ; [0x25b4] = 0x35ae0
0x0000259c: 447c |D ADD r4,r4,pc
0x0000259e: 4d06 .M LDR r5,[pc,#24] ; [0x25b8] = 0x35b10
0x000025a0: 447d }D ADD r5,r5,pc
0x000025a2: e003 .. B 0x25ac ; __cpp_initialize__aeabi_ + 20
0x000025a4: 6820 h LDR r0,[r4,#0]
0x000025a6: 4420 D ADD r0,r0,r4
0x000025a8: 4780 .G BLX r0
0x000025aa: 1d24 $. ADDS r4,r4,#4
0x000025ac: 42ac .B CMP r4,r5
0x000025ae: d1f9 .. BNE 0x25a4 ; __cpp_initialize__aeabi_ + 12
0x000025b0: bd70 p. POP {r4-r6,pc}
$d
0x000025b2: 0000 .. DCW 0
0x000025b4: 00035ae0 .Z.. DCD 219872
0x000025b8: 00035b10 .[.. DCD 219920
;; Omit other codes
.init_array
Region$$Table$$Limit
SHT$$INIT_ARRAY$$Base
0x00038080: fffc8161 a... DCD 4294738273
.init_array
0x00038084: fffc81a1 .... DCD 4294738337
.init_array
0x00038088: fffc838d .... DCD 4294738829
.init_array
0x0003808c: fffc83bd .... DCD 4294738877
.init_array
0x00038090: fffc8425 %... DCD 4294738981
.init_array
0x00038094: fffc8d3d =... DCD 4294741309
.init_array
0x00038098: fffc9109 .... DCD 4294742281
.init_array
0x0003809c: fffc9481 .... DCD 4294743169
.init_array
0x000380a0: fffc95fd .... DCD 4294743549
.init_array
0x000380a4: fffc96fd .... DCD 4294743805
.init_array
0x000380a8: fffc9881 .... DCD 4294744193
.init_array
0x000380ac: fffc9c29 )... DCD 4294745129
.init_array
0x000380b0: fffc9c91 .... DCD 4294745233 ;;Note that this is the data that needs to be dynamically loaded in my main function
.init_array
SHT$$INIT_ARRAY$$Limit
//4. Start analysis
a. During the compilation and linking period, the address-independent compilation technology is used, so the actual running address of our function is uncertain, then it is impossible to simply pass the previously seen dispersion
Load to achieve the "relocation" of data to realize the initialization of variables in ram. It can only be determined after the .bin file is loaded into the flash and executed.
b. So the linker links a __cpp_initialize__aeabi_() function (located in the library "../clib/arm_runtime.c" provided by KEIL.
The __scatter() responsible for scatter loading is located in the library "../clib/angel/scatter.s", which is an assembly file, so this leads to the fact that __scatter() can not be used With the stack, it can be executed early, and __cpp_initialize__aeabi_() must wait until the stack is initialized before proceeding, so it must be after __scatter().
c. Then we look at how __cpp_initialize__aeabi_() is executed to __sta__dyninit() in each file.
It’s still an old routine, just analyze it again, but this time we directly write the corresponding c code (because this function was originally compiled by c, and the assembly code is also very clear.
Obviously see the traces: after entering the function, protect the registers and lr to be used in the function, and then use r0~r3 as intermediate variables during the whole process, and then when returning, directly pop lr from the stack to pc)
int32_t FuncTable[] @"SHT$$INIT_ARRAY$$Base" =
{
0xfffc8161,
0xfffc81a1,
0xfffc838d,
0xfffc83bd,
0xfffc8425,
0xfffc8d3d,
0xfffc9109,
0xfffc9481,
0xfffc95fd,
0xfffc96fd,
0xfffc9881,
0xfffc9c29,
0xfffc9c91, //The relative offset of __sta__dyninit() in main.c and "here"
};
void __cpp_initialize__aeabi_(void)
{
int32_t start;
int32_t end;
void (*func)(void);
//Due to the pipeline mechanism of Cortex-M, the pc here is actually the current execution instruction +4
start = 0x00035ae0 + 0x000025a0; //=0x00038080 //r4
//=&FuncTable
end = 0x00035b10 + 0x000025a4; //=0x000380b4 //r5
//=&FuncTable + sizeof(FuncTable)
while(1)
{
if(start == end)
{
break;
}
else
{
func = *start;
func += start; //The actual offset stored in FuncTable is the offset relative to the corresponding element in FuncTable
func();
start += 4;
}
}
}
Let us calculate the following:
.init_array
0x000380b0: fffc9c91 .... DCD 4294745233 ;; Note that this is the data that needs to be dynamically loaded in my main function
Both are complements, just add them directly: 0xfffc9c91 + 0x000380b0 = 0x00001d41 (found that it is the first address of __sta__dyninit(), but bit[0]=1)
Regarding to the problem of bit[0]=0/1, it is easy to understand if you are familiar with the ARM instruction set.
d. The above has completed the dynamic loading of ram. In summary, the actual address of the function will only be determined during operation, so these variables can only be assigned after operation.
e. Then we use this variable in the main function, it is very simple: add the base address of r9 and use it.
f. Regarding dynamic loading, it will be explained in detail in the address-independent compilation section. Here is just a simple process, knowing that all data can be allocated and initialized with reasonable evidence
That's it, the work is done before the main function is executed.
Originally, scatter-loading only refers to __scatter()
, and I am accustomed to include dynamic loading as a scatter-loading category, because they all initialize ram. No matter how you divide it, it doesn't matter/
Note:
-KEIL is configured through scatter-loading files. Generally, rw, zi, heap, stack (where zi+heap+stack=.bss) are allocated at once. The reason why the stack is after the heap is because the stack generally grows downward in the arm. And the heap is not used up every time and grows in phase, so the heap and stack can share the space between the two to the greatest extent.
-The above is the analysis of KEIL. I did not analyze IAR and GCC in particular. The essence is the same.
-The above is only an analysis of the scatter-loading process when address-independent compilation is not used (although my sample code does use address-independent compilation, there is no difference in essence; at the same time, you can practice it by yourself to further enhance your understanding).
Mainly for const variables
and data initialized by scatter loader
(contents placed in flash, generally do not include functions).
Its official explanation:
ROPI = Read-Only Position Independence. This concerns everything that is readonly in the ELF output from the linker. Note that this includes const data and data initializers, i.e. typically everything that is put in FLASH.
Mainly explained the following 3 points:
1. const data is a variable modified by const;
2. Data initializers is also the data initializer (I understand it is the dynamic loader __sta__dyninit());
3. Function;
The above 3 points include all the data in flash (everything that is put in FLASH).
So there is no problem under bare metal, because under bare metal we generally need to specify where our program runs in the flash (simply like a .hex file, each record has an address, which is why .hex can be converted to The .bin file, on the contrary, is highly probable. The reason is that the `.bin file is pure instruction code without address information.
But after adding the operating system, it's different. The program runs there, and the operating system takes the only counts. Moreover, programs are usually stored in storage media such as disks, external flash, etc., and are only taken out when they are used, either in the internal memory or in the internal flash.
The prerequisite for the realization of ropi is: The positions of functions and variables in the program block are relative to each other, otherwise everything is unnecessary.