C Pointer and Array Cross Reference with Mismatched Type
So far, we've discussed C Pointers and Arrays, C Character Pointer and String Manipulation. There are very confusingly named concepts and it's very easy to get them mixed up.
In this article, we'll see what happens when an external reference is declared with the wrong but compatible type. There are two cases involved, refer array as pointer and opposite.
Ref array as pointer#
Suppose the character array is defined in file 1 as follows:
The external variable declaration referenced in file 2 is as follows:
Here, array a
is defined in file 1, and it is declared as a pointer in file 2. Is there anything wrong with this? Isn't it often said that pointers and arrays are similar, and can even be used interchangeably? But, unfortunately, this is wrong. You may still remember what I said previously: arrays are arrays, and pointers are pointers. They are two completely different things! There is no relationship between them, but they are often dressed in similar clothes to confuse you. Let's try to analyze this problem.
At the beginning of Chapter 1, I emphasized the difference between definitions and declarations. Definitions allocate memory, while declarations do not. Definitions can only appear once, while declarations can appear multiple times. Here, extern
promises the compiler that the name a
has been defined in another file. The name a
used in the following code is defined in another file. Looking back at the discussion of lvalues and rvalues in the previous article, we know that if the compiler needs an address (and possibly an offset) to perform an operation, it can directly read or write the memory at this address by unlocking it (using the key *
), without having to first find the place where the address is stored. On the contrary, for a pointer, you must first find the place where the address is stored, take out the address value, and then unlock the address (still using the key *
).
This is why extern char a[]
is equivalent to extern char a[100]
. Because it is just a declaration and no space is allocated, the compiler does not need to know how many elements the array has. Both declarations tell the compiler that a
is an array defined in another file, and a
also represents the address of the first element of the array, that is, the starting address of this memory block. The addresses of other elements in the array can be expressed in the form of base address + offset.
However, when you declare it as extern char *a
, the compiler naturally thinks that a
is a pointer variable, which occupies 4/8 bytes in 32/64-bit systems respectively. These 4/8 bytes store an address, and the character type data is stored at this address. Although in file 1, the compiler knows that a is an array, but in file 2, the compiler does not know this. Most compilers use files as translation units, and the compiler only processes according to the types declared in this file. Therefore, although the actual size of a is 100 bytes, in file 2, the compiler thinks that a only occupies 4/8 bytes.
We know that the compiler will treat any data stored in a pointer variable as an address. Therefore, if we need to access these character type data, we must first get the address stored in the pointer variable a
.
Assume that the original array a
stores 100 characters, e.g. {'B', 'A', 'A', 'D', 'D', 'A', 'A', 'D', 'F', 'E', 'E', 'D', 'B', 'A', 'B', 'E', ...}. Under x86_64/AArch64 platforms, the compiler will only see the first 8 bytes (the binary encoding of the ASCII characters).
The compiler takes out the first 8 bytes (the ASCII code of the character) at once according to the uintptr_t
type and obtains 0x4441414444414142
in little-endian order, which may not be a valid address and is not expected even if it is valid.
Ref pointer as array#
Obviously, according to the above analysis, if we declare the array defined in file 1 as a pointer in file 2, an error will occur. Similarly, if it is defined as a pointer in file 1 and declared as an array in file 2, an error will also occur.
Suppose the character pointer is defined in file 1 as follows:
The external char array declaration referenced in file 2 is as follows:
In file 1, the compiler allocates 4/8 bytes of space and names it p
; at the same time, p
stores the address of the first character of the string constant "abcdefg"
; this constant string literal itself is stored in the static area of memory(probably .rodata section), and its content cannot be changed. In file 2, the compiler considers p
to be an array with a size of 4/8 bytes, and the array stores data of type char.
In file 2, the pointer-to-array promotion (as opposed to the array-to-pointer decay rule) is applied to the variable-length array. The block of memory (with pointer width) to which p
is bound is treated as an array. For a convenient mnemonic, we note it as _p
, which satisfies the following facts: _p = &p
and *_p = p
.
Suppose on AArch64, its address is 0xaaaab4461078, corresponding to atomic bytearray {0x78, 0x10, 0x46, 0xb4, 0xaa, 0xaa, 0x00, 0x00}. In file 2, the compiler treats the value (address) of the pointer variable p
as an array of 8 char types, and dereferences the bytes in sequence according to the char/byte type. These are not all valid characters, nor are they the addresses of the memory blocks we want. If p[i]
(actually _p[i]
) is assigned a value (i ∈ [0,7]), the legal address originally saved in p
will be destroyed, making it impossible to find the memory it originally pointed to, and even causing an illegal memory access exception.
test programs#
Based on the analysis above, we can design a comprehensive test program.
In file 1(crossdef.c), we define char array and pointer to string literal. function disclose_a()
outputs pointer a
and dereference the first character/byte, function disclose_p()
does the same thing.
#include <stdio.h>
char a[100] = {'B', 'A', 'A', 'D', 'D', 'A', 'A', 'D', 'F', 'E', 'E', 'D', 'B', 'A', 'B', 'E'};
char *p = "abcdefg";
void disclose_a() {
printf("a = %p, *a = (%c, %#x)\n", a, *a, *a);
}
void disclose_p() {
printf("p = %p, *p = (%c, %#x)\n", p, *p, *p);
}
File 2(crossref.c) contains the main routine, it declares external references to a
and p
defined in file 1, but with a deliberately mismatched prototype.
#include <stdio.h>
#include <stdint.h> // uintptr_t
extern char *a;
extern char p[];
extern void disclose_a();
extern void disclose_p();
void ref_array_as_pointer() {
disclose_a();
printf("a = %p\n", a);
printf("*a = %#x\n", *a);
// for (int i=0; i<8; i++) {
// printf("a[%d] = %#x\n", i, a[i]);
// }
}
void ref_pointer_as_array() {
disclose_p();
printf("p = %p\n", p);
printf("*p = %#x\n", *p);
// for (int i=0; i<8; i++) {
// printf("p[%d] = %#x\n", i, p[i]);
// }
}
int main(int argc, char* argv[]) {
int c;
printf("input 'a' or 'p' to choose test case: ");
while ((c=getchar()) != EOF) {
if (c == 'a') {
ref_array_as_pointer();
break;
} else if (c == 'p') {
ref_pointer_as_array();
break;
}
}
return 0;
}
Run the following command to compile the C program.
Run the binary ELF and input a
according to the tips.
$ ./array-pointer-crossref
input 'a' or 'p' to choose test case: a
a = 0xaaaab51e1010, *a = (B, 0x42)
a = 0x4441414444414142
[1] 44102 segmentation fault (core dumped) ./array-pointer-crossref
The following ascii graph illustrates the address and layout of array a
.
0xaaaab51e1010 a
| +---------------------------^----------------------------+
\ / \
--------------------------------------------------------------------------
⚡️ | 'B' | 'A' | 'A' | 'D' | 'D' | 'A' | 'A' | 'D' | ... | xxx | ⚡️
--------------------------------------------------------------------------
a[0] a[1] a[2] a[3] a[4] a[5] a[6] a[7] ... a[99]
Cross reference array as pointer
In fact, the actual type of a
is char[100]
. There is a forced typecast in file 2(crossref.c). When a
is treated as a normal pointer, it takes whatever it contains as an address. Since the pointer width(__SIZEOF_POINTER__
)=8 in AArch64, it will group the first 8 characters/bytes as a unit, forming the address 0x4441414444414142.
However, accessing the illegal address will throw a segmentation fault
exception.
Run the binary ELF again and input p
:
$ ./array-pointer-crossref
input 'a' or 'p' to choose test case: p
p = 0xaaaacf3009d8, *p = (a, 0x61)
p = 0xaaaacf311078
*p = 0xd8
Type rax2 -x 0xaaaacf3009d8
to output in hexpairs or rax2 -c 0xaaaacf3009d8
to output in C string.
The following ascii graph illustrates the address and layout of literal string (character array a
) and pointer p
.
a
+-------------------+ +-----------------^-----------------+
↑ ↓ / \
+--------------+ -----------------------------------------------------
|0xaaaacf3009d8| ⚡️ | 'a' | 'b' | 'c' | 'd' | 'e' | 'f' | 'g' | ⚡️
+--------------+ -----------------------------------------------------
p ↑ / a[0] a[1] a[2] a[3] a[4] a[5] a[6]
0xaaaacf311078 | 0xaaaacf3009d8
|
+--------------+
|0xaaaacf311078|
+--------------+
_p
Cross reference pointer as array
The pointer variable p
is at 0xaaaacf311078
, when treated as an array, its content 0xaaaacf3009d8
(the address of the string literal) is split as a character array. According to the latent implication of the array name, when used as rvalue, it returns the address of the first element, which is 0xaaaacf311078
. So *p
will dereference the first character/byte and return 0xd8
under LSB(Least Significant Byte comes first).
Here it's all about p
, nothing to do with a
.
The content of this article may be a bit confusing and brain-bending, but if you understand it thoroughly, you will really get to grips with the concepts and essence of arrays and pointers. At least, that's how it feels to me. I hope it helps you a little.
As usual, let's see what the compiler does under the hood. This will help our understanding from the ground up.
Type r2 -Ad array-pointer-crossref
to launch debugging with radare2. See radare2 basics - embark, reloc puts@plt via GOT - r2 debug and C Pointer Explanation in armasm for some references if you're not familiar with r2.
Aside: The Address Space Layout Randomisation (ASLR) mechanism ensures that the program is loaded at a different address each time it is run, to mitigate exploits.
list sections#
Use readelf
, objdump
or rabin2
to display the sections' header statically.
readelf -SW array-pointer-crossref
objdump -hw array-pointer-crossref
rabin2 -S array-pointer-crossref
After start debugging with r2, we can use iS
command to list sections:
[0xaaaad08408c0]> iS
[Sections]
nth paddr size vaddr vsize perm type name
―――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――
0 0x00000000 0x0 0x00000000 0x0 ---- NULL
1 0x00000238 0x1b 0xaaaad0840238 0x1b -r-- PROGBITS .interp
2 0x00000254 0x24 0xaaaad0840254 0x24 -r-- NOTE .note.gnu.build-id
3 0x00000278 0x20 0xaaaad0840278 0x20 -r-- NOTE .note.ABI-tag
4 0x00000298 0x1c 0xaaaad0840298 0x1c -r-- GNU_HASH .gnu.hash
5 0x000002b8 0x108 0xaaaad08402b8 0x108 -r-- DYNSYM .dynsym
6 0x000003c0 0x9c 0xaaaad08403c0 0x9c -r-- STRTAB .dynstr
7 0x0000045c 0x16 0xaaaad084045c 0x16 -r-- GNU_VERSYM .gnu.version
8 0x00000478 0x30 0xaaaad0840478 0x30 -r-- GNU_VERNEED .gnu.version_r
9 0x000004a8 0x108 0xaaaad08404a8 0x108 -r-- RELA .rela.dyn
10 0x000005b0 0x90 0xaaaad08405b0 0x90 -r-- RELA .rela.plt
11 0x00000640 0x18 0xaaaad0840640 0x18 -r-x PROGBITS .init
12 0x00000660 0x80 0xaaaad0840660 0x80 -r-x PROGBITS .plt
13 0x00000700 0x2b8 0xaaaad0840700 0x2b8 -r-x PROGBITS .text
14 0x000009b8 0x14 0xaaaad08409b8 0x14 -r-x PROGBITS .fini
15 0x000009d0 0x97 0xaaaad08409d0 0x97 -r-- PROGBITS .rodata
16 0x00000a68 0x5c 0xaaaad0840a68 0x5c -r-- PROGBITS .eh_frame_hdr
17 0x00000ac8 0x12c 0xaaaad0840ac8 0x12c -r-- PROGBITS .eh_frame
18 0x00000d78 0x8 0xaaaad0850d78 0x8 -rw- INIT_ARRAY .init_array
19 0x00000d80 0x8 0xaaaad0850d80 0x8 -rw- FINI_ARRAY .fini_array
20 0x00000d88 0x1f0 0xaaaad0850d88 0x1f0 -rw- DYNAMIC .dynamic
21 0x00000f78 0x88 0xaaaad0850f78 0x88 -rw- PROGBITS .got
22 0x00001000 0x80 0xaaaad0851000 0x80 -rw- PROGBITS .data
23 0x00001080 0x0 0xaaaad0851080 0x8 -rw- NOBITS .bss
24 0x00001080 0x2b 0x00000000 0x2b ---- PROGBITS .comment
25 0x000010ab 0x60 0x00000000 0x60 ---- PROGBITS .debug_aranges
26 0x0000110b 0x251 0x00000000 0x251 ---- PROGBITS .debug_info
27 0x0000135c 0x1a1 0x00000000 0x1a1 ---- PROGBITS .debug_abbrev
28 0x000014fd 0x10e 0x00000000 0x10e ---- PROGBITS .debug_line
29 0x0000160b 0x120 0x00000000 0x120 ---- PROGBITS .debug_str
30 0x0000172b 0x68 0x00000000 0x68 ---- PROGBITS .debug_line_str
31 0x00001798 0xa08 0x00000000 0xa08 ---- SYMTAB .symtab
32 0x000021a0 0x2ac 0x00000000 0x2ac ---- STRTAB .strtab
33 0x0000244c 0x14a 0x00000000 0x14a ---- STRTAB .shstrtab
telescope .rodata#
Use readelf
or objdump
to display specified section's content statically.
During debugging with radare2, use xr
to telescope the .rodata
.
[0xaaaad08408c0]> xr $w*(`iS,name/eq/.rodata ~.rodata[2]`/8) @ `iS,name/eq/.rodata ~.rodata[3]`
0xaaaad08409d0 0x0000000000020001 ........ @ obj._IO_stdin_used 131073
0xaaaad08409d8 0x0067666564636261 abcdefg. @ str.abcdefg ascii ('a')
0xaaaad08409e0 0x202c7025203d2061 a = %p, @ str.a___p__a____c___x__n ascii ('a')
0xaaaad08409e8 0x632528203d20612a *a = (%c ascii ('*')
0xaaaad08409f0 0x000a29782325202c , %#x)..
0xaaaad08409f8 0x202c7025203d2070 p = %p, @ str.p___p__p____c___x__n ascii ('p')
0xaaaad0840a00 0x632528203d20702a *p = (%c ascii ('*')
0xaaaad0840a08 0x000a29782325202c , %#x)..
0xaaaad0840a10 0x000a7025203d2061 a = %p.. @ str.a___p_n
0xaaaad0840a18 0x782325203d20612a *a = %#x @ str.a___x_n ascii ('*')
0xaaaad0840a20 0x000000000000000a ........ 10 x10,d10
0xaaaad0840a28 0x000a7025203d2070 p = %p.. @ str.p___p_n
0xaaaad0840a30 0x782325203d20702a *p = %#x @ str.p___x_n ascii ('*')
0xaaaad0840a38 0x000000000000000a ........ 10 x10,d10
0xaaaad0840a40 0x6127207475706e69 input 'a @ str.input_a_or_p_to_choose_test_case: ascii ('i')
0xaaaad0840a48 0x27702720726f2027 ' or 'p' ascii (''')
0xaaaad0840a50 0x6f6f6863206f7420 to choo ascii (' ')
0xaaaad0840a58 0x2074736574206573 se test ascii ('s')
telescope .data#
Use readelf
or objdump
to display specified section's content statically.
During debugging with radare2, use xr
to telescope the .data
, see reloc puts@plt via GOT - r2 debug.
[0xaaaad08408c0]> xr $w*(`iS,name/eq/.data ~.data[2]`/8) @ `iS,name/eq/.data ~.data[3]`
0xaaaad0851000 ..[ null bytes ].. 00000000 loc.__data_start
0xaaaad0851008 0x0000aaaad0851008 ........ @ obj.__dso_handle /home/pifan/Projects/cpp/pointer/array-pointer-crossref .data __dso_handle program R W 0xaaaad0851008
0xaaaad0851010 0x4441414444414142 BAADDAAD @ obj.a x0,d0 ascii ('B')
0xaaaad0851018 0x4542414244454546 FEEDBABE ascii ('F')
0xaaaad0851020 ..[ null bytes ].. 00000000
0xaaaad0851078 0x0000aaaad08409d8 ........ @ obj.p /home/pifan/Projects/cpp/pointer/array-pointer-crossref .rodata str.abcdefg program R X 'invalid' 'array-pointer-crossref' abcdefg
dm memory maps#
List memory maps of current/target process.
loc.__data_start
corresponds to.data
section
[0xaaaad08408c0]> dm
# segment.LOAD0
0x0000aaaad0840000 - 0x0000aaaad0841000 * usr 4K s r-x /home/pifan/Projects/cpp/pointer/array-pointer-crossref /home/pifan/Projects/cpp/pointer/array-pointer-crossref ; map._home_pifan_Projects_cpp_pointer_array_pointer_crossref.r_x
# segment.LOAD1
0x0000aaaad0850000 - 0x0000aaaad0851000 - usr 4K s r-- /home/pifan/Projects/cpp/pointer/array-pointer-crossref /home/pifan/Projects/cpp/pointer/array-pointer-crossref ; map._home_pifan_Projects_cpp_pointer_array_pointer_crossref.rw_
0x0000aaaad0851000 - 0x0000aaaad0852000 - usr 4K s rw- /home/pifan/Projects/cpp/pointer/array-pointer-crossref /home/pifan/Projects/cpp/pointer/array-pointer-crossref ; loc.__data_start
0x0000aaaaf3293000 - 0x0000aaaaf32b4000 - usr 132K s rw- [heap] [heap]
0x0000ffffa9f50000 - 0x0000ffffaa0d8000 - usr 1.5M s r-x /usr/lib/aarch64-linux-gnu/libc.so.6 /usr/lib/aarch64-linux-gnu/libc.so.6
0x0000ffffaa0d8000 - 0x0000ffffaa0e7000 - usr 60K s --- /usr/lib/aarch64-linux-gnu/libc.so.6 /usr/lib/aarch64-linux-gnu/libc.so.6
0x0000ffffaa0e7000 - 0x0000ffffaa0eb000 - usr 16K s r-- /usr/lib/aarch64-linux-gnu/libc.so.6 /usr/lib/aarch64-linux-gnu/libc.so.6
0x0000ffffaa0eb000 - 0x0000ffffaa0ed000 - usr 8K s rw- /usr/lib/aarch64-linux-gnu/libc.so.6 /usr/lib/aarch64-linux-gnu/libc.so.6
0x0000ffffaa0ed000 - 0x0000ffffaa0f9000 - usr 48K s rw- unk0 unk0
0x0000ffffaa116000 - 0x0000ffffaa141000 - usr 172K s r-x /usr/lib/aarch64-linux-gnu/ld-linux-aarch64.so.1 /usr/lib/aarch64-linux-gnu/ld-linux-aarch64.so.1 ; map._usr_lib_aarch64_linux_gnu_ld_linux_aarch64.so.1.r_x
0x0000ffffaa14b000 - 0x0000ffffaa14d000 - usr 8K s rw- unk1 unk1
0x0000ffffaa14d000 - 0x0000ffffaa14f000 - usr 8K s r-- [vvar] [vvar] ; map._vvar_.r__
0x0000ffffaa14f000 - 0x0000ffffaa150000 - usr 4K s r-x [vdso] [vdso] ; map._vdso_.r_x
0x0000ffffaa150000 - 0x0000ffffaa152000 - usr 8K s r-- /usr/lib/aarch64-linux-gnu/ld-linux-aarch64.so.1 /usr/lib/aarch64-linux-gnu/ld-linux-aarch64.so.1 ; map._usr_lib_aarch64_linux_gnu_ld_linux_aarch64.so.1.rw_
0x0000ffffaa152000 - 0x0000ffffaa154000 - usr 8K s rw- /usr/lib/aarch64-linux-gnu/ld-linux-aarch64.so.1 /usr/lib/aarch64-linux-gnu/ld-linux-aarch64.so.1
0x0000ffffc678d000 - 0x0000ffffc67ae000 - usr 132K s rw- [stack] [stack] ; map._stack_.rw_
sym.disclose_a#
printf("a = %p, *a = (%c, %#x)\n", a, *a, *a);
: rvalue of array name and dereference.
- +12/+28/+48: x0/x0/x1 = loc.data_start+0x10 = 0x0000aaaad0851010, see
dm
- +16/+32: w0 = ldrb [0x0000aaaad0851010] = 0x42('B'), see
.data
- +20/+36~+40: w1 = w0; w3 = w0, w2 = w1 (=w0);
- +52/+56: x0 = segment.LOAD0+0x9e0 = 0xaaaad08409e0, see
dm
and.rodata
params: x0, x1, w2, w3
[0xaaaad08408c0]> pdf @ sym.disclose_a
┌ 76: sym.disclose_a ();
│ sym.disclose_a + 0 0xaaaad0840814 b fd7bbfa9 stp x29, x30, [sp, -0x10]!
│ sym.disclose_a + 4 0xaaaad0840818 fd030091 mov x29, sp
│ sym.disclose_a + 8 0xaaaad084081c 800000b0 adrp x0, loc.data_start
│ sym.disclose_a + 12 0xaaaad0840820 00400091 add x0, x0, 0x10
│ sym.disclose_a + 16 0xaaaad0840824 00004039 ldrb w0, [x0]
│ sym.disclose_a + 20 0xaaaad0840828 e103002a mov w1, w0
│ sym.disclose_a + 24 0xaaaad084082c 800000b0 adrp x0, loc.data_start
│ sym.disclose_a + 28 0xaaaad0840830 00400091 add x0, x0, 0x10
│ sym.disclose_a + 32 0xaaaad0840834 00004039 ldrb w0, [x0]
│ sym.disclose_a + 36 0xaaaad0840838 e303002a mov w3, w0
│ sym.disclose_a + 40 0xaaaad084083c e203012a mov w2, w1
│ sym.disclose_a + 44 0xaaaad0840840 800000b0 adrp x0, loc.data_start
│ sym.disclose_a + 48 0xaaaad0840844 01400091 add x1, x0, 0x10
│ sym.disclose_a + 52 0xaaaad0840848 00000090 adrp x0, segment.LOAD0
│ sym.disclose_a + 56 0xaaaad084084c 00802791 add x0, x0, 0x9e0
│ sym.disclose_a + 60 0xaaaad0840850 a0ffff97 bl sym.imp.printf
│ sym.disclose_a + 64 0xaaaad0840854 1f2003d5 nop
│ sym.disclose_a + 68 0xaaaad0840858 fd7bc1a8 ldp x29, x30, [sp], 0x10
└ sym.disclose_a + 72 0xaaaad084085c c0035fd6 ret
sym.ref_array_as_pointer#
-
extern char *a;
: see ascii graph abovechar **_a = &a
= 0x0000aaaad0850000+0xfe8 = 0x0000aaaad0850fe8
-
printf("a = %p\n", a);
: rvalue of pointer- +16: x0 = *_a = a = 0x0000aaaad0851010
- +20: x0 = *a = 0x4441414444414142("BAADDAAD")
- +24: x1 = x0
- +28~+32: x0 = segment.LOAD0+0xa10 = 0xaaaad0840a10
-
printf("*a = %#x\n", *a);
: dereference pointer- +44: x0 = *_a = a = 0x0000aaaad0851010
- +48: x0 = *a = 0x4441414444414142("BAADDAAD")
- +52: ldrb w0 = [0x4441414444414142] => SIGNAL 11(SIGSEGV), segmentation fault
[0xaaaad08408c0]> pdf @ sym.ref_array_as_pointer
┌ 84: sym.ref_array_as_pointer ();
│ sym.ref_array_as_pointer + 0 0xaaaad08408b4 b fd7bbfa9 stp x29, x30, [sp, -0x10]!
│ sym.ref_array_as_pointer + 4 0xaaaad08408b8 fd030091 mov x29, sp
│ sym.ref_array_as_pointer + 8 0xaaaad08408bc d6ffff97 bl sym.disclose_a
│ sym.ref_array_as_pointer + 12 ;-- x30:
│ sym.ref_array_as_pointer + 12 ;-- d30:
│ sym.ref_array_as_pointer + 12 0xaaaad08408c0 b 80000090 adrp x0, map._home_pifan_Projects_cpp_pointer_array_pointer_crossref.rw_
│ sym.ref_array_as_pointer + 16 0xaaaad08408c4 00f447f9 ldr x0, [x0, 0xfe8]
│ sym.ref_array_as_pointer + 20 0xaaaad08408c8 000040f9 ldr x0, [x0]
│ sym.ref_array_as_pointer + 24 ;-- pc:
│ sym.ref_array_as_pointer + 24 0xaaaad08408cc e10300aa mov x1, x0
│ sym.ref_array_as_pointer + 28 0xaaaad08408d0 00000090 adrp x0, segment.LOAD0
│ sym.ref_array_as_pointer + 32 0xaaaad08408d4 00402891 add x0, x0, 0xa10
│ sym.ref_array_as_pointer + 36 0xaaaad08408d8 7effff97 bl sym.imp.printf
│ sym.ref_array_as_pointer + 40 0xaaaad08408dc 80000090 adrp x0, map._home_pifan_Projects_cpp_pointer_array_pointer_crossref.rw_
│ sym.ref_array_as_pointer + 44 0xaaaad08408e0 00f447f9 ldr x0, [x0, 0xfe8]
│ sym.ref_array_as_pointer + 48 0xaaaad08408e4 000040f9 ldr x0, [x0]
│ sym.ref_array_as_pointer + 52 0xaaaad08408e8 00004039 ldrb w0, [x0]
│ sym.ref_array_as_pointer + 56 0xaaaad08408ec e103002a mov w1, w0
│ sym.ref_array_as_pointer + 60 0xaaaad08408f0 00000090 adrp x0, segment.LOAD0
│ sym.ref_array_as_pointer + 64 0xaaaad08408f4 00602891 add x0, x0, 0xa18
│ sym.ref_array_as_pointer + 68 0xaaaad08408f8 76ffff97 bl sym.imp.printf
│ sym.ref_array_as_pointer + 72 0xaaaad08408fc 1f2003d5 nop
│ sym.ref_array_as_pointer + 76 0xaaaad0840900 fd7bc1a8 ldp x29, x30, [sp], 0x10
└ sym.ref_array_as_pointer + 80 0xaaaad0840904 c0035fd6 ret
sym.disclose_p#
printf("p = %p, *p = (%c, %#x)\n", p, *p, *p);
: rvalue of pointer and dereference.
- +12/+24/+44: x0 = loc.data_start+0x78 = 0x0000aaaad0851078, see
dm
- +16/+28/+48: x1/x0/x0 = ldr [0x0000aaaad0851078] = 0x0000aaaad08409d8, see
.data
, link to.rodata
- +32/+52: w0 = ldrb [0x0000aaaad08409d8] = 0x61('a'), see
.rodata
- +36/+56: w2 = w0 ;w3 = w0
- +60/+64: x0 = segment.LOAD0+0x9f8 = 0xaaaad08409f8, see
dm
and.rodata
params: x0, x1, w2, w3
[0xaaaad08408c0]> pdf @ sym.disclose_p
┌ 84: sym.disclose_p ();
│ sym.disclose_p + 0 0xaaaad0840860 b fd7bbfa9 stp x29, x30, [sp, -0x10]!
│ sym.disclose_p + 4 0xaaaad0840864 fd030091 mov x29, sp
│ sym.disclose_p + 8 0xaaaad0840868 800000b0 adrp x0, loc.data_start
│ sym.disclose_p + 12 0xaaaad084086c 00e00191 add x0, x0, 0x78
│ sym.disclose_p + 16 0xaaaad0840870 010040f9 ldr x1, [x0]
│ sym.disclose_p + 20 0xaaaad0840874 800000b0 adrp x0, loc.data_start
│ sym.disclose_p + 24 0xaaaad0840878 00e00191 add x0, x0, 0x78
│ sym.disclose_p + 28 0xaaaad084087c 000040f9 ldr x0, [x0]
│ sym.disclose_p + 32 0xaaaad0840880 00004039 ldrb w0, [x0]
│ sym.disclose_p + 36 0xaaaad0840884 e203002a mov w2, w0
│ sym.disclose_p + 40 0xaaaad0840888 800000b0 adrp x0, loc.data_start
│ sym.disclose_p + 44 0xaaaad084088c 00e00191 add x0, x0, 0x78
│ sym.disclose_p + 48 0xaaaad0840890 000040f9 ldr x0, [x0]
│ sym.disclose_p + 52 0xaaaad0840894 00004039 ldrb w0, [x0]
│ sym.disclose_p + 56 0xaaaad0840898 e303002a mov w3, w0
│ sym.disclose_p + 60 0xaaaad084089c 00000090 adrp x0, segment.LOAD0
│ sym.disclose_p + 64 0xaaaad08408a0 00e02791 add x0, x0, 0x9f8
│ sym.disclose_p + 68 0xaaaad08408a4 8bffff97 bl sym.imp.printf
│ sym.disclose_p + 72 0xaaaad08408a8 1f2003d5 nop
│ sym.disclose_p + 76 0xaaaad08408ac fd7bc1a8 ldp x29, x30, [sp], 0x10
└ sym.disclose_p + 80 0xaaaad08408b0 c0035fd6 ret
sym.ref_pointer_as_array#
-
extern char p[];
: see ascii graph abovechar **_p = &p
= segment.LOAD1+0xfd8 = 0x0000aaaad0850fd8
-
printf("p = %p\n", p);
: rvalue of array name =&p[0]
- +16: x1 = ldr [_p] = 0xaaaad0851078
-
printf("*p = %#x\n", *p);
: dereference pointer- +36: x1 = ldr [_p] = 0xaaaad0851078
- +40: w0 = ldrb [0xaaaad0851078] = first byte of 0x0000aaaad08409d8 = 0xd8
- +44: w1 = w0
[0xaaaad08408c0]> pdf @ sym.ref_pointer_as_array
┌ 72: sym.ref_pointer_as_array ();
│ sym.ref_pointer_as_array + 0 0xaaaad0840908 b fd7bbfa9 stp x29, x30, [sp, -0x10]!
│ sym.ref_pointer_as_array + 4 0xaaaad084090c fd030091 mov x29, sp
│ sym.ref_pointer_as_array + 8 0xaaaad0840910 d4ffff97 bl sym.disclose_p
│ sym.ref_pointer_as_array + 12 0xaaaad0840914 80000090 adrp x0, map._home_pifan_Projects_cpp_pointer_array_pointer_crossref.rw_
│ sym.ref_pointer_as_array + 16 0xaaaad0840918 01ec47f9 ldr x1, [x0, 0xfd8]
│ sym.ref_pointer_as_array + 20 0xaaaad084091c 00000090 adrp x0, segment.LOAD0
│ sym.ref_pointer_as_array + 24 0xaaaad0840920 00a02891 add x0, x0, 0xa28
│ sym.ref_pointer_as_array + 28 0xaaaad0840924 6bffff97 bl sym.imp.printf
│ sym.ref_pointer_as_array + 32 0xaaaad0840928 80000090 adrp x0, map._home_pifan_Projects_cpp_pointer_array_pointer_crossref.rw_
│ sym.ref_pointer_as_array + 36 0xaaaad084092c 00ec47f9 ldr x0, [x0, 0xfd8]
│ sym.ref_pointer_as_array + 40 0xaaaad0840930 00004039 ldrb w0, [x0]
│ sym.ref_pointer_as_array + 44 0xaaaad0840934 e103002a mov w1, w0
│ sym.ref_pointer_as_array + 48 0xaaaad0840938 00000090 adrp x0, segment.LOAD0
│ sym.ref_pointer_as_array + 52 0xaaaad084093c 00c02891 add x0, x0, 0xa30
│ sym.ref_pointer_as_array + 56 0xaaaad0840940 64ffff97 bl sym.imp.printf
│ sym.ref_pointer_as_array + 60 0xaaaad0840944 1f2003d5 nop
│ sym.ref_pointer_as_array + 64 0xaaaad0840948 fd7bc1a8 ldp x29, x30, [sp], 0x10
└ sym.ref_pointer_as_array + 68 0xaaaad084094c c0035fd6 ret
The subject content of this article is referenced from the classic 《C语言深度解剖(第3版)》 | 第 4 章 指针和数组 - 4.3 指针和数组之间的恩恩怨怨 - 4.3.3 指针和数组的定义与声明.
Sincere thanks to the original author!