跳转至

2021#

Struct Alignment Specify

在上一篇《Struct Alignment Rule》中,我们梳理了结构体存储布局的“地址边界对齐限制”规则。
本篇介绍通过编译器 gcc/msvc 提供的扩展特性及 C/C++ 提供的一些语言特性来修改默认的对齐参数,并测试分析其作用效果。

  1. The packed attribute specifies that a structure member should have the smallest possible alignment.
  2. The aligned attribute specifies a minimum alignment for the variable or structure field, measured in bytes.
  3. -fpack-struct[=n]/#pragma pack(n) specifies the maximum alignment, structure members can potentially be unaligned.

Struct Alignment Rule

An object doesn't just need enough storage to hold its representation. In addition, on some machine architectures, the bytes used to hold it must have proper alignment for the hardware to access it efficiently.

Where alignment most often becomes visible is in object layouts: sometimes structs contain "holes" to improve alignment.

C/C++ Memory Alignment

One of the low-level features of C/C++ is the ability to specify the precise alignment of objects in memory to take maximum advantage of a specific hardware architecture. By default, the compiler aligns class and struct members on their size value.

Memory Address Alignment

One of the low-level features of C/C++ is the ability to specify the precise alignment of objects in memory to take maximum advantage of a specific hardware architecture. By default, the compiler aligns class and struct members on their size value.

x86's ill-timed WORD

The word size is the computer's preferred size for moving units of information around; technically it's the width of your processor's registers. It reflects the amount of data that can be transmitted between memory and the processor in one chunk. Likewise, it may reflect the size of data that can be manipulated by the CPU's ALU in one cycle.

Whereas, in the universe of x86, word continues to designate a 16-bit quantity. Microsoft's Windows API maintains the programming language definition of WORD as 16 bits, despite the fact that the API may be used on a 32- or 64-bit x86 processor.

Machine Word

In computing, a word is the natural unit of data used by a particular processor design. The term word refers to the standard number of bits that are manipulated as a unit by any particular CPU.

The word size is the computer's preferred size for moving units of information around; technically it's the width of your processor's registers. It reflects the amount of data that can be transmitted between memory and the processor in one chunk. Likewise, it may reflect the size of data that can be manipulated by the CPU's ALU in one cycle.

Data Models

In 32-bit programs, pointers and data types such as integers generally have the same length. This is not necessarily true on 64-bit machines. Mixing data types in programming languages such as C and its descendants such as C++ and Objective-C may thus work on 32-bit implementations but not on 64-bit implementations.

Byte Order(Endianess)

Binary —— Bitset & Bytes 中,我们通过打印 bitset 和 byte array,直观感受了二进制的位模式(bit pattern & binary representation)。然后,遗留下了三个问题:

  1. hex(2010) 输出十六进制是 0x7da,hexdump(2010) 输出内存中的字节数组是 {0xda, 0x7},字节顺序为什么不同?
  2. 每2个byte组合而成的短整型 short array [0] 为什么是 0x3130,而非 0x3031
  3. 程序 test-conversion-narrow-down.c 输出结果是怎样的?

这三个问题的答案都涉及到字节序(Byte Order)问题。

The C Memory Model

Pointers present us with a certain abstraction of the environment and state in which our program is executed, the C memory model. We may apply the unary operator & to (almost) all objects to retrieve their address and use it to inspect and change the state of our execution.

Binary —— Bitset & Bytes

所谓数制是指计数的方法。对于有10根手指的人来来说,使用十进制表示法是很自然的事情,现代计算机则采用的是二进制系统,存储和处理的信息以二值信号表示。

一串二进制数码按照固定长度(8)组合出有意义的基本存储单位——字节,多个字节(1,2,4,8)可以组合出基本算术单元 char,short,int/float,long/double,或复合类型和用户自定义类型。

在计算机内存或磁盘上,指令和数据没有任何区别,都是二进制信息。CPU 在工作的时候把有的信息解析为指令,有的信息解读为数据,为同样的信息赋予了不同的意义。

这涉及到 Abstract State Machine

  • a value : what state are we in
  • the type : what this state represents
  • the representation : how state is distinguished