# HPC
# cache line size
Cache lines are assumed to be N bytes long, depending on the architecture:
- On x86-64, aarch64, and powerpc64, N = 128.
- On arm, mips, mips64, and riscv64, N = 32.
- On s390x, N = 256.
- On all others, N = 64.
Note that N is just a reasonable guess and is not guaranteed to match the actual cache line
length of the machine the program is running on. On modern Intel architectures, spatial
prefetcher is pulling pairs of 64-byte cache lines at a time, so we pessimistically assume that
cache lines are 128 bytes long.
https://zhuanlan.zhihu.com/cpu-cache (opens new window)
# latency numbers every programmer should know
# memory order
编译器开发者和处理器制造厂商遵循的内存排序准则是:不能改变单线程程序的行为
std::memory_order - cppreference.com (opens new window)
# acquire and release
acquire 是针对读操作的,release 是针对写操作的。
多线程编程的时候,会有一些场景通过一个变量(标记)来表示数据是否已经准备好了,比如用一个变量来表示某个全局变量(数据)是否已经初始化。
acquire 保证读数据的指令不会被重排到读标记的前头
release 保证写数据的指令不会被重拍到写标记的后头