# HPC

# cache line size

Cache lines are assumed to be N bytes long, depending on the architecture:

  • On x86-64, aarch64, and powerpc64, N = 128.
  • On arm, mips, mips64, and riscv64, N = 32.
  • On s390x, N = 256.
  • On all others, N = 64.

Note that N is just a reasonable guess and is not guaranteed to match the actual cache line
length of the machine the program is running on. On modern Intel architectures, spatial
prefetcher is pulling pairs of 64-byte cache lines at a time, so we pessimistically assume that
cache lines are 128 bytes long.

参照表

查看物理核,逻辑核等命令

https://zhuanlan.zhihu.com/cpu-cache (opens new window)

SMP vs AMP

# latency numbers every programmer should know

latency numbers

# memory order

编译器开发者和处理器制造厂商遵循的内存排序准则是:不能改变单线程程序的行为

std::memory_order - cppreference.com (opens new window)

# acquire and release

acquire 是针对读操作的,release 是针对写操作的。

多线程编程的时候,会有一些场景通过一个变量(标记)来表示数据是否已经准备好了,比如用一个变量来表示某个全局变量(数据)是否已经初始化。

acquire 保证读数据的指令不会被重排到读标记的前头

release 保证写数据的指令不会被重拍到写标记的后头

# lock 与 atomic 性能对比

lock vs atomic