ksco 的工作日志
245 subscribers
168 photos
10 videos
4 files
84 links
内容主要取决于我正在做的东西,目前主要是模拟器 / DBT 之类的散乱话题。
Download Telegram
Super Morning-O Bros! 🌞
新桌子
开心
这机器感觉好厉害
国庆准备卷上 8 天,卷死你们。
😱5
http://www.cs.columbia.edu/~luca/research/cota_CGO17.pdf

Cross-ISA Machine Emulation for Multicores
^ If the host also supports LL/SC, one may think of using them for emulating the target’s LL/SC. This is dangerous, however, because most processors constrain the instructions that can appear between an LL/SC pair. If these restrictions are not respected, the store might fail spuriously. The extra overhead of dynamic translation, such as TLB lookups and register spills, may thus cause the store to fail forever.
跨架构模拟 LL/SC 大概有 4 种方案:

1. 执行的时候停掉其他 CPU,执行完再恢复(文章中说这是 QEMU “目前”的做法)
2. 用 CAS 模拟,会有 ABA 问题,但据称这个问题 “almost never matters for real programs”
3. 监控所有 CPU 的 store
4. 如果 host 硬件支持 hardware transactional memory,则可以精确模拟
3
同架构模拟的情况基本类似,DynamoRIO AArch64 提供了两种方案:

1. 默认方案:用 CAS 模拟
2. 可选方案:搞了一条超级指令,把整个 LL/SC block 打包成一条指令来处理(除了精确和性能好之外,缺点多多)
基础的 codegen,没有考虑 stolen reg 和 tp reg,把这两个考虑进来后情况还要更复杂一点。


# ---> lr.w/d.aq?.rl? rd, (rs1)
sd scratch1, [scratch_1_slot]
fence rl?
ld rd, 0(rs1)
fence aq?
sd rs1, [tls_lrsc_addr]
li scratch1, SIZE
sd scratch1, [tls_lrsc_size]
sd rd, [tls_lrsc_value]
ld scratch1, [scratch_1_slot]


# ---> sc.w/d.aq?.rl? rd, rs2, (rs1)
sd scratch1, [scratch_1_slot]
sd scratch2, [scratch_2_slot]
ld scratch1, [tls_lrsc_addr]
bne scratch1, rs1, fail
ld scratch1, [tls_lrsc_size]
li scratch2, SIZE
bne scratch1, scratch2, fail
amoswap.aq?.rl? rd, rs2, (rs1)
sne rd, rd, value
j finally
fail:
fence aq?rl?
li rd, 1
finally:
li scratch1, -1
sd scratch1, [tls_lrsc_addr]
ld scratch1, [scratch_1_slot]
ld scratch2, [scratch_2_slot]
倒霉的一天,大拇哥下线了怎么敲键盘啊😢
😱2
https://c9x.me/compile/

QBE 的代码写得好干净啊
突然发现 Codemasters 出新作了,WRC licensed、虚幻引擎,只要 200 块!
🤯
😱1