Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Are you sure the translators don't insert code necessary to maintain ordering? I would be shocked if most threaded code works when you throw out the x86 memory model. Managed runtimes like .NET definitely generate code for each target designed to maintain the correct memory model.


https://docs.microsoft.com/en-us/windows/uwp/porting/apps-on...

> You can also select multi-core settings, as shown here... These settings change the number of memory barriers used to synchronize memory accesses between cores in apps during emulation. Fast is the default mode, but the strict and very strict options will increase the number of barriers. This slows down the app, but reduces the risk of app errors. The single-core option removes all barriers but forces all app threads to run on a single core.

https://news.ycombinator.com/item?id=28732273

zamadatix's interprets this as Microsoft saying that by default, Windows on ARM runs x86 apps without x86 TSO, and turns on extra memory barriers using per-app compatibility settings. But if an app needs TSO but isn't in Windows's database, it will crash or silently corrupt data.


They better do, but then, how would an automatic translator know that this is a "release semantics" atomic store operation?

Because on x86 it is, no special barriers or instructions necessary.

mov [shared_data], 1

mov [release_flag], 1


It’s pessimistic and converts over a lot of memory accesses to RCpc or atomics.

(on ARMv8.0 where you don’t have those, barriers are used more)

TSO pessimization is the only way to make the thing work at a translation time cost that isn’t too high.


Or you support TSO directly on your cpu like Apple does on M1.


Sure, but Windows on ARM has to run on many ARM processors, not a specific one designed by MS. They could detect if the processor has non-standard TSO support and use that when running an x86 app, but they still have to do something to run the x86 app on a standard ARM processor.


Maintaining the memory model guarantees is what causes the steep cost in performance when using x86 apps on Windows on Arm.

That said, heuristics are used to speed it up. I would recommend not sharing values in the stack between threads for synchronisation for example.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: