> most of those languages (1) have a GC that makes boxing a much cheaper operation than in Rust
Why is that? I would intuitively think it is the other way. (Is a malloc/free pair not cheaper than an allocation on the GC heap + collecting its garbages?)
As always, it depends on a lot of details. A generational garbage collector can be made to allocate extremely quickly; IIRC for the JVM it's like, seven instructions? For short lived allocations, it sort of acts like an arena, which is very high performance. malloc/free need to be quite general.
It's always about details though. If a GC is faster than malloc/free, but your language doesn't tend to allocate much to begin with, the whole system can be faster even if malloc is slower. It always depends.
Doesn't something like jemalloc basically give you this, but without pauses? Thread-local freelists for quick recycling of small allocations without synchronization.. funnily enough, jemalloc even uses some garbage collection mechanisms internally.
I don't know a ton about jemalloc internals, but it is true that a lot of modern mallocs use some mechanisms similar to GCs. There's some pretty major constraint differences though.
`malloc` + `free` are unknown function calls, they can't be inlined, don't understand the semantics of your language, the strategy behind them is quite generic, etc.
A GC that's integrated with a programming language can do much much better (different heaps for short and long lived allocations, for example).
One can do even better by supporting custom "pluggable" allocators, and not just a single global allocator like Rust does at present. Some of these allocators could even implement GC-like logic internally.
Why is that? I would intuitively think it is the other way. (Is a malloc/free pair not cheaper than an allocation on the GC heap + collecting its garbages?)