Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

15% faster is great. But at what cost?

> Since Ruby 3.3.0-preview2 YJIT generates more code than Ruby 3.2.2 YJIT, this can result in YJIT having a higher memory overlead. We put a lot of effort into making metadata more space-efficient, but it still uses more memory than Ruby 3.2.2 YJIT.

I'm hoping/assuming the increased memory usage is trivial compared to the cpu-efficiency gains, but it would be nice to see some memory-overhead numbers as part of this analysis.



This is a particularly valid concern given ruby+rails seems quite memory inefficient to begin with. I've sometimes had smallish apps on 500mb heroku dynos crashing due to memory slowly climbing and eventually slowing things down as the dyno uses swap, and eventually 500mb of swap. IME ruby+rails doesn't seem to free up memory after it uses it, and that causes problems as the hours go by until the pod/dyno crashes or is restarted.


Ruby processes don't return the memory to the system,they reuse memory already allocated. This is for efficiency - allocating and freeing system memory isn't free. Even if it did, your peak memory usage would be the same. It doesn't allocate memory it doesn't need.

If your memory usage doesn't plateau you have a memory leak which would be caused by a bug in your code or a dependency.

But 500 to 1gb of memory required for a production rails app isn't unusual. Heroku knows this, which explains their bonkers pricing for 2gb of memory. They know where to stick the knife.


> Ruby processes don't return the memory to the system

That is not correct. Ruby do unmap pages when it has too many free pages, and it obviously call `free` on memory it allocated once it doesn't use it.

What happens sometime though is that because of fragmentation you have many free slots but no free whole pages. That is one of the reason why GC compaction was implemented, but it's not enabled by default.

But in most case I've seen, the memory bloat of Ruby applications was caused by glibc malloc, and the solution was either to set MALLOC_ARENA_MAX or to switch to jemalloc.


I'm correct in practice. There are scenarios where ruby might free memory, but ruby is mostly used for rails, and you won't ever see that under a standard rails workload. It will plataue and stay there until a restart. When people see this they think it's a "bug" or a "leak" but it isn't.

On the last fairly large rails app I tried to use jemalloc on there was no change in memory usage. I believe that advice is a bit outdated. Also note using jemalloc doesn't cause memory to be freed to the system. It reduces fragmentation, at the cost of cpu cycles. There's no free lunch.


> It will plataue

Yes, because extra empty pages are released at the end of major GC, which is occasional, and most web application will cyclicaly use enough memory that they will stabilize / plateau at one point.

> I believe that advice is a bit outdated.

It absolutely isn't, your anecdote doesn't mean much compared to the countless reports you can find out there.

> Also note using jemalloc doesn't cause memory to be freed to the system.

Yes it does, it has a decay mecanism, most allocators do. https://jemalloc.net/jemalloc.3.html

> It reduces fragmentation

Yes, and that allows it to have more free pages that it can release.

> at the cost of cpu cycles

Compared to glibc, not so much.


That kind of thinking is a bit flawed unfortunately. You might hit your peak for 20 minutes a day but you’ve provisioned your system for that temporary worst case for the entire day and other services are paying that penalty. If it’s the only thing you’re running, maybe. But in practice there are other things you want to run on the machine to improve utilization rate (since services are not all hitting their peak simultaneous generally)

That’s why good modern allocators like mimalloc and tcmalloc return memory when they notice it’s going unused, so that other services running on the machine can access resources. And this is in c++ land where things are even more perf sensitive.


Theoretically virtual memory and swap solve this problem really well. The OS is free to write the unused pages to disc to let other programs use the real memory.


Swap is horribly expensive and most hyperscalars run their servers without swap and set per-process memory limits, automatically killing workloads that go above their threshold..


swap is only expensive if you are using the swapped out memory. if you are in a case where a program is just holding on to pages it isn't using, swap is basically free. for most users, turning off swap is just losing performance since the OS can always use all of your RAM to cache disk access.


Swap is expensive compared to releasing unused memory back to the OS. The reason is that you spend memory and disk bandwidth writing “unused” data to disk. And that data could very well be unused RAM just sitting around in a memory allocator, which is effectively useless memory that you’re swapping because the allocator didn’t release it.

Zswap is always performance increasing. Swap to disk can be performance degrading (good implementations generally are not unless your working set is larger than your memory and you’re in thrashing) and certainly expensive $$ wise in that it wears out your SSD faster.

You seem to be thinking I’m arguing in absolute terms where all I’m saying is that swapping is a more expensive technique to try to reclaim that unused RAM vs the memory allocator doing it. It can be a useful low-effort technique, but it’s very coarse and more of a stop gap to recover inefficiencies in the system at a global level. Global inefficiency mitigation is generally not as optimally effective as more localized approaches.

Consider also that before the OS starts swapping, the OS is going to start purging disk caches since those are free for it to reload (executable code backed by file, page caches etc). These are second order system effects that are hard to reason about abstractly when you have a greedy view of “my application is the only one that matters”. This means that before you even hit swap, your large dark matter of dirty memory sitting in your allocator is making your disk accessed slower. And the kernel’s swap doesn’t distinguish working set memory from allocator so you’re hoping inherent temporal and spatial locality patterns interplay well so that you’re not trying to hand out an allocation for a swapped out block too frequently.


What if the other thing you're trying to run runs at the same time that your rails app is using peak memory? You have no choice but to have enough memory for peak load.

But if you really do need to cheap out you can generally configure your app server to kill idle worker processes, or bounce them on a schedule to return memory to the system, and hope.


So that’s generally not very likely. You’re going to have some time of day effects that are shared but true “peak” tends to be service dependent rather than something all your services experience simultaneously from what I’ve seen (YMMV).

Killing “idle” processes is also extremely expensive because you have to restart the process, reload all state, and doing graceful handoff is tricky.

It’s good to have graceful handoff for zero downtime upgrades, but I still say having your allocator return RAM is the cheapest and easiest option and something good modern allocators do for you automatically.


There is no one size fits all memory management technique. There are always tradeoffs. The scenario you are describing is not common for ruby apps. Ruby uses a memory management style that is suitable for most ruby workloads.

All the production quality app servers handle killing and and starting new worker processes gracefully and efficiently by forking a running process. Certainly there is some overhead, but that's why you don't underprovision memory, so you don't need to resort to that.


> If your memory usage doesn't plateau you have a memory leak which would be caused by a bug in your code or a dependency.

Extremely bold claim for a framework the size of ruby on rails. I would trot out my own evidence but the receipts are lost with time.

Also—why isn't the allocation behavior tweakable at runtime? Seems pretty trivial with no downsides. It's not difficult to think of a scenario where a non-monotonically-increasing-heap-size is desirable.


This person is incorrect, but even if they were correct, that wouldn't be a framework thing.

Memory management is handled by the language.


Many types of memory leaks are simply because you're holding on to data you don't need to hold onto anymore. Languages cannot prevent this, at least not that I've seen.


Sure, but the person I responded to was suggesting that Rails was deliberately holding onto memory to re-use it.

That's absolutely not something Rails does, but it is something that some managed languages and some (most?) allocators do.


I’ve observed same, and every time I switched to jemalloc and the issue was fixed.


Was it difficult to switch? What were the downsides / tradeoffs? (I read about jemalloc recently but don't know enough about it to confidently pursue it, but may try it on a small app if it's straight forward).


Super easy and have not had an issue with it in over 10 years of using it. There is an example here on how to do it with docker image. https://mailsnag.com/blog/optimized-ruby-dockerfile/.


Going to try this right now! Will report back.

OOC, why isn't this part a ruby default? Isn't it always better to be more memory efficient. (I'm trying to understand what the trade offs are, if any)

EDIT: well, exactly 6 minutes later, I'm done. I followed these instructions: https://elements.heroku.com/buildpacks/gaffneyc/heroku-build...

The app seems to work like usual, I'll just have to wait and see what happens to memory use.

I will reply here in 12 hours with a screen shot showing the results (before/after memory use), whatever they may be.

Also, for reference, here's the metrics for the past 24 hours (LOTS of memory problems): https://imgur.com/a/M8IHd5z


For anyone interested, here's the result: https://imgur.com/a/c62gjKQ (the red vertical line is the point from which jemalloc was used).

It looks like memory usage did indeed go down, and critical errors fell by about 84%.


For completeness, here are the metrics a full 24 hours after the change: https://imgur.com/a/lbdzFvN


Yeah, I had the same reaction last week. It's not a Ruby default because some versions of Linux can't use it. ¯\_(ツ)_/¯


Have you compared it against newer allocators like mimalloc or the rewritten tcmalloc (not the one in gperftools)? Jemalloc is a bit long in the tooth now.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: