Yeah, I was also thinking how meaningful ~20MB of memory use really would be in this context. Or how badly would racy token bucket perform in the real world. Still, enjoyed the read.
I think this is an important point. Trying to store all of these in RAM means you can only have so many. Which is why I really like something that can use a backing store of a more cost efficient DB. Once you start thinking about what you could do if you could have 1000s of rate limits per user you end up thinking of lots of interesting ways to use them. Like limiting how often you log/track-usage to 1/hr per event per user. That's saved me a ton of money.
Second thought: token buckets have a nice property of being really cacheable once they expire. You can push down a "won't refill until timestamp" and then clients can skip checking altogether.