Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I've actually found this to be completely untrue in practice. Just about every web service I've dealt with in production, if not all of them, have been CPU bound. This is for a number of reasons:

1) Network speeds have increased dramatically compared to CPU speeds.

2) People don't optimize code very much.

3) Web apps tend to do more work per request than they did in the 90s.

Regardless, I've never seen an app saturate it's network pipe, but I've seen plenty saturate all their CPU cores, including relatively well-tuned ones. For instance, I wrote a Netty-based reverse proxy app once, and while I got it to run far faster than the typical app in that company, it was still CPU-bound in all my tests.



+1 to this.

We picked Python Asyncio based Tornado async for our services. As soon as we hit scale we were getting CPU bound.

Deep diving into profiling came up with JSON parsing being the culprit.

It's a painful problem to crack once you hit those limits. In some cases you could genuinely skip parsing the JSON (by returning a raw JSON containing string to client) such as when you are simply getting data from cache or db.

In other cases you simply can't skip it. For eg when you are interfacing with a 3rd party library that will only speak JSON. At that stage you are stuck.

You could try and use a wrapper around faster native JSON parser (say uJson) but it will be a trade-off between the parsing time and the time taken to copy the string to the FFI parser and copy back the results. And deal with all the complexity that that entails.

Or you could hand it off as a job to an async queue (this might be the canonical architectural approach to prevent blocking the event loop) but then you have just shifted the problem to a different place where you'll still need to throw more instances at the problem. And this adds extra latency.

I too was in the "don't optimize prematurely" camp but picking Python today for new services IMO would be taking that principle a bit too far.

Especially considering the ergonomics that modern languages like Golang or Rust offer.


> It's a painful problem to crack once you hit those limits.

If you hit that doing something novel and obscure, sure. If it is literally parsing JSON, you spend a little effort researching non-stdlib JSON parsing libraries, pick one of the several stable much-faster-than-stdlib ones, and move on.


Like I mentioned we did that. You can get a small arithmetic multiple speed up. But moving these services to rust basically made our instance counts drop from 32 to 2.

Once the team has the know-how to write a service in Rust/Golang or even the modern pleasant to write Java, it becomes hard to justify why we'd pick Python for a new service at all.


He explicitly mentioned that path…


He explicitly mentioned some of the troubles you might have implementing and maintaining that yourself via FFI.

He very much did not explicitly mention that its already done, with established results, and that the described “pain” isn’t something you have to take on at all.


And then there is the resource usage... a typical dynamic interpreted language uses orders of magnitude more CPU and memory compared to compiling to machine code. Enough of these apps running in a data center can add up to a lot of wasted resources!


I'd argue a lot of web apps are just serializing to json, after doing minimal business logic that likely has to do db requests.

Once again, I said to look at what your load actually is. If you're CPU bound, then yes, buy a bigger instance, optimize, or switch languages.

It's like how if you are putting up a blog, you probably shouldn't be looking at running kubernetes clusters for just that blog.


You might be surprised to find out how CPU-intensive even something as simple as a DB request can be. Your driver has to do the work of communicating with the DB server via its protocol, which is similar in cost to any other network request. More importantly, it has to deserialize the result and marshal the data into objects on the managed heap. Most DB drivers don't support streaming, so for larger requests you have to read all the data into memory, then convert it all into objects at once.

And that's not even taking into account frameworks and ORMs like Hibernate, which itself can multiply the CPU used several-fold on top of JDBC, or whatever lower-level interface it wraps. I've never measured frameworks in dynamic languages, but I have no reason to believe they aren't similarly inefficient.

And, yes, one of the optimizations I did for my reverse proxy app was to upgrade the JSON library, which brought a significant performance boost. But it's not the only source of CPU usage for apps, nor was it the only major optimization I successfully applied.


Yeah I’ve profiled slow Python apps before and it’s almost always serialization that’s the sticking point. Default implementation of date parsing in particular is really bad


> I've never seen an app saturate it's network pipe

True, but that has nothing to do with what OP said:

> you spend a large amount of time waiting on database responses.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: