Practical report: the OpenAI API is a bad joke. If you think you can build a production app against it, think again. I've been trying to use it for the past 6 weeks or so. If you use tiny prompts, you'll generally be fine (that's why you always get people commenting that it works for them), but just try to get closer to the limits, especially with GPT-4.
The API will make you wait up to 10 minutes, and then time out. What's worse, it will time out between their edge servers (cloudflare) and their internal servers, and the way OpenAI implemented their billing you will get a 4xx/5xx response code, but you will still get billed for the request and whatever the servers generated and you didn't get. That's borderline fraudulent.
Meanwhile, their status page will happily show all green, so don't believe that. It seems to be manually updated and does not reflect the truth.
Could it be that it works better in another region? Could it be just my region that is affected? Perhaps — but I won't know, because support is non-existent and hidden behind a moat. You need to jump through hoops and talk to bots, and then you eventually get a bot reply. That you can't respond to.
My support requests about being charged for data I didn't have a chance to get have been unanswered for more than 5 weeks now.
There is no way to contact OpenAI, no way to report problems, the API sometimes kind-of works, but mostly doesn't, and if you comment in the developer forums, you'll mostly get replies from apologists that explain that OpenAI is "growing quickly". I'd say you either provide a production paid API or you don't. At the moment, this looks very much like amateur hour, and charging for requests that were never fulfilled seems like a fraud to me.
So, consider carefully whether you want to build against all that.
Very sorry to hear about these issues, particularly the timeouts. Latency is top of mind for us and something we are continuing to push on. Does streaming work for your use case?
We definitely want to investigate these and the billing issues further. Would you consider emailing me your org ID and any request IDs (if you have them) at atty@openai.com?
Thank you for using the API, and really appreciate the honest feedback.
It's kind of incredible how fast OpenAI (now also known as ClosedAI) is going through the enshittification process. Even Facebook took around a decade to reach this level.
OpenAI has an amazing core product, but in the span of six months:
* Went from an amazing and inspiring open company that even put "Open" in their name to a fully locked up commercial beast.
* Non-existent customers support and all kinds of borderline illegal billing practice. You guys are definitely aware that when there's a network error on the API or ChatGPT, the user still gets charged. And there's a lot of these errors. I get roughly one per hour or two.
* Frustratingly loose interpretation of EU data protection rules. For example, the setting to say "don't use my personal chat data" is connected to the setting to save conversations. So you can't disable it without losing all your chat history.
* Clearly nerfing the ChatGPT v4 products, at least according to hundreds or even thousands of commenters here and on reddit, while denying to have made any changes.
* Use of cheap human labor in developing countries through shady anonymous companies (look up the company Sama who pay Kenyan workers about $1.5 an hour).
* Not to mention the huge questions around the secret training dataset and whether large portions of it consist of illegally obtained private data (see the recent class court case in California)
> Use of cheap human labor in developing countries through shady anonymous companies (look up the company Sama who pay Kenyan workers about $1.5 an hour).
What is wrong about injecting millions into developing nations?
The rest I agree with, although I don't think it was ever really 'open' so its not getting shitty, it always was. Thankfully, "there is no moat" and other LLMs will be open, just a few months behind OpenAI
> * Use of cheap human labor in developing countries through shady anonymous companies (look up the company Sama who pay Kenyan workers about $1.5 an hour).
If you pay a developing country developed country wages what you'll get is 1. inflation and 2. the government mad at you because all their essential workers/doctors/government officials are quitting to work for you.
This is a terrible excuse that I see trotted out far to often to justify going to developing countries and barely even paying workers that country's minimum wage. You absolutely can pay considerably more than minimum wage without disrupting the local economy. They're paying people as low as $1.32 per hour for an absolutely horrible job. I'm not expecting them to pay western wages. But even bumping that up to $2.50 or $3 an hour would make an incredible difference to the local workers lives. The fact that they don't do that is exploitation, pure and simple.
Note that I feel I have quite deep understanding of this issue, and feel strongly about it, because I live and work in a developing country and I see this happening a lot. Westerners come over here and treat local workers like shit, pay them peanuts for 80 hour weeks while making loads of money themselves and then justify it because "it's the local norm". It's sickening, frankly. We westerners doing business in developing countries are in a position of privilege and should be leading by example, not jumping on the first excuse to dump a hundred years worth of the fight for workers rights.
I'm curious. When you buy a loaf of bread from the local market, are they cheaper than first world prices? If so, do you pay double the listed price and demand the shop pay double the price to hire workers so as to not exploit them? Are your expenses in said developing country lower than what you would have paid if you were in a richer country? Are you donating the difference to the local community?
Hi, I've been to Kenya and Tanzania, and while basic staples are cheaper than developed countries they're not that much cheaper these days. If you watch travelog videos where they ask locals how they're doing, many developing countries are struggling with massive inflation that's been partly caused by volatile energy prices (many people can no longer afford gas) and partly by food shortages from the Ukraine War.
It's weird how people always trot out phrases like "I'm just curious" or "I'm just asking questions here" when they try to justify exploitation. Is it so that you have plausible deniability when you inevitably get called on it? Because that doesn't work.
I see that you have pretty extreme takes on what constitutes "exploitation". It's one thing to pretend that you're not part of it if you live on another part of the planet and pretend globalization doesn't exist, but I was wondering how you'd avoid participating in it if you lived in the same country and economic bubble as the ones you claim are exploited.
If you had a morally consistent way to live that life, you'd have my respect. But no, you had to deflect the topic to a phrase I wrote and make presumptions about what I really meant.
FYI, I'm morally at ease with myself, I don't need to justify anything to anyone.
Not to nitpick, but if you're able to name the company employing Kenyans, Sama, who's homepage is at https://www.sama.com/, with a team page at https://www.sama.com/our-team/ , I'm not sure you can complain that they're being shady and anonymous.
It's pretty shady. They have been fully exposed at this stage but from what I understand they were trying to keep a very low profile, going to efforts to make sure the Kenyan workers didn't know they were working for a company called Sama but instead using sub companies to sign the worker contracts.
Since chatGPT-4 is now useless for advanced coding because of their blackbox sudden nerfing, can anyone guess how long before i can run something similar to the orig version privately?
Is the newer 64B models up there? 1 year, 2 years? Can't wait until i get back the crazy quality of the orig model.
We need something open source fast. Thanks open-ai for giving us a glimpse of the crazy possibilities, too crazy for the public i guess.
They also no longer support data exports for many users (including myself) - at one point it worked but now it says you'll receive an email to download your data, which never arrives.
While you're here, you should know that the logic for enabling GPT-4 API access is excluding Microsoft for Startups (https://openai.com/microsoft-for-startups) orgs which have valid billing against Microsoft-provided credits. Presumably, this is an oversight as it wouldn't make sense to exclude pre-existing Microsoft partners. Would you mind escalating this?
This is precisely what is wrong with OpenAI. This. Right here.
"Complaining on HN will get you access. You have know people or "complain in the right forums."
THERE SHOULD BE NO LOGIC.
No qualifying rules. No access checks. No gates. No hoops.
Sam Altman has gone on a worldwide interview tour claiming he wants to "democratise access to AI", meanwhile OpenAI is the least open company I have ever dealt with, or even heard of.
Oracle is more "democratic" and open, for crying out loud.
Streaming has no value for me, I need the entire response before I can do anything. I looked at streaming, but it seemed like a significant effort to implement, and with no obvious benefit — if I get 75% of my response through streaming and then something breaks, it doesn't get me anywhere.
Thanks for offering help, I will contact you directly.
Quick note: your domain doesn't appear to have an A record. I was hoping to follow the link in your profile and see if you have anything interesting written about LLMs.
I know you guys are busy literally building the future but could you consider adding a search field in ChatGPT so that users can search their previous chats?
> We definitely want to investigate these and the billing issues further.
What’s a problem for OpenAI engineers to get web access logs and grep for 4xx/5xx errors?
> you will get a 4xx/5xx response code, but you will still get billed for the request and whatever the servers generated and you didn't get. That's borderline fraudulent.
Borderline!? They're regularly charging customers for products they know weren't delivered. That sounds like straight-up fraud to me, no borderline about it.
You mean it's not normal to tell people that it's their fault for driving their $80,000 electric car in heavy rain, because for many years you haven't bothered to properly seal your transmission's speed sensor?
I've heard LLMs described as "setting money on fire" from people that work in the actually-running-these-things-in-prod industry. Ballpark numbers of $10-20/query in hardware costs. Right now Microsoft (through its OpenAI investment) and Google are subsidizing these costs, and I've heard it's costing Microsoft literally billions a year. But both companies are clearly betting on hardware or software breakthroughs to bring the cost down. If it doesn't come down there's a good chance that it'll remain more economical to pay someone in the Philippines or India to write all the stuff you would have ChatGPT write.
yeah this isnt close. Sam Altman is on record saying its single digit cents per query and then took a massively dilutive $10b investment from microsoft. Even if gpt4 is 8 models in a trenchcoat they wouldnt raise it on themselves by 4 orders of magnitude like that
Single digit cents per query (let's say 2) is A LOT. Let's say the service runs at 10krps (made up, we can discuss about this) it means the service costs 200$ a second i.e 20M$ a day (oversimplifying a day with 100k seconds, but this might be ok to get us in the ballpark), which means that running the model for a year (400 days, sorry simplifying) is around 8B$, so too run 10krps we are in the order of billions per year. We can discuss some of the assumptions but I think that of we are in the ballpark of cents per query the infrastructure costs are significant.
Note that /r/ChatGPT is mostly nontechnical people using the web UI, not developers using the API.
It's very possible the web UI is using a nerfed version of the model evident by its different versioning, but not the API which has more distinct versioning.
I’m pretty sure they tuned the Cloudflare WAF rules on GPT 3 and forgot to increase the request size limits when they added the bigger models with longer contest windows.
I understand your general point and am sympathetic to it, if you're a 10/10 on some scale, I'm about a 3-4. I've never seen billings for failures, but the billing stuff is crazy: no stats if you do streamed chat, and the only tokenizer available is in Python and for GPT-3.0.
However, I'm virtually certain somethings wrong on your end, I've never seen a wait even close to that unless it was completely down. Also the thing about "small prompts"...it sounds to me like you're overflowing context, they're returning an error, and somethings retrying.
I can vouch on this. GPT4 API dies a lot if you use it for a big concurrent project. And of course it’s rate limited like crazy, with certain hours being so bad you can’t even run it for any business purpose.
I built a production app on top of OpenAI and yeah there are frequent errors and timeouts but you literally just have to add some code to account for these and it works fine...
For example exponential back off is the first step, then adding retrying on timeouts (we use streaming and if there are 30 seconds in between getting data back we retry the whole request - rare but happens), then fixing anything else that pops up
It is possible to have a stable production app on top of it
Just fix your code and stop expecting OpenAI to hold your hand
> Just fix your code and stop expecting OpenAI to hold your hand
:-)
My code does retry and the entire application is written to detect and work around breakage. But eventually I do need to get enough content from OpenAI API to be able to make progress, and I am not.
At the moment, for example, all requests just time out after 12 minutes (on my side). No amount of "fixing my code" will help, and I don't want OpenAI to hold my hand, I just want it to a) return some data at least sometimes, b) not charge me for data not delivered.
Let's look at my billing page: over the last hour it shows 8 requests. A total of 52584 tokens. Not a single response made it back to me.
After one of the ubuntu snap updates my firefox stopped working with OpenAI API playground it worked still with every other site. I retried and restarted so many times and it didn't work. Eventually I switched browser to chromium and it worked. I still don't know the problem and it was unnerving, I would have a lot of anxiety to build something important with it.
I tried again just now and I got "Oops! We ran into an issue while authenticating you." but it works on chromium.
Lmao. You had a browser issue when running Firefox on Linux (.000001% of users) and now you are making connections between that and their API stability?
I’m only using them as a stop-gap / for prototyping with the intent to move to a locally hosted fine-tuned (and ideally 7B parameter) model further down the road.
> the way OpenAI implemented their billing you will get a 4xx/5xx response code, but you will still get billed for the request and whatever the servers generated and you didn't get. That's borderline fraudulent.
It's fraudulent, full stop. Maybe they're able to weasel out of it with credit card companies because you're buying "credits."
I suspect it was done this way out of pure incompetence; the OpenAI team handling the customer-facing infrastructure have a pretty poor history. Far as I know you still can't do something simple like change your email address.
You should apply and use OpenAI on azure. We’ve got close to 1m tokens per minute capacity across 3 instances and the latency is totally fine, like 800ms average (with big prompts). They’ve just got the new 0613 models as well (they seem to be about 2 weeks behind OpenAI). We’ve been in production for about 3 months, have some massive clients with a lot traffic and our gpt bill is way under £100 per month. This is all 3.5 turbo though, not 4 (but that’s available on application, but we don’t need it).
Probably compute-bound for inference which they've probably built in an arch-specific way, right? This sort of thing happens. You can't use AVX-512 in Alibaba Cloud cn-hongkong, for instance, because there's no processor available there that can reliably do that (no Genoa CPUs there). I imagine OpenAI has a similar constraint here.
In our own tracking, the P99 isn't exactly great, but this is groundbreaking tech we're dealing with here, and our dissatisfaction with the high end of latency is well worth the value we get in our product: https://twitter.com/_cartermp/status/1674092825053655040/
The API will make you wait up to 10 minutes, and then time out. What's worse, it will time out between their edge servers (cloudflare) and their internal servers, and the way OpenAI implemented their billing you will get a 4xx/5xx response code, but you will still get billed for the request and whatever the servers generated and you didn't get. That's borderline fraudulent.
Meanwhile, their status page will happily show all green, so don't believe that. It seems to be manually updated and does not reflect the truth.
Could it be that it works better in another region? Could it be just my region that is affected? Perhaps — but I won't know, because support is non-existent and hidden behind a moat. You need to jump through hoops and talk to bots, and then you eventually get a bot reply. That you can't respond to.
My support requests about being charged for data I didn't have a chance to get have been unanswered for more than 5 weeks now.
There is no way to contact OpenAI, no way to report problems, the API sometimes kind-of works, but mostly doesn't, and if you comment in the developer forums, you'll mostly get replies from apologists that explain that OpenAI is "growing quickly". I'd say you either provide a production paid API or you don't. At the moment, this looks very much like amateur hour, and charging for requests that were never fulfilled seems like a fraud to me.
So, consider carefully whether you want to build against all that.