I understand your general point and am sympathetic to it, if you're a 10/10 on s...

I understand your general point and am sympathetic to it, if you're a 10/10 on some scale, I'm about a 3-4. I've never seen billings for failures, but the billing stuff is crazy: no stats if you do streamed chat, and the only tokenizer available is in Python and for GPT-3.0.

However, I'm virtually certain somethings wrong on your end, I've never seen a wait even close to that unless it was completely down. Also the thing about "small prompts"...it sounds to me like you're overflowing context, they're returning an error, and somethings retrying.