Really? If this was Apple it might make sense, for OpenAI it feels like a demo that's not particularly aligned with their core competency (a least by reputation) of building the most performant AI models. Or put another way, it says to me they're done building models and are now wading into territory where there are strong incumbents.
All the recent OpenAI talk had me concerned that the tech has peaked for now and that expectations are going to be reset.
What strong incumbents are there in conversational voice models? Siri? Google Assistant? This is in a completely different league. I can see from the reaction here that people don't understand. But they will when they try it.
Did you see it translate Italian? Have you ever tried the Google Translate/Assistant features for real time translation? They didn't train it to be a translator. They didn't make a translation feature. They just asked it. It's instantly better than every translation feature Google ever released.
In common with Siri, Google Assistant, Alexa and chatgpt is the perception that over time the same thing actually gets worse.
Whether it's real or not is a reasonably interesting question, because it's possible that all that occurs with the progress is our perception of how things should be advances. My gut feeling is it has been a bit of both though, in the sense the decline is real, and we expect things to improve.
Who can forget Google demoing their AI making a call to a restaurant that they showed at I/O many years ago? Everyone, apparently.
What Openai has done time and time again is completely change the landscape when the competitors have caught up and everyone thinks their lead is gone. They made image generation a thing. When GPT-3 became outdated they released ChatGPT. Instead of trying to keep Dalle competitive they released Sora. Now they change the game again with live audio+video.
That's only really true on the surface. So far the template is: amazing demos create hype -> once public it turns out to be underwhelming.
Sora is not yet released and not clear when it will be. Dall-e is worse than mid-journey in most cases. GPT-4 has either gotten worse or stayed the same. GPT-4 vision is not really usable for anything practical. Voice is cool but not that useful, especially with lack of strong reasoning from the base model.
All the recent OpenAI talk had me concerned that the tech has peaked for now and that expectations are going to be reset.