And yet, my default go-to example with me being personally unimpressed with Chat...

jiggawatts · on May 11, 2023

These things have limits, definitely. In almost every use-case I've tried, I've bumped up against them.

The trick is to know what they can and can't do, and use them where they're useful enough.

Someone here on HN quipped that ChatGPT is "isn't intelligent" because it couldn't come up with a revolutionary new battery chemistry.

Like... what are you expecting? A god?

LLMs are useful when you can validate the output yourself. Similarly, they're useful when the output doesn't have to be precise but the inputs are English.

For example, they're awesome "filters" for human input as measured against human metrics. "Is the following text rude? Output YES or NO only?" is very useful and works well enough, right now.

They're also useful when you need to iterate to find something, or when your search parameters are incredibly vague but have a narrow Venn diagram intersection.

I've been using ChatGPT as a replacement for the /r/tipofmytongue sub-Reddit. It knows everything well enough to be able to interactively find what I'm looking for to an extent that is super-individual-human, and in some ways beyond what even a very large collection of humans can achieve.

jeremyjh · on May 11, 2023

I think you and I are pretty aligned in our thoughts on ChatGPT. I think its revolutionary, and I use it every day. But I don't consider it generally intelligent, I do think that novel thoughts of demonstrable value are the real test for that, and a useful invention would convince me beyond all doubt.

jiggawatts · on May 12, 2023

> a useful invention

In my opinion, the mental model you should use when evaluating LLMs-vs-Human capability: When asked a question, the LLM has to answer without being able to iterate, back-track, or even use a scratch pad such as pen&paper or a text editor. Basically it's the same as an "oral exam", where you have to stand in the middle of a room and get grilled by a professor to determine your knowledge on a subject.

Don't compare "human with tools and unlimited time" to "LLM with no tools and seconds of time". Compare "human being interrogated in an empty room" and then it is much more clear where an LLM rates.

GPT 4 is definitely super-human in some areas, such as general knowledge and translation between languages.

No human knows as much, or can speak as many languages.

Ask yourself this: can any human, when asked to "invent something", just do it, then and there?

jeremyjh · on May 17, 2023

Give it a billion subjective hours. Give it databases and network access. Do you really think it will produce something? You definitely do a good job of explaining one of the limitations an LLM faces compared to a general super-intelligence.

lukeschlather · on May 11, 2023

Nginx/Apache config is not an easy task. I would call this an incredibly hard problem to solve. I certainly never trust that I've done the config correctly until I've tested it carefully. The way this eventually works is you tell the AI "I want Nginx config that has these properties" and the AI generates config and tests it against Nginx until it has the requisite config and it's tested.

ZephyrBlu · on May 11, 2023

I'm really surprised it cannot output a valid NGINX config when you give it that amount of context.

Can you share the prompt and your expected output? I'm interested to see where it goes wrong.

blatant303 · on May 11, 2023

Here's an experiment that showcases GPT-4 deep (meta)semantic abilities:

https://pastebin.com/8FwQzDiE

TL;DR. I start by asking it to generate sentences where subject-object inversion yields a meaningful sentence where the verb's meaning is shifted metaphorically. For instance "I smoke the cigarette vs the cigarette is smoking me". After some back and forth it comes up with:

The painter captures the landscape vs. The landscape captures the painter. The gardener shapes the garden vs. The garden shapes the gardener. The chef creates the dish vs. The dish creates the chef.

Maybe that will change your mind.