Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

And yet, my default go-to example with me being personally unimpressed with ChatGPT 3 and 4 is getting it to output a valid NGINX configuration without hallucinating, despite feeding it the required documentation, the original Apache .htaccess file, and telling it what URL rewrites would be required. This should not be a difficult thing, but it constantly hallucinates things that aren't there because it's an LLM, and that's what LLMs do, they predict the next word.

Whilst to the untrained eye, sure, the NGINX config looked great, but it didn't function, and it didn't include the URL rewrites as instructed, hallucinated a bunch of rewrites that didn't exist and would serve no purpose, and despite refining the prompts over many days, and consulting with self-proclaimed "prompt engineers", it still didn't give the expected output.

It's neat. Reliable? Not in my experiences for my needs, but I am genuinely glad you're making it work for what you need, that's definitely cool, provided it doesn't hallucinate in an unnoticed capacity. It's a lot of trust to place in an LLM.



These things have limits, definitely. In almost every use-case I've tried, I've bumped up against them.

The trick is to know what they can and can't do, and use them where they're useful enough.

Someone here on HN quipped that ChatGPT is "isn't intelligent" because it couldn't come up with a revolutionary new battery chemistry.

Like... what are you expecting? A god?

LLMs are useful when you can validate the output yourself. Similarly, they're useful when the output doesn't have to be precise but the inputs are English.

For example, they're awesome "filters" for human input as measured against human metrics. "Is the following text rude? Output YES or NO only?" is very useful and works well enough, right now.

They're also useful when you need to iterate to find something, or when your search parameters are incredibly vague but have a narrow Venn diagram intersection.

I've been using ChatGPT as a replacement for the /r/tipofmytongue sub-Reddit. It knows everything well enough to be able to interactively find what I'm looking for to an extent that is super-individual-human, and in some ways beyond what even a very large collection of humans can achieve.


I think you and I are pretty aligned in our thoughts on ChatGPT. I think its revolutionary, and I use it every day. But I don't consider it generally intelligent, I do think that novel thoughts of demonstrable value are the real test for that, and a useful invention would convince me beyond all doubt.


> a useful invention

In my opinion, the mental model you should use when evaluating LLMs-vs-Human capability: When asked a question, the LLM has to answer without being able to iterate, back-track, or even use a scratch pad such as pen&paper or a text editor. Basically it's the same as an "oral exam", where you have to stand in the middle of a room and get grilled by a professor to determine your knowledge on a subject.

Don't compare "human with tools and unlimited time" to "LLM with no tools and seconds of time". Compare "human being interrogated in an empty room" and then it is much more clear where an LLM rates.

GPT 4 is definitely super-human in some areas, such as general knowledge and translation between languages.

No human knows as much, or can speak as many languages.

Ask yourself this: can any human, when asked to "invent something", just do it, then and there?


Give it a billion subjective hours. Give it databases and network access. Do you really think it will produce something? You definitely do a good job of explaining one of the limitations an LLM faces compared to a general super-intelligence.


Nginx/Apache config is not an easy task. I would call this an incredibly hard problem to solve. I certainly never trust that I've done the config correctly until I've tested it carefully. The way this eventually works is you tell the AI "I want Nginx config that has these properties" and the AI generates config and tests it against Nginx until it has the requisite config and it's tested.


I'm really surprised it cannot output a valid NGINX config when you give it that amount of context.

Can you share the prompt and your expected output? I'm interested to see where it goes wrong.


Here's an experiment that showcases GPT-4 deep (meta)semantic abilities:

https://pastebin.com/8FwQzDiE

TL;DR. I start by asking it to generate sentences where subject-object inversion yields a meaningful sentence where the verb's meaning is shifted metaphorically. For instance "I smoke the cigarette vs the cigarette is smoking me". After some back and forth it comes up with:

The painter captures the landscape vs. The landscape captures the painter. The gardener shapes the garden vs. The garden shapes the gardener. The chef creates the dish vs. The dish creates the chef.

Maybe that will change your mind.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: