Grok 4.3

sundarurfriend · 2026-05-01T09:36:45 1777628205

As an English-as-second-language speaker and writer, one thing Grok really shines at is capturing the tone and level of "formality" of a piece of text and the replicating it correctly. It seems to understand the little human subtleties of language in a way the other major providers don't. Chatgpt goes overly stiff and formal sounding, or ends up in a weird "aye guvnor" type informal language (Claude is sometimes better but not always).

Grok seems in general better at being "human" in ways that are hard to define: for eg. if I ask it "does this message roughly convey things correctly, to the level it can given this length", it will likely answer like a human would (either a yes or a change suggestion that sticks to the tone and length), while Chatgpt would write a dissertation on the message that still doesn't clear anything up.

Recently I've noticed that Grok seems to have gotten really good at dictation too (that feature where you click the mic to ask it something). Chatgpt has like 90-95% accuracy with my accent, the speech input on Android's Gboard something like 75%, Grok surprisingly gets something like 98% of my words correct.

michaelbuckbee · 2026-05-01T12:24:34 1777638274

I did a quick eval comparing Grok 4.3, Opus 4.7 and GPT 4.1 and they actually seem pretty similar:

https://ofw640g9re.evvl.io/

They all did pretty well at a more "formal" tone, but GPT4.1 was the only one that didn't make me cringe with a "casual" tone.

[edit] fwiw, grok was also the fastest+cheapest model, claude was slowest and priciest.

sundarurfriend · 2026-05-01T14:04:06 1777644246

This is the most basic level of eval, of whether they can produce output that will be considered by someone somewhere (usually a young urban US American) as informal toned. Real human communication is far more nuanced than this, different groups have different linguistic registers they're used to and things outside it sound odd even if they can't articulate why. You could also want to be informal but not over-familiar with the other person (for eg. in a discord chat to a new acquaintance) - actually looking at the outputs here, the Claude output seems best fitting for that (in my subjective view anyway) than to the one you gave it - or want many other little variations.

What makes one cringe and another recognize as familiar and comfortable is also pretty subtle and hard to define. These things need nuanced descriptions and examples to actually get right, and it's in understanding those nuances and figuring out the register of the examples that Grok outshines the others.

Romario77 · 2026-05-01T15:18:24 1777648704

you said that English is not your first language, so heads up - you don't need "for" when you use "e.g.", it already means "for example".

idiotsecant · 2026-05-01T17:16:02 1777655762

You presumably do have English as a first language so you should know that sentences begin with capital letters.

Was that helpful and interesting conversation?

jasonjmcghee · 2026-05-01T14:05:59 1777644359

That's Grok 4.2 not 4.3 right?

And why are you comparing to gpt-4.1? (As opposed to one of the 6? model releases since then - would have expected gpt 5.5)

michaelbuckbee · 2026-05-01T15:36:19 1777649779

Good catch, there was an issue with the second hardest thing in programming (caching).

Here's an updated eval with the proper models https://a3bmfqfom3.evvl.io/

Reebz · 2026-05-01T23:53:25 1777679605

Claude 4.7 is the clear winner to me for manager and formal report updates.

As an ex-senior exec (hundreds of staff), the bolded timeline impact is a particular nuance that I would expect a Lead/Director to format for a VP+ audience. Interesting none of the other models did that. My eyes immediately went to impact statement, then worked back to context to grasp the whole situation.

wamatt · 2026-05-01T16:05:17 1777651517

Thanks from where I'm looking Grok 4.3 and Claude 4.7 do a better job on the informal close friend/coworker vibe.

ChatGPT sounds fake / formal phrasing (for the specific close friend context) and has em-dashes and uses capitalization. Hence, ChatGPT does not, imo grok the assignment ;)

andai · 2026-05-01T17:37:12 1777657032

Is it me or did GPT get noticeably more natural in word choice recently? You can see it between 4.1 and 5.5 here, but I'm not sure when that happened. (My guess would be one of the recent 5.x releases.)

Edit: I meant specifically the absence of bizarre phrasing. That seems to have improved.

reissbaker · 2026-05-01T18:24:08 1777659848

Wow, I'm surprised. Grok 4.3 actually is noticeably better than the other two for the close-friend variant. Surprisingly I found Claude the cringiest of the three!

embedding-shape · 2026-05-01T12:27:18 1777638438

I know it's just an evaluation, but seeing an informal message and a prompt to ask to rewrite this informal message to the tone of an "informal message" when the original one sounds just fine, just makes me sad... Not because of this evaluation, but because it reminds me that this is how some people use LLMs, basically asking it to remove your own voice from texts that are generally fine already.

michaelbuckbee · 2026-05-01T13:12:40 1777641160

My sister in law is a pharmacist and the heaviest non-dev ChatGPT user I know and her main use case is writing professionally polite messages to doctors on how the drugs they prescribed to a patient would have killed them had she not caught a particular interaction or common side effect.

There's a lot of "tone" in it as she's not trying to anger these folks, but also it's quite serious, but also there's just everything else happening in medicine.

Feels like a great use.

ryandrake · 2026-05-01T15:53:59 1777650839

Pretty neat. This kind of tone self-moderation comes naturally to good communicators, but I know people (on and off the spectrum) who really, really need help with this, and it's cool to see LLMs are able to do this. There are a surprising number of people in the business world who are just totally unable to tone-police themselves. In the medical field I'd be worried about hallucinations, of course, but presumably your SIL fact-checks the output.

hamdingers · 2026-05-01T17:05:00 1777655100

She does herself a disservice by outsourcing that skill. One day she might have to actually talk to one of these people.

michaelbuckbee · 2026-05-01T18:00:37 1777658437

She's 50 years old has a doctorate in pharmacy and has worked as a hospital pharmacist for two decades.

I don't say this as a "gotcha", but more that even with all that experience she still finds it beneficial and helpful.

hamdingers · 2026-05-01T22:33:02 1777674782

That makes it more sad, to me. Someone with those credentials should be able to communicate with their colleagues effectively. I wonder if she used to be able to.

It appears Hacker News disagrees that social skills are valuable skills. Mea culpa, I should have guessed.

PoignardAzur · 2026-05-02T06:01:22 1777701682

There's something ironic about complaining about other people's social skill while you couldn't be bothered to make a point without sounding dismissive and condescending.

janderson215 · 2026-05-03T11:14:38 1777806878

Navigating tough conversations takes time, attention, and mental energy. I’d rather a pharmacist spend that time on catching another dangerous contraindicated combo of drugs for a different patient. Actually, AI should soon be checking for that, too.

accrual · 2026-05-01T13:18:48 1777641528

All three did well, and while I'm a Claude user, I found the Opus reply here added some unnecessary detail, like "Impact: Minimal; no downstream dependencies are currently at risk". Downstream dependencies weren't mentioned in the original message; for all we know downstream could be relying on a poorly performing API and is impacted by waiting another week for replacement.

ActivePattern · 2026-05-01T17:22:36 1777656156

Seeing this makes me wonder if Grok uses Claude conversations for training.

It's otherwise kind of surprising that they both converge on very similar phrases (e.g. "API integration is kicking my ass") that aren't anywhere in the prompt.

sroussey · 2026-05-01T20:11:11 1777666271

Elon testified this week that SpaceTwitter is indeed distilling from openAI and others.

rafram · 2026-05-01T14:00:57 1777644057

All of these were frankly terrible. I guess Grok’s “informal” version sounded the most like a real human, but only because it reads exactly like an Elon tweet (including his favorite emoji!). It’s obvious what they’ve been training on.

mwigdahl · 2026-05-01T14:49:48 1777646988

GPT 4.1? Why not a 5-class model?

djyde · 2026-05-01T09:48:17 1777628897

I've also noticed that when I communicate with Grok in my native language, its tone is more natural than other models. I think this is due to the advantage of being trained on a large amount of Twitter data. However, as Twitter contains more and more AI-generated content now, I'm afraid continued training will make it less natural.

adjejmxbdjdn · 2026-05-01T12:31:01 1777638661

The causation could also be the other way round.

Twitter language has started seeming normal casual to us, rather than us using normal casual language in Twitter.

darkerside · 2026-05-01T12:07:49 1777637269

Sadly, it's more likely that people will just start talking like bots

pdimitar · 2026-05-01T14:02:06 1777644126

I've seen this expressed as a concern even from one of my colleagues. My retort was:

"English is not my native language and LLMs taught me quite a few very useful formalisms that do land well for people and they change their attitude towards you to be more respectful afterwards. It also showed me how to frame and reframe certain arguments. I agree sounding like an LLM is kind of sad but I am getting a lot of educational value -- and with time I'll sneak my own voice back in these newly learned idioms and ways to talk."

gusmally · 2026-05-01T17:20:52 1777656052

Since you seem interested in the ins and outs of English, I want to say that "retort" has a connotation of anger or sharpness. Your response reads more like a "rebuttal" to me.

This is not a correction; maybe retort is what you meant and I'm not trying to be the English police. I just like discussing the intricacies of language :)

pdimitar · 2026-05-01T17:30:34 1777656634

Actually super helpful, thank you!

somenameforme · 2026-05-01T18:31:29 1777660289

Like most of all widely spoken languages, there's a lot of regional variation in English. There's even a bunch of quizzes online where you answer 20 questions about phrasings, and they can tell you where you're from with a disconcertingly high degree of accuracy.

In my experience a "retort" is sharp or witty, but certainly not angry, whereas the word "rebuttal" is itself essentially antagonistic. You might use it when referring to something or someone that you look down upon, whereas a more neutral term would simply be "response."

antod · 2026-05-02T00:41:26 1777682486

Just personally I tend to regard retort as short and reactive while rebuttal as a longer and more considered disagreement. A retort could be defensive and wrong or it could be sharp and insightful - it doesn't imply one or the other. A rebuttal is mostly an attempt to correct something while a retort doesn't need to be a correction (although it could).

Even something like "piss off!" could be a retort, but usually never a rebuttal :)

pdimitar · 2026-05-01T18:35:52 1777660552

Just as I was reading your comment I remembered that Samuel Jackson used "retort" in his speech in the "Pulp Fiction" movie and was wondering whether he was openly antagonistic there (I mean, he killed a bunch of guys with a pistol shortly afterwards but still) or was it a witticism.

I admit I am lost on these nuances and I usually kind of use whatever idiom comes to mind, which yes, likely would net me some weird looks depending on where I am geographically.

microtherion · 2026-05-01T14:31:37 1777645897

It's impressive that you've even managed to use an em-dash in spoken language. /s

pdimitar · 2026-05-01T14:33:33 1777646013

I did spot the /s but it's not relevant: I use two normal dashes actually. :)

JKCalhoun · 2026-05-01T12:17:51 1777637871

You're absolutely right!

jmalicki · 2026-05-01T15:20:45 1777648845

So human language will improve and become more precise? I'm all for it, especially if we get more emojis in speech! Why is that sadly? Humans will learn to imitate their more intelligent betters.

techjamie · 2026-05-01T12:26:28 1777638388

There was already evidence last year[1] that pointed to ChatGPT-specific words like "meticulous," "delve," etc becoming more frequently used than they were previously. The linked study used audio of academic talks and podcasts to determine this.

[1] https://arxiv.org/abs/2409.01754

pohl · 2026-05-01T13:37:45 1777642665

Part of me wanted to object to those two examples, which I’ve used frequently since the reaching adulthood in the 80s. Another part of me has been triggered by an apparent uptick in the word “crisp”, which my gut takes as an coding-LLM tell.

ls612 · 2026-05-01T14:54:55 1777647295

Opus 4.7 loves to use the word “substrate” whenever it gets the chance, it’s a really weird tic. How do these models end up this these sorts of behaviors?

pacific01 · 2026-05-01T10:20:04 1777630804

Did you try meta? I was into grok but now meta works well for me

thunderbong · 2026-05-01T09:56:25 1777629385

I'm sure Twitter knows which are the bot accounts and is surely excluding them from their model training. Twitter bots aren't a new phenomenon after all.

cowsup · 2026-05-01T11:38:06 1777635486

I don't think Twitter/X know for sure who the bots are, since Elon has been pretty vocal about trying to stop them for ages, yet I still get lots of spam DMs (as do others with far fewer followers/reach).

Even if 95% of the spam gets actively reported and dealt with, that still leaves a ton of nonsense on the platform, getting fed into the LLM. And spam has only gotten worse over the years, as the barrier to entry has lowered and lowered.

UltraSane · 2026-05-01T16:31:18 1777653078

"Elon has been pretty vocal about trying to stop them for ages"

Elon lies a lot. Like ALL THE TIME.

GTP · 2026-05-01T13:05:12 1777640712

Are the spam DMs advertisements or more generally something linked to a product or service? I wouldn't be surprised if X is more lenient towards bots that pay them for adverts.

Zancarius · 2026-05-01T14:16:33 1777644993

Most of what I get seem to be advertisements or automated messages if you follow large(r) accounts.

One of the most interesting things that I've noticed is these advertisements will be triggered if you follow accounts that are positioned as influencers. I followed one out of curiosity and received a DM from that account advertising some cryptocurrency service.

It's a good way to filter out and block accounts that have almost certainly not grown organically.

HarHarVeryFunny · 2026-05-01T12:50:38 1777639838

I'd have guessed that at least some of the bots are Twitter itself, trying to draw you in with some sense of engagement. Given that Musk is the owner, and everything we know about him and have seen him do, I'd not be surprised if some of the MAGA bots are his too.

joncrane · 2026-05-01T11:52:32 1777636352

>Elon has been pretty vocal about trying to stop them for ages

You know people lie, right? Especially when the lie casts them in a better light and/or makes them more money.

subscribed · 2026-05-01T12:40:06 1777639206

Elon lied on record many times, admitting to the lies only when forced, under oath.

hackinthebochs · 2026-05-01T13:25:18 1777641918

Highly doubtful seeing as my 14 year old twitter account got caught in a recent bot ban wave with no means of contacting a human for recovery.

pixel_popping · 2026-05-01T10:28:21 1777631301

There is bots everywhere, it has nothing to do with the platform, it has to do with attackers having an incentive to do mass account farming, no platform is secure against it.

rglullis · 2026-05-01T12:35:29 1777638929

Super easy, just make a web-of-trust type of thing: messages are only visible to those who already vouched for you. Otherwise, you pay $0.01/per message/per user reached.

pixel_popping · 2026-05-02T15:44:36 1777736676

How would that solve it? If I pay, I can still push the content I want (factual or not) which is the same equivalent as paying for accounts directly.

rglullis · 2026-05-02T17:51:23 1777744283

By buying accounts, you are buying reputation. By paying for the posts, you are maybe paying for reach at first, but (a) it will be costly and (b) it does not guarantee that the reached ones will spread anything further.

kedihacker · 2026-05-01T11:46:20 1777635980

With banning and deboosting they need to be very accurate but with filtering they can be more liberal in excluding

simianwords · 2026-05-01T10:52:32 1777632752

not really. there are easy heuristics to filter out bots with good confidence. FWIW i don't see any bots posting anything in my feed

pixel_popping · 2026-05-01T11:00:13 1777633213

Yes your individual feed isn't really relevant if we talk about the masses, Reddit accounts are for sale quite cheap, HN as well, X too and so-on, it's literally just a matter of means/methodology. If I want today to do 1000 random posts talking about a certain thing, I could.

simianwords · 2026-05-01T11:28:50 1777634930

my individual feed does matter because it shows that it is possible to curate something without bots which is obviously what XAI would do

ninininino · 2026-05-01T15:32:41 1777649561

congratulations, you have solved anti-scam. go make your billion since its easy.

simianwords · 2026-05-01T17:07:12 1777655232

its easy to solve at the offline level where you have time to filter out. in fact this is already done in pre-training by OpenAI and other companies.

you think its hard?

ninininino · 2026-05-05T15:01:14 1777993274

Yes I think it's hard.

OpenAI has already been proven to be easily gamed through very unsophisticated poisoning (fake information in a web page + an edit to a wiki page pointing at it, fake information in a reddit post), so I'm not sure we shoudl hold up their efforts at data cleaning as a gold standard.

https://www.sei.cmu.edu/blog/data-poisoning-in-ai-models-the...

cimi_ · 2026-05-01T15:16:43 1777648603

> As an English-as-second-language speaker and writer

How do you know it's actually better? I'm not trying to be condescending, but this reads to me like vibes :)

soerxpso · 2026-05-01T18:40:16 1777660816

A friend of mine uses it for D&D prep and has told me that it's good for that in particular because of its ability to match the flavor/style that he's going for. He prefers ChatGPT for everything else.

FeloniousHam · 2026-05-01T15:05:28 1777647928

I only use Grok through the "Gork" personality in the Tesla, but find its responses to be very realistic, often genuinely funny, and occasionally useful.

satvikpendem · 2026-05-01T15:31:33 1777649493

Do you use its unhinged mode? It can be hilarious but tiresome after a little while.

FeloniousHam · 2026-05-01T17:34:05 1777656845

We tried it, it was fun. Conspiracy mode just sounds like talking to my kids.

kccqzy · 2026-05-01T14:37:08 1777646228

This is more of a user preference. When I want to be informed my default is that chat bots should imitate the tone of Wikipedia. Not informal, but somewhat academic and in-depth. I don’t like it when chat bots explain things like an average human without pedagogical training: meandering, in the wrong order, and often having to repeat themselves.

jp42 · 2026-05-01T15:44:13 1777650253

anecdata: The responses of grok on X in my language are really good. the tone, sarcasm, level of "vulgarity" in response is so accurate that it seem its written by human

timacles · 2026-05-01T23:45:41 1777679141

This whole thread sounds like a grok astroturf campaign

satvikpendem · 2026-05-01T15:32:56 1777649576

So you're saying it groks you better?

AntiUSAbah · 2026-05-01T11:42:53 1777635773

[flagged]

0xy · 2026-05-01T12:13:11 1777637591

Isn't it exhausting to view everything an ideological lens instead of reviewing technical achievements on their merits?

Leynos · 2026-05-01T12:37:54 1777639074

There are limits to being willing to overlook ideology.

jmye · 2026-05-01T16:18:29 1777652309

So tired of this "reacting to a dude who built a CSAM generator is the real cringe" horseshit from people who know exactly what they're covering for.

SpicyLemonZest · 2026-05-01T12:52:06 1777639926

It's very exhausting! But Elon Musk chose to leverage his fortune from Tesla and SpaceX into an ideological project to destroy a lot of things I care about, so he's left me no choice. If he'd like people to review his work on its technical merits, shouldn't he at the bare minimum apologize and promise not to do it again?

loneboat · 2026-05-01T11:58:32 1777636712

The hitler Grok? What? I genuinely don't understand what you're trying to say in this comment.

2ndorderthought · 2026-05-01T12:09:15 1777637355

https://www.forbes.com/sites/tylerroush/2025/07/09/elon-musk...

greenavocado · 2026-05-01T12:08:40 1777637320

He's equating Grok to Hitler which is absurd. If you want to speak with the führer you need to visit https://hitler.ai

JKCalhoun · 2026-05-01T12:20:04 1777638004

Close enough—Grok called itself "MechaHitler" (a link was posted).

artdigital · 2026-05-01T09:26:17 1777627577

Grok is my favorite model for chatting, and my favorite voice mode. It seems to be the only voice mode that isn't routing to a extremely cheap model (like Haiku), and has been the highest quality out of all the frontier ones. When you subscribe to SuperGrok you can also create a "council" of agents, each with their own system prompt and when you ask something, they will all get asked in parallel to come to a conclusion. Good stuff!

Just wish they would finally put some work into their apps, it's the only thing keeping me from actually subscribing to SuperGrok:

- No MCP / connected apps support. It's been teased but here we are, still not available. I can't connect Grok to anything, so I can't use it for serious work

- Projects are still not available in the app so as soon as you move something into a project, it's gone from all the native apps

- No way to add artifacts (like generated markdown docs) directly to a project, we have to export to PDF/markdown and re-import. And there isn't even a way to export artifacts. This makes serious project work hard because we can't dynamically evolve projects with new information

- No memory, no ability to look up other chats, each chat is completely new

- No voice mode in projects at all

If someone from xAI is reading this, please consider adding some of these.

base698 · 2026-05-01T14:02:07 1777644127

Starting to like the lack of memory. Claude remembers I have a grill and will interject in conversations about how maybe this thing would go well with BBQ when it's unrelated or just also about food.

Petersipoi · 2026-05-01T17:31:14 1777656674

This is so obnoxious. I ended up deleting all the memory from Gemini because it ended every response with, "As an engineer, father of X, you'll love this because...". As if I want my occupation and the number of children I have to be relevant to which lawn mower I buy.

sethops1 · 2026-05-01T22:30:17 1777674617

Yup. I finally went into settings and disabled memory altogether. Every chat is a fresh slate now, the way it should be.

toraway · 2026-05-01T17:46:19 1777657579

Haha I recently asked Gemini for a product comparison for USB-C GaN chargers and it randomly inserted "as a Software Developer at $COMPANY working remotely, you may find the 100W fast charging useful when using your company laptop while travelling."

Like, thanks, really useful stuff (and definitely worth the creepy vibes to include that).

xur17 · 2026-05-01T16:17:33 1777652253

Gemini thinks my name is my brother in law's name, and despite explicitly telling it that's not my name + digging through the settings, it still amusingly calls me the wrong name.

UltraSane · 2026-05-01T16:32:38 1777653158

I'm a network engineer and Claude loves to make analogies to network routing protocols and such. They are often very creative. You can actually edit the profile Claude makes of you. It can be very funny to say you are a professional clown or mime or something equally odd. I wonder what analogies it would create for horse semen extractor?

Eliezer · 2026-05-01T15:05:53 1777647953

You can turn that off in settings.

burnte · 2026-05-01T17:42:18 1777657338

I have that disabled. I tend to use different chats as the LLM equivalent of private browsing, so I like it to not have memory transferred between them.

numbers · 2026-05-01T18:29:35 1777660175

:D that's like my Claude where it loves to point out that I have an ADU in the backyard in unrelated situations.

miohtama · 2026-05-01T18:58:09 1777661889

I like my Python with hot sauce.

artdigital · 2026-05-01T09:40:20 1777628420

I also think Grok would benefit from allowing usage of "SuperGrok Heavy" (their $300 plan) in coding harnesses with included usage. Currently they give you some API credits on the Heavy plan so you can use some Grok for coding, but $300 USD value is just not there.

Not saying they should create their own grok-code harness, just allowing usage in existing ones would already be beneficial. But that's probably what the Cursor acquisition is going to do eventually

HarHarVeryFunny · 2026-05-01T13:02:24 1777640544

The Gemini app voice mode uses one of their more recent models (and not some gimped small one), and is very capable. The personality is also fine, much more natural than the Gemini web chat, with my only complaint being it's insistence on suggesting a "next step" which seems to he something that they all do.

I'm not sure if the "next step" is just to drive cost up for you (but makes no sense for free version), or because they are all failing to learn more natural conversational patterns and distinguish questions that are begging for a quick answer and shut up as opposed to a longer exploratory conversation where next step may have some value, although it would be nice if these models would follow an instruction to NOT do it!

altmanaltman · 2026-05-01T14:28:54 1777645734

I think the "next step" instruction is more about engagement than cost, basically giving the user some options to continue the chat. I always have had success by ending the prompt with "only reply with nothing else but the answer to the query in a precise way". This usually always works better than telling it to not ask leading questions etc but a straight up expectation of the answer format you need is an instruction that most models can follow imo

HarHarVeryFunny · 2026-05-02T11:41:03 1777722063

I find that asking Gemini "just the answer, no follow up" etc works at best for one or two conversational turns, sometimes none!

The problem seems to be the way it in effect overweights the system prompt vs user input, so it quickly ignores things like this that conflict with the system prompt.

This is kind of a case of the bitter lesson - the conversational patterns of these models would be much more natural if they just let it learn them, and respond in a context appropriate way, rather than this crude system prompt way of forcing it to respond in the same way always, regardless of input or of how much the user tells it to shut up!

jquery · 2026-05-01T14:58:33 1777647513

The “next step” is in the system prompt, not the model. Gemini leaked part of its system prompt to me a few days ago, and there was something in there encouraging it to ask the user what they wanted to do next at the end of its response. Something about “give the user 1 or 2 options for follow up”.

I honestly find it rather annoying, but Gemini has stopped doing it to me for the most part, so maybe they’re trying out a new system prompt.

WarmWash · 2026-05-01T13:38:36 1777642716

An interesting side bit about the gemini voice model is that you can use it in AI studio and type messages instead of using the microphone.

On the backend google does TTS to feed the model, which then speaks back you via sound on your speakers.

HardCodedBias · 2026-05-01T14:51:19 1777647079

I use ChatGPT all of the time, but the model backing the voice model (or it's settings) is intensely stupid.

If Grok is actually good here, they will have a customer!

AlwaysRock · 2026-05-01T15:15:00 1777648500

I could be wrong but I think the voice mode that chatgpt uses is still a 4.something model.

brightball · 2026-05-01T16:32:08 1777653128

IMO everything you mention is the reason for the Cursor deal.

afpx · 2026-05-01T09:39:59 1777628399

When I signed up, I accidently paid for a full year. So from time to time, I'll throw it something just to see what it produces compared to the other LLMs. And, even after all this time, it still feels like a really "dumb" model compared to the other frontier ones. But, worse, many of my system prompts make it go wacky and puke jibberish. However it was pretty cool for those couple months awhile back when it was uncensored. You could ask it about a wild conspiracy, and it would actually build the case and link you to legitimite source material. They dropped the hammer down on that real quick.

2ndorderthought · 2026-05-01T10:32:08 1777631528

Ah yes the psychosis reinforcement vertical. It's such a lucrative market for those schizophrenics and bipolars. Great way to get lots of engagement. Groks portfolio is so diverse

jmalicki · 2026-05-01T14:29:14 1777645754

It's a great way to get funded by your CEO and get good performance reviews; xAI employees know how their bread is buttered.

readthenotes1 · 2026-05-01T10:39:29 1777631969

I have a schizophrenic relative who is in such a relationship with grok. Instead of telling hen you need to take your meds, it says hen is the smartest person in the world

2ndorderthought · 2026-05-01T10:41:15 1777632075

I'm so sorry your family is suffering from this. I hope you can find a way to bring them back. Disorders featuring psychosis are so painful for everyone around them. Blessings to you and your family

afpx · 2026-05-01T11:35:21 1777635321

I love how you guys downvote all the old comments to make them hidden from search. My no-name account rarely gets downvoted. But, within 20 minutes of posting this, I drop 10 points. Rando accounts

booleandilemma · 2026-05-01T15:20:27 1777648827

Don't worry about HN points. It's all just fake anyway. Numbers on the internet. GitHub stars on the other hand, now those are real.

wincy · 2026-05-01T13:09:36 1777640976

I upvoted your first comment because it was insightful, interesting, and added to the conversation. I downvoted this one because complaining about downvotes is largely considered to be in bad taste and doesn’t really help anything. I did both of these things before I realized you were the same person.

afpx · 2026-05-01T14:09:08 1777644548

Yes, for sure I deserve downvotes for the above. Those types of comments should be downvoted. However, I needed to post it to point out that I got the -10 well before the comment above. I never experienced that before and thought it interesting enough to share. Karma doesn't mean anything to me personally. But burst behavior like that is unusual.

2ndorderthought · 2026-05-01T12:46:56 1777639616

I upvoted both of your comments. I also cannot downvote anything.

afpx · 2026-05-01T11:09:33 1777633773

Except that it pointed at original sources, like reference manuals, archival documents, published newspaper articles, magazine articles, etc. - a lot still available on archive.org. Good try with your 16 day old account. And, why would anyone trust NPR at this point? Get real, bud. Most people with any curiousity know all about the ADL, JStreet, AIPAC, Greater Israel, Mossad / CIA, Chabad networks, Epstein, drones, weapons programs, cryptocurrencies, etc. etc. etc. - but, don't worry they're all safe with papa Ellison.

Anyone remember why Oracle was named Oracle?

arvid-lind · 2026-05-01T11:13:57 1777634037

Commenter was referencing a Bill Hicks joke. https://www.youtube.com/watch?v=NXi-9kA4ERM

afpx · 2026-05-01T12:30:50 1777638650

Actually it's funny you mention Bill Hicks. I didn't even know who he was. Or Alex Jones. That claim was one of the more absurd ones I discovered. But, given everything else I learned over the past year, who f'n knows at this point.

2ndorderthought · 2026-05-01T11:16:52 1777634212

Someone gets it!

2ndorderthought · 2026-05-01T11:16:19 1777634179

"We have improved @Grok significantly," Elon Musk wrote on X last Friday about his platform's integrated artificial intelligence chatbot. "You should notice a difference when you ask Grok questions."

Indeed, the update did not go unnoticed. By Tuesday, Grok was calling itself "MechaHitler."...

https://www.npr.org/2025/07/09/nx-s1-5462609/grok-elon-musk-...

Grok is definitely a reliable source of truthful sane rational information.

afavour · 2026-05-01T14:09:26 1777644566

Rich billionaire Ellison = bad, compromised

Rich billionaire Musk = good, has no vested interest in biasing the output of his AI tool

ajitid · 2026-05-01T11:17:20 1777634240

If I sub to SuperGrok, would I be able to use it in Pi agent or in Opencode? This is not clear to me if I can. Do I get an API Key in SuperGrok?

everfrustrated · 2026-05-01T11:27:20 1777634840

No, no api access for the Grok product. APIs are only via the xAI product.

walletdrainer · 2026-05-01T09:55:22 1777629322

> No MCP / connected apps support. It's been teased but here we are, still not available. I can't connect Grok to anything, so I can't use it for serious work

Grok has tool use, no? Why would you also need MCP? What does MCP add?

artdigital · 2026-05-01T10:19:33 1777630773

I'm talking about the consumer Grok app and grok.com website. There currently are not connected apps (or MCP) at all, so while Grok can use tools, there is no way to add tools to it

Oarch · 2026-05-01T11:15:46 1777634146

I'd agree on the voice transcription; it seems so much more accurate than the other frontier models I've used. I often speak to Grok and paste the transcribed output to Claude!

Cakez0r · 2026-05-01T11:23:23 1777634603

If someone from Grok is reading, don't waste time on these chaff features. The market will eventually deliver better 3rd party solutions to all of these things. There is an audience that isn't interested in these walled garden features and are only interested on intelligence per dollar.

raincole · 2026-05-01T11:53:24 1777636404

Lol I wonder when Anthropic discussed the idea of Claude Code internally, were there bozos saying "3rd parties will eventually deliver this so we shouldn't waste time one it."

wincy · 2026-05-01T13:12:39 1777641159

Personally, my work doesn’t want to get locked into a single LLM provider so we use Cursor. Much easier to fight the big corp software approval battle once then switch around the LLMs to the new hotness (provided legal has the requisite data sharing agreements in place, we’re not supposed to use Chinese models or Grok) but I can switch between Anthropic and OpenAI models at will.

Cakez0r · 2026-05-01T12:17:21 1777637841

Power users are hotswapping these models into their own agents (hermes, openclaw, etc) which have their own systems for project management, memory, interacting with tools, etc. The important metric is intelligence per dollar. Can I drop this model into my harness and have it be cheaper without losing intelligence. That is where the puck is heading.

wyre · 2026-05-01T14:03:12 1777644192

The only good thing Claude Code did was bring coding harnesses to a wider audience. It is not a good harness.

jmalicki · 2026-05-01T14:33:25 1777646005

What are good harnesses? I haven't yet been able to get good agent teaming approaches out of other harnesses yet, before that feature I mostly regarded the space as competitive, but until another harness can do as well with Claude models it seems like it's better for now?

torginus · 2026-05-01T11:38:45 1777635525

Aren't they 'wasting' time on these features exactly because the engineering requires a different, more traditional skillset from the ML work model people do, and can be done in parallel?

gertlabs · 2026-05-01T19:09:48 1777662588

Grok 4.3 is a unique model in our tests. It's one of the fastest models, and its responses are far smaller/token dense than other models with comparable performance.

However, its overall coding reasoning ability is not competitive with the big April releases, and neither Grok 4.20 nor Grok 4.3 have been able to significantly push the intelligence frontier since Grok 4. Grok 4.3 is better in agentic workloads, and a fair analogy would be that it's capabilities are approximately GPT 5.1 / Gemini 3 Pro Preview level, but much faster and cheaper. So definitely a solid release in its own ways. Many of the recent open weights releases are smarter, but slower.

Full benchmarks at https://gertlabs.com/rankings

nomel · 2026-05-01T21:40:05 1777671605

Any possibility that there could be a compromise in making it work seemingly well (benchmarks around this?) with post-knowledge-cutoff information, which appears to be their primary use case for it?

gertlabs · 2026-05-01T22:15:09 1777673709

All models are moving towards more frequent and more efficient tool use, which should close the gap on post-knowledge cutoff problems. The only tradeoff I see is speed, and Grok 4.3 is currently taking the fast side of that tradeoff.

bel8 · 2026-05-02T01:41:09 1777686069

Interesting benchmarks. But how is Deepseek V4 Flash significantly better than Pro in the agentic coding benchmarks?

gertlabs · 2026-05-02T03:02:56 1777690976

Pro is smarter in one-shot problems, but it struggles with custom tooling, and spends too much time trying to figure out our harness. We ran a lot of samples, so I can't make excuses for the model. Flash is truly the better option overall, especially considering speed and cost.

bilsbie · 2026-05-01T14:43:27 1777646607

Grok has become my go to search engine lately. I think it’s the only AI with access to x posts and beyond that it seems to generally be more “searchy” than other LLM’s.

pantsforbirds · 2026-05-01T16:05:24 1777651524

Grok and Gemini are the ones I tend to use for finding news related to breaking events. Both were really nice during the Iran incident when I wanted to find out things as they were being reported.

sroussey · 2026-05-01T20:13:44 1777666424

Why would you want to search twitter in the first place?

tornikeo · 2026-05-01T09:40:20 1777628420

So, we have: - claude for corps and gov - codex for devs - grok for what, roleplay, racism? Those are the two things I've ever heard grok associated with around me.

sudb · 2026-05-01T10:04:32 1777629872

So interestingly, I know of at least one application in a charity that deals with trafficking where grok was happy to do one-shot classification tasks where all other models refused to cooperate.

I think there's a surprising number of actually useful applications in this sort of grey area for a slightly-less guardrailed, near-frontier model (also the grok-fast models are cheap!).

vorticalbox · 2026-05-01T14:27:37 1777645657

I am software dev and i was doing a security check on my own application (work) I was running in localhost and gave it access to the code.

every single model refused to attempt to run any sort of test to check if it was a n issue other than grok.

dmix · 2026-05-01T14:40:34 1777646434

You couldn't even ask Claude how CopyFail worked. Even more general questions around it kept getting rejected.

nico · 2026-05-01T15:16:46 1777648606

A couple of days ago, using codex at work, all of a sudden it said my session had been flagged for security reasons. I wasn’t doing anything cybersecurity related, nor testing any vulnerabilities or anything like that, just trying to build a pretty simple web app

tcoff91 · 2026-05-01T16:05:19 1777651519

It seems really dumb for the models to not due security related things. What if I want it to do a security audit of my own software that I'm building?

vorticalbox · 2026-05-01T16:15:21 1777652121

codex will actually help you look but it will refuse to actually try and exploit it.

it won't for example create a POC python script that you normally would use to prove the issue.

cameronh90 · 2026-05-01T12:47:25 1777639645

Gemini especially has a habit of blocking my pretty mundane requests, claiming they’re attempts to jailbreak or create malicious code.

Grok also does quite well at code reviews in my experience because it’s not so aggressively ”aligned”.

tomp · 2026-05-01T13:03:37 1777640617

I couldn't get Gemini nor ChatGPT to do OCR of children's books (I literally own the books, so there's no copyright issue - all just fair use!).

The OCR was complex enough (bad quality photos) that "simple" OCR models couldn't do it.

Fortunately, Claude obliged (as well as Mistral OCR was helpful!)

2ndorderthought · 2026-05-01T10:21:55 1777630915

There are lots of uncensored models out there. I don't think grok is leading in that front. They kind of pick and choose which things they want to support based on elons world views. Elon used to hang out with sex traffickers so of course grok is fine talking about it. Probably even offers strategies for them does free accounting has money laundering strategies etc...

1123581321 · 2026-05-01T13:37:57 1777642677

What are the leading uncensored models? How well do they perform for you?

2ndorderthought · 2026-05-01T13:50:51 1777643451

I don't use any but they do exist and there are scientific papers discussing them. I heard about them through r/localllama

Scroll_Swe · 2026-05-01T13:57:09 1777643829

>There are lots of uncensored models out there.

Like what?

Something as easy where normal people can login to a website and app and just use?

2ndorderthought · 2026-05-01T14:00:14 1777644014

I don't think companies are hosting them because imagine the liability. Could be wrong though. Again I don't know much about these things I just know they exist.

Scroll_Swe · 2026-05-01T14:03:01 1777644181

Yes that is my point.

It is the dropbox comment all over again.

"Well you can just self-host to get uncencored same as Grok without NAZI!! Elon Musk!!"

Just like you can spin up an FTP to get your own Dropbox.

Well... very few people are going to actually do that.

pixelmelt · 2026-05-02T18:56:46 1777748206

I've been working on my own misaligned model and grok is definitely different enough with a syspronpt compared to all the other frontier models that I've considered using it to generate synthetic training data, however it leans really heavy into LLMisms which makes it not really worth it. Tangentially I also really like the idea of llms as librarians they are trying out with grokapedia.

svachalek · 2026-05-01T17:17:32 1777655852

Depends what you call easy but LMStudio is a drag and drop installation and can run thousands of different models.

CJefferson · 2026-05-01T15:35:28 1777649728

Deepseek is fairly uncensored. I tried pushing it and reached my limits before it did.

RKearney · 2026-05-01T19:14:27 1777662867

Is this satire? Ask it about June 4 1989, Taiwan independence, or Winnie the Pooh.

maldev · 2026-05-02T21:44:08 1777758248

Not that you're wrong, but I think they were talking about it from a technical POV. I use deepseek to write exploits and red team("Malicious" code). It's alignment is under different values so it's nice to be able to at least swap between models for different uses.

spiderfarmer · 2026-05-01T10:32:16 1777631536

[flagged]

user34283 · 2026-05-01T10:50:25 1777632625

[flagged]

2ndorderthought · 2026-05-01T10:57:22 1777633042

[flagged]

derangedHorse · 2026-05-01T12:46:45 1777639605

> so of course grok is fine talking about it. Probably even offers strategies for them does free accounting has money laundering strategies etc...

The slander comes in when you assume Elon knew and was complicit with their crimes to the point he'd intentionally normalize it as a discussion topic in Grok. You even went so far as to say it's willing to assist in committing crimes.

2ndorderthought · 2026-05-01T13:48:52 1777643332

He is aware of the csam generation. He blamed the users and the official stance from his team was not to offer any fixes. That is the last I heard.

https://arstechnica.com/tech-policy/2026/01/x-blames-users-f...

I do not see the slander. These are his viewpoints. He says him, grok, and his team aren't responsible for what users do. Other companies, countries and people feel differently about the responsibility for AI models generating csam for money.

Grok and xais depictions of it are that it isn't woke and is maximally based and is politically incorrect by design. So yes, chosing to avoid being correct about policies like laws and avoid social norms lead me to believe that the generation of hate speech(some of which was illegal in certain localities), csam, etc are an expected outcome. Like Elon musk said, it's the users fault not groks. So I would not be surprised if it offered other illegal advice or helped criminals forward criminal activities. Especially more than has already been reported.

Here are some of the crimes that grok is being implicated in as far as I know today: https://www.irishtimes.com/crime-law/2026/03/03/number-of-ga...

https://www.france24.com/en/europe/20251121-france-to-invest...

https://www.robertkinglawfirm.com/mass-torts/grok-lawsuit/

https://news.bloomberglaw.com/litigation/grok-maker-xai-face...

https://www.msn.com/en-us/news/technology/musk-testifies-xai...

Among others.

I don't see that as slanderous. I see it as factual and an expected outcome for the stated goals of the product and the responses to the outcomes of the product itself by the company and its leadership.

I legitimately do expect there to be more lawsuits and possibly criminal persecution against musk, xai, over grok and no I would not be surprised if the tool is currently being used for more crime. Especially given the response to the sexual crime allegations that have been made.

I don't think Elon personally intends to normalize this. But I think that may happen anyways because I think the response was too soft.

Yes I do think grok can be used to aid crimes and criminal activity like the many lawsuits and journalists currently suggest. I don't think grok is "willing" it's not a person. I know it currently has been implicated in generating material leading to the arrests of individuals. Which I would be very surprised if that was legal.

https://factually.co/fact-checks/technology/grok-created-ill...

gadders · 2026-05-01T11:37:09 1777635429

So did Bill Gates and Reid Hoffman.

sumeno · 2026-05-01T12:02:28 1777636948

Yes, lots of billionaires were involved with the pedophile sex trafficker. They are all bad

Der_Einzige · 2026-05-01T12:10:20 1777637420

Elon, bill, Reid and Trump should share a prison cell.

Democrats have no loyalty to their own sex offenders. Look how we treated the California governed candidate, or Anthony weiner, or literally every other sex pest found in our party. Some of them who didn’t even deserve it get canceled like Al Franklin.

Diddling and then defending it and doubling down is literally a maga problem.

gadders · 2026-05-01T12:18:14 1777637894

[flagged]

KaiserPro · 2026-05-01T13:02:28 1777640548

> Ashley Biden diaries

Unless they contain allegations about Biden the president, or indeed other people then they are irellevent no?

The point is, if someone is breaking the law, they should be in jail.

This applies to Clinton, Biden, Trump, anyone. The point is the law is meant to be without fear or favour. The problem for us is that its been proven if you pour enough shit on the floor, you can get away with raping children.

Given the whole point of Qanon was to oust the peadophile ring in washington, its a bit sad that we are now supposed to disregard all that and blindly accept billionarse not seeing justice.

gadders · 2026-05-01T13:07:15 1777640835

Obviously the intelligence material being gathered was too valuable.

KaiserPro · 2026-05-02T10:04:27 1777716267

I'm sorry you're gonna have to explain that, what intelligence and why was it valuable?

so valuable that only the press paid any money for it?

gadders · 2026-05-02T16:23:19 1777738999

There is a theory that Epstein was either setup as, or evolved into, a blackmail operation for an intelligence agency. Views differ as for which nation state.

felixgallo · 2026-05-01T12:26:56 1777638416

someone stole Biden's daughter's diary, which revealed that she had battled a substance abuse problem in the past, and that's disqualifying to Biden exactly how?

user34283 · 2026-05-01T11:12:39 1777633959

On Artifical Analysis it shows only Kimi K2.6 and Mimo V2.5 Pro as better.

Those models are 1T parameters total and 30B or 40B active, this might make abliteration impractical.

About Musk, yes, there is correspondence. The only confirmed meeting appears to be a 30 minute visit at Epstein's house together with Musk's wife at the time.

As for photos you mention, a quick search tells me there is one photo of Musk and Maxwell at a 2014 Vanity Fair Oscar Party.

I find most commentary on here and other platform like Reddit extremely exaggerated compared to what is actually confirmed. Users seem hellbent on linking Musk to pedophilia-related allegations.

2ndorderthought · 2026-05-01T11:26:50 1777634810

[flagged]

mapontosevenths · 2026-05-01T12:11:02 1777637462

Elon publicaly claimed he had never corresponded with Epstein. that was a lie.

When the documents were released they found several like thie one below. Saying things like "What day/night will be the wildest party on =our island?" [0]

The "our" part is especially interesting as it implies he didnt just visit, but had an ownership stake.

Other emails were found with Epstein making excuses to avoid having Musk visit, and Musks own child publically stated that the emails were authentic and aligned with her memory of the events. [1]

[0] https://www.justice.gov/epstein/files/DataSet%2010/EFTA01762...

[1] https://www.threads.com/@vivllainous/post/DUMBh2Vkk8D?xmt=AQ...

jdiff · 2026-05-01T16:48:02 1777654082

The =s that are scattered throughout the files are characters that have been replaced due to improper parsing. Wherever you see a =, it has taken the place of another character. The best interpretation of the string "=our" is "your".

mapontosevenths · 2026-05-01T22:10:32 1777673432

Thanks for clarifying. Thats helpful.

user34283 · 2026-05-01T12:46:05 1777639565

My searches have not turned up a result showing that Musk "claimed he had never corresponded with Epstein".

Can you source this? If not, can you explain why you did not check it before you posted the inaccurate claim?

mikeyouse · 2026-05-01T13:14:11 1777641251

At minimum Musk repeatedly claimed that Epstein was the one reaching out trying to get Musk to visit his island, when in reality Musk was the one initiating and asking which nights would be the wildest parties. And after making plans to visit with his then-wife, when Epstein warned him that the ratio of women-to-men might upset Musk’s wife, Musk told Epstein it wouldn’t be a problem.

https://www.theguardian.com/technology/2026/jan/30/elon-musk...

Musk has a long history of accusations (see the “I’ll buy you a horse” SpaceX lawsuit) as well as having fathered numerous children with women ~25 years younger than himself so not sure why you’d want to die on this particular hill.

user34283 · 2026-05-01T14:02:07 1777644127

I never heard about the horse related thing, that’s interesting, thanks.

A long history? Another search tells me that apart from the mentioned accusation, there is only one WSJ article alleging sexual conduct with SpaceX employees.

You asked why I take Musk‘s side in these discussions; it’s because I don’t think he’s a pedophile.

Nothing I‘ve seen seemed convincing to me, and the arguments made online often were so laughably inaccurate and exaggerated as to border on blatant slander.

mikeyouse · 2026-05-01T16:01:39 1777651299

Yeah I don’t think he’s a pedophile either.. but I do think he’s okay with consorting with a known one because it would provide him access to young women. His history of dating and impregnating young women is well known and while not illegal is pretty gross imo. The flight attendant is only one of many accusations at SpaceX…

https://www.imdb.com/news/ni64641805/

user34283 · 2026-05-01T16:27:13 1777652833

I don’t think that makes much sense, surely as a billionaire you don’t need to consort with Epstein to meet women around 25 years old.

That link seems to report on the same single WSJ article that mostly alleges workplace power-balance issues, referencing unnamed women, none of whom have come forward to publicly accuse Musk of misconduct. It‘s also fairly thin imo.

Maybe Musk‘s conduct is more gross than I believe, but at this time I‘ll not jump to conclusions.

mapontosevenths · 2026-05-01T22:37:54 1777675074

He flew to a known pedophiles private island for what he described as "wild parties" then lied about it publically.

I dont think it's "jumping to conclusions" to say that SOMETHING unsavory happened.

Do you know what the term for someone who parties with a pedophile is? Pedophile.

No normal person would tolerate it.

mapontosevenths · 2026-05-01T22:15:47 1777673747

I misremembered. Mea culpa.

He did NOT claim never to have corresponded with Epstein. Instead he claimed that Epstein asked him to go the island and he refused. The files show the opposite to be true.

Still an absolutely enormous lie of the sort you would only tell if guilty.

Here it is in his own words. See above for one of several examples in the files illustrating how very untrue it is.

https://x.com/elonmusk/status/1972005867580281038

user34283 · 2026-05-02T09:12:58 1777713178

I looked into this long ago, and imo it doesn’t look as bad as you say.

Musk downplayed his correspondence and willingness to meet with Epstein to the point where you could argue Musk was lying, yes.

However, he did decline an invitation to the island in 2012/13, at first because Musk was looking for a party and thought this would be a peaceful island experience. Eventually Musk declined because of logistics.

margalabargala · 2026-05-01T13:08:56 1777640936

You keep using that word, "slander". I do not think it means what you think it means.

coreyh14444 · 2026-05-01T10:31:19 1777631479

If you need to ask about what people on Twitter are talking about, Grok is really good for that obviously. I use it all the time for "what are the cool kids on twitter saying is the best tiling window manager these days" or whatever. Also, if you have a question that's borderline shady, Grok will often deliver. "Can you find a grey market Windows license site for me" etc.

niek_pas · 2026-05-01T14:56:12 1777647372

> If you need to ask about what people on Twitter are talking about, Grok is really good for that obviously.

Isn't that why OP was asking about racism?

ukd1 · 2026-05-01T13:21:40 1777641700

btw copy pasted your idea in to supergrok, and learnt about Niri! Great use case, thanks!

Havoc · 2026-05-01T12:20:32 1777638032

Interesting use case!

Hfuffzehn · 2026-05-01T11:16:42 1777634202

From what I can gather Grok is not used for roleplay much. It is considered to inconsistant and crazy.

People are mostly using GLM and Deepseek via API and Gemma4 and Mistral finetunes locally.

It seems to me like the roleplay market is comparatively old and mature and users have developed cost consciousness and like models to follow their workflow/preferences. So something like Opus is liked for its smartness but considered too expensive and opinionated.

Might be an interesting data point for how the other markets might develop in the future.

vel0city · 2026-05-01T11:50:34 1777636234

It ships with a roleplay feature.

https://grok.com/ani

Hfuffzehn · 2026-05-01T11:59:01 1777636741

Sure, but the best statistics about what models people are actually using when they can choose is probably from openrouter: https://openrouter.ai/apps/category/entertainment/roleplay

cyanydeez · 2026-05-01T12:10:55 1777637455

doesnt knowing about openrouter skew by self selection.

Hfuffzehn · 2026-05-01T12:43:32 1777639412

Yes, but that market is not b2b, less commercialized, more end consumer focused and more bring your own key.

That's why I find it interesting. Anthropic is not interested in building a moat there and OpenAI has given up on their announcement of exploring it.

So you can see end users making decisions.

cyanydeez · 2026-05-01T15:28:02 1777649282

but those end users are a self selected specialized group that won't represent how jim bob in rural nowhere is going to work with Grok 4.3 to refine their racism.

standardly · 2026-05-01T13:40:55 1777642855

The grok companions still aren't available on Android :( Such a wasted market opportunity

I'm not an anime person, but I thought the waifus were kind of endearing and seemed like a much better experience for casual prompting

2ndorderthought · 2026-05-01T12:12:09 1777637529

That doesn't mean it's good at it

GorbachevyChase · 2026-05-01T14:49:06 1777646946

I know it’s really important to write and vocalize one’s alignment with the values of the day, but I don’t think language models being structurally incapable of offending your favorite race/ethnicity/caste should be an objective of AI labs. Language models are just systems and I’m not sure why we think users are not responsible for how they use their outputs. For the same reasons, I don’t dismiss the utility pens as a tool of “racism” because maybe somebody could write a naughty word on a bathroom stall.

You probably live somewhere where harassment is a crime, right? Probably, there are speech codes, too? Isn’t that enough? Do we really need to orient every effort of every person on earth around ethical fashions that change every few years?

goshx · 2026-05-01T14:59:42 1777647582

> but I don’t think language models being structurally incapable of offending your favorite race/ethnicity/caste should be an objective of AI labs.

The opposite should not be an objective either, and Elon has been very openly manipulating what grok says.

bilbo0s · 2026-05-01T15:37:56 1777649876

Good point.

But no one is saying "use grok".

Grok sucks. Not only because it's seemingly made only to serve the goal of ethnically cleansing non-whites or whatever, but also because it's just not even close to being as useful as other models. In human terms, grok is the job candidate who's simply not qualified. That candidate being a virulent racist is beside the material point.

Here's the thing though, the point of functional LLMs with fewer guardrails is still a good one. Grok is not that model. But such a hypothetical model would have broad application. (For good and for ill. Of course.)

solidasparagus · 2026-05-01T16:14:36 1777652076

I don't agree. I avoided grok because of Musk for a long time, but having used it more, I think it is one the best models around and grok.com is an extremely good chat app. My evaluation was based on trying it before gpt-5.5 and obvious before grok 4.3, but it was, for me, the 2nd best model/chat app after claude. It's much less edgelordy than you might think based on the news.

tel · 2026-05-01T18:04:53 1777658693

All my usage of Grok for technical topics shows it regularly deeply misunderstanding things and just parroting back my question in fancy language. It’s the only frontier model I get this impression of. That makes it super annoying when it tries to market itself as good at engineering tasks when it seems (to me) to be much worse at them.