More

zschallz · 2026-06-13T21:24:17 1781385857

Curious what people's experience is with these models. Anecdotally I tried these out earlier in the year and found it struggled with pretty basic full-stack coding I was doing, when Sonnet 4.6 and Haiku 4.5 didn't break a sweat. Was hoping to use it while my Claude usage was resetting but was disappointed.

saratogacx · 2026-06-13T22:55:37 1781391337

I've been using GLM-5/5.1 for about 6 months and it has been a fairly capable model. I've seen a lot of mixed opinions that tend to align with harness usage so it is worth trying out a couple with a model before writing it off. For example, I'm using crush and have had a good experience while others using CC have had a much more mixed experience. For task complexity, I treat it as I would sonnet with the same care in building out plans/prompts before firing it off and letting it go.

I use intelliJ for much of my development and also set the built in AI tools to use my GLM sub (BYOK) and it has worked out well albeit a bit slow.

Overarll, it's my main model and has been getting better with each release.

andai · 2026-06-14T00:05:56 1781395556

Yeah, the harness makes a big difference in my experience. Some of the models don't even work with some harnesses, including some very big ones. And some are clearly distilled to work with specific harnesses.

I'd love to see some numbers though, on models/harness combinations.

TheServitor · 2026-06-14T01:30:06 1781400606

https://www.tbench.ai/leaderboard/terminal-bench/2.0

wgd · 2026-06-13T23:52:53 1781394773

I've got a GLM subscription (mostly because I like supporting open model makers, pretty sure my monthly usage is so low that pay-per-token would be more cost effective), so I generally use GLM-5.1 for any personal projects and I use Opus at work.

To be entirely honest I haven't noticed much of a capability gap between the two for the sorts of things I ask of an AI agent. Maybe Opus is _slightly_ smarter or slightly better at long-running tasks but the difference is slim enough it could just be a placebo from the Claude branding / hype.

I'm looking forward to giving GLM-5.2 a spin sometime soon and seeing how it stacks up. If nothing else 1M context is a great improvement, feels like between DeepSeek v4, then MiniMax M3, and now GLM-5.2 adding it 1M is rapidly becoming "table stakes" for agentic models.

wmedrano · 2026-06-14T03:41:41 1781408501

Which specific models were you using?

In March I switched to Opencode + Kimi K2.5 and found it was a step behind. I switched to GLM 5.1 and has felt like a step above. Its probably some combination of me forgetting the baseline, model improvements, and OpenCode improvements.

$20 a month has been good enough for my coding use cases. I wouldn't call myself a vibe coder. Stuff I do is create graphs/visualizations, review, polish code, generate toy examples for learning.

Havoc · 2026-06-13T21:43:24 1781387004

They're pretty good for casual use. I mostly use GLM and occasionally sprinkle some opus via api in when I think it'll help

sumedh · 2026-06-14T00:41:22 1781397682

In my experience these models (glm 5.1) struggle after 100K tokens.

bigyabai · 2026-06-14T00:46:02 1781397962

GLM-5.1 had a coherency bug at launch, it might be worth retrying it if you haven't in a while. It can now use the full 256k context as intended.

sumedh · 2026-06-14T00:50:45 1781398245

Interesting, will give it a try again, thanks.

zschallz · on May 3, 2021

The Economist | Several engineers and engineering managers | Birmingham, UK (On site) | £30-63k

The Economist is scaling its engineering team to the next level, growing from around 50 engineers in Birmingham, UK to around 75 engineers over the new few quarters.

We're hiring:

Engineers of all levels for our product engineering teams

Senior and mid-level engineers for or mobile product engineering teams

Engineering Managers for our DevSecOps / Security enablement teams

When the pandemic ends and it's safe to do so, we're planning to return to our central Birmingham UK office (walking distance to New Street and Snow Hill stations) around 3 days a week.

Just ping me if you have any questions or would like to apply.

zschallz · on Oct 12, 2019

Even more troublesome, IMO. This as a gate to interviews leads to a really uninclusive hiring process.

zelly · on Oct 13, 2019

> uninclusive

Isn't that the point of a hiring process?

maxerickson · on Oct 13, 2019

It depends on how you want to understand language.

The point of a hiring process is to exclude candidates that are legitimately not suitable for the role.

A reasonable contextual interpretation of the grandparent comment is that the hiring process will end up excluding suitable candidates for illegitimate reasons.

Whether they should have to slavishly spell this out in a message board comment is a hot topic.

TeMPOraL · on Oct 13, 2019

They should, because otherwise they're just virtue signalling, not making an argument.

kabdib · on Oct 13, 2019

How is this uninclusive? I'm not arguing, just want to understand your analysis.

jpdaigle · on Oct 13, 2019

All of these seem to require a significant investment of time before the interview process gets kicked off, which is uninclusive because it biases the hiring process against candidates whose time outside of the office is consumed by taking care of their children. (Stereotypically, it’s borderline age-ism too because candidates in their 30s and 40s are much more likely to have children compared to people in their mid-twenties.)

throwawaycanada · on Oct 13, 2019

>All of these seem to require a significant investment of time before the interview process gets kicked off, which is uninclusive because it biases the hiring process against candidates whose time outside of the office is consumed by taking care of their children.

You have to be kidding me. It also is biased against people whose time is spent playing video games or who do nothing at all.

Some jobs are meant for people at a certain point in their career. Sometimes you don't get to have everything at the same time.

People like you would argue it would be better to shut down hospitals than let doctors work long hours because those long hours might make women not want to do the job.

jodrellblank · on Oct 15, 2019

people who cannot spare the time are not the same group as people who choose recreation instead.

Some jobs are meant for people at a certain point in their career.

That should be determined by their employability, not the state of their personal life.

People like you would argue

No they wouldn’t; strawpeople don’t argue anything, they just fall over.

djinnandtonic · on Oct 13, 2019

It filters out anyone without a few hours/days/a week to spend interviewing at your company (those without access to good hardware at home, those with familial or other after hours commitments, etc)

tyri_kai_psomi · on Oct 13, 2019

The act of dissecting a candidate's qualifications and mapping them to what a company is searching for is always going to come off as appearing to be uninclusive to one group or another unless you are naive to believe every single human being contains the same skill set, current ability, and potential.

LanceH · on Oct 12, 2019

Please state the alternative.

zschallz · on June 14, 2018

I'm with ya, but that's because I'm an American who finds himself in the UK! Every UK keyboard I use is ever so slightly different, which drives me a bit crazy.

zschallz · on June 12, 2018

You may be able to host your development environment cheaper. Hetzner works out to be almost the same price as electricity of keeping my desktop up all the time for me.

zschallz · on April 27, 2017

Interesting. Thanks for the heads up. I've had the 8.99 offering for a while now and haven't had any issues... yet.

zschallz · on Oct 28, 2016

I've been loving Scaleway. It's a great service and I'm glad to see they're expanding. Hopefully they'll come to the UK and North America next.

zschallz · on Sept 17, 2016

We use DbFit every day at our shop. It's really changed how we work (as DB developers). Was funny that I thought of this project right after reading this article, and how I should probably try and contribute, then I see this in the comments. :)

Thanks!

adzicg · on Sept 18, 2016

nice :) I built DbFit at a time when I was getting a lot of work helping companies with a huge investment in oracle pl-sql, but my interests moved on, and for a while it just felt I was holding the project back.

People wanted to implement new things, support new databases, I felt overprotective of the design, but didn't have enough time to bring new contributors onboard. So I just kind of gave up, and that allowed me to think about it from a completely different perspective. If I just deleted it, then I wouldn't care too much about it any more. Giving away the keys was kind of the same, but without preventing others from contributing. And it worked out great.

zschallz · on June 19, 2016

Ransomware have spread via exploitable, unupdated versions of RDP, which is worrying. Better to listen on another port of RDP must be exposed to the Internet.

zschallz · on April 5, 2016

There's a large income exclusion that means unless you earn a really kick ass salary, you're unlikely to be double taxed.

sremani · on April 5, 2016

tax is one part, but paperwork and penalties are the other scary part. I have only heard of it, how minute mistakes can end up with 50K fines etc. Probably some one who has gone through the filing taxes from overseas can give better insight.

coredog64 · on April 5, 2016

US citizen who worked in New Caledonia for 2+ years. The scariest part is that I don't have any official documentation for what I earned. I had access to our payroll system that records the amount I earned in the local payroll system -- I didn't generally have a need to convert that to USD every pay period and record it. I was paid "in kind" for housing. And at the end of it all I got an unofficial letter stating what that amount was.

55555 · on April 5, 2016

It's true. There are two specific forms which have 10,000 USD late fees which can compound at least three times per year each. So doing your taxes late could cost dozens of thousands of dollars. These are specific to controlling CFCs, not applying for the FEIE though.