Curious what people's experience is with these models. Anecdotally I tried these out earlier in the year and found it struggled with pretty basic full-stack coding I was doing, when Sonnet 4.6 and Haiku 4.5 didn't break a sweat. Was hoping to use it while my Claude usage was resetting but was disappointed.
I've been using GLM-5/5.1 for about 6 months and it has been a fairly capable model. I've seen a lot of mixed opinions that tend to align with harness usage so it is worth trying out a couple with a model before writing it off. For example, I'm using crush and have had a good experience while others using CC have had a much more mixed experience. For task complexity, I treat it as I would sonnet with the same care in building out plans/prompts before firing it off and letting it go.
I use intelliJ for much of my development and also set the built in AI tools to use my GLM sub (BYOK) and it has worked out well albeit a bit slow.
Overarll, it's my main model and has been getting better with each release.
Yeah, the harness makes a big difference in my experience. Some of the models don't even work with some harnesses, including some very big ones. And some are clearly distilled to work with specific harnesses.
I'd love to see some numbers though, on models/harness combinations.
I've got a GLM subscription (mostly because I like supporting open model makers, pretty sure my monthly usage is so low that pay-per-token would be more cost effective), so I generally use GLM-5.1 for any personal projects and I use Opus at work.
To be entirely honest I haven't noticed much of a capability gap between the two for the sorts of things I ask of an AI agent. Maybe Opus is _slightly_ smarter or slightly better at long-running tasks but the difference is slim enough it could just be a placebo from the Claude branding / hype.
I'm looking forward to giving GLM-5.2 a spin sometime soon and seeing how it stacks up. If nothing else 1M context is a great improvement, feels like between DeepSeek v4, then MiniMax M3, and now GLM-5.2 adding it 1M is rapidly becoming "table stakes" for agentic models.
In March I switched to Opencode + Kimi K2.5 and found it was a step behind. I switched to GLM 5.1 and has felt like a step above. Its probably some combination of me forgetting the baseline, model improvements, and OpenCode improvements.
$20 a month has been good enough for my coding use cases. I wouldn't call myself a vibe coder. Stuff I do is create graphs/visualizations, review, polish code, generate toy examples for learning.
The Economist | Several engineers and engineering managers | Birmingham, UK (On site) | £30-63k
The Economist is scaling its engineering team to the next level, growing from around 50 engineers in Birmingham, UK to around 75 engineers over the new few quarters.
We're hiring:
Engineers of all levels for our product engineering teams
Senior and mid-level engineers for or mobile product engineering teams
Engineering Managers for our DevSecOps / Security enablement teams
When the pandemic ends and it's safe to do so, we're planning to return to our central Birmingham UK office (walking distance to New Street and Snow Hill stations) around 3 days a week.
Just ping me if you have any questions or would like to apply.
It depends on how you want to understand language.
The point of a hiring process is to exclude candidates that are legitimately not suitable for the role.
A reasonable contextual interpretation of the grandparent comment is that the hiring process will end up excluding suitable candidates for illegitimate reasons.
Whether they should have to slavishly spell this out in a message board comment is a hot topic.
All of these seem to require a significant investment of time before the interview process gets kicked off, which is uninclusive because it biases the hiring process against candidates whose time outside of the office is consumed by taking care of their children. (Stereotypically, it’s borderline age-ism too because candidates in their 30s and 40s are much more likely to have children compared to people in their mid-twenties.)
>All of these seem to require a significant investment of time before the interview process gets kicked off, which is uninclusive because it biases the hiring process against candidates whose time outside of the office is consumed by taking care of their children.
You have to be kidding me. It also is biased against people whose time is spent playing video games or who do nothing at all.
Some jobs are meant for people at a certain point in their career. Sometimes you don't get to have everything at the same time.
People like you would argue it would be better to shut down hospitals than let doctors work long hours because those long hours might make women not want to do the job.
It filters out anyone without a few hours/days/a week to spend interviewing at your company (those without access to good hardware at home, those with familial or other after hours commitments, etc)
The act of dissecting a candidate's qualifications and mapping them to what a company is searching for is always going to come off as appearing to be uninclusive to one group or another unless you are naive to believe every single human being contains the same skill set, current ability, and potential.
I'm with ya, but that's because I'm an American who finds himself in the UK! Every UK keyboard I use is ever so slightly different, which drives me a bit crazy.
You may be able to host your development environment cheaper. Hetzner works out to be almost the same price as electricity of keeping my desktop up all the time for me.
We use DbFit every day at our shop. It's really changed how we work (as DB developers). Was funny that I thought of this project right after reading this article, and how I should probably try and contribute, then I see this in the comments. :)
nice :) I built DbFit at a time when I was getting a lot of work helping companies with a huge investment in oracle pl-sql, but my interests moved on, and for a while it just felt I was holding the project back.
People wanted to implement new things, support new databases, I felt overprotective of the design, but didn't have enough time to bring new contributors onboard. So I just kind of gave up, and that allowed me to think about it from a completely different perspective. If I just deleted it, then I wouldn't care too much about it any more. Giving away the keys was kind of the same, but without preventing others from contributing. And it worked out great.
Ransomware have spread via exploitable, unupdated versions of RDP, which is worrying. Better to listen on another port of RDP must be exposed to the Internet.
tax is one part, but paperwork and penalties are the other scary part. I have only heard of it, how minute mistakes can end up with 50K fines etc. Probably some one who has gone through the filing taxes from overseas can give better insight.
US citizen who worked in New Caledonia for 2+ years. The scariest part is that I don't have any official documentation for what I earned. I had access to our payroll system that records the amount I earned in the local payroll system -- I didn't generally have a need to convert that to USD every pay period and record it. I was paid "in kind" for housing. And at the end of it all I got an unofficial letter stating what that amount was.
It's true. There are two specific forms which have 10,000 USD late fees which can compound at least three times per year each. So doing your taxes late could cost dozens of thousands of dollars. These are specific to controlling CFCs, not applying for the FEIE though.
reply