More

aftbit · 2026-05-07T19:53:26 1778183606

All shell, no ghost.

aftbit · 2026-05-06T16:30:41 1778085041

>It's a dubious notion though, the demand is here to stay and new supply needs to come online to meet demand.

This is a big bet. Look at what happened in 2001 with the dot-com boom. We're still trading on their dark fiber over-build today. Meanwhile any overcapacity built in fabs will quickly be made obsolete by newer and better fab technology (or at least, that's been the pattern for the past 30 years).

I think you're missing the fact that building new supply takes time and sustained commitment, and there's simply nobody in a good position to make that commitment without losing big if your thesis turns out to be wrong.

If the demand for new AI builds is eventually satisfied, or worse, craters overnight, then who will be left holding the bag? It sure won't be Google or Apple, or even NVIDIA - it will be TSMC and Samsung.

But as you mention, this is all temporary. Either you're right, and demand will remain sustained long enough for some of the providers to decide to take that risk, or demand will crater and prices will fall.

The fact that the S&P 500 is near record highs at the same time as consumer confidence is at a 70 year low is not encouraging for continued all steam ahead in my mind... but then it's easy to predict a general future recession, and much harder to predict it to the day.

state_less · 2026-05-06T17:28:01 1778088481

The idea that we're still trading dark fiber from 2001 is an old narrative right? I guess it is still floating around. But, we're in a second big fiber build out for not only residential, but also to connect all these new data centers. Could there be some demand oscillation for silicon? Sure, but overall deep learning is not some passing fad even if someone gets too far out over their skis and over buys. So far Big Tech has increased spend 3 years in a row and next year they'll spend more than this year. Demand has evidently not let up for AI services, we still need more silicon.

I doubt consumers will regret these compute purchases in 3 years, my 3 year old GPU is still holding value, actually increased in value, and I use it more now than ever.

I did predict or expect that if oil went to $150 we'd be in recession territory, currently hovering below that level and folks are feeling the squeeze and aren't happy. Things could get worse or better, tough to tell, but I think silicon demand is more or less secular and will do relatively well in a variety of macro conditions.

aftbit · 2026-05-07T13:55:59 1778162159

This seems like it would be easy to structurally solve - big tech could make strategic partnerships with 5 to 10 year horizons with the very fabs in question. If they promised to spend a flat or growing $X on silicon (without easy contract cancellation) over the next decade, then the risk would be entirely on the tech companies and not on the fab companies. Of course, there's always bankruptcy to worry about, but that's less of a threat for Google and Apple than it is for OpenAI or NVIDIA.

The fact that we _haven't_ seen such deals be made, and that ~50% of new datacenter builds have been quietly cancelled[1], suggests to me that we're dealing with paper demand more than true sustained demand.

1: https://www.tomshardware.com/tech-industry/artificial-intell...

jandrewrogers · 2026-05-06T17:39:54 1778089194

The dark fiber from the dotcom era is approaching the end of its life expectancy. Contracts for leasing dark strands reflected that. Most of it likely still has useful life left that the owners will want to monetize but it might change how it is used.

robocat · 2026-05-06T22:32:03 1778106723

> too far out over their skis

Bad metaphor. Not being forward enough is a common mistake. When learning, you may want to make yourself _feel_ like you're too far forward. We can commonly think we're too far forward when actually overall we're still too far back. Teaching it is very hard.

Although it is dependent on your style of skis (e.g. carving) and style of skiing (powder, racing, cruising).

For me, it is very rare that I go too far forward (I began with a leant back style and I haven't rectified that after many years of skiing). I try to prioritise fun over ability.

aftbit · 2026-05-06T15:52:18 1778082738

This is par for the course with this justice department. After ICE agents killed Renee Good, rather than investigating her death, they chose to investigate her partner for anti-government sentiments instead.

If someone points out their wrongdoing, rather than take it seriously and work to purge their organization of corruption, they maliciously attack the people who pointed it out.

aftbit · 2026-05-06T15:19:34 1778080774

The only use case that pops into my mind is to build a product like Shopify that sets up a store, email, landing page, etc all from one chat-bot interface.

aftbit · 2026-05-05T13:12:59 1777986779

You either die a hero or live long enough to see yourself become a villain.

What did we expect when they dropped "don't be evil" from their company values?

reactordev · 2026-05-05T13:29:53 1777987793

A claim about as useful then as it is now. They never wanted to be anything but, once Sergei left. The Schmidt era had them publicly declare one thing while doing something else entirely behind the curtain.

coldtea · 2026-05-05T13:44:44 1777988684

They were corporate evil from day 1. The rest was just PR slogans, and playing the good guy as long as you don't need to squeeze profits.

aftbit · 2026-05-03T23:20:10 1777850410

What hardware are you using to power this?

girvo · 2026-05-04T00:17:27 1777853847

> DGX Spark-alike

Probably wasn't clear enough if you don't know what that is already, apologies

It's an Asus Ascent GX10, which is a little mini PC with 128GB of LPDDR5X as shared memory for an Nvidia GB10 "Blackwell" (kind of, it's a long story) GPU and a MediaTek ARM CPU

aftbit · 2026-05-04T00:29:55 1777854595

Ah yeah I saw that, I was just curious which particular mini-PC you were using. I was considering picking up one of the various AI Max 395 boxes before the RAMpocalypse but didn't take the plunge. Thanks for the response!

girvo · 2026-05-04T01:14:44 1777857284

I heavily considered one of the AMD Strix Halo boxes, but part of the reason I wanted this was to learn CUDA :)

sterlind · 2026-05-04T02:45:08 1777862708

pulls up chair

could you tell me the long story?

edit: or wait, is it quasi-Blackwell the way all DGX Sparks are quasi-Blackwell? like the actual silicon is different but it's sorta Blackwell-shaped?

girvo · 2026-05-04T03:01:35 1777863695

Yeah exactly. Shader model 121 is different to SM 120 (consumer Blackwell) and is different again to data centre Blackwell SM100.

The promise of this chip was “write your code locally, then deploy to the same architecture in the data centre!”

Which is nonsense, because the GB10 is better described as “Hopper with Blackwell characteristics” IMO.

Still great hardware, especially for the price and learning. But we are only just starting to get the kernels written to take advantage of it, and mma.sync is sad compared to tcgen05

aftbit · 2026-05-03T23:16:30 1777850190

    #!/bin/sh
    export ANTHROPIC_BASE_URL=https://api.deepseek.com/anthropic
    export ANTHROPIC_AUTH_TOKEN=sk-secret
    export ANTHROPIC_MODEL=deepseek-v4-flash
    export CLAUDE_CODE_DISABLE_NONESSENTIAL_TRAFFIC=1
    exec claude $@

rapind · 2026-05-04T02:51:12 1777863072

ANTHROPIC_MODEL=deepseek-v4-pro[1m] ANTHROPIC_SUBAGENT_MODEL=deepseek-v4-flash

This is what I’ve been using for non-confidential projects for about a week now (soon after v4 came out). I honestly can’t tell the difference, but I’m not doing anything crazy with it either.

Worth noting that I don’t think DeepSeek‘s API lets you opt out of training. Once this is up on other providers though… (OpenRouter is just proxying to DeepSeek atm)

lhl · 2026-05-04T10:04:37 1777889077

For those that don't want their data trained on, OpenRouter allows you to have account-wide or per-request routing with either provider.data_collection: "deny" or zdr: true (zero data retention).

Also, you can use HuggingFace Inference for DeepSeek V4 or Kimi K2.6, both of which work quite well and route through providers that you can enable/disable (like Together AI, DeepInfra, etc) - you'll have to check their policies but I think most of those commercial inference providers claim to not train on your data either.

miroljub · 2026-05-04T10:23:52 1777890232

I wonder why the question about data security and training comes often with DeepSeek, Kimi, Glm and never with Anthropic, OpenAI, and Google models.

Why is that?

IIRC, USA data protection protects data of US citizens only, foreigners data is not protected, and the companies are not even allowed to disclose when they collect those data.

zeendo · 2026-05-04T13:27:07 1777901227

Because Anthropic, at least, gives you the option to opt out of training? I think Google and OpenAI do, too.

Matl · 2026-05-04T11:35:37 1777894537

> USA data protection protects data of US citizens only, foreigners data is not protected

HN is an American site. If you look at the US government, it is going to fearmonger about anything China related, because they haven't had a genuine competitor for decades and they're scared and lashing out. Most US news just parrot the government line, sometimes more so than state TV would, and so it reflects here.

I also feel comfortable saying that many Americans don't care one bit what happens to foreigners, be it by action of their government or companies.

giwook · 2026-05-04T12:21:29 1777897289

> I also feel comfortable saying that many Americans don't care one bit what happens to foreigners, be it by action of their government or companies.

This is true. There are also many of us who do care.

This brings to mind something I heard recently about the so-called "Rule of 10". There will always be 3 people who support you, 3 people who are against you, and 4 people who have no idea what's going on and don't care.

Don't just focus on the 3 people who are being negative.

Matl · 2026-05-04T12:28:32 1777897712

Oh absolutely.

boondongle · 2026-05-04T13:17:49 1777900669

Wolf Warrior diplomacy isn't even 10 years dead. The HK treaty was violated and continues to be. Taiwan gets threatened every other week.

People can have problems with America and I'm fine with that. But pretending China isn't subsidizing industry (land, education, transportation) in a predatory fashion is silly. Too many companies have gone out of business because of it. We can all have our friends in China without pretending the CCP is playing the ballgame fairly. The government doesn't need to point it out. That doesn't even get into influence operations (which are especially easy on platforms like this.)

Seriously - there may be a day in the future where Western nations and China get along but it really can't/won't happen while it's holding all the industry and trying to take the Services income as well.

Matl · 2026-05-04T15:17:54 1777907874

The US assisted a genocide, literally kidnapped the president of a sovereign country so it could take its oil, threatened its own allies with invasion and started a war of aggression against another so that it can take their oil, all in a span of a few months.

But tell me more.

lostdog · 2026-05-04T15:35:40 1777908940

Yes, if you just list 3 more problems about the US then it means that China has no problems at all.

Matl · 2026-05-04T16:21:30 1777911690

No it means that perhaps the US should finally start looking at itself instead of just asserting that it doesn't need to because China.

That doesn't mean China should not be criticized. But to me it's clear that the China blame game is not about a genuine concern for Chinese people or its neighbors, it's about trying to keep it down because China should never dared to rise in the first place.

Anglo Saxons and maybe the French should be in charge and the rest should be resource colonies. It very much feels like that Western mentality is still there.

beedeebeedee · 2026-05-04T17:05:36 1777914336

> No it means that perhaps the US should finally start looking at itself instead of just asserting that it doesn't need to because China.

Agreed, the US definitely needs to do some introspection to sort out its own shit (and stop spraying it on everyone else).

However, that does not mean that China gets a pass. Fundamentally, the Chinese model of governance does not protect the individual. For all its faults, the US model is based upon the idea of individual liberty, which acts as a touchstone and allows it to self-correct whenever it goes to far in the wrong direction. That's something the Chinese model does not do, and means that, short of a revolution, it will continue to be an authoritarian state with all of the malignant features that entails.

Matl · 2026-05-04T17:26:47 1777915607

> Fundamentally, the Chinese model of governance does not protect the individual. For all its faults, the US model is based upon the idea of individual liberty

Look, am not here to defend the Chinese model but I find it interesting how convinced you seem that individualism is the right model for everyone.

While I would generally agree with you, I have spoken to many from poorer countries who say that they prefer to trade some individualism for a steady hand of economic development and lifting the population from poverty. That is the Chinese model.

These people would argue that they can reclaim more and more individual freedom as the country gets richer and more self confident.

I am not saying they are right, but looking at a nominal democracy like India and a nominal autocracy like China, I know which government works better as far as raising the living standards of its population and it's not the Indian one.

My hope is that China will continue to liberalize on its own. Forcing it will likely only reverse the gains.

Individualism also leads to the sort of healthcare system the US had or Skid Row. So it's not all roses.

SJMG · 2026-05-04T13:40:45 1777902045

> also feel comfortable saying that many Americans don't care one bit what happens to foreigners, be it by action of their government or companies

What's the point of this kind of statement for you? Does this help you understand others or just continue to drive the wedge in? Where are you from? Ask yourself can the statement,

"many {of my country} don't care one bit what happens to foreigners, be it by action of the government or companies" not be read as true?

There are self-absorbed, disinterested, uncompassionate people in every country which will satisfy your "many" qualifier.

Matl · 2026-05-04T16:45:03 1777913103

I am from Europe. I feel comfortable saying that many in Europe do not care about what their governments or companies do to foreigners, (at least not enough to inform themselves about it).

However looking at the polls in the US gives you a fairly decent idea that there's a decent chunk of people that seem to get off on violence towards non-Americans. Why do you think ICE went with the violent tactics it did?

As to

> What's the point of this kind of statement for you? Does this help you understand others or just continue to drive the wedge in?

The point is to maybe make some Americans ask what it is that they can do to reform the government they have the most direct influence over (their own) instead of trying to reassure themselves that theirs is still better than country's X.

jorvi · 2026-05-04T12:09:04 1777896544

That doesn't work, if you do that it will mark DeepSeek's models with a warning symbol along with the error "paid model training violation".

striking · 2026-05-04T15:52:48 1777909968

In those cases, OpenRouter just chooses providers that agree not to train / offer ZDR. Which sometimes means you start off without access to the model until some other providers start offering it.

BeetleB · 2026-05-04T13:41:37 1777902097

In a sense, it's working as intended. If you set zdr to true, you currently can't use DeepSeek v4. However, once other providers offer it (it is an open model, after all), some may allow zdr.

specproc · 2026-05-04T14:16:43 1777904203

Yeah, OR gives a bunch of providers, including Deepseek, which does train.

I set ZDR to true, and it only calls from the third party ZDR Deepseek APIs. Bit more expensive, but my client wants it.

ricardobeat · 2026-05-04T18:46:37 1777920397

ANTHROPIC_SUBAGENT_MODEL is not a valid setting, should be CLAUDE_CODE_SUBAGENT_MODEL.

rapind · 2026-05-04T19:30:33 1777923033

This is correct. Sorry I was using my phone to post. Here's what my bash alias verbatim looks like (.bashrc / .zshrc). The DEEPSEEK_API_KEY var is setup separately (so claude doesn't see it):

----

alias clauded='ANTHROPIC_BASE_URL=https://api.deepseek.com/anthropic ANTHROPIC_AUTH_TOKEN=$DEEPSEEK_API_KEY ANTHROPIC_MODEL=deepseek-v4-pro[1m] ANTHROPIC_DEFAULT_OPUS_MODEL=deepseek-v4-pro[1m] ANTHROPIC_DEFAULT_SONNET_MODEL=deepseek-v4-pro[1m] ANTHROPIC_DEFAULT_HAIKU_MODEL=deepseek-v4-flash CLAUDE_CODE_SUBAGENT_MODEL=deepseek-v4-flash CLAUDE_CODE_EFFORT_LEVEL=max claude'

----

I doubt that the opus, sonnet, and haiku model args actually matter if you want to omit them.

I run this on a VPS that has no other credentials or project access so I can give it the skip permissions arg.

maxgashkov · 2026-05-04T09:53:46 1777888426

As of now, OpenRouter offers multiple providers for DeepSeek with ZDR (not sure if they respect it but still).

vidarh · 2026-05-04T10:12:22 1777889542

At several times the price of DeepSeek, though, so it's a tradeoff... Even then Pro is still cheaper than Haiku.

tariky · 2026-05-04T06:26:44 1777876004

I wanted to try this. To bring back opus and sonnet do I just reset those env's?

snqb · 2026-05-04T12:14:08 1777896848

yes, this is pretty much just rerouting Claude to call Deepseek's Anthropic-style-compatible endpoints instead of its own defaults Once removed, it'll work just like before

ianmurrays · 2026-05-04T07:04:17 1777878257

Correct.

aaurelions · 2026-05-03T23:42:04 1777851724

It seems like any project that makes fun of Claude is bound to reach the top spot on Hacker News. Even if it’s just a project consisting of four lines of code.

oblio · 2026-05-04T09:34:19 1777887259

You're just mean. I count 6 lines of code!

varenc · 2026-05-04T04:12:33 1777867953

The more interesting part of deepclaude is the local proxy it runs to switch models mid-session and do combined cost tracking. Though these features seem quite buried in the LLM-generated readme. Looking at the history, it appears they were added later, and the readme wasn't restructured to highlight this.

Also, the author checked in their apparently effective social media advertising plan: https://github.com/aattaran/deepclaude/commit/a90a399682defc... (which seems to be working)

yard2010 · 2026-05-04T04:40:40 1777869640

How come such slop is allowed here, what value do these vibe coded zero shot "projects" add? Why not just post the prompt?

throwatdem12311 · 2026-05-04T11:55:27 1777895727

Seriously. When I first looked this project had been pushed the first commit two hours prior. Projects should be at least 3 months old or automatically removed.

ulimn · 2026-05-04T12:02:43 1777896163

But then that would have the downside of falsely blocking projects that were developed in private and then just pushed to Github (or any public repo). Like I always use my own, self-hosted Forgejo for everything by default.

throwatdem12311 · 2026-05-04T12:12:40 1777896760

If you develop on your own private instance and then mirror to GitHub to release it then there will be 3 months of git history in the logs.

sumeno · 2026-05-04T12:33:28 1777898008

If it's a project you actually care about and are actively working on it'll be just as good 3 months from now.

If it's something that'll be irrelevant in 3 months why should anyone care about it?

ulimn · 2026-05-04T14:42:57 1777905777

That is true in most cases I guess but just look at the current product in OP. In 3 months, at the pace AI products evolve, we might "all" be using the next AI coding harness and Claude Code could be a thing of the past. So it's not a long lasting tool like curl for example.

All I'm trying to say that generalizing like suggested might exclude some useful things.

KallDrexx · 2026-05-04T14:43:07 1777905787

Fwiw git history can be forged pretty easily. You can re-timestamp commits

fragmede · 2026-05-04T05:15:27 1777871727

Convenience? Am I supposed to take the prompt and use my own tokens on it? Why should I have to do that?

woctordho · 2026-05-04T08:08:18 1777882098

For the same reason that GitHub has a releases page for uploading binaries.

jpadkins · 2026-05-04T15:06:27 1777907187

is the value the working outputs or the inputs? A prompt alone would not let you recreate this project.

otabdeveloper4 · 2026-05-04T04:48:20 1777870100

Recruiters used to use the candidate's Github "sources" page for evaluating candidates as a kind of proof-of-work.

groestl · 2026-05-04T05:14:09 1777871649

And recruiter agents still do.

spirit23 · 2026-05-04T03:20:10 1777864810

So I created https://getaivo.dev, one can use model in the coding agent directly. Just `aivo claude -m deepseek-v4-pro`

Tanxsinxlnx · 2026-05-04T07:58:59 1777881539

does it support aws bedrock provider support,does i can use any model in this

spirit23 · 2026-05-04T12:15:32 1777896932

Ah, for aws bedrock, just use `aivo keys add` add baseurl and apikey, everything is ready, `aivo models` to see models

Tanxsinxlnx · 2026-05-05T10:05:33 1777975533

will check

spirit23 · 2026-05-04T08:07:35 1777882055

Currently no, but it can be added

btbuildem · 2026-05-04T01:04:11 1777856651

This in essence is what allows one to use any model with CC -- including local.

neutrinobro · 2026-05-04T12:13:14 1777896794

I know. I'm struggling to understand how this is a github repo/HN article. I've been using claude-code with a llama.cpp server and a dummy API key, and all that is required is to define 2 environmental variables to point claude at the local endpoint. Am I missing something?

guluarte · 2026-05-04T18:31:14 1777919474

I'm using this

    deepseek() {
      unset ANTHROPIC_AUTH_TOKEN
      local -x ANTHROPIC_BASE_URL="https://api.deepseek.com/anthropic"
      local -x ANTHROPIC_AUTH_TOKEN="${DEEPSEEK_API_KEY}"
      local -x ANTHROPIC_MODEL="deepseek-v4-pro"
      local -x ANTHROPIC_SMALL_FAST_MODEL="deepseek-v4-flash"
      local -x API_TIMEOUT_MS=600000
      local -x CLAUDE_CODE_DISABLE_NONESSENTIAL_TRAFFIC=1
      TMUX= command claude "$@"
    }

KronisLV · 2026-05-04T17:26:33 1777915593

Wonder if there's a way to launch the desktop Claude app like that, especially on Windows, not just the Claude Code TUI/CLI. Might not be possible and you'd just have to use --remote as a workaround.

nadermx · 2026-05-03T23:36:06 1777851366

The AI wars have begun

heisenbit · 2026-05-04T06:27:26 1777876046

And they are enticing human agents to further their agendas using techniques learned from the white mice.

stingraycharles · 2026-05-04T01:54:38 1777859678

This has been possible since the beginning.

port11 · 2026-05-04T14:46:47 1777906007

DeepClaude doesn't support MCP tool use; does your solution work with MCP tools such as Serena?

faangguyindia · 2026-05-04T02:59:58 1777863598

those who use deepseek v4, what level of output you get? Codex 5.3 or GPT 5.4?

is flash version on level of gpt 5.4 mini

adonese · 2026-05-04T05:33:21 1777872801

I tried it on a non trivial, but also well documented and self contained task. It did amazingly well. I used deepseek v4 pro via deepseek platform. The model is very fast and also it is super cheap. I burned only 0.06 USD (I reckon how the same task would have cost me had I used e.g., amp).

PS. mentioning amp because i used to use it and I pay directly for token. I topped up 5 usd so I will be going to use it and see how far can it take me. But my impression so far is even when model subsidization is done, those open source models are quite viable alternatives.

zozbot234 · 2026-05-04T06:00:38 1777874438

> But my impression so far is even when model subsidization is done, those open source models are quite viable alternatives.

My understanding is that DeepSeek V4 Pro is going to be uniquely good at working on consumer platforms with SSD offload, due to its extremely lean KV cache. Even if you only have a slow consumer platform, you should be able to just let it grind on a huge batch of tasks in parallel entirely unattended, and wake up later to a finished job.

AIUI, people are even experimenting with offloading the KV cache itself to storage, which may unlock this batching capability even beyond physical RAM limits as contexts grow. (This used to be considered a bad idea with bulky KV caches, due to concerns about wearout and performance, but the much leaner KV cache of DeepSeek V4 changes the picture quite radically.)

torginus · 2026-05-04T08:37:17 1777883837

Good. It's hard to overstate how nervous most executives are about relying on cloud-based providers.

AI currently works basically by sending your entire codebase and workflow, and internal communication over the internet to some third party provider, and your only protection is some legal document say they pinky promise they won't train on your data.

And said promise is made by people whose entire business model relies on being able to slurp up all the licensed content on the internet and ignore said licensing, on the defense of being too big to fail.

zozbot234 · 2026-05-04T08:46:28 1777884388

Yes, this is the most straightforward argument for local AI inference. "Why buy cloud-based SOTA AI? We have SOTA AI at home." It's great that DeepSeek may now be about to make this possible, once the support in local inference frameworks is up to the task.

adonese · 2026-05-04T06:23:54 1777875834

Is there any place I can read about KV? Excuse my ignorance as I'm not familiar with this topic and I read scattered notes that deepseek's cost are well optimized due to how their kv cache work. But I want to read more how kv cache relates to the inference stack and where does it actually sit.

> AIUI, people are even experimenting with offloading the KV cache itself to storage, which may unlock this batching capability even beyond physical RAM limits as contexts grow.

Especially this point. Any reason that this idea was considered bad? Is it due to the speed difference between the GPU VRAM to the RAM?

zozbot234 · 2026-05-04T06:37:13 1777876633

KV cache generally grows linearly with your current context; it gets filled-in with your prompts during prompt processing, and newly created context gets tacked on during token generation. LLM inference uses it to semantically relate the currently-processed token to its pre-existing context.

> Any reason that this idea was considered bad?

Because the KV cache was too big, even for a small context. This is still an issue with open models other than DeepSeek V4, though to a somewhat smaller extent than used to be the case. But the tiny KV of DeepSeek V4 is genuinely new.

miroljub · 2026-05-04T10:29:39 1777890579

> even when model subsidization is done, those open source models are quite viable alternatives.

Model inference was never subsidized. Inference is highly profitable with today's prices. That's why you have many inference providers. My guess, the prices for inference will go down, as more competition starts cutting the margin.

It's model training, development and R&D that cost a lot, and companies creating closed models don't have any business model except astroturfing and trying to recover training costs through overpriced inference.

spaceman_2020 · 2026-05-04T08:05:08 1777881908

have you used it for non coding tasks via MCP, like Figma/Paper for design or Ableton MVP for sound design?

The token cost makes it tempting to use for token-heavy tasks like this

63stack · 2026-05-04T11:14:37 1777893277

It's close to Opus 4.5 for me

niobe · 2026-05-04T06:55:01 1777877701

thanks, that was super easy.

I have been wanting to try CC with different models since Opus went downhill last month..

What limitations or issues have you noticed when using DeepSeek with Claude Code if any?

klaussilveira · 2026-05-04T22:19:41 1777933181

Damn, we need something like this for Ollama.

bhelkey · 2026-05-04T22:42:06 1777934526

Ollama works with Claude Code:

$ ollama launch claude --model qwen3.5 [1]

[1] https://docs.ollama.com/integrations/claude-code

Daunk · 2026-05-05T02:13:40 1777947220

You mean `exec claude "$@"`

thatxliner · 2026-05-05T00:03:21 1777939401

how do subagents work?

aftbit · 2026-04-28T00:04:11 1777334651

Darn I've only got ~20 GB of VRAM. I really need to get a stronger machine for this sort of stuff.

MerrimanInd · 2026-04-28T00:31:07 1777336267

20GB isn't enough for a 13B parameter model? I thought the 29-31B models could run on a 24GB GTX x090 card?

I'm currently shopping for a local LLM setup and between something like the Framework Desktop with 64-128GB of shared RAM or just adding a 3090 or 4090 to my homelab so I'm very curious what hardware is working well for others.

zamadatix · 2026-04-28T01:01:58 1777338118

> 20GB isn't enough for a 13B parameter model? I thought the 29-31B models could run on a 24GB GTX x090 card?

Parameters are like Hertz - they don't really tell you much until you know the rest anyways. In this case, a parameter is a bfloat16 (2 bytes). I'm sure someone will bother to makes quants at some point.

> I'm currently shopping for a local LLM setup and between something like the Framework Desktop with 64-128GB of shared RAM or just adding a 3090 or 4090 to my homelab so I'm very curious what hardware is working well for others.

I grabbed a 395 laptop w/ 128 GB to be a personal travel workstation. Great for that purpose. Not exactly a speed demon with LLMs but it can load large ones (which run even slower as a result) and that wasn't really my intent. I've found GPUs make more usable local LLMs, particularly in the speed department, but I suppose that depends more on how you really use them and how much you're willing to pay to have enough total VRAM.

It's next to impossible to make your money back on local (regardless what you buy) so I'd just say "go for whatever amount of best you're willing to put money down for" and enjoy it.

mghackerlady · 2026-04-28T13:44:32 1777383872

>2 bytes

So a wyde

Wowfunhappy · 2026-04-28T00:29:37 1777336177

How much system memory do you have? Llama.cpp can split layers across cpu and gpu. Speeds will be slower of course but it's not unusable at all.

aftbit · 2026-04-23T20:48:36 1776977316

How did the FBI break into the phone in the first place? Shouldn't they be fixing that bug too?

DennisP · 2026-04-24T12:52:27 1777035147

It could have been by just pointing the phone at the suspect's face.

47282847 · 2026-04-24T14:05:33 1777039533

Side note: FaceID only unlocks if you actually look at the screen. If you’re careful to avoid that, one would have to physically force your eyes to do that without also covering other necessary areas of your face.

A kid and I sometimes engage in a game where they try to get me to look where necessary, so far without success.

godelski · 2026-04-25T03:28:33 1777087713

  > FaceID only unlocks if you actually look at the screen.

You need "Require Attention for Face ID" turned on for this

aftbit · 2026-04-23T20:39:52 1776976792

Do either of those work on browser extensions that I install as a user? I don't see anything relating to extensions in there.

eranation · 2026-04-24T14:53:54 1777042434

Nope but that’s a good idea