More

dennisy · 2026-05-11T18:07:45 1778522865

I love this and wanted to build this - but https://www.alphaxiv.org/ already exists, and it gets no social action (hardly any papers have comments), so this makes me doubtful about this.

I am interested to hear if anyone knows why the format may not resonate with researchers or those reading papers in general?

My own reason is that to get value from a "social" site the number of interactions has to be high and of a fast speed for people to continue to engage, which is maybe not possible to hit on research papers.

smokel · 2026-05-11T18:45:26 1778525126

People will not flock somewhere unless they sense some potential return on investment. If a website looks like it will disappear in a few months, it does not make sense for a user to invest time and effort into it.

You have to either invest a lot to get a critical mass to join your site, or make it extremely entertaining to be there from the start. Apart from all the criticism, this is what Facebook, Instagram, Twitter, and LinkedIn got right from the start. For their intended audiences, it is either useful or fun to be on their platforms.

I don't see much added value for most arXiv extensions, except for SemanticScholar [1], which might have been lucky being one of the first.

[1] https://www.semanticscholar.org/

robot-wrangler · 2026-05-12T08:16:40 1778573800

> My own reason is .. maybe not possible to hit on research papers.

I think fancy people with appropriate credentials and .edu emails are all using openreview? So the audience is what, the unwashed masses who also happen to be doing some light reading at the bleeding edge of knowledge? Surely there are dozens of us I tell you, dozens! =P But yeah, maybe not enough to sustain a social network.

Never heard of alphaxiv, will try. I would also love for this to work, probably not willing to risk slogging through science twitter/bluesky/mastodon. Honestly HN would be the obvious place if it would add a pretty simple tagging system as most of the people interested are probably already here. I don't think we'll see that, because if we had filters no one would go to the front page, and that'd be a bad thing for certain interests.

xioxox · 2026-05-11T20:54:55 1778532895

Personally, I think social media and academic publishing timescales, rewards and social conventions don't match very well. Social networking feels transient and impersonal. I would like to take some time to form my opinion about a paper, not jump in with a post. Maybe some comment box would be ok to write a couple of nice things about a paper, but it doesn't feel the place to write harsh criticism or have complex discussion where things could be misunderstood. Rather than write in the public record, if I think the paper has a deep flaw I would prefer to contact the authors first. This can be followed up by discussions in your own papers. Others may have different opinions, of course.

n4r9 · 2026-05-12T08:55:33 1778576133

Scirate.com has been going since before 2010 and still active as far as I'm aware. Mostly used by quantum info folks though.

czbond · 2026-05-11T18:41:30 1778524890

I could see the author using GenAI video creation to summarize and make short videos about each paper. I believe this format could do wonders for paper discovery - say choose "Computer science" and you could flip through 20 papers in a few minutes getting an idea of what research recently has been published.

Other formats are dense and require reading and internalizing the content

dennisy · 2026-05-08T20:04:43 1778270683

This is very cool!

But why do we need this? An agent can just have a local DB using SQLite for example.

stopachka · 2026-05-08T20:13:55 1778271235

Two reasons this could make sense:

1. With this, agents can actually deploy a full backend with their credentials [^1].

2. If your agent ever wants to add auth, or real-time presence, or file uploads, or streams, they'll be able to do that too

[^1] Alas we don't offer static site hosting, so to push the website you would need to use something like a vercel cli.

noitpmeder · 2026-05-09T07:27:03 1778311623

Are neither of those things possible with a sqlite backend??? Why would one ever reach for this bespoke database tech

dennisy · 2026-04-25T09:38:50 1777109930

This is a fair question, but not one I feel we can let people self answer.

I doubt many people will honestly admit they did no design, testing and that they believe the code is sub par.

It does give me an idea that maybe we need a third party system which can try and answer some of the questions you are asking… of course it too would be LLM driven and quite subjective.

embedding-shape · 2026-04-25T10:51:45 1777114305

> I doubt many people will honestly admit they did no design, testing and that they believe the code is sub par

I'd doubt any engineer that doesn't call most of their own code subpar after a week or two after looking back. "Hacking" also famously involves little design or (automated) testing too, so sharing something like that doesn't mean much, unless you're trying to launch a business, but I see no evidence of that for this project.

dwb · 2026-04-25T10:32:49 1777113169

> I doubt many people will honestly admit they did no design, testing and that they believe the code is sub par.

Well no, but if people want to see a statement like this, and given that most people will want to be at least halfway honest and not admit to slop, maybe it will help nudge things in the right direction.

dennisy · 2026-04-25T07:45:15 1777103115

True! But this is a very naive implementation, a proper implementation could surpass these challenges.

awestroke · 2026-04-25T09:42:21 1777110141

Well let's talk again when the problems have been solved, then. Until then, manually curated skills and documentation will beat this

dennisy · 2026-04-25T07:06:53 1777100813

Congratulations on the launch!

There is lots of competition in this space, how is your tool different?

dennisy · 2026-04-22T20:19:50 1776889190

Fair play for launching this, it looks like a neat project.

However I feel it will be an uphill battle competing with OpenAI and Anthropic, I doubt your harness can be better since they see so much traffic through theirs.

So this is for those who care about the harness running on their own infra? Not sure why anyone would since the LLM call means you are sending your code to the lab anyway.

Sorry I don’t want to sound negative, I am just trying to understand the market for this.

Good luck!

yzhong94 · 2026-04-22T20:22:30 1776889350

We are not trying to compete with OpenAI and Anthropic! We open source it because there's interest from other startups.

Teams would use Anthropic and OpenAI, but they shouldn't just use Anthropic or OpenAI. We see much better results from calling the models independently and do adversarial review and response.

This doesn't replace your need for the models, but you certainly don't need to rely on any of the cloud agent solutions out there that call these models underneath the hood.

dennisy · 2026-04-22T20:11:34 1776888694

I feel for the startups sweating each one of these frontier lab releases.

How many more are thinking “am I next?”

gitmagic · 2026-04-22T21:13:18 1776892398

Yeah, I’m happy I gave up, it’s just not possible to compete and the constant stress to try to keep up is not worth it.

(I built https://nelly.is as a solo founder without funding)

weird-eye-issue · 2026-04-23T05:46:14 1776923174

The trick is niching down, this has always been the trick.

dennisy · 2026-04-25T07:01:49 1777100509

I think this used to be the trick, but now with AI it is such a general purpose technology I am not sure that makes sense. Users can “niche down” in a generic app, using prompts and configs.

weird-eye-issue · 2026-04-28T13:10:29 1777381829

It's still the trick. I've made millions per year since 2023 niching down into a small slice within the marketing industry with an AI product

dennisy · 2026-04-19T19:24:23 1776626663

So what are you suggesting do not allow companies to sell such tools?

duped · 2026-04-19T19:37:42 1776627462

I'm suggesting people shouldn't lie to sell things because their customers will believe them and this causes measurable harm to society.

liveoneggs · 2026-04-19T19:57:41 1776628661

AI does outsource thinking. It is not a lie.

hansmayer · 2026-04-19T20:54:03 1776632043

If you don't tend to think much in the first place or have low expectations, then yes

duped · 2026-04-19T20:36:01 1776630961

I think if you believe that you're either lying or experiencing psychosis. LLMs are the greatest innovation in information retrieval since PageRank but they are not capable of thought anymore than PageRank is.

dennisy · 2026-04-19T17:59:52 1776621592

Ohh this is very cool!

dennisy · 2026-04-19T17:55:41 1776621341

I think most people would agree.

However it is less clear on how to do this, people mostly take the easiest path.

fintler · 2026-04-19T18:14:32 1776622472

Its an eternal september moment.

https://en.wikipedia.org/wiki/Eternal_September

userbinator · 2026-04-19T20:31:12 1776630672

Eternal Sloptember

operatingthetan · 2026-04-19T18:37:23 1776623843

I guess engineers can differentiate their vibecoded projects by selecting an eccentric stack.

alex7o · 2026-04-19T18:59:25 1776625165

Choosing an eccentric stack makes the llms do better even. Like Effect.ts or Elixir

rpcope1 · 2026-04-19T19:39:21 1776627561

I actually noticed the same. Having it work on Mithril.js instead of React seems (I know it's all just kind of hearsay) to generate a lot cleaner code. Maybe it's just because I know and like Mithril better, but also is likely because of the project ethos and it's being used by people who really want to use Mithril in the wild. I've seen the same for other slightly more exotic stacks like bottle vs flask, and telling it to generate Scala or Erlang.

fragmede · 2026-04-20T08:27:43 1776673663

That makes sense. There's less training data but it is better training data. LLMs were trained on really bad pandas code, so they're really really good at generating bad pandas. Elixer, there's less of it, but what there is, is higher quality, so then what it outputs is off higher quality too.

gommm · 2026-04-20T08:12:54 1776672774

That's been my experience as well. Claude code does better with Elixir (plus I enjoy working on the code better after :) )

egeozcan · 2026-04-19T18:45:27 1776624327

> a. Actually do something sane but it will eat your session

> b. (Recommended) Do something that works now, you can always make it better later