Hacker News
new
|
past
|
comments
|
ask
|
show
|
jobs
|
submit
login
mahmud
on Oct 6, 2010
|
parent
|
context
|
favorite
| on:
HackerFollow: Follow HN Users
Instead of crawling by #id, crawl by new posts from the /newest page. For each post, split it into multiple pages, setting the parent id/title that way. Not that you
have to
, but a future suggestion.
jeromec
on Oct 6, 2010
[–]
I'm guessing the info is currently scraped from the 'threads?id=username' page, but the title to each story is already there after the word 'on'.
Guidelines
|
FAQ
|
Lists
|
API
|
Security
|
Legal
|
Apply to YC
|
Contact
Search: