hnsync: A Hacker News Sync Tool
November 2024
hnsync
is a Go application that syncs Hacker News items to a local SQLite database. It fetches HN items (stories and comments) up to the maxitem from the HN API and then exits. Recent items are also periodically refreshed during the run.
You can find the source code and more details on GitHub.
Getting Started
Running hnsync
is simple. Just execute:
go run github.com/larose/hnsync@latest
By default, the synced data is stored in a file named hn.db
in a table called hn_items
. This table contains two key columns:
id
: The unique identifier for each Hacker News item (INTEGER).data
: The raw JSON response from the Hacker News API (TEXT).
Example Query
You can easily inspect the data using SQLite:
sqlite> SELECT id, data FROM hn_items LIMIT 1;
id data
-- ------------------------------------------------------------
1 {"by":"pg","descendants":15,"id":1,"kids":[15,234509,487171,
82729],"score":57,"time":1160418111,"title":"Y Combinator","
type":"story","url":"http://ycombinator.com"}
How It Works
The design of hnsync
is very simple. It uses three types of workers:
-
Discoverer: Finds item IDs up to the
maxitem
provided by the HN API and adds them to the database. -
Refresher: Scans the database for items that need updating and queues them for processing.
-
Syncer: A group of workers that consume the queued items, fetch their data from the HN API, and save the updated records to the database.
Feedback
If you've used hnsync
for a project, I’d love to hear about it. Feel free to share what you’ve built!
Like this article? Get notified of new ones: