pubsub
This article is a stub. You can help the IndieWeb wiki by expanding it.
pubsub is a generic name for a protocol, system, or service that routes messages between producers and consumers. It usually refers to one of three things:
- a system that notifies readers that a publisher has published new content, e.g. PubSubHubbub.
- a system that sends push notifications to users, e.g. iOS's APN and Android's GCM.
- a category of plumbing, e.g. Google Cloud PubSub, Amazon SQS and other queue systems.
Subscribing to updates to a URL (ie the first definition above) is an extremely useful building-block for the indieweb. The rest of this page discusses that context.
Generalised Flow
- Subscriber -> Publisher: Please let me know when you update http://example.org/
- Publisher -> Subscriber: Okay, now Iโve verified youโre not spam Iโll let you know
Later, after updating http://example.org/
- Publisher -> Subscriber: I updated http://example.org, hereโs its new content
Problems with this flow
- Lots of load on the publisher: Subscription management, sending updates. Too much complexity, too low a value/pain ratio for most creators to implement
- Solution: Hubs as the middle man between subscribers and publishers
Previous/Current Work
- PubSubHubbub is a decentralised, hub-based approach
- Widely implemented and suitably easy for publishers to use, but difficult to test
- Lots of complexity for subscribers
- Reliance on ATOM format, semantics and DRY-violating duplication
- RSS Cloud
Brainstorming
Ideal solution: a content-agnostic, pure-HTTP hub-based approach.
Publishers create content on the web, and let hub(s) know that it exists/when itโs updated via a POST request. Content is published with a Last-modified header.
Hubs periodically poll the content for changes (HEAD request and Last-modified header inspection) just in case they missed any updates/to enable pubsub of content published by people who for whatever reason canโt notify the hub.
Publishers publicly declare which hub manages notifications for their content via a Link header or HTML/XML element with rel=hub.
When a subscriber wants to subscribe to changes to a URL, they discover the hub as described above, and send a POST request to some endpoint (/subscribe?) with the URL theyโre subscribing to and the URL they want notifications sent to.
- We need some sort of auth at this point to weed out spammers and enable the authenticity of future notifications.
- TODO: spec this stage out in more detail, take inspiration from current PuSH work
Then, when a publisher updates their content and either a) sends a notification to the hub or b) the hub polls the URL and sees the Last-modified header change, the hub sends POST requests out to all the subscribers for that URL.
- Possibly with the URL content as the POST body to prevent the Thundering Herd problem
Benefits over current solutions
- Eliminates DRY-violating duplication
- Content-agnostic, not tied to ATOM feeds but can be used with *any* content served over HTTP
- Extremely low barrier to entry for publishers โ occasional hub polling of content eliminates the need to even send notifications to the hub to begin with