Thursday, June 06, 2013

Does polling scale better than push?

For the sake of simplicity, given 10000 subscribers, 1 publisher and assuming resource required for serving 1 pull request is roughly equal to resource required for sending 1 push:
  • A hub that is pulled from every minute has to serve number of subscribers x 1440 requests per day, i.e., 10000 subscribers x 1440 requests per day irrespective of the number of updates.
  • A hub that pushes has to send (number of subscribers x number of updates) per publisher pushes per day   i.e., 10000 subscribers x number of updates x 1 publisher pushes per day, i.e., 10000 subscribers x number of updates pushes per day.
  • So that's 10000 x 1440 for pull and 10000 x number of updates for push.
  • Therefore, if number of updates per day is greater than 1440, a hub that pushes will require more resources than ones that is pulled from. 
  • More importantly, a hub that is pulled from will not require additional resources if the number of updates per day increases.

Would love to hear what you think (in the comments) especially if you think this might not be the case.

Notes

  • This assumes that >= 1 min latency is ok for your specific use-case.
  • Resource required for serving 1 pull request might not be equal to resource required for sending 1 push. Here are my notes for why, I would love to hear yours:
    • Given constant number of subscribers and publishers, a pull based system will experience a uniform load throughout while a push based system will experience load in bursts.
    • Push potentially uses less bandwidth though Pull can take advantage of caching.
    • Push has the overhead of subscribers not being available, keeping track of such subscribers and retrying several times. 
  • Proof by induction doesn't work because with push not every subscriber is subscribed to every publisher.


See PushHubPullSub

This was inspired by my notes on Push vs Pull on the IndieWebCamp wiki.

1 comment:

Vishnu S Iyengar said...

I say no, but my comment is stuck in the previous location of your post =p