There has been a lot of Twitter-chat today about the change that was introduced by Twitter that stopped people seeing ‘@’ replies from people who they weren’t following.
There was a lot of moaning and groaning throughout the day and later news suggested that this was perhaps a scalability/performance issue.
Now, I spend a lot of my time dealing with exactly these issues so, thought I would burble a bit about it. Because I think I can understand the issue.
I can’t be sure of Twitter storage and indexing structure so this is largely conjecture, but hopefully it gets close to explaining the issue.
For you to ‘see’ a tweet it has to be retrieved into your stream ( by whatever means, web, API, etc ). So, there is a big lookup. Tweets by your followers, tweets that mention you etc.
To find ALL the tweets that mention you, the search has to look in ALL streams. It would appear that there isn’t any kind of indexing that can cope with this. The signs have been there for a while, particularly in the lag people were seeing in appearing in searches.
I have a very strong feeling that the weight of searching for @replies across all streams for everyone was just getting a bit too much. And the comments that “serious technical reasons why that setting had to go or be entirely rebuilt” are entirely expected because the amount of work that needs to get done to support this grows exponentially with growth of users numbers.
It will need an entirely new approach to indexing (and perhaps storage) to pull this off. Something much more akin to the Google model and map/reduce is likely, who knows.
Anyway, I feel their pain, tuning is such painful progress because it is running very hard to stay still, when all you want to do is add new features. Maybe they’ll just turn it back on and get more cooling fans…
Trackbacks/Pingbacks