The Technorati prism vs reality

Technorati is a great source for information, trends and numbers about the blogosphere. But one has to take figures coming out from there with a bit of salt, and make their own mind about the Technorati prism, which is not the reality.

Take for example the Oct 06 state of the blogosphere from David Sifry. Here in France, some are "worried" that French would be underrepresented. Without entering into any (useless) debate about the importance (of even merits) of French vs other languages, I strongly suspect that Technorati is actually incapable of accurately tracking languages, at least not automatically. The proof? The latest "top 100 French blogs" co-branded by Technorati and Edelmann misses out a lot of prominent French blogs that should have been listed. Why? Because each blogger has to 1) get a profile on Technorati, 2) claim their blog, 3) manually set their blog "primary language" in their Technorati profile. You can bet that most of them didn't go that far. I didn't even notice I had to do that until I investigated about the inaccuracies in the aforementioned listing. And if they don't, well, my bet is that they're "assimilated" in some way, most probably into the mono cultural prism that predominates in a certain part of North America1 ;-). Provided that the process works in the first place, which is far from certain2!

Come on guys, a $1,995 Google Mini is able to autodetect languages in any document. All prominent search engines do. How does Technorati deal with languages today? Manually? My bad, they're using languid to automate that, but David says it needs to be improved.

A second hint about the level of accuracy of the Technorati figures is this phrase from David (emphasis mine):

My gut feeling is that since we're better at dealing with Spam now, even some of the blue areas in last quarter's graph were probably accountable to spam, which would mean that rather than the bumpy ride shown above, we're actually seeing a steady increased (but slower) growth of the blogosphere.

Also, there are lots of sites that aren't blogs in their index. Evidence, how come this corporate site has a rank (and a totally false "updated" info), how is it counted or separated from blogs? Add to that they also exclude a very large chunk of French blogs by not indexing Skyblog (5.9M blogs as of today, not insignificant compared to the size of the French blogopshere).

Another dirty little secret I've been suspecting for a long time, is that Technorati doesn't go further than a blog home page for links counting, at least for the ranking that serves as the "authority" level. I'd like to be wrong on that one.

But don't get me wrong on this, I positively applaud the work David is doing with his regular states of the blogosphere and I have a lot of respect for the folks at Technorati. But I would really welcome a little bit more clarity about the methods they're using to get those numbers (and assumptions) out.

So far, you really have to read between the lines and make your own mind about the Technorati prism vs reality.

(1) Technorati isn't localized, so only those who read English can go through the registration process. I find it weird that they can claim any accuracy in following foreign blogs when they start by excluding those bloggers who don't speak English.
(2) I set the primary language as French for my French blog a few weeks ago. Today, verifying the process while writing this post, I discovered that my "primary language" preference was reset to "all languages". So I had to set it again to French, but this preference doesn't stick, it keeps falling back to "all languages". Funnily enough, the same information for my blog in English is correctly labeled as English. Something's really wrong here!

4 Comments

I'll look into the blog language setting errors, there may be some issues with the account management or language setting heuristics. Please email me if you have information on how Skyblog's pings work or if you know of other ways that their updates are published. I can't guarantee we can get full inclusion in Technorati's index but it is something I can investigate.
thanks,
-Ian Kallen
Technorati

Hi Ian, kudos for noticing this! I heard that Skyblog is working on providing feeds, so there's hope. I'm sure they're fully aware of the situation with respect to Technorati (I'm sure the reverse holds true, as this has been pointed out for a long time to Loic le Meur and Dave).

When we last checked, Skyblog did not have per-post permalinks, making it hard to index usefully as we can't point back. If they have fixed this, that would be good to know. We'd certainly love to hear from Skyblog.

@Kevin : re. skyblog and permalinks, it's still the case.

Now re. the bug in the language setting: it's still not fixed. No matter how hard I try to set French for padawan.info/fr/, that setting is NEVER memorized :-(.

mensuelles Archives

Recent Entries

  • Steve Jobs

    "Remembering that I’ll be dead soon is the most important tool I’ve ever encountered to help me make the big choices in life. Because...

  • Your privacy on MOTOBLUR by Motorola

    After the Nokia Ovi Store carelessness, it's now Motorola who's allowing strangers to get access to your private information on their MOTOBLUR portal. Exactly like...

  • How to resume a broken ADC download

    (I'm documenting this trick for myself to remember, but it can be useful for others…) Apple, on its Apple Developer Connection site, has a bad...

  • WTF is this ‘myEventWatcherDiv’ doing in my web?

    All of a sudden I started to find the following line in most of the web pages I was browsing, including ones I made where...

  • Your privacy on Nokia Ovi Store

    My friend Adam Greenfield recently complained about the over-engineering culture at Nokia: I was given an NFC phone, and told to tap it against the...