Some blogosphere stats, June 2005

I'm updating some figures on the number of blogs out there, and stumbled on a few intesting trends. Here is a raw report, and don't count on me to make an analysis right now!

In January 2005, LiveJournal claimed 5.7 million users with 2.5 million "active in some way". Right now on June 1 they claim 7.3 million of accounts which represents a gain of 28%, but with 2.6 active in some way which is a gain of only 4%. I have no numbers for TypePad (may be I should just bug Loic or get him drunk to figure them out).

In France, in the same period, Skyblog moved from 1,4M to 2,1M of accounts (+50%). Unfortunately they don't provide any number to estimate active vs. inactive blogs.

Netcraft saw 63.5M web sites in May and reckons that 27.7M are active, while Perseus counted 32M blogs in April on only 20 blog services (top North American services only, excluding others and self-hosted blogs). And today, Technorati claims an index of 10.7M blogs (a third of what Perseus sees and 17% of what Netcraft sees) and 1.173 billion links while Google claims to index 8 billion web pages (so the links in the 10.7M blogs in Technorati would represent 15% of the Google index, though it doesn't include images).

Perseus saw a growth of 10M blogs in the first quarter alone. Netcraft saw the number of sites grow by 1.2M per month (and 580,000 active sites per month) between January and May 2005. David Sifry reckons that the number of blogs doubles every five months.

Clearly, the blogosphere is the most active part of the web.

7 Comments

Huh... sorry to pop your bubble here (but bubbles are meant to be popped, aren't they? ;-), but unless I'm badly missing something, "1.173 billion links" doesn't make "15% of the Google index", not even by the wildest stretch of the imagination...

More likely (but still without an ounce of reliability), I'd venture something along the line of 1,173 / 10.7 = more or less 100M links... times perhaps 10-20, in order to account for some linking diversity (which is, imho, way underestimating the morbidly incestuous linking habits of the blogosphere)... and you end up with a very tentative 200M links, i.e. 3% of the Google index.

Then, you realize that quite possibly, Technorati includes at least some amount of internal links in this count, and you end up with a number quite insignificant in the Google index...

Don't get me wrong: I am convinced that blogs will certainly get there some day (although by then, they'll be as meaningless a description as "personal homepage"), but they aren't yet...

""1.173 billion links" doesn't make "15% of the Google index", not even by the wildest stretch of the imagination..."

Oh well, may be at 2am my imagination was too wild ;-). I assumed that Technorati boasts about unique links, not the total number of links they can see on blogs (which would indeed be quite smaller). But you're right, I have no evidence to say that those links directly map to the web pages in Google's index.

This said, 100M times 10-20 makes 1 to 2 *billion*, not 200M. And where does this 10-20 range comes from anyway, dear doctor? Blogosphere math is hard, isn't it? ;-)

Indeed, 2am mathematic contorsions are never a good idea... and I am no statistician to boot...
In fact, I made at least two stupid evaluation mistakes in the random "formulae" thrown above... the idea behind is much easier to explain and should do as well: if you have 10M blogs generating 1B links, you are entitled to think that the actual number of sites linked may be anything from 1000/10=100 (obviously a ludicrously small number) to a real billion. The truth is, imho, much closer to the lower bound (i.e. 100 times a "diversity" factor that reflects the fact that all blogs do not link to the same 100 sites, even though they aren't that far from it) than the higher bound. And all this is without accounting for internal linking. I'll ask next time I have a chance, but I'm pretty sure Technorati's total link count doesn't take in account redundancy...

Cheers

Mmh, another confusion is that you mention sites, I only compared links (thinking of them as unique URLs while I agree there might be redundancy in those figures) to the number of web pages in Google. Technorati seems to indexes links to anything, not just blogs, as Google does (they just say "8 billion web pages", so it seems to exclude images, videos, files, etc.).

OK, the figures look silly, but still, I think the comparison is not completely meaningless. I'm just trying to grasp the amount of links that the blogosphere generates with respect to the amount of addressable (URL-ized) documents published on the web.

Hmmn, yea, by "sites", just above, I meant pretty much any external link, with or without redundancy...

And I agree this the comparison isn't meaningless at all, my contention that it is still well below significant isn't gratuitous either ;-) To be honest, and if I went with a less "scientific" gut feeling, I'd tell you that outward linking from the blogosphere is so ridiculously small (numerically speaking) that not only would it fail to register in the sea of internet URLs gathered by Google, but it would even be dwarfed by its own internal linking.

Otherwise put: bloggers all link about the same thing, and much less about other stuff than about themselves. Traffic and PR may be big, actual web estate is microscopic, imho.

Drunk and under torture I will not say it. And don't try again to send me your Tarquine Vampire, it will not help ;-)

mensuelles Archives

Recent Entries

  • Steve Jobs

    "Remembering that I’ll be dead soon is the most important tool I’ve ever encountered to help me make the big choices in life. Because...

  • Your privacy on MOTOBLUR by Motorola

    After the Nokia Ovi Store carelessness, it's now Motorola who's allowing strangers to get access to your private information on their MOTOBLUR portal. Exactly like...

  • How to resume a broken ADC download

    (I'm documenting this trick for myself to remember, but it can be useful for others…) Apple, on its Apple Developer Connection site, has a bad...

  • WTF is this ‘myEventWatcherDiv’ doing in my web?

    All of a sudden I started to find the following line in most of the web pages I was browsing, including ones I made where...

  • Your privacy on Nokia Ovi Store

    My friend Adam Greenfield recently complained about the over-engineering culture at Nokia: I was given an NFC phone, and told to tap it against the...