Google purchases Pyra (blogger.com)

OK, everyone in BlogSpace (hey, I think I just coined a new term!) will be discussing today’s revelation: Google has bought Pyra. (For the clueless — the publishers of blogger.com [from whence this entry comes] and blogspot.com.)

Disclaimer: Before I begin, understand that I hold the following two fundamental beliefs (yeah, and others…) about blogs, which I need to blog about:

  • Blogs are highly overrated by blog authors/readers
  • Blogs are highly underrated (not on radar for most) by non-blog authors/readers

Blogs today have the same hype as, say, Linux: No, Linux will not prevent your lettuce from wilting, but it is a stable platform. It will not (at least short-term) replace Windows/Mac (your desktop), but it may replace your Sun/Win2000/AIX/HPUX server….

In the same vein, Blogs are “the” answer to everything; they will change everything…..

….NOT!

The Web was supposed to change EVERYTHING.

It didn’t.

But it — slowly (in Internet years, not geological years) — changed a lot.

Blogs are similar.

They will NOT democrotize information (for the AVERAGE user); they will not re-invent journalism (will impact it, however); they will not make your lettuce crisper etc…

But weblogs are — in the correct environment — good and Web-shattering.

‘Nuf said.

My two cents:

I don’t know what the financial benefit to Google will be — frankly, I don’t know (or care, really) if Pyra is profitable — either short- or long-term, but I do think that the acquistion is both interesting and a good fit. Here’s why:

  • Good Fit: Google is, at base level of operation, in the business of collecting/indexing/pointing toward content. Pyra, while not the only publishers of blogs (MoveableType and Radio/Manilla are the other big players), is easily the biggest. According to Ev at Pyra, the company has approximately 1 million blogs, of which about 20 percent are active. While Google does a good job of crawling blogs, it takes time. How much easier would it be to just trawl the actual database that hosts the blogs? Way faster, get an idea of who is currently updating and so on. I see a separate section of Google News with recent headlines from blogs, as well as a search tool that will enable users to search blogs virtually as they are published. No wait for the nightly crawler. It becomes a “live” search. This is HUGE. I can’t underestimate how intriguing this is.
  • Interesting: Yes, the sites on blogger.com and blogspot.com can be trawled via database access, not a spider. Yes, faster results. But what does this mean for the MoveableType | Radio/Manilla | homegrown-blog sites? WILL Google crawl via database — and index/post more quickly — the Pyra sites or will it continue to do the traditional spider crawl of these sites (even though they will be hosting the sites). Two views:
    • Google is Evil: Sure, they can directly hit the database for the Pyra sites, so it can more quickly be posted. But Google is a search engine, and this gives an unfair advantage (time) to the sites that Google stands to make money off of. As blogging becomes even bigger (I think it will) and Google maintains its role as search engine of choice (I think it will), then there is a real incentive for people to host at the former Pyra properties instead of, say, using MoveableType. Somewhat akin to M$ leveraging their installed OS (Windoze) base to, uh, “promote” Internet Explorer.
    • Google is Good: Sure, it does give an advantage to the Pyra sites. But Google’s goal is to get the stuff out there and as fast and cleanly as possible. This helps, and it does not affect the crawling (it had better not!) of those non-Pyra sites. Status Quo there.

  • Interesting: If Google does go ahead and indexes off the database for Pyra sites, as opposed to spiders (except for fear of catching some flack, I can’t understand why it wouldn’t) this is a new paradigm: Much like a single site with search, Google will be searching off the database — but it’s not a single site, it’s many thousands of them. A little scary, but a new concept. Will other sites/users grant Google access to their databases (on a strictly limited basis, obviously) so there can be a faster dissemination of blog information? If so, will this database access go beyond Web sites? I see an optional XML feed (maybe just the RSS file) that is either fed to site of your choice (doubtful?) or sits out there and is available to all spiders — which then report it back and it becomes an hourly (for example) hit instead of every day/week.

Sorry, but this is interesting stuff. It’s not a simple buyout.

It may change things.

Actually, what I expect to happen is the following:

  • Goolge will begin with keeping a spider crawl of Pyra sites, but this will change in short order as people get comfortable with it “controlling” all this content (they don’t control it; they just index it folks…)
  • Either a “Web Log” tab on the search will appear, or it will be part of — but separate from — the News section (much like Sports is separate from Health).
  • Google will do some interesting things with XML to make all bloggers more accessible via Google and other search engines that care to do the crawl/feed processing. They will keep the protocol/process open, but — since they will invent it — they will have the first-mover advantage.
  • Google will 1) Make the Pyra sites more stable and scalable (after this purchase, people will begin to understand blogs because it’s now associated with Google, which they know). Move from Win2000 to *nix? (Actually, Win2000 is a nice, stable platform. No, not as good with memory management as *nix, but a good platform. Go ahead; flame me — this is not a troll) and 2) Introduce new tools to make the process easier, much along the lines of MoveableType. I think XML will play a big role, but what the hell do I know?

What I don’t expect to happen with Google buying out Pyra:

  • Google will not — at least for a year — begin any sort of additional charges. Example: Blogger.com is free (Pro is a charge); this will be unaffected.
  • Any other disturbance in The Force