InsideGoogle

part of the Blog News Channel

Why Doesn’t Google Sitemaps Use RSS?

Jeremy Zawodny makes a point about how there already is a system in place where blogs ping services like Feedster and Technorati which Google Sitemaps could use, instead of its similar XML system. It makes sense. Why build a new system instead of using what is already in place?

While Jeremy’s reasoning is hard to argue with, I will give one: Over time webmasters will realize Sitemaps is an essential part of SEO. This means millions of sites will implement it. Tying Google Sitemaps to the weblog ping protocol ties it to RSS. Perhaps Google did not want to force RSS on everyone, instead giving them a similar XML-based solution.

Many publishers, despite the benefits, would say no to RSS pings because they don’t want RSS (not an idea I agree with, of course). They are against RSS because of the control it gives up. Google wants everyone to use Sitemaps, it figures it can avoid any anti-RSS feelings some publishers have by using something else. If Google can avoid using RSS, it can get those anti-RSS publishers to sign up anyway.

Oh, and of course there’s the obvious. RSS doesn’t contain a sitemap, just recent posts. RSS contains more than just page URLs, often including page content Google doesn’t want through Sitemaps. RSS contains many URLs, many going to external sites.

While Google should be using RSS in its web crawl (for PageRank data, at the least), RSS is useless for Sitemaps. I would love it if Google used RSS for relevancy, or if it implemented a blog search engine. How amazing would it be if Google showed links to each page, including the linking RSS feeds? While most feeds are just blog posts, some have original content Google can use.

I can’t disagree with the Sitemaps system, especially since most blog publishers can easily implement it, as we’ve seen with Wordpress and Movable Type solutions that are simple and took ten minutes to figure out. Since RSS couldn’t handle Sitemaps, Google had to build something new anyways, so tying it to the weblog ping protocol is not an option.
(via Search Engine Watch)

June 8th, 2005 Posted by Nathan Weinberg | Search, General | 4 comments



Hosting sponsored by GoDaddy

4 Comments »

  1. “I can’t disagree with the Sitemaps system, especially since most blog publishers can easily implement it, as we’ve seen with Wordpress and Movable Type solutions that are simple and took ten minutes to figure out.”

    Funny, those are exactly the folks who need sitemaps least.

    Comment by Jeremy Zawodny | June 8, 2005

  2. Jeremy, RSS is ineficient for search engine sitemap purposes. It contains too much information, and no archives.

    Comment by Nathan Weinberg | June 8, 2005

  3. Apologies for the delay; I had to double-check to be certain. According to the FAQ, SiteMaps -does- support RSS…

    https://www.google.com/webmasters/sitemaps/docs/en/about.html

    /”
    8. What other formats can I use for my sitemaps?

    We also support the Open Archives Initiative (OAI) protocol for metadata harvesting, a popular protocol in the library world. If your sitemaps are already available in OAI-PMH version 2.0 format, you are welcome to submit these. We also accept RSS 2.0 and Atom 0.3 syndication feeds, using the link/lastMod fields.

    Finally, if you simply want to give us a list of your URLs, read “What is the simplest sitemap I can submit?” below.
    “/

    Comment by Michael Wells | June 8, 2005

  4. Michael: You win :-)

    Still, Jeremy was talking more about weblog ping services than about actual RSS delivery. And my arguement that RSS is woefully inneficient for sitemaps purposes is one I still make, which leaves me puzzled that Google supports it. I guess they are going for widespread usage over efficiency.

    Comment by Nathan Weinberg | June 8, 2005

Leave a comment