InsideGoogle

part of the Blog News Channel

Internet Archive And Yahoo Present Alternative To Google Book Project

The Open Content Alliance, a project started by the Internet Archive and Yahoo, intends to start scanning books in a way should satisfy both publishers and end users. Unlike Google’s program, it is opt-in, not opt-out. Unlike Google’s program, it will have the full text available for readers. Unlike Google’s program, the full text will be searchable by all search engines, not just Google.

This could present a serious roadblock to Google’s ambitions. While no program is perfect, this seems far more likely than Google’s to get publishers on board. Considering that book publishing is a massive industry, not just some small niche, perhaps Google should have realized that it needed to work with book publishers, not against them. By the time Google back-pedals and gives in, it may be too late and the OCA may have taken off.

Contributors to OCA also include Adobe, HP, O’Reilly Media (a commercial publisher who will be making some of their books available), and various international archives.

From the OCA FAQ:

The OCA will seed the archive with collections from the following organizations:

  • European Archive
  • Internet Archive
  • National Archives (UK)
  • O’Reilly Media
  • Prelinger Archives
  • University of California
  • University of Toronto

The OCA will encourage the greatest possible degree of access to and reuse of collections in the archive, while respecting the rights of content owners and contributors. Generally, textual material will be free to read, and in most cases, available for saving or printing using formats such as PDF. Contributors to the OCA will determine the appropriate level of access to their content.

Metadata for all content in the OCA will be freely exposed to the public through formats such as the Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH) and RSS.

OCA contributors must secure the permission of all concerned copyright holders prior to submitting materials to the OCA for digitization or inclusion in the archive.

Washington Post:

The alliance won’t include any copyrighted material unless it receives the explicit permission of a publisher or author. That restriction means the alliance is bound to be missing much of the material available in brick-and-mortar libraries.

In an effort to be as comprehensive as possible, Google plans to index millions of copyrighted books from three major university libraries _ Harvard, Stanford and Michigan _ unless the copyright holder notifies the company by Nov. 1 about which volumes should be excluded from the search engine index.

Google’s “so-called” opt out provision has outraged many publishers, who contend the company is flouting long-established copyright laws. The Author’s Guild Inc., which represents about 8,000 writers, sued Google for copyright infringement last month. Google maintains its scanning represents “fair use” allowed under the law because it only allows Web surfers to view excerpts from copyrighted books.

Gary Price:

The OCA project differs from other digitization projects in that the database of scanned material will be available for anyone to use on any site. Yes, it’s an open access database! You could even create a focused database (let’s say one on American literature) and use it on your own web site.

Without getting into legal “what if’s,” most of the material in the OCA will be available as full text. There are no limits on how much you can view or download for offline viewing or printing. Kahle said that in some cases you can find content via the Open Content Alliance, print it, and slap a cover on it. Sort of a, “make your own book” type of thing.

Also check out the Associated Press article titled, “Publishers say yahoo to online book plan“, and at the Yahoo Search Blog.
(via Paid Content)

October 3rd, 2005 Posted by Nathan Weinberg | Controversy, Yahoo, General | 3 comments



Hosting sponsored by GoDaddy

3 Comments »

  1. I say the OCA is even more of a boon for Google.

    1) The OCA will have very limited content
    2) Google will inevitably win the suit brought by the Author’s Guild
    3) Google can include results from the OCA in its searches. It is, after all, indexable by all search engines.

    How many works will end up in the OCA’s archives anyway? I can’t imagine there are that many publishers out there ready to contriubute. What about works that have returned to the public domain?

    Comment by Nicholas | October 3, 2005

  2. Here comes the Open Alliance, backed by Yahoo, HP, Adobe and others

    Brewster Kahle, the guy behind the internet archive, listed with a byline on Yahoo’s search blog of Founder of the Digital Librarian Internet Archive has announced the Open Content Alliance, backed by Yahoo:
    To be clear, the public domain works …

    Trackback by Things That ... Make You Go Hmm | October 3, 2005

  3. “Unlike Google’s program, it will have the full text available for readers.”

    I’ve read this in a few places now. It was my understanding that Google will provide full text of public domain works.
    http://print.google.com/googleprint/common.html#6

    Comment by joel | October 3, 2005

Leave a comment