Iconic books are texts revered as objects of power rather than just as words of instruction, information, or insight. In religious and secular rituals around the globe, people carry, show, wave, touch and kiss books and other texts, as well as read them. This blog chronicles such events and activities. (For more about iconic books, see the links to the Iconic Books Project at left.)

Sunday, August 31, 2008

Rosetta Disk for long-term language backup

The folks at the Long Now Foundation have unveiled this 3-inch Rosetta Disk. Shown is the back with a "teaser" in eight different languages announcing: “Languages of the World: This is an archive of over 1,500 human languages assembled in the year 02008 C.E. Magnify 1,000 times to find over 13,000 pages of language documentation.”

The other side, under a glass sphere, is pure nickel etched with 13,500 pages of linguistic data:
The Rosetta disk is not digital. The pages are analog “human-readable” scans of scripts, text, and diagrams. Among the 13,500 scanned pages are 1,500 different language versions of Genesis 1-3, a universal list of the words common for each language, pronunciation guides and so on. Some of the key indexing meta-data for each language section (such as the standard linguistic code number for that language) are displayed in a machine-readable font (OCRb) so that a smart microscope could guide you through this analog trove.

Our hope is that at least one of the eight headline languages can be recovered in 1,000 years. But even without reading, a person might guess there are small things to see in this disk.

The motivation for the project: long-term backup!
Following the archiving principle of LOCKS (Lots of Copies Keep ‘em Safe) we would replicate the disk promiscuously and distribute them around the world with built in magnifiers. This project in long term thinking would do two things: it would showcase this new long-term storage technology, and it would give the world a minimal backup of human languages.

I applaud the effort to think long-term about the problem of information (and language) retrieval. I've already had reason to comment here about the rapid decay of electronic data. I'm amused, though, that the text chosen to render in 1,500 languages turned out to be the beginning of the Bible, Genesis 1-3. The Rosetta Disk's creators explain that they chose Gen 1-3 "since it was most likely already translated into all languages already." (Actually, they would have found that the New Testament has been translated into even more languages than the Hebrew Bible, since Christian translators tend to make the NT the highest priority in their work.)

Despite their entirely secular, linguistic goals, the Rosetta project thus reproduces part of the most iconic text in Western culture into yet another iconic form, and in the cause of durability, which the Bible by its ancient origins itself symbolizes.

No comments: