A presentation from the Update conference held in Brighton in September 2011.
Tagged with “digital preservation” (22)
Universal access to all knowledge, Kahle declared, will be one of humanity’s greatest achievements. We are already well on the way. "We’re building the Library of Alexandria, version 2. We can one-up the Greeks!"
Start with what the ancient library had—-books. The Internet Library already has 3 million books digitized. With its Scribe Book Scanner robots—-29 of them around the world—-they’re churning out a thousand books a day digitized into every handy ebook format, including robot-audio for the blind and dyslexic. Even modern heavily copyrighted books are being made available for free as lending-library ebooks you can borrow from physical libraries—-100,000 such books so far. (Kahle announced that every citizen of California is now eligible to borrow online from the Oakland Library’s "ePort.")
As for music, Kahle noted that the 2-3 million records ever made are intensely litigated, so the Internet Archive offered music makers free unlimited storage of their works forever, and the music poured in. The Archive audio collection has 100,000 concerts so far (including all the Grateful Dead) and a million recordings, with three new bands every day uploading.
Moving images. The 150,000 commercial movies ever made are tightly controlled, but 2 million other films are readily available and fascinating—-600,000 of them are accessible in the Archive already. In the year 2000, without asking anyone’s permission, the Internet Archive started recording 20 channels of TV all day, every day. When 9/11 happened, they were able to assemble an online archive of TV news coverage all that week from around the world ("TV comes with a point of view!") and make it available just a month after the event on Oct. 11, 2001.
The Web itself. When the Internet Archive began in 1996, there were just 30 million web pages. Now the Wayback Machine copies every page of every website every two months and makes them time-searchable from its 6-petabyte database of 150 billion pages. It has 500,000 users a day making 6,000 queries a second.
"What is the Library of Alexandria most famous for?" Kahle asked. "For burning! It’s all gone!" To maintain digital archives, they have to be used and loved, with every byte migrated forward into new media evey five years. For backup, the whole Internet Archive is mirrored at the new Bibliotheca Alexadrina in Egypt and in Amsterdam. ("So our earthquake zone archive is backed up in the turbulent Mideast and a flood zone. I won’t sleep well until there are five or six backup sites.")
Speaking of institutional longevity, Kahle noted during the Q & A that nonprofits demonstrably live much longer than businesses. It might be it’s because they have softer edges, he surmised, or that they’re free of the grow-or-die demands of commercial competition. Whatever the cause, they are proliferating.
Michael Nelson, Associate Professor at Old Dominion University, developed, along with colleagues at the Los Alamos National Laboratory, “Memento,” a technical framework aimed at better integrating the current and the past web. In the past, archiving history involved collecting tangible things such as letters and newspapers. Now, Nelson points out, the web has become a primary medium with no serious preservation system in place. He discusses how the web is stuck in the perpetual now, making it difficult to view past information. The goal behind Memento, according to Nelson, is to create an all-inclusive Internet archive system, which will allow users to engage in a form of Internet time travel, surpassing the current archive systems such as the Wayback Machine.
Jemima Kiss examines plans for a digital public space with the British Library, the Royal Opera House and the BBC.
How can we preserve analogue culture in a digital world? Could something allow us to view, research & remix cultural items? Jemima Kiss examines plans for a digital public space – a part of the internet that could grant worldwide access and create links between museums, archives and libraries.
Jemima talks to Richard Ranft of the British Library and Francesca Franchi of the Royal Opera House about the items and artefacts from their archives that a digital public space could open up to the public, and how the reach of both organisations can be dramatically extended to a worldwide audience.
Bill Thompson, head of partnerships at the BBC’s archive (but also of the Digital Planet and Click programmes) explains how the corporation could help build what is needed, and how it could work.
And Jill Cousins of europeana.eu discusses how similar project that is funded by the European Commission works, and how it has now developed into a full service.
Jeremy Keith joins Jen to talk about Mobilewood, future-friendlying websites, responsive design techniques, digital preservation, html5 semantics, Firefox 7, and much more.
A weekly podcast about changing technologies and the future of the web, discussing HTML5, mobile, responsive design, iOS, Android, and more. Hosted by Jen Simmons.
Our evening with Jason Scott last Wednesday was possibly the most entertaining archives talk ever in the world. No really. It really was. Jason is passionate and committed to his work and deadly serious about its importance but he is also seriously funny. Enjoy.
This is a collection of Geocities data downloaded by a bunch of people who call themselves ARCHIVE TEAM, who began scraping the Yahoo! Geocities site during a six month period in 2009, before Yahoo! shut down geocities.com on October 26th, 2009.
At the time of the purchase, Geocities was the THIRD most popular website on the Internet. Even by the time of its shutdown, it was in the top 250. We don’t have complete rock-solid knowledge of why it was shut down, but all signs point to Yahoo! trying to get back to basics (like, uh, having a huge audience?) and Geocities magically didn’t fall into this new "focus", and lacked any internal cheerleader to make it last through meetings.
Yahoo! succeeded in destroying the most amount of history in the shortest amount of time, certainly on purpose, in known memory. Millions of files, user accounts, all gone.
Archivist, technology historian, and filmmaker Jason Scott talks to Nora Young about online video, digital heritage, and how the internet isn’t as permanent as we might think.
About two weeks ago, I got an email from Google:
On April 29, 2011, videos that have been uploaded to Google Video will no longer be available for playback. We’ve added a Download button to the video status page, so you can download any video content you want to save. If you don’t want to download your content, you don’t need to do anything. (The Download feature will be disabled after May 13, 2011.)
So, basically… “unless you take action, all your videos will be deleted.” But then, a week later, Google changed its tune. In my inbox:
Google Video users can rest assured that they won’t be losing any of their content and we are eliminating the April 29 deadline. We will be working to automatically migrate your Google Videos to YouTube. In the meantime, your videos hosted on Google Video will remain accessible on the web and existing links to Google Videos will remain accessible.
This Google Video example is just one of many recent stories that suggest the web isn’t as permanent as we’re often led to believe. This past March, Yahoo Video removed all user-generated uploads from its site. When Cisco announced its plans to shut down its Flip Video business, it also announced that its companion FlipShare video sharing service “will no longer be supported past 12/31/2013.”
For his perspective on online video and digital heritage, Nora interviewed Jason Scott. Jason’s an archivist, technology historian, and filmmaker.
Jason Scott’s talk at the Personal Digital Conference, 2011.
Between The Alexandrian War of 48 BCE and the Muslim conquest of 642 CE, the Library of Alexandria, containing a million scrolls and tens of thousands of individual works was completely destroyed, its contents scattered and lost. An appreciable percentage of all human knowledge to that point in history was erased. Yet in his novella “The Congress”, Jorge Luis Borges wrote that “every few centuries, it’s necessary to burn the Library of Alexandria”.
In his session James will ask if, as we build ourselves new structures of knowledge and certainty, as we design our future, should we be concerned with the value of our ruins?
With a background in both computing and traditional publishing James Bridle attempts to bridge the gaps between technology and literature. He runs Bookkake, a small independent publisher and writes about books and the publishing industry at booktwo.org. In 2009 he helped launch Enhanced Editions, the first e-reading application with integrated audiobooks.