
The internet archive archive#
Perhaps what is most remarkable about this collection is that these images come not from some newly-unearthed archive being seen for the first time, but rather from the books that we have been digitizing for years that have been resting in our digital libraries. A search for bird offers a vividly colorful showcase of the world’s bird species, while searching for telephone traces the invention’s history from its introduction as an electric novelty to its widespread adoption.

Searching for love yields a myriad images of cherubs and courtship, while mortis (death) offers a glimpse into the early modern period’s fascination with the subject. The latter is especially powerful, as it allows to keyword search 500 years of images, instantly accessing particular topics or themes. Each image includes detailed descriptions, including the subject tags of the book it came from and the text immediately surrounding it on the page. What would it look like if those 600 million pages could be “read” completely differently? What if every illustration, drawing, chart, map, or photograph became an entry point, allowing one to navigate the world’s books not as paragraphs of text, but as a visual tapestry of our lives? How would we learn and explore knowledge differently? Those were the questions that launched a project to catalog the imagery of half a millennium of books.Ī Yahoo research fellow at Georgetown University, Kalev Leetaru, extracted over 14 million images from 2 million Internet Archive public domain eBooks that span over 5 centuries of content, compiling more than 14 million high resolution images spanning nearly every topic imaginable. Yet, its 19 petabytes include more than 600 million pages of digitized texts dating back more than 500 years. The Internet Archive is best known for its historical library of the web, preserving more than 400 billion web pages dating back to 1996.


Over the past couple of weeks, The Internet Archive has already been uploading content behind the scenes, and today we are very excited to officially launch them into The Commons.
