Enabling Access to Old Wu-Tang Clan Fan Sites: Facilitating Interdisciplinary Web Archive Collaboration

Date

2016-03-08

Authors

Ruest, Nick
Milligan, Ian

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

The growth of digital sources since the advent of the World Wide Web in 1991, and the commencement of widespread web archiving in 1996, presents profound new opportunities for social and cultural opportunities. In simple terms, we cannot study the 1990s without web archives: they are both primary sources that reflect how people consume and understand media, as well as repositories that document the thoughts, opinions, and activities of millions of everyday people. These are a dream for social historians. For example, consider GeoCities, which grew to some thirty-eight million pages created by as many as seven million users during the fifteen years between 1994 and 2009. There are untold opportunities to understand the recent past, based on the voices of people who never before would have been included in a traditional historical record. But wait, with all this opportunity comes challenges: large data, the need for interdisciplinary collaboration between historians who might have the questions but not the technical resources or knowledge to work with these sources, and basic questions around what a web archive is and how to access them. Libraries and archives are perfectly positioned to work in this new emerging field that brings together historians, computer scientists, and information specialists. In our talk, we discuss the fruits of one collaboration that has emerged at York University and the University of Waterloo. Bringing together a librarian, a historian, a computer scientist, and an interdisciplinary team of undergraduate and graduate students, York has become a collaborative hub: using a combination of centralized and de-centralized infrastructure to run data analytics, store web archives, provide a publicly-facing portal (http://webarchives.ca/), and to collaborate using Slack, a research team has taken shape. We’ll discuss the challenges of working in an interdisciplinary environment, and give insights into how the team has been working through in-detail case studies of our work with http://webarchives.ca and the warcbase web analytics platform. The combination of computer scientists and humanists is not always a simple one, and York University Libraries provided the infrastructure, help, and leadership to make the team a success.

Description

Keywords

web archives, digital preservation, access, text analysis, network analysis

Citation