An exploratory look at 3,039,804 #elxn42 tweets

dc.contributor.authorRuest, Nick
dc.contributor.authorMilligan, Ian
dc.date.accessioned2016-04-14T11:56:05Z
dc.date.available2016-04-14T11:56:05Z
dc.date.issued2016-04-14
dc.description.abstractThis presentation examines the tools, approaches, collaboration, and findings of the Web Archives for Historical Research Group around the capture and analysis of Twitter for the 2015 Canadian Federal Election. While Twitter is not a representative sample of broader society - Pew Research notes that it skews young, college-educated, and affluent (above $50,000 household income) – Twitter still represents an exponential increase in the amount of information generated, retained, and preserved from non-elite people. Therefore, when historians study the 2015 federal election, Twitter will be a prime source. On August 3, 2015, the team initiated both a search API and stream API collection with twarc using the hashtag #elxn42. Data collection ceased on November 5, 2015, the day after Justin Trudeau was sworn in as the 42nd Prime Minister of Canada. We collected for a total of 102 days, 13 hors and 50 minutes. To analyze the data set, we took advantage of a number of utilities that are available within twarc and twarc-report, as well as jq, Mathematica, and Apache Spark Notebook. In accordance with the Twitter ToS, we also hosted the tweet ids in an institutional repository. Our analytics included: * breaking tweet text down by day to track change over time; * client analysis, allowing us to see how the scale of mobile devices affected medium interactions; * URL analysis, comparing both to Archive-It collections and the Wayback Availability API to add to our understanding of crawl completeness; * and image analysis, using an archive of extracted images. Our presentation introduces our collecting work, the analysis we have done, and provides a framework for other collecting institutions to do similar work with our off-the-shelf open-source tools. We hope that national libraries and other institutions will find our model useful as they consider how to archive ongoing events using Twtiter.en_US
dc.description.sponsorshipInternational Internet Preservation Consortium (IIPC)
dc.identifier.urihttp://hdl.handle.net/10315/31087
dc.language.isoen
dc.rightsAttribution-ShareAlike 2.5 Canada*
dc.rights.urihttp://creativecommons.org/licenses/by-sa/2.5/ca/*
dc.subjectsocial mediaen
dc.subjectweb archivesen
dc.subjecttext analysisen
dc.subjectiipcen
dc.subject#elxn42en
dc.subjecttwitteren
dc.subjectjsonen
dc.titleAn exploratory look at 3,039,804 #elxn42 tweetsen
dc.typePresentation

Files

Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
An exploratory look at 3,039,804 #elxn42 tweets.pdf
Size:
3.69 MB
Format:
Adobe Portable Document Format
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.83 KB
Format:
Item-specific license agreed upon to submission
Description: