When talking about how they came up with the name for Twitter Jack Dorsey is quoted as saying:
'The definition was 'a short burst of inconsequential information', and 'chirps from birds'. And that's exactly what the product was.'
Four years on and Twitter is a national phenomenon. Any celebrity with a mild interest in technology has an account and it has over 100 million users worldwide. Individuals use it primarily as a productivity and social tool while many organisations have embraced the service as a key way to engage with their audience. The chirps have evolved into an almighty chorus and many have found themselves asking exactly how ‘inconsequential' is the information we tweet?
Tweets, text-based posts of up to 140 characters, were initially seen as ephemera. Nobody needs to keep a record of what people have been having for dinner or what book they are reading. However, when those tweets become useful in a way that we didn't initially anticipate, then maybe there is a case for their preservation.
Won't tweets be there forever?
Currently each tweet has a permalink. It can be indexed by Google, the Internet Archive and any other harvesting service. However Twitter does have search limitations resulting from its need to keep resources used at a minimum. The result is that when searching you will only be able to find tweets from the last 7-10 days and will also only be able to retrieve a certain number of tweets. The Twitter API restricts you to 1500 tweets for a given #hashtag or keyword and 3200 for a person's timeline of tweets. Without knowing a permalink, individual tweets tend to disappear into a Web black hole.
Why Preserve Tweets?
There are many reasons why you might want to preserve tweets, they include:
As a cultural snapshot - Twitter offers great insight into how a section of the population feels at any given time. The election was an interesting example of Twitter's ability to have an effect on real-life events. Although the percentage of Twitter users is still relatively low, trending topics (Twitter's most popular keywords at any given time) were often reported on the mainstream news.
To support research - Twitter may form the basis for discussions that then move into more traditional communication fora. A record of how important connections were made may be of future use.
As a measure of impact - Retweetings (Twitter's mechanism for forwarding on information), and comments on particular subjects are a good indicator of the impact of posts. The number of followers an account has is also of interest. Twitter can allow information to reach a large number of people in a very short amount of time. This is of particular interest to those involved in marketing and areas like scholarly communication.
Preservation of an organisation's corporate memory - Twitter is a useful way to obtain feedback about your organisation. It may also be used by staff to discuss the working environment etc. Twitter could potentially be used to enable effective corporate decisions by data mining.
As a record of an event - Most events now have a hashtag. Event organisers may find preservation of their event data very useful for future event planning and delegates may find having a record of the back channel helpful. Recent innovations such as Twitter captioning services (like iTitle - http://www.rsc-ne-scotland.org.uk/mashe/ititle/) allow 'mashing up' of a Twitter stream and video footage of speakers.
As an individual - For one's own personal records.
Back in late 2008 to early 2009 the JISC Preservation of Web Resources project looked at digital preservation issues of relevance to the UK HE/FE web management community. One particular area of interest was the preservation of Web 2.0 services. In June 2009 the project blog published a blog post entitled ‘Some Use Cases for Preserving Twitter Posts'.
What do you preserve and how?
In April 2010 Twitter announced that they would be donating the entire archive of public tweets to the Library of Congress for preservation and research. The LOC sees Twitter as ‘a historical record of communication, news reporting, and social trends'. It states that its key aims are preserving access to the archive for the long term and making data available to researchers, although how it will be doing this is as yet unclear. Although this is an interesting move, and endorses the value of tweets, it does not transfer the responsibility away from individuals and organisations; they need to continue to preserve tweets that are of importance to them.
Any form of digital preservation, no matter how minor, requires a basic strategy. The sooner the strategy is documented the sooner it can be embedded in workflow processes and implemented. When detailing a Twitter preservation strategy some questions you will need to ask are: why do you wish to preserve this data and what do you intend to do with it, who will be responsible for the carrying out the work, and how can the preservation actions become part of a workflow. Once a strategy is in place you can move on to selection (considering which tweets you will actually preserve), choosing the tools you will use, and then the actual archiving.
WordPress Lifestream plugin which allows you to integrate Twitter with your blog and so archive using blog capabilities.
What the Hashtag allows you to create an HTML archive and RSS feed based on a hashtag.
Tweetdoc service allows you to create a PDF file that brings together all the tweets from a particular event or search term.
Twapper Keeper allows users to create an archive of tweets for a hashtag.
The Archivist Desktop is a desktop application that runs on your local system and allows you to archive tweets for later data mining and analysis for any given search.
Digital Preservation has been likened to chasing a moving train. Digital objects are generated everywhere and the technologies they rely on constantly change. Twitter might be seen as just another stream adding to the information deluge and its preservation relatively low priority. However Twitter has arrived at a time when we are becoming increasingly aware of what can be lost. The tools to preserve the stream are already available. Perhaps this puts us in the fortunate position of being able to build the preservation processes into the way we work. We needn't save the stream but we know how to catch the drops that matter.
JISC Beginner's Guide to Digital Preservation ...creating a pragmatic guide to digital preservation for those working on JISC projects: http://jiscbgdp.wordpress.com/
Marieke Guy works for UKOLN (http://www.ukoln.ac.uk/), a centre of excellence in digital information management providing advice and services to the library, information and cultural heritage communities, based at the University of Bath. She is interested in digital preservation and in particular preservation of web resources. Back in 2008 she worked on the JISC Preservation of Web Resources (PoWR) Project which organised workshops and produce a handbook that specifically addressed issues of relevance to the UK HE/FE web management community. She is currently writing a Beginner's Guide to Digital Preservation (http://blogs.ukoln.ac.uk/jisc-bgdp/) for JISC and will be presenting a paper on blog preservation and a poster on Twapper Keeper at iPres 2010 (19 - 22 September).
The FreePint Family is a family of resources to help information workers be more effective, raise the value of information in their organisations and contribute to success.
'FreePint... provides most of my professional development because it won't come through work and [other resources] just don't cut it.'
FUMSI Forum: Do you have a research question? Post it to the FUMSI Forum, where professionals share Q&A and useful tips on how to Find, Use, Manage and Share Information. It's free.