There will be at least a six-month window between the original date of a tweet and its date of availability for research use. Linked information such as pictures and websites is not part of the archive, and the Library has no plans to collect the linked sites. Private account information and deleted tweets will not be part of the archive. Things looked easier in 2010, when the library launched the Twitter partnership with a jaunty press release, “How Tweet It Is”: And that might be where the Library of Congress is stuck.” “When libraries didn’t have the resources to digitize books, only a company the size of Google was able to put the money and the bodies into it. “This is a warning as we start dealing with big data-we have to be careful what we sign up for,” said Michael Zimmer, a professor at the University of Wisconsin-Milwaukee who has written on the library’s efforts. Will the library finally untie it-or give in and cut the thing off? The library has been handed a Gordian knot, an engineering, cyber, and policy challenge that grows bigger and more complicated every day-about 500 million tweets a day more complicated.
TWITTER DOWNLOAD ARCHIVE ARCHIVE
This frustrates researchers, who had hoped to mine the archive for insights about language and society-and who currently have to pay heavy licensing fees to Twitter for its data.Įvery Trump tweet in a big, searchable database. And, in the meantime, the value of a vast tweet cache has soared. There’s certainly no way to search through all that they’ve collected. So, for now, staff regularly dump unprocessed tweets into a server-the digital equivalent of throwing a bunch of paperclipped manuscripts into a chest and giving it a good shake. No engineers are permanently assigned to the project. Six years after the announcement, the Library of Congress still hasn’t launched the heralded tweet archive, and it doesn’t know when it will. If Twitter could handle a few million tweets a day, surely the largest library in the world could, too.īut as it turns out, it couldn’t. Yet, however dubious the task seemed back then, no one doubted the Library of Congress would get the work done. The news actually frightened some folks: Does this mean my future grandkids will read my live-tweets of Parks and Recreation? I imagined library scribes copying tweets by hand onto vellum or cranking feeds through a printing press. It was also fascinating: equal parts futuristic and anachronistic. It was odd: a 210-year-old institution partnering with a four-year-old startup, cataloging the internet’s ephemeral # brunchtweets. In 2010, the Library of Congress and Twitter announced a historic and incongruous partnership: Together, they would archive and preserve every tweet ever posted, creating a massive store of short-form thoughts.