The Library of Congress announced today that it has completed a full collection processes, continuous stream of tweets, and he began working to archive and organize more than 170 billion tweets.
Under an agreement between the federal institution and Twitter in 2010, the company micro-blogging is the provision of the Library of Congress with a full feed of all public tweets, starting with 21 billion generated from between 2006 and April 2010, and now supplemented by about $ 150 billion more for display.
In a statement on the status of the project today, the library wrote that Twitter is a new collection of the Library of Congress, but an important step in its mission. While the company is turning to social media as a primary means of communication and creative expression, social media supplement, and in some cases replace, letters, journals, serials, and other sources routinely collected by libraries research.
Although the library was the construction and stabilization of the archive is not yet available to researchers access, however we received about 400 inquiries from researchers around the world. Some themes of interest expressed by researchers in models ranging from the rise of citizen journalism and communications elected officials to track vaccination rates and predict market activity.
The Library of Congress is not entirely clear how the current archive will be used, but it has released a white paper (PDF) describing the project.
This project, of course, is different from the recently announced initiative to Twitter to make history for each user tweeter right at their disposal. This effort is ongoing, although only certain users have access to this day.
Interestingly, the Library of Congress reported in the White Paper that the two copies of the entire archive of 170 billion tweets include about 133 terabytes of data. Each tweet, the library writing contains about 50 fields of metadata that accompany it.
Browse » Home