Acquiring the Cablegate documents that have been released so far was a nightmare. It is amazing that these files are so difficult to find and sort, seeing as the cables have been mentioned on the Internet and in the media so frequently. My first attempt at downloading the cables started from the link posted on one of my classes webpages, which lead me to a page titled “The Secret US Embassy Cables”.
After scrolling around the subsequent pages, I noticed that the website actually allowed you to download all of its content in a single archive, which I assumed would be a .zip or .rar file. After clicking the link, it brought me to a 404 File not Found page. What I did notice, however, was that the URL ended in .torrent, meaning the single archive file is actually in .torrent form.
Since .torrent files are commonly used to share files, sometimes large files anonymously, I simply typed in the file name (cablegate-201101132206.7z.torrent) into Google. The search brought back no promising results, bringing back sketchy websites that require you to pay to access the file. No thanks.
I then searched for the .torrent file directly on The Pirate Bay, and again the search brought back no results. I then searched “Cablegate” on The Pirate Bay, which brought back 148 files. It seemed that ~5MB files were being uploaded daily from anonymous users. I then torrented all the files that were available (2010/11/28 – 2011/01/08) using Transmission.
The files downloaded incredibly quickly, but were compressed with a .7z file extension. I then downloaded The Unarchiver, which is an archive utility for Mac OSX. The files uncompressed easily and the entire process of downloading the torrents and then unarchiving them took approximately fifteen minutes. I then dumped all the files into a folder. The folder totalled 6.45 GB, which seems huge for a bunch of .html/text files. The total number of files in the folder equates to a whopping 167,214 files, which is odd seeing as only 2428 cables have been released as of today, January 16th 2011.
The files are formatted in .html and occasionally .txt, and are labelled with numbers (cablegate-201012041409). This is extremely frustrating because you cannot search using keywords. When I searched the title TERRORISM FINANCE: REQUEST FOR POSTS ASSISTANCE, which is the title of a cable on the Cable Viewer, Mac OSX returned 132 results.
Clearly, these files seem to be in no organizational or hierarchical structure, and are not properly searchable. In order to use these files, they need to be sorted through, renamed, and formatted in a way that allows for easy viewing. The Guardian has sorted the cables using Google Fusion Tables, but does not allow users to actually read the cables.
To sum up, trying to access these files was difficult since the link to the Wikileaks full site archive is broken. Once downloaded through torrents, the files are difficult to read and use as there is not a suitable way to search through the documents.