The data is aggregated as a list of events: time, latitude, longitude, magnitude. Once the scraping for data is complete, the data is plotted on a world map. The miner is more likely to pick up frequent burst reports than lump-sum reports. For example, mined data from CNN indicates that CNN doesn't report very frequently on Sudan, while it reports on Iraq quite frequently. This is a clear projection of national interest.
CNN's reported casualties. Iraq, Afghanistan and Pakistan show clear signs of unrest. Note that some locations are surprisingly underreported, e.g. Sudan. The counts seem well correlated with U.S. popular interests, and a lack of investment in most of Africa. There is a distinct lack of coverage in the F.S.U.
Reuters' reported casualties. Note that Reuters appears to have much better coverage of Africa and former British colonies in general. Still a distinct lack of reporting in the F.S.U.
Xinhua's reported casualties. Note much better coverage of the world (overall) than CNN. East Africa is particularly well covered, aligning well with the idea that China is extremely interested in their growing African investments.
RIA Novosti's reported casualties. The F.S.U. is much more thoroughly covered, aligning well with the idea that Russia desires a sphere of influence on their borders. The lack of coverage in developing Africa would seem to suggest that Russia is not a net exporter of investment. Also surprisingly poor coverage of Iraq and Afghanistan.
Notes:
- The web searches on Reuters, CNN, Xinhua and RIA Novosti were wildly abused in the production of these maps.
- OpenGeocoding.org was used to geocode the reported locations into latitude and longitude.
- ws.geonames.org was used to reverse geocode the coordinates into country codes.
- gunn.co.nz was used to turn data into maps.
I remember these maps, they really show the startling difference in coverage. I wish there was something more you could do with the data... but this is pretty cool.
ReplyDelete