Data-driven Journalism: Making News out of Numbers
In 1858, Florence Nightingale took important raw facts and figures and transformed them into graphics, highlighting the health issues faced by the British soldiers in the latrines. Way back then, no one would have termed this data-driven journalism. But today, in the age of Big Data and Wikileaks, we’re actually confronted with this phenomenon on a daily basis, making Nightingale’s 19th century work ever-present.
Behind those facts and figures, there are some truly exciting stories just waiting to be uncovered. Renowned media houses, including The Guardian and The New York Times, have long since recognized this fact and have well-established teams designated especially for data-driven journalism. But what is it exactly that defines this form of journalism? And what does it mean for authors today?
Florence Nightingale’s Diagram on the mortality rates in the British Army. Source:www.theguardian.com
Today, we’ve certainly moved on from the time when dense data sets could only be found in expensive books. The World Wide Web ensures we have constant access to an overwhelming degree of complex data. It’s not just Wikileaks and Offshore Leaks; the Internet is simply swimming with statistics, all democratized and made accessible to the public.
That said, even the most volatile data remains relatively worthless until someone takes the information and sheds light on the facts hidden behind the numbers. This is where data-driven journalism really comes into its own. It’s used to rummage through incomprehensible mountains of facts and figures, searching through Excel tables and statistics, continually on the lookout for the newest and most exciting story.
Starting Point – Data
It goes without saying that statistics have always been used for journalistic research. In classic journalism, however, the statistics are used to support the points made. With data-driven journalism, it’s exactly these statistics that are used as the starting point for the journalistic investigation. In a nutshell, it’s about using data to investigate and develop stories and preparing the new knowledge in the best possible way for the public.
Simon Rogers, Founder of the Guardian’s Data Blog, describes the role of data journalism in this way: “Mostly, we act as the bridge between the data (and those who are pretty much hopeless at explaining it) and the people out there in the real world who want to understand what that story is really about.”
Real Detective Investigations
Anyone who has ever seen raw data sets may very well wonder how they’d provide inspiration for exciting journalism. According to Rogers, remembering to think like a journalist and not like a programmer is what’s key. Remember to ask yourself what’s interesting about the statistics. Is there anything new and special about the info, or what would happen if I combined them with something new?
Discovering the news behind the data sets and revealing these stories in a presentable, understandable way can often prove to be real detective work. For this reason, professional data-driven journalists use technologies such as Data Mining or Web Scraping. Even without these tools, though, tables and data sets can still be used to source exciting news – you’ve just got to keep a few things in mind.
How to Interpret Data Correctly?
- Regardless of whether it’s truancy rates or the yearly number of break-ins, try to visualize your data sets as you work. Programs such as Excel or OpenOffice Calc can be used to create clear and simple diagrams out of even the most incomprehensible data. Even at a glance, everything will seem much clearer. If you’re struggling to find the news, keep these points in mind:
- Check out any anomalies or irregularities the data throws out as these could well be important starting points for further investigation.
- The same is true for those stats on the edge of the rankings: Just looking at the extremes often leads to interesting conclusions.
- To understand data sets correctly, it often helps to think of them relatively. What does the average amount of debt per person in one city actually mean? What happened in this time period? To what degree does this diverge from the national average? Critically reviewing data in this way can quickly indicate whether you’re on to the latest hot story.
Orientation in the Data Jungle
Data-driven journalism is only going to be successful if the correct sources have been used: It’s of vital importance to make sure the data is verified, using trusted and reliable sources. Raw data can be found in abundance online. Journalists will often use the websites of various government institutions, research organizations, societies and non-government organizations to get the verified stats they need. The DE statistischen Bundesamt, for example, has a great deal of information on very specific topics.
Even straying from these well-known tracks, it’s easy to come across freely accessible sources, many of which will inevitably help when developing exciting news. A few include the following websites:
- Nationmaster: Whether you’re looking up the Internet usage in Iceland or obesity rates in the Asia, Nationmaster has the numbers you need. The website provides uncountable possibilities for comparing countries and their populations.
- Eurostat: If you’re looking up the living conditions of the Europeans or maybe even their shopping behaviors, then the European Commission has got information on all EU countries here.
- The CIA’s World Factbook: The CIA knows a lot more about us than we’d sometimes like. Some of this information, however, they share with the public in the World Factbook. The information is impressive. If you’re looking for specific facts on a favored country, then this resource just can’t be beaten.
- Google Public Data Explorer: Many organizations‘ public data, such as the World Bank and OECD, can be accessed here.
- Google Zeitgeist: What was the world searching for in 2013? Here, you can find the answer to more than a billion questions, all of which were searched for using Google in 2013.
Effective Visualizations: Clear, Interactive, Multimedia
Large volumes of statistics often only become clear when they’re presented optically, for example, through Gantt charts, index cards and organization charts. With the dynamic changes thanks to digital media, there are always exciting new ways being developed for us to innovatively present data sets. Graphics don’t just have to appear two dimensionally anymore; they can be interactive. Data journalist Gregor Aisch brought the Japanese Reactor disaster in graphic form to Germany by showing how many people would need to be brought to safety should a similar event occur in the Bundesrepublik. The New York Times really impressed with its combination of video, text, photos and graphics in its feature ‘Snow Fall’. ‘Climbing the Income Ladder’ highlighted the connection between location of origin and income in the USA.
Graphic “In Climbing Income Ladder, Location Matters” Source: www.nytimes.com
The Guardian Data Blog presented one further inspiring example of successfully transformed data sets at the website informationisbeautiful.net. Tip: Anyone can quickly achieve a great visualization of their data using the free software available at Datawrapper.
Think of Numbers Like You Do Words
Even in DDJ, it’s just about one thing: telling great stories and doing so in the best possible way. This can be through words or numbers, and as always, this does still require that specialist touch of a good journalist. The very capability to transform data sets, to make them understandable and clear, is becoming increasingly important for journalists today. Viewing these statistics critically, just as it’s always been important to read news critically, too, is as important now as it ever has been.