by Dr. Jose Ramon G. Albert, Secretary General, Philippine National Statistical Coordination Board
Official Statistics and Public Policy
Governments across the world, including the government of the Philippines, recognize the need for information to manage their economies more effectively. There is a particular need to examine national development plans, and global commitments to reduce poverty and related goals exemplified in the Millennium Development Goals (MDGs), and the emerging Post-2015 MDG Agenda. And there is pressure to ensure that progress in meeting these goals is further accelerated.
Big Data is Here!
The world’s capacity to collect data is reported to double every 40 months since the 1980s, with about 2.5 quintillion (2.5 x 1018) bytes of data being created per day in 2012. With such a tsunami of data being shared and transmitted on the web and by way of various electronic means, at an exponential rate, the public’s hunger for information is accelerating.
What is now being referred to as Big Data, which is typically characterized with the 3V’s: volume, velocity and variety, is undoubtedly creating business opportunities. Google, for instance, with its Google Flu Trends has managed to illustrate the potentials of using web searches for terms relating to illness, with the frequency of searches correlating strongly with actual incidence of the flu reported by the US Center for Disease Control. Big Data applications have also shown promise for monitoring inflation and predicting sales (through frequency of tweets), people’s movements including traffic (through frequency of SMS messages). While these case studies shows the vast potentials of Big Data, there are also possible pitfalls.
Big Data: Big News or Big Mess?
There is growing enthusiasm about the data revolution and its possibilities for making use of Big Data, especially for measuring and monitoring progress in societies. Even chief statisticians, when they met at the UN Statistics Commission last February 2013, have realized that Big Data is here to stay as an alternative data source. However, many have also expressed some degree of caution, as big data need not always mean better data. Statistician General Pali Lehohla summarizes the differences between official statistics and big data (see Table 1). Big data is largely unstructured, unfiltered data exhaust from digital products, such as electronic and online transactions, social media, sensors (GPS, climate sensors), and consequently, analytics can be poor, unlike traditional data sources utilized for official statistics that are well-structured, but with a high cost, and typically infrequent with time lags.
Table 1. Comparison of Production of Official Statistics and Big Data
|Official Statistics||Big Data|
|1. Structured and planned product||1. Largely unstructured unfiltered “data exhaust”, i.e., by-product of digital products (transactions, web, social media, sensors)|
|2. Methodological and clear concepts||2. Poor analytics|
|3. Regulated||3. Unregulated|
|4. Macro-level but typically based on high volume primary data||4. Micro-level huge volume with high velocity (or frequency) and variety|
|5. High cost||5. Generally little, or no cost|
|6. Centralized; point in time||6. Distributed; real-time|
A serious issue raised against Big Data is the personal information attached to data exhaust. Precise, geo-location-based information could potentially allow “Big Brother” to watch over us. It is clear that these days, Amazon, Visa and Mastercard are watching closely our shopping preferences; Google is watching our browsing habits; Twitter is watching what’s on our minds; Facebook is watching various information about us, including our social relationships; and mobile providers are probably watching whom we talk to, what we say to them, and even who is nearby.
Any National Statistical System (NSS) is challenged to come up with better statistics for a better society. The challenge is to be more forward looking and open to making use of non-traditional data sources, such as Big Data. Clearly, there will also be a need to identify legal protocols and institutional arrangements so that an NSS can get access to Big Data. Exploratory talks have been made among representatives of the Philippine Statistical System (PSS), academe, and a telecom to work on using telecom data for analyzing migration patterns in the midst of and aftermath of natural disasters. It would be important for the PSS to have access to Big Data holdings, and it is critical to partner with organizations such as PARIS21 in pursuing such an initiative. Beyond conducting pilot case studies, there will also be a need to address privacy issues with Big Data through possible statistical policies, in order to prevent misuse of Big Data. Extensive capacity building program will also be required in the PSS to harness Big Data, so that ultimately, the Official Statistics community can help identify “signals” within “noise”, certify quality and ultimately decipher truth from falsehood in the use of Big Data. We in the PSS after all are committed to have statistics truly matter to every Filipino.
To read Dr. Jose Ramon G. Albert’s full article, visit the Republic of the Philippines NSCB website.