Data Donation. Big Data at the service of the public good

7 September 2018

Among successful cases, surprising applications and scandals, industry literature and national popular dissemination have documented, especially in the last decade, all the progresses and declinations of the gold of the 21st century, the Big Data. Much less emphasis has instead been reserved for a small group of companies and organizations that, while recognizing the unquestionable value of data for business, have embarked on a parallel but opposite path, focusing on initiatives aimed at using data for the social good.

However, today there is a renewed and growing interest in the public good that data have the potential to generate. Some companies are effectively leveraging their data assets to improve public services and decision-making through the emerging field of data donation, sharing them responsibly with researchers, non-profit organizations, government agencies, and the public sector. After water, waste, electricity, gas, schools, hospitals and roads, data are the new common good that, if exploited for the interests of the community, can fill the gaps in knowledge and translate into a wide range of targeted policies and timely interventions.



Numerous fundamental initiatives aimed at bringing solutions to the major problems of humanity die at birth because the necessary data can not be found, are damaged or the context in which they were collected is missing. The data donation does not yet have univocally established boundaries and practices, but there are, however, common themes that concern most of the definitions. Mallory Soldner, Advanced Analytics Manager at UPS, during her Ted Talk explains the three main categories of data philanthropy: data donation, donation of skills and technology donation.

The donation of data by companies and organizations is further divided into two types. The donation of private data, which represents access to specific data sets, within closed environments, for specific purposes, is the most widespread. The donation of open data instead provides access to specific data sets, under open licenses, within public archives or through open APIs, for general or specific purposes. Anonymization, aggregation, and other forms of data security require time, experience, computing cycles, and other costs that make open data donation much rarer and complex, even if it can produce wider and impactful results due to the large volume of data involved.

The examples, excellent though few, of open data sharing by commercial organizations, confirm the added value of this practice. Syngenta, an Italian group that today operates globally in the Agribusiness sector, has collaborated with the Open Data Institute of London to release searchable, usable and sharable data sets, containing the basic information for the agricultural efficiency indicators gathered in 3,600 farms in 41 countries in Europe, Africa, America and Asia, representing around 200 combinations useful to understand the interaction between climate and crops.



The potential of Big Data for the public interest has been understood and intercepted by the UN already in 2009. At the zenith of the global financial crisis, the Global Pulse initiative was born as a research and development laboratory to find out if and how the Big Data and real time analysis could contribute to the definition of more agile and effective policies. The results were more than encouraging, so much that the chatter on blogs and forums have become useful for prefiguring imminent peaks of unemployment, while the volume of tweets that mentions food prices now serves as an indicator for inflation rates. Twitter has also become a valuable source of information for public health, tracing the most varied conversations that range from earthquakes to epidemics and incorrect use of prescription drugs. For this reason, the main institutions in the US and Australia have long since released their real-time monitoring tools for online conversations.

The interest for the data philanthropy does not only concern the institutions, but it also extends to companies, some of which have included the donation of data between their CSR activities. An excellent example is Mastercard, which inaugurated the Mastercard Center for Inclusive Growth in 2013. The Center operates as an independent subsidiary and focuses on the application of Big Data to a range of social issues, also connecting companies, governments, universities and NGOs with an international community of thinkers, leaders and innovators. One of the Center’s most important collaborations is the Data-Driven Justice Initiative of the Obama presidency. Through the insights provided by Mastercard, it was possible to demonstrate the impact of crime on shopkeepers and job opportunities in Baltimore, leveraging data to advance the criminal justice reform. Similarly, the Center provided the NGO DataKind with 100 data scientists to work on social impact projects around the world, including one with the Red Cross to reduce fire deaths in the United States. According to Shamina Singh, President of the Center, the mission of their activities is precisely that of redistributing the resources generated by Mastercard data to promote inclusion and a positive social impact.

Most of the data is collected and therefore donated by companies and organizations, but there are specific areas and themes that require direct action by individuals. For example, gathers data scientists and experts in suicide prevention with the aim of identifying and tracing linguistic and behavioral patterns of aspiring suicides in their daily interactions on the web. To achieve this, the organization encourages the relatives of the victims to donate their public data and the access to their social media and apps. Therefore, the profiles and interactions on the web are analyzed, as well as the data collected by the so-called wearables and fitness trackers. Big Data are thus turning out to be good allies in the fight against mental illness and depression because of their high predictive capacity.



After the earthquake in Haiti, the Columbia University and the Swedish Karolinska Institute worked together with the mobile operator Digicel to understand how people affected by cholera were moving, thus facilitating humanitarian aid. In fact, to use the full potential of data, it is necessary to be able to cross different sources of information, from social media to telephone companies, including banking services and e-commerce. It is therefore clear that at this point the public sector, to fully exploit the possibilities given by Big Data, depends entirely on the benevolence of the private sector. To bypass this limitation, it would be necessary to recognize Big Data as a public good.

The idea, however interesting, goes to clash with the interests of the main generators of Big Data, the Silicon Valley giants who built their own empires on this same resource. The myth that the Internet is a neutral, non-hierarchical and decentralized space, where the great social media have the merit of giving a democratic voice and a space to everyone, has irremediably collapsed, even in the awareness of public opinion, with the scandal of Cambridge Analytica.

In its natural monopoly, Facebook is not driven by a public interest but by a business model focused on advertising, where a single user in the United States and Canada is worth about $97 and in Europe $23. For this reason, a few weeks ago UK Labor Party leader Jeremy Corbyn has proposed the creation of the British Digital Corporation (BDC), which should act both as a think tank to drive the digital technology policy and become a non-profit service provider that can compete with those for profit like Facebook and use the collected data not for advertising but for public good. Corbyn’s proposal is undoubtedly courageous, but the main challenge that a public platform would face would be to find a worthy substitute for the capitalist profit incentives that drive Facebook’s algorithms and determine its success.

The history of humanity tends to remind us with a certain regularity that the common good and social justice, unfortunately, do not seem to be sufficiently convincing arguments by themselves. However, we are in time to change this course and use the data not only to make better decisions on the type of film we want to see, but on the kind of world we want to see (cit. DataKind).


Although the fields of application of Big Data are still partly unexplored, their role in shaping the society of the future appears to be more than clear. To fully exploit the potential of data, it is necessary to form professionals able to combine the data scientist analysis activity with the effective implementation of the results. The BBS Master in Data Science is conceived for those who are interested in learning the most innovative methodologies of analysis and their application to the tactical and strategic choices of companies and organizations. Who attends the Master in Data Science is a person who wants to learn how to manage a big data business, and is aware of the opportunity that this presents in terms of generating value. 


Back To Top