The high cost of free data

Bob McQueen on the potential pitfalls of a “free data for all” approach

“Free data for all may not be the wisest use of public sector investments in data collection, analysis, equipment and tools to manage data...”

There is a legitimate urge for the public sector to make smart city data freely available to public and private sector organizations alike. While this is laudable stewardship of publicly funded data collection and management investments, there may be other aspects of data management that should be considered when developing the optimum approach.

Free data for all may not be the wisest use of public sector investments in data collection, analysis, equipment and tools to manage data. One of the most important reasons for this is that there is a cost associated with collecting, managing and storing large volumes of data. Hence the title of this article, with due acknowledgment to Donald Shoup, Distinguished Research Professor of Urban Planning, at UCLA, who inspired it with his book entitled The High Cost of Free Parking.

There is also a public sector need to ensure that data is turned into information, from which insight and understanding is gained to support the development of the best response plans and strategies for managing smart city transportation. In some cases, this process is starved of budget and the full value of the data is not unlocked because of insufficient resources to get to the end of the process. Budget starvation is often a result of misconceptions on cost and lack of understanding on the value of data. There is often a “vicious cycle” effect at work as depicted in the illustration (left).

Increasingly, city and regional transportation authorities are faced with a new reality that in future a significant proportion of transportation service within a region may be satisfied by the private sector. This would be in the form of transportation network companies, and smart mobility services such as Uber, Chariot (pictured right, now owned by Ford Motor Company, although Ford announced it was closing the service down just before we went to press) and Lyft. Understanding the respective roles and responsibilities for both the public and private sector is becoming increasingly important.

There is no way to avoid the need to collect data as a public agency. It is necessary fuel for the decision-making, investments and actions required to keep our cities moving. So, should we just throw the data over the wall and hope or should we be more thoughtful in our approach to data stewardship? Just because you give people data it does not guarantee that they will use it. Is it more effective to distribute the analytics alongside added value data and selected raw data? I think so. This does not mean that other people should not have the opportunity to apply analytics to the data, just that the public sector should evaluate the opportunity.


Adding value on the public side of the equation can generate additional revenue that will support better and more extensive data management and analysis. My belief is that a balanced approach can exist that will satisfy the objectives of both the public and private sectors. On the one hand public agencies want to use publicly funded data investment output to make data freely available to the public.

On the other hand there is an opportunity to have more influence on the use to which the data is put in some of the value-added work that is done in the public sector.

This also enables any revenue that comes from value-added information to be fed back into the costs of data collection, data management and data analysis. For example, if an oil company is investigating the best location for a new filling station, or perhaps even the best location for a new electric vehicle charging point, it may be interested in a package of value-added data that goes beyond the basic.

Through the combination of basic data sets, a value-added data package can be created that illustrates traffic volumes as a proportion of the cost of the land, illustrates foot traffic to generate revenue for the associated store and provides insight into demographics of the area. This type of data packaging adds value that makes it easier for the consumer of the data and therefore contains an intrinsic value for which a fee can be paid. It also enables the public sector to have much greater influence over the outcomes of data use.


It is often said that data is the new oil. So, when was the last time that an oil company invested money to explore and find oil, extracted it from the ground and then gave it away? They do not do that and when challenged will explain that there is an investment in finding and extracting the oil that needs to be repaid. Therefore, they keep the raw oil and subject it to a refining process that results in a wide range of much sought-after and valuable end products.

Now I know that public sector agencies are not like oil companies and have different objectives related to social equity and the wise stewardship of public investment. However, there is a business model that a public agency could consider that manages smart city data in a balanced manner. First, the data should be maintained, identified and stored in manner that enables consistent and coherent access to all the data, to the right people at the right time.

This can either be a central repository or it could be a distributed and governed model like a service-oriented architecture. Second, the data in the central repository will be used to develop advanced analytics to generate the information needed for world-class management of public-sector transportation services and other smart city services. Third, while processing the data for the public sector analytics, additional data manipulation and value-add analytics can be conducted as required by the private sector. Dual streams of available data could then be offered to both the public and private sectors. A basic set of data can be made available free of charge to all comers.

Added value data and analytics would then be subjected to a valuation process and offered at a fee, to the private sector. The valuation process should have twin objectives. In the first case the public sector should determine the value of the data from an internal perspective. What improvements in safety, efficiency, user experience and environmental protection are derived from having the data and making effective use of it? This is the public value of the data. The second data valuation is the commercial, open market value of the data. This will be added value data or analytics that have been subjected to additional processing such as combining data set or extracting trends and patterns to create information.

This approach features the added benefit of addressing a specific challenge. Just because you make data available does not ensure that people will use it or use it wisely. By providing added value data and analytics as part of the distribution process along with the raw data, it is possible to provide examples and stimulate a wide range of people to make use of the data to achieve outcomes. These outcomes can be in terms of the safety, efficiency and user experience goals described earlier, but could also be private sector goals related to establishing new businesses and operating successful services.

There is a marvelous opportunity for public-private partnership associated with smart city data. The contention is that smart cities need smart data management and that smart data management involves the following elements:

  1. Efficient data collection and ingestion supported by Artificial Intelligence tools
  2. A data management system that is intelligent enough to allocate data to different storage areas depending on the frequency of use and work on either centralized or distributed architecture models
  3. World-class ease of use for extracting the data and using it to answer any question and not just predefined questions
  4. The ability to apply advanced analytics to the central data repository to get the information required to make life better and achieve the outcomes of the public sector
  5. An advanced distribution mechanism that will provide data to the appropriate place and to the right person at the right time.
  6. Data Creation capabilities that enable the management of level of extraction, analytics used and fluctuations in the value of data
  7. A mechanism to determine the public value of the data related to safety, efficiency, user experience and environmental preservation gains and a private sector market valuation of the data

These elements can be used by internal public sector staff, or by private sector support staff.


In addition to raw data, the distribution method will also include the capability to distribute analytics by supporting an application center-type dashboard, where previously developed analytics can be accessed and used in their entirety or as models for further adaptation. The distribution mechanism will also include a suitable pricing structure that enables both free and premium data to be made available. Distribution of data can be guided by our marketing and sales plan that takes account of the evolving value of data.

All these features should be contained within a cohesive framework that can take an ecosystem approach and incorporate multiple best-of-breed tools from different suppliers.

So how do we move toward a smart data management approach for a smart city? First, there is no need to develop the entire framework described above in one single initiative. In fact, given the need for organizational capability to keep up with technology capability, it is better to take an incremental approach to implementation. An incremental approach would feature a focus on an early winner or proof of concept project that would focus on a small number of use cases that are of immediate concern.

Excellent criteria for identifying an early winner are as follows: it will deliver clear and immediate value to the agency and suitable data is available to support the use cases and analytics. The conduct of this early project would deliver standalone values and benefits but would also allow the organization to align to the new capabilities. The experience and results would also be documented to form the basis of a business justification for further investment in the larger scale system as described above.

This is a balanced and pragmatic approach that allows free data to be available in recognition of the publicly funded nature of data collection among public agencies. It also recognizes that significant value can be generated by bringing the data together and subjecting it to advanced analytics. There are two choices with respect to this added value process, it can be done on the public side of the fence, or on the private side of the fence. My argument is that this is best done in the public side of the fence to enable the public sector to have influence and advocate towards a successful outcome for both the public and private sectors, not regulating or mandating, but acting as an informed guide on successful outcomes.

This may not be the best answer but let’s talk about it and come to a decision that will drive the best business model to satisfy both the public sector policy objectives and provide the best support for private sector entrepreneurial efforts. The decision should consider short, medium- and long-term objectives and the reasonable anticipation of the evolution of transportation and related information technology services in the future.


There is no need to sell the family silver or for the public sector to act as a benevolent supporter of the private sector. A true win-win public-private partnership that is sustainable over the long term is an achievable goal. My contention is that the operation of a smart data management system that recognizes the public value and private value of data, distributing it accordingly, is the wisest use of public investment in smart city data collection and analysis. Let’s aim for a win-win that emphasizes the high value of data that is appropriately managed and used and understand and recognize the real value of data from both public and private sector perspectives.

Let’s take a balanced approach to the provision of free and premium data using a “freemium” model and most importantly let’s take a smart data management approach to data for a smart city.


Bob McQueen is the founder of Bob McQueen and Associates and is also North American Bureau Chief of H3BM.

Share now...