The first of a two-part literature review on open data portals, and introducing the concept of a 'portal hourglass'
This is the first of two posts in my Data Portals series drawing on a rapid literature review, covering around 80 of the hundreds of published papers on the topic of (open) data portals over the last decade. You can find the full set of papers covered in the review (most with abstracts) in this Zotero collection.
In this post, I’m exploring the open data expectations gap that portals have faced: introducing an hourglass model to think about the different areas of focus for portal-centric research, and future intervention.
Over the last decade, open government data portals have attracted consider cross-discipline academic interest: with steady growth in the number of publications mentioning data portals or open data portals in their title or abstract. Papers have come from from technology and information management scholars, political scientists, public administration and management researchers, human computer interaction researchers, and, as a major contributor to the publication volume, computer science. The graph below (showing papers with “open data portal” in their title or abstract as tracked by dimensions.ai) includes papers from from Lecture Notes in Computer Science (20), the arXiv pre-print server (20), Government Information Quarterly (8), PeerJ Preprints (7) and Communications in Computer and Information Science (7) amongst a long-tail of many other publication venues.
Digging into this literature, I was left with the distinct impression that it is almost universally critical, albeit often also constructive. That is to say, papers are generally focussed on the limitations, faults and unfulfilled promise of portals: although these diagnoses generally provide the basis for proposed interventions intended to improve portal performance against the authors preferred portal objectives, rather than being used to argue against the concept of portals altogether.
Computer scientists are frustrated with the lack of machine-readable data available; political scientists with the lack of tangible impacts on transparency; and human-computer interaction researchers, with perceived usability challenges of portals failing to make data more accessible to the non-technical user.
However, to pick through the critiques, it’s useful to have more of a map of what researchers mean by ‘portal’? Thorsby et al. (2017), for example, introduce a useful analytical separation between ‘portal features’ and ‘content’ (the datasets). For other authors, the focus is how data from a portal is used; whilst others see the portal more as a data publishing initiative, involving a portal management team, and potentially requiring collaboration and new practices from across an organisation in order to make data available.
Drawing on the model of the Internet Hourglass Model [Beck 2019], where the TCP/IP acts as the common connection point allowing many different network and physical layers (the bottom of the hourglass) to support many protocols and applications (the top of the hourglass), I’ve found it can be useful to think of open data research through the concept of a ‘portal hourglass’. That is, the (open government data) portal is seen as the common interface point between many disparate datasets, agencies, agendas and processes inside an organisation, and many different (potential) applications of, uses for, and engagements with, that data outside the organisation.
Individual papers may focus on different layers of the hourglass: from the organisational data practices that generate and share data and metadata, or the work of creating and launching a portal, through to the quality of metadata and datasets accessible through the portal, or the kinds of data use taking place. However, the portal, either as an technical artefact, or a policy initiative, is generally seen as a part of this focus, and the primary mechanism of data availability.
This places an interesting pressure on the idea of the portal. Many papers implicitly start from the engagement layer, and ask whether portals are delivering on a vision of improved government transparency [Lnenicka and Nikiforova 2021; Lourenço 2015], democratic governance [Ruijer et al. 2017], innovation [Ojo et al. 2016; Zuiderwijk et al. 2014], or ‘Government as a platform’ [OECD 2018]. Frequently, the empirical work of these papers then only looks as far as the portal, exploring whether portals have appropriate metadata, datasets, or social features to support desired impacts. Wider organisational practices relating to transparency or participation are more or less hidden from view, and portals are judged to be more-or-less failing to deliver. This has arguably driven a focus on designing better social or co-production features into portals, and seeing the portal as a potential location for participatory engagement between citizen and state.
Note: following our design sprint, we have moved away from an hourglass framing of the portal. See ‘From pinch-point to pyramid’ for more.
Many authors appear to have explicitly, or implicitly, adopted the maturity model presented by Alexopoulos et al.  which presents a path from 1st Generation portals that focus on ‘Data & transparency’ to 2nd Generation portals incorporating interactive communications and crowdsourcing, and 3rd Generation portals supporting “Collaboration: interactivity and with the public, Co-creating value-added services”. Yet, the 2nd and 3rd generations appear to have been persistently difficult to deliver. Lupi et al.  “outline a structural misalignment between … expected uses of data in local actions and forms of support to the users provided by current city Open Data portals”. These technologies alone, it appears, can’t deliver what’s being asked of them, and yet it has often been the technology, rather than the wider initiatives and community dynamics that successful portals may be embedded in, that have been replicated.
Chatfield and Reddick  point towards the importance of considering the wider organisational arrangements around portal in their study of twenty Australian local government data portals, which finds the presence of clear open data policy correlates with better functioning portals. Gasco-Hernandez and Gil-Garcia  explore the role of management in translating policies or laws into functioning portals (or not). And Abella et al.  also argue for the need for research to address the whole ‘open data impact process’, by asking about each stage from ‘qualifying data for publication’, through to understanding a ‘re-users ecosystem’. This raises important questions of how far data can be made more re-usable by action solely at the layer of portal management, or whether portals as both technology and initiatives need to be designed to reach further inside public organisations. Such action may need to support substantially changed data management practice that better take account of external dataset re-users, and even external contributors, to draw on Anastasiu et al.  vision of portal as switchboard that surfaces public data demand as well as supply.
It is also worth here noting a limitation of the hourglass model. The idea that all engagement and use facilitated through the portal is external is an oversimplification, and potentially an increasingly problematic one. Anecdotally, when the World Bank launched their open data portal, a substantial proportion of portal use was internal, rather than external - highlighting the role of the portal in breaking down internal data silos. Park and Gil-Garcia  use a case study from a US health programme to highlight how increased data accessibility supports internal stakeholders to rethink data practices. They look at the important role of middle-managers and non-managers inside agencies in changing the way data is managed and presented, and highlight the “gap between ‘data’ quality and data ‘system’ quality”, pointing to the value of flexible visualisation tools in allowing internal actors to take better ownership of datasets, and increase data quality.
The hourglass model can also help in thinking about the relative reach of different interventions in the portal landscape (and their relative complexity). For example, ‘portals as code’ could be modified to introduce better metrics on dataset re-use as Degbelo et al.  suggest. Such modifications to the portal might be relatively low-investment changes, which, if they incentivise behaviour change in organisational data management could have a large effect. By contrast, delivering effective training (engagement layer) to support re-users may, as Gascó-Hernández et al.  found in their review of OGD training interventions in Spain, Italy and the United States, require significant work to capture and communicate knowledge about the context of data, and may demand the provision of new opportunities to interact with (relevant parts of) government. Such tailored approaches to each datasets may, however, prove essential to secure high quality engagement. Given the high costs of applying improvements to all data at once, it might be appropriate, as Hsu et al.  suggest, to focus on improving the quality and standardisation of a particular set of priority datasets, focussing less on the long-tail of data publishing, and more on those with high (social) value. Decisions about the level of the portal hourglass to act on, and realistic research-based assessments of their likely impact, may help shape future pathways for portals-as-a-whole to move forward.
Abella, A., Ortiz-de-Urbina-Criado, M., and De-Pablos-Heredero, C. 2019. The process of open data publication and reuse. Journal of the Association for Information Science and Technology 70, 3, 296–300.
Alexopoulos, C., Diamantopoulou, V., and Charalabidis, Y. 2017. Tracking the Evolution of OGD Portals: A Maturity Model. Electronic Government, Springer International Publishing, 287–300.
Anastasiu, I., Foth, M., Schroeter, R., and Rittenbruch, M. 2020. From Repositories to Switchboards: Local Governments as Open Data Facilitators. In: S. Hawken, H. Han and C. Pettit, eds., Open Cities | Open Data: Collaborative Cities in the Information Era. Springer, Singapore, 331–358.
Beck, M. 2019. On the hourglass model. Communications of the ACM 62, 7, 48–57.
Chatfield, A.T. and Reddick, C.G. 2017. A longitudinal cross-sector analysis of open data portal service capability: The case of Australian local governments. Government Information Quarterly 34, 2, 231–243.
Degbelo, A., Granell, C., Trilles, S., Bhattacharya, D., and Wissing, J. 2020. Tell me how my open Data is re-used: Increasing transparency through the Open City Toolkit. In: Open Cities| Open Data. Springer, 311–330.
Gasco-Hernandez, M. and Gil-Garcia, J.R. 2018. The Role of Management in Open Data Initiatives in Local Governments: Opening the Organizational Black Box. JeDEM - eJournal of eDemocracy and Open Government 10, 1, 1–22.
Gascó-Hernández, M., Martin, E.G., Reggi, L., Pyo, S., and Luna-Reyes, L.F. 2018. Promoting the use of open government data: Cases of training and engagement. Government Information Quarterly 35, 2, 233–242.
Hsu, J., Ravichandran, R., Zhang, E., and Keung, C. 2021. Open Data Standard and Analysis Framework: Towards Response Equity in Local Governments. Equity and Access in Algorithms, Mechanisms, and Optimization, Association for Computing Machinery, 1–8.
Lnenicka, M. and Nikiforova, A. 2021. Transparency-by-design: What is the role of open data portals? Telematics and Informatics 61, 101605.
Lourenço, R.P. 2015. An analysis of open government portals: A perspective of transparency for accountability. Government Information Quarterly 32, 3, 323–332.
Lupi, L., Antonini, A., Liddo, A.D., and Motta, E. 2020. Actionable Open Data: Connecting City Data to Local Actions. The Journal of Community Informatics 16, 3–25.
OECD. 2018. Open data portals: Enabling government as a platform. OECD, Paris.
Ojo, A., Porwol, L., Waqar, M., et al. 2016. Realizing the Innovation Potentials from Open Data: Stakeholders’ Perspectives on the Desired Affordances of Open Data Environment. Collaboration in a Hyperconnected World, Springer International Publishing, 48–59.
Park, S. and Gil-Garcia, J.R. 2021. Open data innovation: Visualizations and process redesign as a way to bridge the transparency-accountability gap. Government Information Quarterly, 101456.
Ruijer, E., Grimmelikhuijsen, S., and Meijer, A. 2017. Open data for democracy: Developing a theoretical framework for open data use. Government Information Quarterly 34, 1, 45–52.
Thorsby, J., Stowers, G.N.L., Wolslegel, K., and Tumbuan, E. 2017. Understanding the content and features of open data portals in American cities. Government Information Quarterly 34, 1, 53–61.
Zuiderwijk, A., Janssen, M., and Davis, C. 2014. Innovation with open data: Essential elements of open data ecosystems. Information Polity 19, 1-2, 17–33.