Analytical Report 18: Characterising Dataset Search on the European Data Portal
The European Data Portal’s 18th analytical report illustrates a quantitative study on data search through more than two years of EDP search and interaction logs. Understanding data search behaviour is key to developing better search algorithms and improving the search experience. This study presents current findings from key literature in data search.
Characterising Dataset Search on the European Data Portal: An Analysis of Search Logs
In September 2020, the European Data Portal (EDP) published the analytical report “Characterising Dataset Search on The European Data Portal: An Analysis of Search Logs”. This report is a quantitative study conducted by the University of Southampton as part of the EDP. The study observes two years of EDP search and interaction logs and provides directions for further development derived from user search behaviour.
As a critical mass of datasets have been published openly across Europe, the aim has gradually shifted towards ensuring that the available data is of value to users and that it has broad impact. According to prior EDP work, open data impact remains the least matured open data maturity dimension. Hence, calling for sustained efforts to monitor and measure it in various ways is crucial. The provision of datasets, as well as dataset search functionalities, is a key section of the EDP. The portal aims for users to be supported in both the discovery and re-use of datasets. More understanding of the data search behaviour of EDP users can improve the ability to develop capabilities and experiences that support them. This report is a first step in this direction.
Approach
To begin addressing the issue, the EDP conducted a quantitative analysis of 844,343 anonymised EDP user session logs from between April 2018 to June 2020. Before diving into the analysis, some background information was gathered. Subsequently, the team investigated dataset search as an emerging area of research, detailing different subtopics that feed into the development of the dataset research agenda. These learnings were then contrasted with analysis performed on the search logs of four open data portals. Finally, a quantitative analysis was performed with the use of Matomo Web Analytics suite, a tool that logs the actions of users each time they visit the portal.
Results
The report continues with a detailed description of the results found. The results are presented based on four topics which were formulated based on the gathered background information:
- Dataset search in the context of the EDP
- Dataset search strategies and search query characteristics
- EDP versus web search engines in dataset search
- Success in dataset search
The results section of the report is extensive, and the information shared in this article is a sneak preview. For more details of the results, please read the report.
Improving data search and user experience
The report concludes with a discussion of the main findings of the search and interaction logs analysis. This includes recommendations to emphasise and expand the tracking of user interactions in dataset search to allow for more detailed follow-up studies to inform search and user experience design.
For the continued relevance and development of open data portals, it is vital for portals to understand the needs of their users, especially when thinking about which functionalities to prioritise for future development for success in user uptake. The analysis suggests that while many EDP users land on the dataset section from searching with web search engines, there are alternative ways to add significant value to a user’s dataset search journey. User experience can be improved by supporting users in finding value in the content published on the EDP. In time, developments could bootstrap a real open government data community that can use the EDP as a learning hub that supports open data re-use.
For more information or examples on open data, explore the European Data Portal’s (EDP) news archive and featured highlight section. Aware of open data examples or stories? Share them with us via mail, and follow us on Twitter, Facebook or LinkedIn to stay up to date!