High-value datasets – demography in the EU
How to use high-value datasets to study population
This is part of a series of articles showcasing examples of high-value datasets from their different thematic categories. High-value datasets are defined by EU law based on their potential to provide essential benefits to society, the environment and the economy. This series aims to help readers find reliable and accurate information from official sources relating to the availability of various high-value datasets and to present this information through data visualisation. You can see the article providing an overview of high-value datasets here.
Only datasets specifically defined by law can be considered high-value datasets, and as such the data presented in the articles does not necessarily fall under that definition. Instead, the data has been chosen to be thematically adjacent to high-value datasets and to showcase what can be done with information made available by official EU bodies and EU Member States. The official list of high-value datasets adopted on 12 December 2022 can be found in the legal documents that define them and their characteristics.
Population as a high-value dataset
Data on demography plays a crucial role in understanding the world around us. By analysing the population dynamics and demographic trends, we can gain a better understanding of the social, economic and other factors that affect us all.
This is why data on population was chosen to be included in the ‘statistics’ category of the high-value datasets. This dataset is called ‘yearly population’ and its key variables are defined by Regulation (EU) No 1260/2013 of the European Parliament and of the Council. Other regulations involved are Implementing Regulation (EU) No 205/2014, Regulation (EC) No 862/2007 and Regulation (EC) No 351/2010.
As specified in the annex to the implementing regulation mentioned above, the yearly population dataset includes three key variables: population on 1 January, median age and old-age dependency ratio.
Median age refers to the age that separates the population into two equal halves, with half of the population being younger and the other half being older than the median age. In other words, it is the age at which half of the population is older and half is younger.
The old-age dependency ratio is the proportion of people aged 65 or older to people aged 20 to 64 years old. A higher old-age dependency ratio indicates a larger share of the population is composed of older individuals who are more likely to require healthcare and other social services. This indicator is used by policymakers and economists to anticipate the impact of population aging on social security programmes, healthcare systems and the labour market.
Several breakdowns are available for these key variables. The population data can be disaggregated by sex and age, educational attainment, citizenship, country of birth and human development index – which according to the annex is ‘a regrouping of the country of birth and country of citizenship’. For some of the breakdowns, data is also offered up to the province (NUTS 3) level, allowing for a more granular analysis. This is also true for the median age and old-age dependency ratio indicators, while median age data can also be disaggregated by sex.
The law requires specific breakdowns to be made available. For example, it is mandatory for sex and age data to be available up to the province level. Sex and age data must also have a ‘Human Development Index’ disaggregation. As specified in the annex, Member States that meet certain conditions set out in the regulation must also offer other breakdowns such as citizenship or country of birth. Other key variables have a simpler structure, and for median age and the old-age dependency ratio, only data up to the province level or disaggregated by sex must be offered.
Demographic data on Eurostat
The population in the EU-27 aggregate has grown steadily since at least 1990. In 1990, there were 418 million people living in those countries, which went up to 446 million according to the latest estimates for 2022.
However, the rate of population growth gradually slowed down over time, and as Eurostat notes, the EU population increased on average by about 0.7 million people per year during the 2005–2022 period, compared with an average increase of around 3.0 million people per year during the 1960s. This trend reversal, combined with the severe toll of the COVID-19 pandemic, led to an actual population decrease that started in 2021 and was also observed in 2022.
Positive population changes are driven by births and immigration, while negative changes are driven by deaths and emigration. Births minus deaths make up an indicator called ‘natural change’, which is basically the change in population excluding immigration and emigration.
The number of live births decreased progressively between 1960 and 1995, while the number of deaths slowly increased. The gap between live births and deaths in the EU narrowed considerably from 1961 onwards, and the natural change of the population became negative in 2012, when the number of deaths surpassed the number of births. Eurostat also highlights that ‘net migration in the EU increased considerably from the mid-1980s and was the main determinant of population growth since the 1990s’.
Eurostat data can be used to analyse the population in each of the roughly 1 500 provinces that make up the EU. This allows us to monitor regional and local trends and determine which provinces are experiencing growth in their population and which ones are shrinking.
Among the provinces in which population grew the most is Ilfov, an area surrounding Bucharest (Romania), where inhabitants doubled over the span of 19 years. In the same period, two Spanish provinces also saw a large growth in their population, namely the islands of Fuerteventura and Formentera. Several other provinces in Spain experienced a similar trend, along with Luxembourg and Malta. On the other hand, in the province of Vidin (north-western Bulgaria), the population dropped by a little more than one third as compared to 2003, with a similar reduction (in percentage points terms) in the Latgale province (eastern Latvia) and in two provinces in Lithuania: Tauragė and Utena counties.
The following visualisation shows this change starting from 2003, a year for which a large amount of province-level data is available.
Population data on data.europa.eu
Data concerning breakdowns of the yearly population high-value dataset can also be found on data.europa.eu. National authorities can upload their own data on the portal, which at times can reach an even higher level of detail compared to the data available on Eurostat. For example, a search on population age structure data, using the appropriate search keyword, leads to a high number of results.
This way, it is possible to learn about people living in the municipalities of Salzburg (Austria), in Portugal or in Spain. A dataset uploaded by the French Ithéa Conseil makes it possible to study the age structure of the population in French regions and cities. The dataset was used in the following visualisation to show the distribution of young and old people in the four largest French cities.
The marital status of the population is another field of particular interest, and one where we observe significant changes over the years. Data on this topic on the data.europa.eu portal can help us understand exactly how this indicator has been changing.
Interesting datasets about this topic include the population of Helsinki since 2004 by district and subregion, in Czechia from 2021 census data, or in Sweden since 2006. One particularly insightful dataset was uploaded by the data portal of the Dutch government and shows the marital status of people in the Netherlands since 1950.
The following visualisation shows how the attitude of people to marriage, singlehood and divorce changed over the span of the last 70 years.
Another useful breakdown is about education. The educational attainment of the EU population markedly increased over time, and datasets on the data.europa.eu portal show how much.
One such example is a dataset about the inhabitants of Bilbao (Spain), disaggregated by neighbourhood and level of education. Another interesting dataset was uploaded by the Czech National Open Data portal and includes 2021 educational level from census data.
Other demographic data providers
The two main sources of EU demographic data are Eurostat and national authorities. National authorities make data available on their website and some of it on the data.europa.eu portal as well, where its description is translated into English and other languages.
Other EU bodies that focus on demography include the Commission’s Directorate-General for Employment, Social Affairs and Inclusion. Using the appropriate search keyword on their website, for example, leads to several documents and publications about this topic, such as an analysis of fertility in Finland and many others.
Download the data visualisations presented in this story and the data behind them.
Article by Davide Mancino
Data visualisations by Federica Fragapane