Skip to main content
EU Science Hub
General publications, Study, Report

Economies of scope in the aggregation of health-related data

Details

Identification
JRC nr: JRC125767
Publication date
14 September 2021

Description

Economies of scale in data aggregation is a widely accepted concept. It refers to improved prediction accuracy when the number of observations on variables in a dataset increases. By contrast, economies of scope in data is more ambiguous. The classic economic interpretation refers to cost savings in the re-use of data for other purposes. Here, we introduce another interpretation of economies of scope, in data aggregation. It refers to improvements in prediction accuracy when the number of complementary variables in a dataset increases, not the number of observations on these variables. If economies of scope in data aggregation exist, the value of aggregated data pools of complementary variables is higher than the sum of values of the disaggregated datasets because more and better insights can be extracted from the aggregated dataset.

Economies of scope in data aggregation is controversial in the economic research literature, also because there is so far little empirical evidence for their existence. The objective of this project is to fill that gap. For this purpose we create an aggregated data pool of health and health-related variables. We run machine learning models on this data pool to predict health outcomes. We gradually increase the number of independent variables in the model to estimate the magnitude of economies of scope in the aggregation of variables. Our findings confirm the existence of economies of scope in the aggregation of health and healthrelated variables in order to improve the prediction accuracy of health outcomes. The evidence is based on a nation-wide household survey and medical consumption data from the Netherlands.

Authors:

HOCUK Seyit, KUMAR Pradeep, MULDER Joris, PRUFER Patricia

Files

jrc125767.pdf
English
(4.32 MB - PDF)
Download