The EU's digital finance agenda, in line with broader Commission policies on digital transition, aims to foster a level playing field in the financial sector and address challenges posed by digital transformation. Central to this vision is the EU Digital Finance Platform, an initiative designed to catalyse innovation and forge a unified market for digital financial services.
At the heart of this initiative is the establishment of a Data Hub, a secure environment for smooth data exchange among national supervisors and financial firms, aiming to advance data-driven innovation and establish trustworthy data-sharing systems.
To ensure compliance with confidentiality requirements, the European Commission has decided to build the Data Hub using synthetic data.
The JRC has meticulously evaluated the data synthesis software to ensure that the new dataset will be a valuable resource for firms while respecting confidentiality issues.
The Directorate‑General for Financial Stability, Financial Services and Capital Markets Union (DG FISMA) asked the Joint Research Centre (JRC) to conduct tests and checks to evaluate the accuracy, anonymisation, and confidentiality of the synthetic data generated by a software package provided by Synthesized, a UK-based software company. The results indicate that the synthesised data successfully replicate the main patterns of the original data, with univariate distributions overlapping well.
The synthesis process makes any potential attack challenging, ensuring the confidentiality level necessary for sharing confidential information in the Data Hub. Overall, synthetic data offers a way for national supervisors to participate in the project without having to make the real data they hold accessible to any third party.
The Data Hub will be launched at a dedicated event Digital Finance Platform - Launch of phase II & data hub on 21 March 2024.
Background
Synthetic data are artificial data, generated to reproduce the characteristics and structure of the original data. At the JRC, various teams are actively investigating the potential applications of synthetic data across multiple sectors, extending beyond finance to include administrative data and other domains.
The exploration of synthetic data has indicated its potential to facilitate the utilization of specific public sector datasets that are not suitable as open data. By leveraging synthetic datasets, researchers and policymakers can access and analyse sensitive or restricted information without compromising individual privacy or data confidentiality.
For instance, recent activities of the JRC have shown promising results in fields such as common European data spaces, privacy preserving technologies and the generation of synthetic data through the use of generative AI.
These initiatives are particularly focused on the scientific exploration of synthetic data applications to bolster policy development, the assessment of synthetic data representativeness, generation of population replicas diseases and socio-economic variables. Such evidence is very important for informing policy making and ensuring robust, data-driven decision processes within the European Union while preserving data sensitivity aspects.
Related links
Digital Finance Platform - Launch of phase II & data hub
European Data Spaces - Scientific Insights into Data Sharing and Utilisation at Scale
Technological Enablers for Privacy Preserving Data Sharing and Analysis
Details
- Publication date
- 20 March 2024
- Author
- Joint Research Centre
- JRC portfolios