TRUSTWORTHY DATA SPACES: WHO ARE THEY FOR?
The primary objective of the European data strategy has been to establish a unified data market, which guarantees Europe's competitiveness and autonomy in the global landscape. Through the creation of shared European data spaces, the EU has aimed to create a larger pool of data to be accessible to utilise by both economies and societies, while ensuring that the entities and individuals responsible for generating the data maintain control.
What does this mean for people - individuals and communities - as well as for those making the EU data strategy happen?
Etoile Partners, EDDS's parent company, forms part of TRUSTEE, an EU-funded project which proposes a secure-by-design Federated Platform for data exchange and computation. Data is everywhere and across all domains - from health to transportation, from energy to education - and it promises to generate insightful knowledge that can potentially steer better decisions and better outcomes for people. That last part is important especially in education.
At this year's Computers, Privacy and Data Protection Conference in Brussels, the TRUSTEE team presented its core values and principles around the project, while Etoile Partners contributed with discussing the collective research achieved on the legal and socio-ethical requirements that pre-condition any data space development.
Watch the video or read below some of the key points from the panel presentations.
Who are the users of data?
These can be research entities, scientists, media, policymakers, public and industry – all can be end-users within the EU strategy for open data markets. However, unlike the data capture and extraction typical of the Silicon Valley attitude, EU projects such as TRUSTEE look for setting up strict rules and conditions and first and foremost with end-users' rights and needs in mind.
Through TRUSTEE, we have sought to identify the complexities and tensions that emerge as more data is made available. These complexities and tensions arise across many levels. Some known, some less so, and some we cannot even foresee.
Cross-sector or within-domain data sharing and use is of great interest many stakeholders in a domain, however, several conditions should be addressed before any open data spaces or markets are created:
accountability and transparency - who would use what kind of data, data models, and how will these will then be utilised?
external validation that legal, socio-ethical, security and other requirements, rules and conditions are met by the data providers, data spaces developers, and end-users of open data spaces
The scenarios for data use through open data spaces are endless and so are the risks and tensions.
Some challenges from end-user perspective can be around data quality, security, and identity management. Their interests may lie in multidisciplinary data exchange and secondary use.
Risks of data loss and loss of control over data.
The challenges can be also across borders, domains, jurisdictions, but then also at local cultural level.
Cross-border and cross-domain data opportunities
Certainly, open data spaces and data use promise better research, timely interventions, advancing science, efficacy, impact, optimisation, discovery, and so on. While sensitivity related to privacy and security risks remain the biggest hurdles to overcome before data is used, it comes to creativity and imagining all possible scenarios in order to understand what exactly is needed to do to protect individuals and then use data for good.
Here are two scenarios we have investigated at TRUSTEE:
In health and education data exchange or processing is particularly sensitive. Yet, education data, in combination with energy data can reveal energy consumption and energy waste in a school or a school district. No specific sensitivity around such data exists and the opportunities from processing such data can be immense. Schools can identify potential energy waste based on their school data. Such insight can better inform them to improve on how they use energy; and even educate staff and students around such information.
Across other domains data may be seemingly innocuous. In transportation, for exmaple, data form sensory systems don’t necessarily carry personal and sensitive data. It doesn’t immediately assume legal and socio-ethical challenges. Yet, in combination with health data makes it particularly challenging. Say in a car accident while sensory data from the car may provide valuable data on the car performance, combined with the health data collected for the driver or any other person involved in the accident can potentially identify individuals and make all the available data very sensitive.
With health-related data it is promising to know that we can combine environmental data and enhance research on lung cancer for instance in relation to where individuals live and what kind of air they breathe and so on. But again, the challenges with processing such data and the requirements are multi-faceted and can never be exhaustive.
This requires not only technical innovation and viewpoints but also non-technical input. Non-technical expertise seeks to ensure accountability, traceability, and building trust in those who access, use, and manipulate the data, and optmise the future open data markets, platforms or spaces. Human-led and human-centred efforts can certainly be fortified through technologies (which is where EDDS drives auditing as a must before any digital systems settle as primary mediators of human processes). Furthermore, grounding the development of such spaces in fundamental human rights and ethical principles (when using data or developing and deploying AI models) is crucial. All of this is to say that 1) such projects should take time (unlike what has been happening with Silicon Valley tech attitudes) and 2) they are complex and would be harder to fix than build them properly from the start.
All of this is to say that a new culture of understanding and action around data is necessary. There are layers of controls and requirements that are already identified and aligned at legal and socio-ethical levels. But these must be treated with equal measure since specific parts of data processing may still be within the legal remit but not necessarily ethical or beneficial for specific segments of society.
TRUSTEE has been fundamental in enhancing the understanding and updating EDDS's framework of evaluation of educational technologies. Specifically around aspects of socio-ethical requirements, of enforceability, and external verification of controls through state-of-the-art technologies.
The tensions that further emerge as AI functionalities are deployed have been fundamental to EDDS's vertical criteria assessing algorithmic fairness (whereby the very definition of 'fairness' is contentious) at pre-design stage of development of educational technologies and throughout their use.
For more about TRUSTEE, read here.