Skip to main content

Service Provider(s) Well-being district Varha, University of Turku, Turku University of Applied Sciences, Business Turku

SYNDATE – Test platform for synthetic health data

arho.virkki@varha.fi

Client Private or academic operators that produce synthetic and anonymous data
Duration 1-3 months
Price Ask for an offer
Aloitus The service can start within about a week of agreeing on the project, depending on the delivery schedule of the raw data.
Description Anonymisation refers to the process of transforming identifiable personal data into a form where re-identifying individuals is no longer possible. Synthetic anonymous data, however, is data that resembles original personal data, but refers to synthesized, artificial subjects, rather than real individuals. Synthesizing aims to achieve the same result as typical anonymisation – to produce a representative anonymous sample of the original data. However, this technology offers the additional benefit of being able to generate arbitrary amounts of data. Furthermore, it allows for the addition of simulated variables and features to support development activities not adequately captured by the original data.

The quality of the data synthesis is established by testing its consistency with the intended analyses using the original data. In addition, it is necessary to test or logically prove that the original subjects can no longer be identified.

The data is created in Varha’s Atolli service, a secure processing environment that allows clients to store and process data compliant with Finnish secondary legislation and the corresponding Findata regulations.

Customers are typically research groups or companies with one of the following needs:
1. The client has a data permit from the register keeper (e.g., Varha) to receive original register data into the audited and sealed Atolli secure processing environment. Synthetic data can then be generated in Atolli and exported to the client.
2. The client has a data permit approved by Findata, and pseudonymised patient data has been brought to Atolli. Synthetic data can be generated in Atolli and exported to the client.
3. The customer brings their own data (not regulated by Findata or Varha), which is then synthesized and handed over to the customer. This data must be anonymous or subject to consent.

Beyond these common needs, we can also accommodate more specific requests, such as post-processing of anonymous data, modification of data modalities using synthetic data methods, simulation of various hypothetical scenarios based on the data, and the creation of synthetic patients based on descriptions of clinical drug trials.

The cost incurred includes expert services and costs for data acquisition, storage, and processing. Producing this service requires highly specialized expertise and a good understanding of the intended data use.

The data owner may require an agreement outlining the terms and conditions under which synthetic data may be used and shared. If the original material contains gene sequences, their treatment must be agreed upon separately.

The service typically includes:
– Kick-off meeting
– Obtaining and storing data into the Atolli analytical environment
– Synthesizing the data
– Ensuring the quality and anonymity of the data
– Comprehensive report of analyses and result files
– Results meeting