Skip to main content

Service Provider(s) Well-being district Varha, University of Turku, Turku University of Applied Sciences, Business Turku

SYNDATE – Test platform for synthetic health data

arho.virkki@varha.fi

Client Private or academic operators that produce synthetic and anonymous data
Duration 1-3 months
Price Ask for an offer
Aloitus The service can be begin within about a week of agreeing on the project, depending on the delivery schedule of the raw data.
Description Anonymisation refers to the process of transforming an identifiable personal data into such form that identifying individuals is no longer possible. Synthetic, anonymous data, on the other hand, is material resembling original personal data, but the persons mentioned in it do not refer to actual cases, but instead to cases generated by synthesizing.  Synthesizing aims to achieve the same result as line-level anonymization: The goal is to produce a representative anonymous sample of the original material. However, the additional benefit of the technology is that an arbitrary amount of data can be produced and, if necessary, added by simulating variables and features needed in research and development activities that are not adequately described by the original data.

The quality of the synthesis is established by testing the data to be consistent with the intended analyses with the original material and by testing or using methods to prove that the original patients cannot be identified.

The material is created in Varha’s Atolli service, where you can store and process materials that are under Findata’s supervision.

Customers are assumed to be either research groups or companies. Patient data can be accessed for synthesizing either with a data permit for one well-being area, a data request approved by Findata, or by importing one’s own data.  A customer’s need is typically one of the following:

  1. The client has a data permit, on the basis of which they receive actual patient data (pseudonymized or with the correct identifiers) on the audited, sealed side of  Atolli and synthetic data generated from this is handed over to the client.
  2. The client has a data permit approved by Findata, and the pseudonymised patient data has been brought to Atoll’s audited side. The synthetic data generated from this is handed over to the customer.
  3. The customer brings their own data (which is not licensed by Findata or Varha) which is synthesized and handed over to the customer. This data has to be anonymous, or its use has to be subject to consent or it does not contain any personal data.

In addition to these, there may be more specific customer needs, such as post-processing of anonymous data, modification of modalities using synthetic data methods, simulation of various hypothetical situations based on the data, and creation of synthetic patients based on descriptions of clinical drug trials.

The components of the service price are expert services and possible acquisition, storage and processing costs of the material. Producing the service is very demanding expert work, which requires time-consuming familiarity with the research question at hand.

Owners of original material may have requirements for data access rights. Restrictions on the use of original material may be required, e.g. as to who is allowed to process it or how long it is in use, and it may be required to be destroyed after a certain period of time. In addition, obtaining access to the original material may require collaboration with the experts of the owner of the material for the duration of the project. The ways in which potential publications resulting from the analysis refer to the Syndate service remain to be determined.

The data owner may require an agreement on the terms and conditions under which synthetic data may be used and shared. The original material may also contain gene sequences, and there are no precise regulations for them in the law, and their methods of synthesis are not as advanced as with tabular and image data. Thus, their treatment must be agreed upon separately. The service typically includes

–       Kick-off meeting

–       Obtaining or bringing material into an analytical environment

–       Synthesizing the material

–       Ensuring the quality and anonymity of the material

–       Comprehensive report of analyses and result files performed

–       Results meeting