


Service Provider(s) | Well-being district Varha, University of Turku, Turku University of Applied Sciences, Business Turku
|
Client | Private or academic operators that produce synthetic and anonymous data |
Duration | 1-3 months |
Price | Ask for an offer |
Aloitus | The service can start within about a week of agreeing on the project, depending on the delivery schedule of the raw data. |
Description | Anonymisation refers to the process of transforming identifiable personal data into a form where re-identifying individuals is no longer possible. Synthetic anonymous data, however, is data that resembles original personal data, but refers to synthesized, artificial subjects, rather than real individuals. Synthesizing aims to achieve the same result as typical anonymisation – to produce a representative anonymous sample of the original data. However, this technology offers the additional benefit of being able to generate arbitrary amounts of data. Furthermore, it allows for the addition of simulated variables and features to support development activities not adequately captured by the original data.
The quality of the data synthesis is established by testing its consistency with the intended analyses using the original data. In addition, it is necessary to test or logically prove that the original subjects can no longer be identified. The data is created in Varha’s Atolli service, a secure processing environment that allows clients to store and process data compliant with Finnish secondary legislation and the corresponding Findata regulations. Customers are typically research groups or companies with one of the following needs: Beyond these common needs, we can also accommodate more specific requests, such as post-processing of anonymous data, modification of data modalities using synthetic data methods, simulation of various hypothetical scenarios based on the data, and the creation of synthetic patients based on descriptions of clinical drug trials. The cost incurred includes expert services and costs for data acquisition, storage, and processing. Producing this service requires highly specialized expertise and a good understanding of the intended data use. The data owner may require an agreement outlining the terms and conditions under which synthetic data may be used and shared. If the original material contains gene sequences, their treatment must be agreed upon separately. The service typically includes: |