Synthetic data puts privacy at the heart of AI projects
As the future is rapidly approaching, synthetic data has become an invaluable tool – providing machine learning and predictive analytics models with tabular or visual information without any risk of compromising protecting privacy. Across a multitude of industries, it is used for a variety of purposes – such as sensor training for self-driving cars and simulating market trends in the financial sector.
Synthetic data replicates the mathematical and statistical properties of its source while simultaneously preserving correlations between variables. This ensures that real world trends are accurately reflected in generated data sets. It also ensures no personal information can be extracted from the results. This is a marked improvement on merely de-identified data which can still be linked back to the source.
The benefits of synthetic data have the potential to revolutionize medical research: not only does it reduce bias by simulating patients from underrepresented groups, but it also provides an answer to the inconsistent results often seen in pediatric and rare disease studies due to small patient numbers. All these add up to create a more engaging, persuasive outlook on synthetic data – showing us just how far we’ve come, and how much farther we can go.
Source: Synthetic data puts privacy at the heart of AI projects – The Globe and Mail