Synthetic data – the use of AI to create datasets that mimic real world data – is rapidly becoming a much bigger part of our daily lives. But this form of data raises critical philosophical and ethical questions that will shape the future for all of us, write Mikkel Krenchel and Maria Cury.
There’s a data revolution happening and nobody is talking about it. It revolves around synthetic data. Unless you work in the field of artificial intelligence (AI), you may have never heard of it. But this rapidly growing form of data raises critical philosophical and ethical questions that will shape the future for all of us. First, what is synthetic data? There are many types, but the basic premise is the use of AI to create datasets that mimic real world data. These datasets can then be used to feed the insatiable need for data that trains machine learning algorithms to make better predictions. Instead of training algorithms on messy, expensive real-world data riddled with privacy issues and bias, now one can supplement or supplant real-world data with “better,” “cheaper,” or “bigger” datasets constructed using AI. Put simply, synthetic data is artificial data feeding artificial intelligence. It’s similar to deep fakes, yet used for less nefarious purposes, and applied to not only videos and images but any type of data under the sun, from insurance data, to army intelligence, self-driving vehicles, or even patient health care records. It is as awe-inspiring as it is terrifying.
Join the conversation