The term ‚big data‘ is increasingly used as a buzzword in medical literature, however, the expectation of what counts as ‚big‘ greatly diverges between clinician and AI engineer. Large data sets are hard to come by and are often accompanied by issues in data safety and concealment of patient identity.

Generative Adversarial Networks (GAN) can be used to create synthetic samples given a sufficient training cohort of ‚real‘ samples. In our lab, we encorporate GANs to generate synthetic bone marrow images that can be used to augment training cohorts of deep learning classifiers independently of a specific data source and protect patient identity.

So far, the quality of our GAN-based images for bone marrow smears (depicted above) has been evaluated by > 10 experts in cytomorphology that cannot adequately differentiate between real and synthetic data anymore (AUROC 0.62) providing evidence for the quality of the synthetic images.

We hope to share our results with you in a publication shortly. Stay tuned!