Structured generative models for vision

Dr. Paul Henderson

16:40 27 July 2022

University of Glasgow | School of Computing Science | Website

Most deep learning techniques for imaging and vision tasks require large amounts of labelled data. In this talk, I shall discuss an alternative -- unsupervised learning using structured generative models. Like GANs and VAEs, these structured models learn the distribution of a dataset and can generate new samples. Unlike GANs and VAEs, they learn an interpretable latent space and decoder, which explicitly assigns interpretable semantics to latent variables -- for example representing positions and 3D shapes of the objects visible in the image. I shall discuss how this allows predicting 3D shape, object segmentations, and depth maps from a video or even a single image, without receiving supervision for any of those tasks.