Once the Proof of Concept has been established, I started implementing the architecture in the Image domain. The basic objective is to understand which images the user likes and only display those picture to the user. For this we used a Siamese Network which learned liked pair of images and disliked pair of images. This information helped us in generating an embedding space where each user's preference can be shown as a cluster. When the architecture starts learning about the new user, the information of previous user's feedbacks helps the learning with minimal interaction. To learn about the clusters, a Gaussian Mixture Model is used to generate multiple soft clusters. The soft cluster gives us the flexibility to decide that some new users can have preferences which includes multiple clusters. A Conditional Generative Adversarial Network is used to take a vector drawn from the Gaussian Mixture in embedding space from the Siamese Network as the conditional input and output an image for the user to inspect. The predictive model then interacts with the new user and based on their feedback adjusts the Gaussian Mixture to find the distribution with the highest probability of generating the user’s preferred data. The goal is to learn to generate images that match the preferences of the user using only a minimal number of direct user interactions. Testing in this domain has shown promising results that exemplify the ability of the approach to capture the user’s preferences while presenting only a minimal number of image examples.
To learn more, check out the paper.