Art & AI

Generating imagination: AI-powered sketches for art, animation, and innovation

Project Type

Research

Role

Research Assistant

Timeframe

Aug 2019 - May 2021

Tools

Procreate, RunwayML, Snapchat Lens Studio, GitHub

Overview

The project aimed to use different machine learning models for creating datasets for image generation tasks and animating with sound files to create moving portraits which can be used in creative work ie films, concept art or machine learning use cases for dataset creation.

1 Empathize

Datasets are an important part of machine learning but take time to make and curate.
Artists look for references or stylistic choices when creating new artwork.
Generative Adversarial Networks (GANs) can generate new novel images for both dataset creation and artwork creation.
Gans can help artists create new experiences in less time.

The goal of our output is to enhance an artist’s work, not replace it.

Define

The semester project explored working with stylegan2-ada(https://github.com/NVlabs/stylegan2-ada) by Nvidia and data-efficient gans (https://github.com/mit-han-lab/data-efficient-gans) by MIT han lab on a custom dataset of sketches created based on 100 Disney faces sketches.

The goal of our output is to be used as a dataset for training different machine learning models or as sketch design iterations for artists as an inspiration, reference, or final product.

This falls under the 2D art category, with an added exploration of animating the output sketch image from the trained model with audio as input using MakeItTalk: Speaker-Aware Talking-Head Animation by Adobe Research (https://github.com/adobe-research/MakeItTalk).

2

3 Iterate

The first step was to try a small number of images hand-drawn based on Disney character faces to see how it would perform with a StyleGAN2 architecture.

Procreate was used to draw the Disney face sketches with a pen for better line quality, as the team ran into issues with light lines and pencils not being picked by the the gan. After experiencing model collapse, where only a limited variety of samples are generated, the team started increasing the dataset size and quality with data cleaning practices like cropping faces.

Access the PDF to view a sampling of detailed notes on the process:

Journal

Prototype

Stylegan2-ada is written in Python, and the Runway SDK module supports Python 3.6, which is used to import models into RunwayML. The runwayml models can be found here: https://app.runwayml.com/models/akhaliq/stylegan2art.

The model was trained in Google Colab Notebook using a v100 GPU with the sketch dataset. The output from StyleGAN was used to train in the Snapchat Lens Studio platform.

The resulting lens is published here https://www.snapchat.com/unlock/?type=SNAPCODE&uuid=3f299e1ddbf142c0858bebaffbf0e62a&metadata=01

4

5 Test

The goal of this project was to get the distribution of fake images to be as realistic as possible given the small dataset so that it would be useful to artists/animators and machine learning practitioners.

The metric used to measure this was the Fréchet Inception Distance (FID) score, which is often used in evaluating GANs. FID measures the distance between two sets of images as datasets. The lower the FID score, the better the results for the fake images being similar to the real images. The team also released our trained model on runwayml for users around the world to access and use in their projects and to use with other runwayml projects.

The lowest FID score we achieved was around 70 on a training run, which is a good score for only training on 100 images. Our dataset of hand-drawn sketches is also available on Playform to use to train models.