Categories
Tutorials

Is Stable Diffusion Inpaint Obsolete Versus InstructPix2Pix? (Add Glasses to Face)

New tech and methods are always exciting, but I have started to think old methods that endure are still the best.

I recently received a request to find out how to put sunglasses on an image.

I tried two methods: Inpaint and InstructPix2Pix.

The goal is to put sunglasses on the portrait. Black sunglasses with minimal reflections. Like these images of Jason Statham:

Base Image

When doing something experimental, I like using the least complicated images.

In the case of sunglasses, we don’t want long hair making it difficult for the AI.

Having hair would mean the AI has to find a way to hide the frame behind the hair. That makes it a bit more difficult.

And when you make things difficult for AI, you’ll need to do more Photoshopping.

Anyway, here’s the base image: a bald guy.

bald guy cropped for stable diffusion

Inpainting In Stable Diffusion

Inpainting in Stable Diffusion refers to a powerful technique used in image editing to restore missing parts of pictures or create entirely new elements within an existing image. \

It’s the basis of how you can do a face swap in Stable Diffusion.

In this case, the guy isn’t wearing sunglasses, but we want him to.

putting sunglasses on a person in stable diffusion

It’s pretty simple. Just mask the areas where a sunglasses will go.

I am using Realistic Vision 2.0 as a model and a simple prompt “sunglasses on face”. I tried adding “black” but I still got multi-coloured sunglasses.

One thing I like to do is to create a CFG-Denoising plot. I’ve got a guide on how to set it up here.

You’ll know why:

sunglasses on face cfg denoising plot stable diffusion
Click to enlarge

In fact, setting a X/Y plot up is so important that I’ve already set it up so it appears by default.

As you can see, the middle CFG and middle denoising produces the best images.

Too little denoising and nothing happens. Too much and it becomes a face in a face.

These are some hopefuls from this Inpainting method.

InstructPix2Pix and Adding Sunglasses

Next, I’ll try the InstructPix2Pix method.

Today, you don’t even need to install an extension for Automatic1111. You just need to download the checkpoint, put it into your Stable Diffusion models folder. I used the .ckpt file, rather than the .safetensors file.

Steps:

  1. Download the checkpoint and put it into your Stable Diffusion models folder
  2. Go into Stable Diffusion (make sure to refresh UI)
  3. Select the instructpix2pix checkpoint
  4. Wait for it to load
  5. Go to img2img and type in a prompt in conversational English.

I prompted “put sunglasses on him”

instructpix2pix stable diffusion put sunglasses

Run a Image CFG Plot in InstructPix2Pix

I chose to run a graph plot for Image CFG, a variable that’s specific to InstructPix2Pix.

I fixed the seed to “3”, and then ran a Image CFG plot that jumps 0.3 in value from 0.1 to 2.8 (delta = 2.7).

Here’s the output:

Click to enlarge

Interesting. It seems that at lower Image CFGs, the whole image gets changed.

low image cfg instructpix2pix

Who dat?!

At higher Image CFGs, the original image is preserved, as below.

high image cfg instructpix2pix

But Wait, There Are No Temples on the Glasses

Unlike the inpainting method, InstructPix2Pix doesn’t seem to connect the glasses to the ears.

This is a bit of a problem because I’d have to do further adjustments to the image in order to get to the same outcome as the inpainting method.

Let’s Refine the Inpainting Method

My goal is to have black sunglasses that look elegant, befitting of the guy in the image.

Let’s try drawing a more precise inpainting mask.

a more precise sunglasses img2img stable diffusion

Out of a batch of four, I found this to be the best.

yellow sunglasses inpainting

Making the Sunglasses Black in img2img

I just can’t seem to get the sunglasses black. Perhaps I need to emphasize specific parts of my prompt.

The old prompt was “black sunglasses”.

Let’s try three brackets surrounding “black sunglasses”.

The new prompt is (((black sunglasses))).

It also turns out that having “B&W” in your negative prompt will affect the black colour of the glasses. Once I removed “B&W”, I got much better chances of getting black sunglasses.

Photoshop as Final Step to the img2img Process

The final step is to use Photoshop. The image above seems OK, but the sunglasses’ lens looks a bit too pale.

Let’s put some darkening curves on the lenses.

And here’s the result:

photoshop sunglasses onto image portrait

A Workflow for Adding Sunglasses Onto Image in Stable Diffusion

My thoughts so far: you’d do well to try both methods and see what you get.

Remember, this image is just one of the many images that you might need to put sunglasses onto.

You might get a person with hair. You might get a person looking to the side.

Basically, the methods above are useful in all scenarios, but the effectiveness of each one might vary depending on your variables.

Here’s a general overview of the steps I’d take:

  1. Study the input image
  2. Try instructpix2pix FIRST. Satisfied? Job done.
    • Why IP2P first? Because it’s the lowest effort method. You just need to type a simple prompt
  3. If the results aren’t very good, run a Image CFG plot, and if still unsatisfied, run a Denoising and/or CFG Scale plot
  4. Try inpainting next. Draw several masks — try one shaped like glasses, and another covering the eyes-to-ear portion.
  5. Run CFG/Denoising plots if unsatisfied.
  6. Photoshop your best image

And that’s how I’d do it.

Have a project in mind?

Websites. Graphics. SEO-oriented content.

I can get your next project off the ground.

See how I have helped my clients.