Stable Diffusion is amazing for marketing and other digital professions.
Here’s a further tutorial on how I used Stable Diffusion to generate a cartoon character for a voice artist, Su-Lin Jones.
I made the logo above with Stable Diffusion, combining a cartoon representation of the voice artist, and two other AI-generated elements — the mic and the headphones.
The placard’s handwriting is mine.
Generating a character
I used the TFM American Cartoons model for all of this.
Get a real image. Run it through an X/Y plot. Use the appropriate model.
I used the image in the red rectangle:
Make sure to adjust your pixel dimensions to suit whatever your input image is.
Then, prompt and pray. I detailed the procedure here.
Su-Lin chose this.
But her sleeves were cut off.
So take it into Photoshop and extend her sleeves.
I basically used the Pen tool to extend the sleeves’ strokes. Then I cloned in the purple colour into the empty areas.
I also removed the stray stands of hair.
Upscale the AI generated image
At 578x477px, it’s a bit small.
I sent it to the upscaler and get a 4x image. It became about 2300px lengthwise.
The result from upscaling were pretty good.
If you can’t tell the difference between the image above and below, then point proven.
Generating a microphone
Always start with “prompt and pray”.
I instructed Stable Diffusion with this prompt:
a microphone on a table
And negative prompt:
human, person, woman, man, character
I got this keeper.
But Stable Diffusion isn’t exactly producing a sole microphone. There always seemed to be a person involved.
Well, well… time for img2img and ControlNet.
I cut the microphone out in Photoshop and erased the parts which don’t close a loop. For example, the wires stemming from the mic.
I exported it at 600px height-wise. You don’t need sharpness when doing img2img, so we’re good.
More like it… innit?
Am I… doing img2img wrong?
Hmm, why bother generating an image, cropping it, cleaning it… when you can just go to one of these free stock image pages and get an image for img2img’s sampling?
So, I got this
Threw it into img2img, and then got a canny ControlNet sampling.
This came out among many others:
BTW, if you don’t ControlNet it, you’ll get lots of heads again.
Blending AI generated images with Photoshop
Su-Lin wanted me to put the headphones on her head.
But yeah, I couldn’t just slide the headphones right onto her avatar. This is how it’ll look.
First up, Free Transform the headphones to make it proportional to the head.
Then, just like how you’d bend headphones to fit your head, you will need to do that in Photoshop using the Puppet Warp tool.
We want to ensure the headphones are parallel to her head.
Most fundamentally, you will need to pin the headphone’s speakers and the middle of the headband. I have also found that putting points where the speakers meet the headband helps a lot.
Here’s how it’ll look like after you straighten it out.
Notice that I also copied the right speaker to the left to create some symmetry.
OK, so now it’s time to blend the headphones onto her hair.
I found that “catching” onto a black outline already existing on the image to be key. Mask everything beyond this line.
As you can see, everything to the right of the speaker is masked off. I drew the black line to create further separation.
This is important because it gives the viewer the impression that the headphones are behind her hair.
You’d notice that there’s a light to the left that tapers off on the right.
Paint some shadows to match this pattern… and done!
Wait, how do I do all of this?
If you want to do something like this, then follow this guide which has a more detailed process.