Learning GANs the right way?

From Noise to Creation: GANs

Exploring the Power of Generative Adversarial Networks

Chitransh
9 min readApr 14, 2024

Hey people! This happens to be my first one on medium and I have no idea how to write this so bear with me for while. I promise it will be worth your time. 😄

I am currently in my sixth semester of engineering and pursuing a Bachelor’s Degree in Artificial Intelligence and Machine Learning

I am really intrigued by how we as a race have come far from the early days of arithmetic and basic computing to now creating and deploying full scale AI agents that outperform humans at times.

In this article my aim is to introduce you to Generative Adversarial Networks in a simple and engaging way, even if you’re new to artificial intelligence.

Introduction

Hmmmm…… So you read the title huh!
What is the first thing that comes to your mind when I say noise?
NOOO don’t tell me about the teenager on the second floor who bursts out loud music at night!!

This image is AI generated.

Neat trick there huh?! (Using AI images to explain AI images 😏)
Read the caption above.

So back to the question what exactly is noise when I talk in terms of a GAN?

To answer that question first we must understand that what does this very scary long word GENERATIVE ADVERSARIAL NETWORK mean?

What is a GAN?

Analogy One:

Let me help you out with core concept without diving into the specifics then after you are familiar with the core concept we can later dig into the specifics.

Now, picture two artists sharing a room i.e. Rick and Morty.

Source: “Rick & Morty”Adult Swim

AHHHHHHH!!!! Not them😭
LET ME TRY AGAIN!!!
Now, picture two artists sharing a room i.e. Rick and Morty. 😄

This image is AI generated.

Rick is talented as a painter, and Morty has sharp eyes as an art critic. They begin an interesting game in which Morty’s (The critic’s) task is to determine which paintings are FAKE and which are REAL.

Let’s say that Rick (The painter) starts by creating a FAKE of Mona Lisa.
Then after he is done with the painting the role of Morty (the critic) is to guess which painting is the REAL and which is FAKE and also Morty (The critic) has access to the original Mona Lisa Painting to make the comparison at all times.

But the catch is Rick (The painter) has a very bad memory and he keeps forgetting how to draw.
Even though Rick is really talented at painting, he still has to practice from the beginning every time. This means he has to draw random things/patterns/shapes over and over again to get better and make his strokes/paintings perfect.

Now Rick (The painter) has started painting…

This image is AI generated.

Once Rick has finished painting something that he thinks looks like Mona Lisa , we’ll send the painting to Morty (The critic). We know Rick (The painter) has a bad memory, his strokes and color scheme are not very accurate in the first trial. In fact they are terrible which makes it all the way easier for Morty (The critic) to distinguish between his painted/generated art and the original Mona Lisa.

After 30 trials from the beginning and painting relentlessly Rick (The painter) finally understood how and what he needs to draw to fool Morty (The critic).

And now he thinks he has done it!!! He finally created the perfect FAKE.

But even this time his painting was caught by Morty (The critic) because he got almost everything right but the face.

Source: @ninhaeliane (Instagram)

After absolutely cracking up😆, Monty (the critic) told him after he was caught because the face was a lot different than the original one.Keeping that in mind Rick (The painter) again did some trials and after each trial he showed his painting to the critic for better feedback.

Eventually after 60 trials he was able to get really close to the REAL one.
Now the critic was also shocked and was even forced to improve his critic skills to find the FAKE one.

So you see after seeing so much random variations of the art generated/painted by the painter, Morty (The critic) was also getting trained and was also improving his skills in identifying the FAKES and the REAL ONE.

Both the painter and the critic improve at their respective skills as the game progresses. The critic gets better at identifying fakes, and the painter learns how to produce more realistic paintings. In this way, they continue to go back and forth, each attempting to outwit the other.

This happens till Rick (The painter) is able to draw a very realistic version of Mona Lisa and if Morty (The critic) fails to distinguish between the REAL and FAKE .

Then we would say that Rick has won the game.

By the way…..
This is what Rick (The painter) was able to draw at his 60th attempt.Pretty close to the original one huh? ( I mean they seem pretty close to me without my glasses 😅)

This image is AI generated.

Since you understood the game now let us move to the things that are actually done while training a GAN. I hope you completely understood the analogy that I gave.

Now let’s imagine that Rick (The painter) is a computer program called the “generator” and Morty (The critic) is a program called the “discriminator”.

Don’t worry if you find the names a little jargon-y that’s how they are since the beginning so need to worry, you’ll get familiar with them.

Now I will be addressing the painter (Rick) as the Generator and the critic (Monty) as the Discriminator.

Moving Ahead…..

A GAN can be thought of as a creative competition/game between two computer programs, one of which (the generator/painter) aims to produce realistic-looking images and the other (the discriminator/critic) determines whether those images are real or fake.

Source : ITRelease.com

This configuration is known as a Generative Adversarial Network, or GAN.

Now the process moves along in a way such that some noise is generated by the generator (The painter) which makes no sense just as Rick (The painter) used to draw random objects to perfect his strokes while painting.When Morty (the critic) sees the random noise the first time then he is easily able to tell which is FAKE and which is REAL.

So now Rick (the generator/painter) will take the feedback from Morty (The discriminator/critic) and then he will work on his skillset(Generator/Discriminator Loss) until Morty (The discriminator/critic) can’t tell which one is REAL or FAKE.

Once this happens we could say the training for the GAN has been completed.

Analogy Two:

Let me help you out with another example below:
Let’s suppose we want a GAN to generate handwritten digits that we can train the by using a dataset such as MNIST.

(Don’t worry MNIST is just a very big collection of handwritten digits by various people in different hand writings)

Source: Wikipedia (MNIST dataset)

To put the idea into a very simple language, we can say that the generator learns to draw the right pixels in order to match the desired result. After that, it passes to the discriminator, which judges the pixels.

Source: IBM

Over time and after numerous iterations of the same work and feedback from the discriminator, the generator determines where to change the pixel values to prevent the discriminator from determining its authenticity.

Now as you can see below, from a complete random pattern of dots (Noise) the GAN was able to generate digits which are very similar to the actual MNIST dataset.

Source: TensorFlow (MNIST)

The Magic of Noise

Sooooo does that mean that I can use noise and create anything in the world.

I mean to some extent yeahhhhh…
(I might get a bit formal in the next section so don’t feel bored 😛)

But you have to understand that it isn’t just random noise that gives the impression of being chaotic. It has been thoughtfully designed to include nuanced structures and patterns that the GAN can access and alter.

In reality this is what noise would look like:

Source: Wikipedia

Any noise vector has the capacity to develop into something remarkable, much like discovering hidden jewels in a vast sea of randomness. Taking this raw noise and shaping it into something meaningful is what the GAN does as it works.

The process is similar to that of molding and shaping clay into a recognisable shape by the artist. That is, until you realize that the noise vector is the “clay” and the generator network of the GAN is the “artist.”

Once it produces something truly remarkable, the GAN refines its creation with each iteration, adding more detail and complexity. Not two creations are ever the same, just as no two snowflakes are alike ,which is the beauty of it all.

Photo by Earl Wilcox on Unsplash

I hope that now you have a basic level understanding of how GANs work and what is the approach behind them.

In conclusion, Generative Adversarial Networks, can generate highly realistic and diverse outputs that closely resemble the training data by starting with random noise and iteratively improving the output through this adversarial process.

Utilizing the advantages of both networks i.e. the discriminator offering crucial feedback to enhance the generator’s performance and the generator learning to produce realistic outputs — is the secret to GANs’ success.

What’s lies ahead???

GANs have many uses and are still expanding, including the creation of images and videos, anomaly detection, picture synthesis, and text-to-image , data augmentation, style transfer, and even language modeling.

With the advancement of deep learning, GANs will become more and more important across a range of domains, expanding the limits of artificial intelligence.

This image is AI generated.

I hope that this introduction to the exciting field of generative adversarial networks has been a good and entertaining reading experience for you.

— — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — —

This can be considered an simplified version of the actual working because I promised to not indulge into the technicalities just yet!! Such as the two computer programs are not actually “programs” but neural networks under unsupervised domain and the details about the nature of the “Feedback” that the discriminator shares back and that the “Feedback/Loss” is considered for both networks etc.

My goal here was to have a beginner friendly approach to the explanation which would help you understand GANs even if you are familiar with Artificial Intelligence.

— — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — —

A deeper look at the theoretical foundations, architectural styles, and real-world applications of GANs will be covered in more articles that I plan to publish on a regular basis.

Here are some other articles that I really liked about them and can help you get more knowledge about GANs:

  1. https://www.analyticsvidhya.com/blog/2021/04/lets-talk-about-gans/
  2. https://developer.ibm.com/articles/generative-adversarial-networks-explained/

For more in-depth knowledge about GANs along with it’s implementation and architectures:

https://neptune.ai/blog/generative-adversarial-networks-gan-applications

So that’ll be all for today!
Hope to see you around for the next one!
Namaste 🙏

In the event that you enjoyed reading this article, make sure to follow for more.

https://www.linkedin.com/in/chitransh-srivastava-37b0a0225/

--

--