Backpropagation in Convolutional Neural Networks (CNNs)

2024 ж. 22 Мам.
31 044 Рет қаралды

In this video we are looking at the backpropagation in a convolutional neural network (CNN). We use a simple CNN with zero padding (padding = 0) and a stride of two (stride = 2).
X: / far1din_
Github: github.com/far1din
Manim code: github.com/far1din/manim#back...
---------- Content ----------
00:00 - Introduction
00:51 - The Forward propagation
02:23 - The BackPropagation
03:31 - (Intuition) Setting up Formula for Partial Derivatives
06:07 - Simplifying Formula for Partial Derivatives
07:05 - Finding Similarities
08:55 - Putting it All together
---------- Contributions ----------
Background music: pixabay.com/users/balancebay-...
#computervision #convolutionalneuralnetwork #ai #neuralnetwork #deeplearning #neuralnetworksformachinelearning #neuralnetworksexplained #neuralnetworkstutorial #neuralnetworksdemystified #computervisionandai #backpropagation

Пікірлер
  • great video, but i don't understand how we can find the value of the dL/dzi terms. At 7:20 you make it seem like dL/dzi = zi, is that correct?

    @louissimon2463@louissimon2463 Жыл бұрын
    • No, they come from the loss function. I explain this at 4:17. It might be a bit unclear so I’ll highly reccomend you watch the video from 3blue1brown: kzhead.info/sun/p62eeLCmoaVriHA/bejne.htmlsi=Z6asTm87XWcW1bVn 😃

      @far1din619@far1din6196 ай бұрын
    • I'm with @louissimion, you show how dL/dw1 is related to dz1/dw1+... (etc), but you never show/expain where dL/dz1 (etc) comes from. Poof - miracle occurs here. Having a numerical example would help a lot. This "theory/symbology" only post is therefore incomplete/useless from a learing/understanding standpoint.

      @rtpubtube@rtpubtube4 ай бұрын
  • Really intuitive and great animations.

    @sourabhverma9034@sourabhverma90343 күн бұрын
  • Fantastic explanation!! Very clear and detailed, thumbs up!

    @zemariamm@zemariamm7 ай бұрын
  • great job. this explanation is really intuitive

    @bambusleitung1947@bambusleitung1947Ай бұрын
  • Great Explanation, helped me understand the background working

    @saikoushik4064@saikoushik40642 ай бұрын
  • great stuff man, crystal clear!

    @guoguowg1443@guoguowg1443Ай бұрын
  • what i was looking for. well explained

    @Peterpeter-hr8gg@Peterpeter-hr8gg7 ай бұрын
  • This channel is a hidden gem. Thank you for your content

    @JessieJussMessy@JessieJussMessy Жыл бұрын
  • excellent. the exact video i was looking for.

    @ramazanyel5979@ramazanyel597915 күн бұрын
  • Couldn’t explain it better myself … absolutely amazing and comprehensible presentation!

    @markuskofler2553@markuskofler255311 ай бұрын
  • This is a topic which is rarely explained online, but it was very clearly explained here. Well done.

    @farrugiamarc0@farrugiamarc02 ай бұрын
  • This was really helpful....Thank you so much for the vizualization...Keep up the good work...Looking forward to your future uploads.

    @RAHUL1181995@RAHUL1181995 Жыл бұрын
  • great explanation, clear direct and understandable, sub!

    @DSLDataScienceLearn@DSLDataScienceLearn3 ай бұрын
  • really clear explanation and good pacing. I felt I understood the math behind back propagation for the first time after watching this video!

    @DVSS77@DVSS77 Жыл бұрын
  • Nicely put, thank you so much.

    @user-gg2ov3up5k@user-gg2ov3up5k9 ай бұрын
  • great video, underrated channel , please keep it up with CNN videos!

    @giacomorotta6356@giacomorotta6356 Жыл бұрын
  • great explanation

    @harshitbhandi5005@harshitbhandi50055 ай бұрын
  • the animations were super useful, thanks!

    @MarcosDanteGellar@MarcosDanteGellar Жыл бұрын
  • Thanks for sharing!

    @gregorioosorio16687@gregorioosorio166877 ай бұрын
  • really beautiful, thanks.

    @aliewayz@aliewayz2 күн бұрын
  • What a masterpiece.

    @heyman620@heyman6208 ай бұрын
  • Very well explanation, I search many videos but no body explained regarding change in filter's weight. Thank you so much for this animated simple explanation.

    @nizamuddinkhan9443@nizamuddinkhan944310 ай бұрын
  • Best explanation

    @LeoMarchyok-od5by@LeoMarchyok-od5byАй бұрын
  • Great example thanks a lot

    @objectobjectobject4707@objectobjectobject470711 ай бұрын
  • I have seen few videos before, this one is by far the best one. It breaks down each concept and answers all the questions that comes in the mind. The progression, the explanation is best

    @haideralix@haideralix6 ай бұрын
    • Thank you! 🔥

      @far1din619@far1din6196 ай бұрын
  • Well done.

    @samiswilf@samiswilf Жыл бұрын
  • Why is this channel so underrated? You deserve more subscribers and views.

    @khayyamnaeem5601@khayyamnaeem5601 Жыл бұрын
    • Perhaps developers use ad blockers, and as a result, KZhead needs to ensure revenue by not promoting these types of videos (that's my opinion)

      @eneadriancatalin@eneadriancatalin Жыл бұрын
  • please continue your videos !!

    @PlabonTheSadEngineer@PlabonTheSadEngineer4 ай бұрын
  • your channel is a Hidden Gem..My suggestion is to start a discord and get some crowd functing and one on ones for people who want to learn from you..youa re gifted in teaching.

    @paedrufernando2351@paedrufernando2351 Жыл бұрын
  • amazing video thanks!

    @akshchaudhary5444@akshchaudhary54443 ай бұрын
  • Great explanation with cool visual. Thanks a lot.

    @shazzadhasan4067@shazzadhasan40675 ай бұрын
    • Thank you my friend 😃

      @far1din619@far1din6195 ай бұрын
  • Amazing! I was looking for some material like this a long time ago and only found it here, beautiful :D

    @pedroviniciuspereirajunho7244@pedroviniciuspereirajunho72445 ай бұрын
    • Thank you my brother 🔥

      @far1din619@far1din6195 ай бұрын
  • Masterpiece 💕💕

    @osamamohamedos2033@osamamohamedos2033Ай бұрын
  • Thank you so much!!! This video is so so so well done!

    @elgs1980@elgs1980 Жыл бұрын
    • Thank you. Hope you got some value out of this! 💯

      @far1din619@far1din619 Жыл бұрын
  • Great explanation and visualization

    @aikenkazin4096@aikenkazin40966 ай бұрын
    • Thank you my friend 🔥🚀

      @far1din619@far1din6196 ай бұрын
  • thanku you so much for this

    @ManishKumar-pb9gu@ManishKumar-pb9gu4 ай бұрын
  • Please do not stop making these videos!!!

    @Joker-ez2fm@Joker-ez2fm5 ай бұрын
    • I won’t let you down Joker 🔥🤝

      @far1din619@far1din6195 ай бұрын
  • The equation at 6:00 ends up with ∂L/∂w(i) = 4 * ∂L/∂w(i) if we cancel out ∂z. It is Multivariable chain rule so the correct function is: ∂L/∂w(i) = ∂z1/∂w(i) * dL/dz1 + ∂z2/∂w(i) * dL/dz2 + ∂z3/∂w(i) * dL/dz3 + ∂z4/∂w(i) * dL/dz4. So we can't do cancelation.

    @anardashdamirli@anardashdamirliАй бұрын
  • You are a great example of fluidity of thought and words..great explanation

    @jayeshkurdekar126@jayeshkurdekar126 Жыл бұрын
    • Thank you my friend. Hope you got some value! :)

      @far1din619@far1din619 Жыл бұрын
    • @@far1din619 sure did

      @jayeshkurdekar126@jayeshkurdekar126 Жыл бұрын
  • Amazing

    @ziligao7594@ziligao75945 күн бұрын
  • fab video! help me a lot

    @yuqianglin4514@yuqianglin45146 ай бұрын
    • Glad to hear that you got some value out of this video! :D

      @far1din619@far1din6196 ай бұрын
  • Well explained now I need to code it my self

    @SolathPrime@SolathPrime Жыл бұрын
    • Haha, that’s the hard part

      @far1din619@far1din619 Жыл бұрын
    • @@far1din619 I think I came up with a solution Here def backward(self, output_gradient, learning_rate): kernels_gradient = np.zeros(self.kernels_shape) input_gradient = np.zeros(self.input_shape) for i in range(self.depth): for j in range(self.input_depth): kernels_gradient[i, j] = convolve2d(self.input[j], output_gradient[i], "valid") input_gradient[j] += convolve2d(output_gradient[i], self.kernels[i, j], "same") self.kernels -= learning_rate * kernels_gradient self.biases -= learning_rate * output_gradient return input_gradient First i initialized the kernel gradient as an array of zeros with the kernel shape then I iterated through the depth of the kernels the the depth of the input then for each gradient withe respect to the kernel I did the same to compute the input gradients Your vid helped me understand the backward method better So I have to say thank you sooo much for it

      @SolathPrime@SolathPrime Жыл бұрын
    • @@far1din619 I'll document the solution and but it here when I do please pin the comment

      @SolathPrime@SolathPrime Жыл бұрын
    • @@SolathPrime That’s great my friend. Will pin 💯

      @far1din619@far1din619 Жыл бұрын
  • Thanks.

    @OmidDavoudnia@OmidDavoudnia28 күн бұрын
  • Great video!! Your explanation is the best I have found. Could you please tell me what software you use for the animations ?

    @rodrigoroman4886@rodrigoroman48866 ай бұрын
    • I use manim 😃 www.manim.community

      @far1din619@far1din6196 ай бұрын
  • thx

    @PeakyBlinder-lz2gh@PeakyBlinder-lz2gh3 ай бұрын
  • I've had no trouble learning about the 'vanilla' neural networks. Although your videos are great, I can't seem to find resources that delve a little deeper into the explanations of how CNNs work. Are there any resources you would recommend ?

    @user-ki3jf6gu6l@user-ki3jf6gu6l2 ай бұрын
  • What about the weights of the fully connected layer

    @bnnbrabnn9142@bnnbrabnn91422 ай бұрын
  • +1 sub, excellent video

    @simbol5638@simbol56385 ай бұрын
    • Thank you! 😃

      @far1din619@far1din6195 ай бұрын
  • perfect, one suggestion make videos a little longer 20-30 is a good number

    @im-Anarchy@im-Anarchy6 ай бұрын
    • Haha, most people don't like these kind of videos too long. Average watchtime for this video is about 3minutes :P

      @far1din619@far1din6196 ай бұрын
    • ​@@far1din619​oh shii! 3 minutes, that was very unexpected, maybe it's because people revisit the video to revise specific topic.

      @im-Anarchy@im-Anarchy6 ай бұрын
    • Must be 💯

      @far1din619@far1din6196 ай бұрын
  • Great explanation. Can you please tell which tool do you use for making these videos.

    @piyushkumar-wg8cv@piyushkumar-wg8cv7 ай бұрын
    • Thank you my friend! I use manim 😃 www.manim.community

      @far1din619@far1din6196 ай бұрын
  • What is the loss function here, and how are the values in the flattened z matrix used to compute yhat ?

    @govindnair5407@govindnair54072 ай бұрын
  • 1:15 why do you iterate in steps of 2? If you iterated by 1 then you could generate a 3x3 layer image. Is that just to save on computation time/complexity or is there something other reason for it?

    @arektllama3767@arektllama3767 Жыл бұрын
    • The reason why I used a stride of two (iterations in steps of two) in this video is partially random and partially because I wanted to highlight that the stride when performing backpropagation should be the same as when performing the forward propagation. In most learning materials I have seen, they usually use a stride of one, hence a stride of one for the backpropagation. This could lead to confusion when operating with larger strides. The stride could technically be whatever you like (as long as you keep it within the dimensions of the image/matrix). I could have chosen another number for the stride as you suggested. In that case, with a stride of one, the output would be a 3 x 3 matrix/image. Some will say that a shorter stride will encapsulate more information than a larger one, but this becomes “less true” as the size of the kernel increases. As far as I know there are no “rules” for when to use larger strides and not. Please let me know if this notion has changed as everything changes so quickly in this field! 🙂

      @far1din619@far1din619 Жыл бұрын
    • @@far1din619 I never considered how stride length could change depending on kernel size. I guess that makes sense, the larger kernel could cover the same data as a small kernel, just in fewer steps/iterations. I also figured you intentionally generated a 2x2 image since that’s a lot simpler than a 3x3 and this an educational video. Thanks for the feedback, that was really insightful!

      @arektllama3767@arektllama3767 Жыл бұрын
  • 5:24 does this just mean we divide z1 by w1 and ultiply by L divided by z1 and do that for all z'S to get the partial derivative of L in respect to w1?

    @ItIsJan@ItIsJan8 ай бұрын
    • It’s not that simple. Doing the actual calculations is a bit more tricky. Given no activation function, Z1 = w1*pixel1 + w2*pixel2 + w3*pixel3… you now have to take the derivative of this with respect to w1, then y = z1*w21 + z2*w22… take the derivative of y with respect to z1 etc. The calculus can be a bit too heavy for a comment like this. I’ll highly reccomend you watch the video by 3blue1brown: kzhead.info/sun/p62eeLCmoaVriHA/bejne.htmlsi=Z6asTm87XWcW1bVn 😃

      @far1din619@far1din6196 ай бұрын
  • Hello well explained. I need your presentation

    @burerabiya7866@burerabiya7866 Жыл бұрын
    • Just download it 😂

      @far1din619@far1din6199 ай бұрын
  • dL/dzi = ??

    @MoeQ_@MoeQ_ Жыл бұрын
    • I explain the term at 4:17. It might be a bit unclear so I’ll highly reccomend you watch the video from 3blue1brown: kzhead.info/sun/p62eeLCmoaVriHA/bejne.htmlsi=Z6asTm87XWcW1bVn 😃

      @far1din619@far1din6196 ай бұрын
  • You have nices videos, that helped me better understand the concept of CNN. But, from this video, it is not really obvious that matrix dL/dw - is convolution of image matrix and dL/dz matrix, as showed here kzhead.info/sun/g9Jwgq9vq6Gcg58/bejne.html. The stride of two is also a little bit confusing

    @user-oq7ju6vp7j@user-oq7ju6vp7j6 ай бұрын
    • Thank you for the comment! I believe he is doing the exact same thing (?) I chose to have a stride of two in order to highlight that the stride should be similar to the stride used during the forward propagation. Most examples stick with a stride of one. I now realize it might have caused some confusion :p

      @far1din619@far1din6196 ай бұрын
  • w^* is an abuse of math notation, but it's convenient.

    @int16_t@int16_t8 ай бұрын
  • I think it's spelled "Convolution"

    @CorruptMem@CorruptMem Жыл бұрын
    • Haha thank you! 🚀

      @far1din619@far1din619 Жыл бұрын
KZhead