Mean Field Approach for Variational Inference | Intuition & General Derivation

2024 ж. 22 Мам.

8 846 Рет қаралды

Variational Inference tries to fit a surrogate posterior to mimic the true posterior. But how should we choose the surrogate posterior? Here are the notes: raw.githubusercontent.com/Cey...
In an earlier video, we saw that we can solve the optimization problem in Variational Inference that consisted of minimizing the KL by maximizing the ELBO. That allowed us to avoid the usage of the posterior, which we do not know (otherwise we would not be doing Variational Inference in the first place). But the question remained: What is q? Or in other terms: What kind of surrogate posterior should I choose.
Spoiler: This video will not answer this question ;) Here, we will just derive a general result for the idea of subdividing the surrogate posterior into smaller independent distributions. But that is an important finding, which then let us find functional forms of the surrogate naturally based on our problem.
-------
📝 : Check out the GitHub Repository of the channel, where I upload all the handwritten notes and source-code files (contributions are very welcome): github.com/Ceyron/machine-lea...
📢 : Follow me on LinkedIn or Twitter for updates on the channel and other cool Machine Learning & Simulation stuff: / felix-koehler and / felix_m_koehler
💸 : If you want to support my work on the channel, you can become a Patreon here: / mlsim
-------
⚙️ My Gear:
(Below are affiliate links to Amazon. If you decide to purchase the product or something else on Amazon through this link, I earn a small commission.)
- 🎙️ Microphone: Blue Yeti: amzn.to/3NU7OAs
- ⌨️ Logitech TKL Mechanical Keyboard: amzn.to/3JhEtwp
- 🎨 Gaomon Drawing Tablet (similar to a WACOM Tablet, but cheaper, works flawlessly under Linux): amzn.to/37katmf
- 🔌 Laptop Charger: amzn.to/3ja0imP
- 💻 My Laptop (generally I like the Dell XPS series): amzn.to/38xrABL
- 📱 My Phone: Fairphone 4 (I love the sustainability and repairability aspect of it): amzn.to/3Jr4ZmV
If I had to purchase these items again, I would probably change the following:
- 🎙️ Rode NT: amzn.to/3NUIGtw
- 💻 Framework Laptop (I do not get a commission here, but I love the vision of Framework. It will definitely be my next Ultrabook): frame.work
As an Amazon Associate I earn from qualifying purchases.
-------
Timestamps:
00:00 Introduction
00:45 Recap: Variational Inference
01:07 Definition: Mean Field Approach
02:19 But, what is Q?
02:55 Example for 3d latent vector
03:20 ELBO Maximization for the example
03:51 Recap: Evidence Lower Bound
05:12 Factorization plugged into ELBO
06:14 Simplifying the ELBO for q_0
14:24 Special Expectation Notation
15:37 Simplifying the ELBO for q_0 (cont.)
17:33 Simplified ELBO in optimization
18:58 Maximizing the Functional
23:07 Generalization for arbitrary subdivisions
24:31 Summary
25:02 Outro

Пікірлер

Very helpful, I"m really happy I discovered your channel.
@sinaaghaee4 ай бұрын
- Thanks a lot :). Welcome to the channel! Feel free to share it with your friends and colleagues.
  @MachineLearningSimulation4 ай бұрын
You're such a really talented teacher. Thank you 🙏
@olivrobinson2 ай бұрын
- Thanks for this very kind comment :). I'm glad the video was helpful to you.
  @MachineLearningSimulation2 ай бұрын
your channel is everything I need. you're such a good teacher. looking forward to the future episodes. thank you :)
@yeahzisue3 жыл бұрын
- Thanks a lot for your feedback :) I love teaching, and it's amazing to hear that the videos are of great value. There is a lot more to come on probabilistic machine learning.
  @MachineLearningSimulation3 жыл бұрын
I spent 50 mins on VI in your videos and understood more than in 10h of reading book/lectures
@bolmanjr906 Жыл бұрын
- I'm honored, thanks ♥️
  @MachineLearningSimulation Жыл бұрын
the thing, that you can find great lectures on basically everything right now, is incredible. It amazes me more and more, the feeling that I will never be stuck alone with a problem, because not only has someone done this before, but there was somebody that put a great effort into making a lecture about it. Amazing channel, thank you so much!
@McSwey2 жыл бұрын
- That's beautiful to hear! ❤️ Actually, this was also one of my major motivations to start the channel. The internet is an amazing place to have a free and diverse set of educational resources. You find really good ones on more basic topics (like Linear Algebra, 1D Calculus etc.), but it's getting more rare the closer you come to the frontier of research. This is of course reasonable, since the demand is just lower. Yet, I think that it is absolutely crucial to also have high-quality videos that have just been produced for an online audience (not just recorded lectures - those are of course also great). I believe this has many implications, one of it being that knowledge becomes accessible to literally everyone (which has a Internet access and knows English). This is just mind-blowing to me. Personally, I benefitted a lot from these available resources. The channel is my way to give back on the topics I acquired expertise in and that have not yet been covered to the extent necessary. In a nutshell, I am super happy to read your comment, since that was exactly what I was aiming for with the channel. 😊
  @MachineLearningSimulation2 жыл бұрын
- @@MachineLearningSimulation Your channel is a mine of gold. I'm writing my master thesis and this video was everything I needed, to understand a paper that is the most similar to the method I'm proposing. So to me, being able to watch it is a big deal :) I'll binge-watch all your videos, when I'll have more time. bayesian inference especially is something I always wanted to get into
  @McSwey2 жыл бұрын
- @@McSwey That's wonderful. I am super happy I could help! :) Good luck with your thesis. And enjoy the videos :). Please always leave feedback and ask question.
  @MachineLearningSimulation2 жыл бұрын
Thanks for the video. Very clear and informative. The content and flow are well thought through. Danke.
@ythaaa2 жыл бұрын
- Gerne ;) This video was one of the topics, I struggled with for a long time. I found the derivation in Bishop's book quite tough. So it's amazing to hear that the video is of great help.
  @MachineLearningSimulation2 жыл бұрын
Thanks you so much! It is much clearer than the Bishop book!
@zhenwang58722 жыл бұрын
- You're very welcome! :) I liked to have the derivation with a simple example. Amazing that you appreciate it ♥️
  @MachineLearningSimulation2 жыл бұрын
Thanks for your video! It's nice and clear.
@duoyizhang46652 жыл бұрын
- You're welcome 😊 Glad you enjoyed it.
  @MachineLearningSimulation2 жыл бұрын
this was very well explained. i was struggling to understand the mean field algorithm from the "variation inference review by david blei".
@VarunTulsian Жыл бұрын
- You're very welcome. I haven't read that paper yet, when I first learned about it I struggled with it from Bishop's "pattern recognition and machine learning". I'm glad this different perspective I present here is helpful :)
  @MachineLearningSimulation Жыл бұрын
cool video ! it's what i need! thank you ❤️from CN
@xinking26442 жыл бұрын
- You're very welcome 😊
  @MachineLearningSimulation2 жыл бұрын
Thank you very much for your videos on variational inference. You explained it even better than the book of Bishop.
@dailyshorts43562 жыл бұрын
- You're very welcome :) And great that you mention the amazing book of Bishop. I love it, but at some points I thought that the example-based derivations that I do here in the video, could be a bit more insightful. Thanks for appreciating this :)
  @MachineLearningSimulation2 жыл бұрын
The best video one can find on Mean-Field Appox. Thank you!! By the way, what do you mean by normalizing the q0 at the end of the video?
@saichaitanya4735 Жыл бұрын
- Thanks a lot
  @MachineLearningSimulation Жыл бұрын
Great derivation!
@todianmishtaku6249 Жыл бұрын
- Thanks again 😊 I see you're taking a tour through the probabilistic series, appreciate it 😊
  @MachineLearningSimulation Жыл бұрын
- @@MachineLearningSimulation. Yes. Thank you. I plan to go through all your series. They are appealing. I like the fact that they are self-contained compared to many other sources both books and videos.
  @todianmishtaku6249 Жыл бұрын
thx ! soo clear
@dataencode572 жыл бұрын
- Thanks. You're welcome 😊
  @MachineLearningSimulation2 жыл бұрын
thank yo very much
@medomed11052 жыл бұрын
- You are welcome :)
  @MachineLearningSimulation2 жыл бұрын
Just one observation: the functional unconstrained optimization solution gives you an unnormalized function, you said it is necessary to normalize it afterwards. So what does guarantee that this density, but normalized, will be the optimal solution?? If you normalize it, it will not be a solution to the functional derivative equation anymore. Also, the Euler-Lagrange equation gives you a critical point, how does one know if it is a local minima/maxima or a saddle point?
@piero82845 күн бұрын
Many thanks. How can you find the normalizer at the end? And afterwards, what does it mean tot UPDATE a FUNCTION until convergence? Thanks
@MLDawn Жыл бұрын
- Hi, you're welcome. Thanks for the comment ☺️ Can you please give time stamps to the point in the video you are referring to. That helps me recall what I exactly said in the video, thanks.
  @MachineLearningSimulation Жыл бұрын
- @@MachineLearningSimulation Hi, first of all incredible Video! You explained this topic so much better then the lecturers at my university! I think the Update a Function until convergence part refers to 25:00. I'd be also interested in knowing what you meant here, is the Mean Field approach an Iterative approach? Also this is probably a stupid question, but at for Example 24:04 you show us the final formula where we take the Expectation over all i != j. I assume we would do this via Integration? But if we are able to integrate over our joint probability p(Z, X=D) with respect to all z_i != j, why can't we just marginalize our joint probability P(Z, X=D) to just get P(X=D) and then use Bayes Rule to directly compute P(Z|X)? I know you probably explained that somewhere in there, but I don't quite get why that isn't an option.
  @user-vo5dw7nc8x9 ай бұрын
while maximizing the functional, why did you say that d [q_0 E_1,2[log p]] / d q_0 = E_1,2 [log p]? [assume that d is a partial derivative here] shouldn't E_1,2[log p] also be dependent on q_0(z_0)? since p is dependent on z_0, z_1, and z_2; hence it cannot be considered a constant w.r.t. q_0(z_0)?
@agrawal.akash97022 ай бұрын