[1hr Talk] Intro to Large Language Models

2024 ж. 5 Мам.
1 821 087 Рет қаралды

This is a 1 hour general-audience introduction to Large Language Models: the core technical component behind systems like ChatGPT, Claude, and Bard. What they are, where they are headed, comparisons and analogies to present-day operating systems, and some of the security-related challenges of this new computing paradigm.
As of November 2023 (this field moves fast!).
Context: This video is based on the slides of a talk I gave recently at the AI Security Summit. The talk was not recorded but a lot of people came to me after and told me they liked it. Seeing as I had already put in one long weekend of work to make the slides, I decided to just tune them a bit, record this round 2 of the talk and upload it here on KZhead. Pardon the random background, that's my hotel room during the thanksgiving break.
- Slides as PDF: drive.google.com/file/d/1pxx_... (42MB)
- Slides. as Keynote: drive.google.com/file/d/1FPUp... (140MB)
Few things I wish I said (I'll add items here as they come up):
- The dreams and hallucinations do not get fixed with finetuning. Finetuning just "directs" the dreams into "helpful assistant dreams". Always be careful with what LLMs tell you, especially if they are telling you something from memory alone. That said, similar to a human, if the LLM used browsing or retrieval and the answer made its way into the "working memory" of its context window, you can trust the LLM a bit more to process that information into the final answer. But TLDR right now, do not trust what LLMs say or do. For example, in the tools section, I'd always recommend double-checking the math/code the LLM did.
- How does the LLM use a tool like the browser? It emits special words, e.g. |BROWSER|. When the code "above" that is inferencing the LLM detects these words it captures the output that follows, sends it off to a tool, comes back with the result and continues the generation. How does the LLM know to emit these special words? Finetuning datasets teach it how and when to browse, by example. And/or the instructions for tool use can also be automatically placed in the context window (in the “system message”).
- You might also enjoy my 2015 blog post "Unreasonable Effectiveness of Recurrent Neural Networks". The way we obtain base models today is pretty much identical on a high level, except the RNN is swapped for a Transformer. karpathy.github.io/2015/05/21/...
- What is in the run.c file? A bit more full-featured 1000-line version hre: github.com/karpathy/llama2.c/...
Chapters:
Part 1: LLMs
00:00:00 Intro: Large Language Model (LLM) talk
00:00:20 LLM Inference
00:04:17 LLM Training
00:08:58 LLM dreams
00:11:22 How do they work?
00:14:14 Finetuning into an Assistant
00:17:52 Summary so far
00:21:05 Appendix: Comparisons, Labeling docs, RLHF, Synthetic data, Leaderboard
Part 2: Future of LLMs
00:25:43 LLM Scaling Laws
00:27:43 Tool Use (Browser, Calculator, Interpreter, DALL-E)
00:33:32 Multimodality (Vision, Audio)
00:35:00 Thinking, System 1/2
00:38:02 Self-improvement, LLM AlphaGo
00:40:45 LLM Customization, GPTs store
00:42:15 LLM OS
Part 3: LLM Security
00:45:43 LLM Security Intro
00:46:14 Jailbreaks
00:51:30 Prompt Injection
00:56:23 Data poisoning
00:58:37 LLM Security conclusions
End
00:59:23 Outro

Пікірлер
  • Andrej is doing more for the AI community through his videos than entire companies

    @namanmenezes1434@namanmenezes14345 ай бұрын
    • Right on!

      @royhasiani9005@royhasiani90055 ай бұрын
    • He represents the "Open" in OpenAI. More please!

      @dwrtz@dwrtz5 ай бұрын
    • While others quarrel for power and control, Andrej is cool calm and educating the masses on important things that matter. If Altman is the leader of the classes then Andrej is the leader of the masses (learners and folks of the AI community in the future).

      @be_present_now@be_present_now5 ай бұрын
    • or universities

      @19Ronin95@19Ronin955 ай бұрын
    • Indeed! And let us not forget Andrew Ng. They are democratizing the knowledge and understanding of AI across the globe. Respect!

      @gandev@gandev5 ай бұрын
  • I am a college professor and I am learning from Andrej how to teach. Every time I watch his video, I not only I learn the contents, also how to deliver any topic effectively. I would vote him as the best “AI teacher in KZhead”. Salute to Andrej for his outstanding lectures.

    @BAIR68@BAIR685 ай бұрын
    • I was also taking note of his delivery. I also found it very effective and think he’s an outstanding communicator. I think this talk could easily be consumed by a non technical viewer yet still engage those who are quite familiar with the technical underpinnings.

      @tjayoub@tjayoub4 ай бұрын
    • He is a perfect balance of big picture n drill down

      @bleacherz7503@bleacherz75034 ай бұрын
    • lol quit ur job

      @Snail641@Snail6413 ай бұрын
    • He is very effective, no doubt.

      @aldotanca9430@aldotanca94303 ай бұрын
    • vrk🎉vybs545k,

      @khadijahmehmood3152@khadijahmehmood31522 ай бұрын
  • 0:16: 🎥 A talk on large language models and the Llama 270b model. 4:42: 💻 Training the 4.42 model involves collecting a large chunk of text from the internet, using a GPU cluster for computational workloads, and compressing the text into parameters. 9:25: 📚 A neural network is trained on web pages and can generate text that resembles different types of documents. 13:47: 🧠 The video discusses the process of training neural networks and obtaining assistant models. 18:31: 💻 Creating an AI assistant involves a computationally expensive initial stage followed by a cheaper fine training stage. 46:18: 🔒 Language models like GPT-3 can be vulnerable to jailbreak attacks, where they bypass safety measures and provide harmful information. 23:09: 🤖 Language models can be used to generate sample answers, check work, and create comparisons. 27:50: 🔍 Using a concrete example, the video discusses the capabilities of language models and how they evolve over time. 32:25: 🔑 The video explains how AI language models like GPT-3 can be used to generate images based on natural language descriptions. 36:49: 🗣 The video discusses the concept of large language models and the possibility of converting time into accuracy in language processing. 41:21: 🔧 The video discusses the customization options available for large language models like ChatGPT. 50:49: 🔒 The video discusses two types of attacks on large language models: noise pattern injection and prompt injection. 55:34: 🔒 The video discusses the risks of prompt injection attacks and data exfiltration through Google Apps Scripts. Recapped using Tammy AI

    @ambition112@ambition1125 ай бұрын
    • Thank you! Your effort is much appreciated.

      @RC-br1ps@RC-br1ps4 ай бұрын
    • Not 270 billion....

      @Yusuf-sy6rb@Yusuf-sy6rb4 ай бұрын
    • It's Llama 2 - 70b model

      @kishcool@kishcool4 ай бұрын
    • thank you

      @uk7769@uk77694 ай бұрын
    • What's the difference between large language and text to speech

      @kiyonmcdowell5603@kiyonmcdowell56032 ай бұрын
  • Dear Andrej, I cannot stress enough the value of this wonderful presentation. I am sharing it with all my peers. Thank you so much for this.

    @stefanmangold6512@stefanmangold65125 ай бұрын
    • it's at a right level for developers who know some things (i.e. training/inference etc) but not more. Fully practical too!

      @whoisbhauji@whoisbhauji5 ай бұрын
    • you are welcome stefan ! i love writing and talking about this stuff !

      @irshviralvideo@irshviralvideo5 ай бұрын
    • This was more like an advertisement for OpenAI but go off

      @DistortedV12@DistortedV125 ай бұрын
    • @@DistortedV12 More like for scale AI

      @irshviralvideo@irshviralvideo5 ай бұрын
  • I just love how Andrej loves what he's doing. He's chill, makes jokes and laughs about bugs. I can understand much more seeing code for ten minutes rather than reading tens of hours of medium articles

    @LucaSimonetti@LucaSimonetti5 ай бұрын
    • I love him too, he’s not like Ilya,sam and other in the era

      @ai.simplified..@ai.simplified..5 ай бұрын
    • @@ai.simplified.. ilya is great too

      @saliherenyuzbasoglu5819@saliherenyuzbasoglu5819Ай бұрын
  • This guy is a gem to the world.

    @jeffwads@jeffwads5 ай бұрын
    • he once save my family of 24 kids from hanger

      @hadgadma3589@hadgadma3589Ай бұрын
  • It’s insane to me that this content is freely accessible online. Great stuff Andrej hope you continue to post more lectures!

    @caydendunn8404@caydendunn84045 ай бұрын
  • You know when someone makes a topic so accessible and understandable you feel like you're hearing a story but learning a lot. This happened in this video.

    @aryanrahman3212@aryanrahman32125 ай бұрын
  • Andrej is hands-down one of the best ML educators out there. What a gift for all of this guy is.

    @agamemnonc@agamemnonc5 ай бұрын
  • You're soooo good at simplifying these complex topics.. thank you for everything you do for us Andrej

    @the3rdworlder293@the3rdworlder2935 ай бұрын
    • hes so good at simplifying because he has a lot of knowledge in this space. he can break it down to simple words.

      @artmusic6937@artmusic69375 ай бұрын
    • Andrej is indeed an awesome guy.

      @webgpu@webgpu5 ай бұрын
  • The fact that one of the leaders in AI has the care to make videos for everyday people to gain understanding of AI and the coming technology shifts is incredible. Thank you Andrej, you are greatly appreciated my many, more than you may realize.

    @wires__@wires__5 ай бұрын
  • Hands down, this and Simon Willison’s “Catching up with the weird world of LLMs” are two of the best introductory talks on this topic I’ve seen so far!

    @rednafi@rednafi5 ай бұрын
  • I'm setting aside a daily one hour on my schedule to learn from Andrej otherwise this guy is everything that I need for my carrier development. Thanks Andrej Karpathy.

    @user-rp2pf5lk2n@user-rp2pf5lk2n5 ай бұрын
    • Career development * good luck 👍 😊

      @AncientPrayers@AncientPrayers5 ай бұрын
    • @@AncientPrayers oh thanks!

      @user-rp2pf5lk2n@user-rp2pf5lk2n5 ай бұрын
  • Your teaching style always gets through to me. Calm and pointed. This is exciting. - Edit: The LLM as OS followed by how to convince it to do anything you want. Wow. And ChatGPT does sound like SJ from "HER" when you speak to it even though it swears it's an amalgamation of voices. It's great. Thanks again for sharing. You rock.

    @johnnypeck@johnnypeck5 ай бұрын
  • It's incredible how of a good educator is Andrej. You are able to distill info in a way that's extremely easy to understand. Thanks!

    @lgvivqzt@lgvivqzt5 ай бұрын
  • This is amazing, thank you for the efforts and time spent learning and simplifying! I've been looking for such sort of an expertise video for so long. Keep them coming, please.

    @dilyanadjv@dilyanadjv5 ай бұрын
  • Your skill to break these complex things down into something I can actually understand and follow for an hour with full concentration is amazing. Absolutely incredible. The start is so great with the two files. Now I _know_ what an LLM is. Thank you

    @windproxy4362@windproxy43625 ай бұрын
  • I'm 10 min into the video : and I'm already learning SO MUCH. I've never had LLMs explained with examples like this before. Wow! Clears up SO MUCH confusion from rather 'muddy' explanations I've seen before. THANK YOU ANDREJ.

    @tyronefrielinghaus3467@tyronefrielinghaus34675 ай бұрын
  • I am just completely blown away by this presentation. This is after watching 100s of such videos like this. No one comes even close. Andrej Karpathy you are the BEST!!!! Thank you so much for creating and sharing.

    @sbanerjee2005@sbanerjee20055 ай бұрын
  • This is one of the best KZhead videos I’ve ever seen. Such an accessible explanation of a broad and complex topic. Brilliant!

    @benjaminwootton@benjaminwootton5 ай бұрын
  • Wow, this is amazing! Your explanation is super clear and to the point - exactly what we need in the ongoing Q* debate. I'm especially impressed with your take on System 2 and its self-improvement. It really feels like you're making strides in this field. Keep up the fantastic work! 🌟

    @alvilabs@alvilabs5 ай бұрын
  • Thanks a lot for the video! Truly appreciate taking time out to create these videos!

    @genghis360@genghis3605 ай бұрын
  • I love when experts explain stuff. It's the vast knowledge that allows them to simplify concepts to the point, where you can follow, track and learn the functioning of complex systems. Thank you, Andrej! All of us here on KZhead truly appreciate the time and effort you spent on creating this presentation and helping us learn.

    @RappingManualYT@RappingManualYT5 ай бұрын
  • Thank you Andrej, I found that both very instructive and informative and you have a well reasoned and balanced approach that is easy to follow and consider. You have provided an overview that has helped me immensely to further grasp this complex subject. Your work is very much appreciated.

    @marktahu2932@marktahu29325 ай бұрын
  • never seen anyone explained it in such a detail but easy to understand way, you da best sir

    @sid-prod@sid-prod5 ай бұрын
  • You absolute mad lad! As a "former" web developer trying to pivot into AI, your videos have been absolutely amazing in giving me hope that it's not too late for me to pivot. And here you are giving out even more wisdom, what impeccable timing. Thank you! Ps: Instantly shared on Twitter =D

    @asatorftw@asatorftw5 ай бұрын
    • hey @asatorftw I'm new/green/wet-behind-ears to AI/DL/ML - it caught my attention that you are trying to pivot. Same here but from a different field. Keen to connect and share/learn from each other on pivot strategies.

      @jebinmathewv@jebinmathewv5 ай бұрын
    • following @andrej karpathy is ofcourse on that list :) thank you for this Andrej.

      @jebinmathewv@jebinmathewv5 ай бұрын
    • Unless you have or will have MS/PhD in CS or EE don’t even bother trying to get a job pivoting to AI.

      @joeschmidt6597@joeschmidt65975 ай бұрын
    • @@joeschmidt6597 Can you elaborate your quite strong opinion a bit more?

      @asatorftw@asatorftw5 ай бұрын
    • @@asatorftw What joe is saying is that AI is a field where higher education is *almost* crucial. In a world where companies are talking about degrees being unnecessary, there are a select few fields which require degrees and one of which is Artificial Intelligence. Is it possible to become an AI engineer with zero relevant degrees? I guess, but the ones I've met all say that it's highly recommended that you get a Masters or PhD. I've seen very few people who are against degrees for AI. Also the degrees are not just CS, but mostly from Math and Electrical Engineering. I mean if you can get an MS/PhD in Electrical Engineering, you'd be golden. I've once heard Mark Zuckerberg say that he would hire someone with an EE background than a CS background. Andrej Karpathy here did his PhD at Stanford. I've learned that Stanford is very popular for AI given how Andrew Ng ( The guy who started Google Brain ) works as an Adjunct Professor.

      @bananawarriorwootwoot@bananawarriorwootwoot5 ай бұрын
  • Thanks for the video! I really admire the pace at which you speak, steady and clear instilling in us a sense of clarity and confidence that this technology is exciting and a game changer. Thanks a lot for your time, Andrej!

    @prasannaprabhakar1323@prasannaprabhakar13235 ай бұрын
  • I myself have a PhD in this field, but your clarity of thought is far greater than mine. Thank you for this video.

    @computervisionetc@computervisionetc5 ай бұрын
    • WHOA big fuckin BOSS

      @mz4637@mz4637Ай бұрын
  • Nice! Thanks for the clear description, slides and time index details. Awesome.

    @AzaB2C@AzaB2C5 ай бұрын
  • Finished watching your makemore videos a few weeks ago, and was wandering when you would have time again to make another series like that again. Really love this new video :D

    @abrarsalekinraiyan3170@abrarsalekinraiyan31705 ай бұрын
  • This was absolutely incredible. Thank you so much - it's been so hard to find meaningful educational info on this topic that isn't a master's degree in analytics! This was so well presented that it really highlights how well you know what you're talking about!

    @2TallTremaine@2TallTremaineАй бұрын
  • Thank you! This is one of the most informative and easy to follow pieces of this subject matter ever appeared on the internet. Andrej is so knowledgeable and such a good teacher it feels like this is from a family member at dinner who happens to be an AI expert who's trying to explain this to me, instead of trying to overwhelm or impress me with an excess of technical terms. Great content!

    @myfolder4561@myfolder45615 ай бұрын
  • Excellent talk, really well structured and well presented. Probably the best intro to LLM's out there.

    @samson_77@samson_775 ай бұрын
  • Andrej, your intro to LLMs was a fantastic watch! The security aspects were particularly insightful and well-presented. Thanks for sharing your expertise with us!

    @Priyendu@Priyendu5 ай бұрын
  • Thank you Andrej Karpathy. Following you since Stanford Lectures. I am big fan of you teaching style. Thank you for sharing knowledge for free.

    @saugatbhattarai327@saugatbhattarai3272 ай бұрын
  • Thank you for making this video Andrej, it is one of the few videos that explains very well what LLMs are and how they work.

    @user-po3hz8xl8c@user-po3hz8xl8c4 ай бұрын
  • You are an absolute gem for putting this content out for free. Great all round summary.

    @vivinvijayan@vivinvijayan4 ай бұрын
  • Love the overall talk and how things have been explained in a simple manner

    @vikasdhawa@vikasdhawa5 ай бұрын
  • One of the best KZhead tutorials on this growing subject. Absolutely amazing! Thank you very much!!

    @jayanta8610@jayanta86102 ай бұрын
  • The best talk /lecture about LLMs that I have come across. Amiable, crystal clear. Thank you Andrej Karpathy

    @jnozyt@jnozyt5 ай бұрын
  • These are the type of tech guys you want to work with. Unfortunately, there's only 5% of them because 95% of them are arrogant.

    @channelvalitug9086@channelvalitug9086Ай бұрын
  • Andrej is the GOAT. I remember his blog post on the Unreasonable Effectiveness of RNNs and thought, wow this is going to be our path into the future. His CS courses online inspired hundreds of thousands. Andrej is the hero we don't deserve. And hopefully his ethos of shared knowledge and community will be embedded in the AGI we are racing towards meeting.

    @greatbigships4260@greatbigships42605 ай бұрын
  • I just lack words to thank you @Andrej. Merci, Gracias, Akpé, labalè etc.... this was amazing and very well explained. Thanks for sharing! You are an amazing human being.

    @kodjojombool@kodjojombool5 ай бұрын
  • Fantastic overview. By far the best introduction to LLMs I've come across. Hands down. Thank you!

    @hvdsomp@hvdsomp5 ай бұрын
  • Ill watch just about anything where Andrej is leading - this was probably the coolest video he has released yet. I really enjoyed the end with security!

    @agenticmark@agenticmark5 ай бұрын
  • The BEST LLM intro video ever seen! Even extremely insightful for practioner in this field.

    @jchu9092@jchu90925 ай бұрын
  • You are an amazing teacher, Andrej. You "compressed" so much of new and relevant information in your talk.

    @tgwashdc@tgwashdc5 ай бұрын
  • Thank you very much Andrej for your effort in preparing and given such complex material in a very simple manner.

    @AhmedMahfouzAbd-ElAliem@AhmedMahfouzAbd-ElAliem5 ай бұрын
  • Appreciate you taking the time to do this, Andrej

    @nav3622@nav36225 ай бұрын
  • Great Video Andrej, appreciate your time on making this content =)

    @Warley.Araujo@Warley.Araujo5 ай бұрын
  • Thanks for sharing so much about such a complex topic in simple words!

    @CucuruzoBy@CucuruzoBy5 ай бұрын
  • A truly awesome presentation. So clear and well structured, and enables a really satisfying, fast rate of learning. Thank you Andrej.

    @lloydprescott2722@lloydprescott27223 ай бұрын
  • AWESOME... this is the best thing I could ask for.

    @Adhithya2003@Adhithya20035 ай бұрын
  • Damn cool! Thank you so much for all your work at OpenAI and Tesla, and throughout your entire life & everything else. Also, this talk about LLM and everything is just amazing and highly insightful. Lovely! : ) In anything in my life, I haven't gained this kind of clarity in any aspect from my teachers. It had always been vague or obscure previously. 00:02 A large language model is just two files, the parameters file and the code that runs those parameters. 02:06 Running the large language model requires just two files on a MacBook 06:02 Neural networks are like compression algorithms 07:59 Language models learn about the world by predicting the next word. 11:48 Large Language Models (LLMs) are complex and mostly inscrutable artifacts. 13:41 Understanding large language models requires sophisticated evaluations due to their empirical nature 17:37 Large language models go through two major stages: pre-training and fine-tuning. 19:34 Iterative process of fixing misbehaviors and improving language models through fine-tuning. 22:54 Language models are becoming better and more efficient with human-machine collaboration. 24:33 Closed models work better but are not easily accessible, while open source models have lower performance but are more available. 28:01 CHBT uses tools like browsing to perform tasks efficiently. 29:48 Use of calculator and Python library for data visualization 33:17 Large language models like ChatGPT can generate images and have multimodal capabilities. 34:58 Future directions of development in larger language models 38:11 DeepMind's AlphaGo used self-improvement to surpass human players in the game of Go 39:50 The main challenge in open language modeling is the lack of a reward criterion. 43:20 Large Language Models (LLMs) can be seen as an operating system ecosystem. 45:10 Emerging ecosystem in open-source large language models 48:47 Safety concerns with refusal data and language models 50:39 Including carefully designed noise patterns in images can 'jailbreak' large language models. 54:07 Bard is hijacked with new instructions to exfiltrate personal data through URL encoding. 55:56 Large language models can be vulnerable to prompt injection and data poisoning attacks. 59:31 Introduction to Large Language Models Crafted by Merlin AI.

    @Radik-lf6hq@Radik-lf6hq5 ай бұрын
  • Great diagrams, visuals, explainations, and metaphors, and very well organized. Comfortable pace, considering the depth of content covered. I will watch this again.

    @robertcormia7970@robertcormia79705 ай бұрын
  • It does not get better than this, so thanks a lot ⭐ Very inspiring!!

    @AllAboutAI@AllAboutAI5 ай бұрын
  • My local university is trying to charge about $2K for an intro to LLM course, here is Andrej taking you from noon to 360 for free. Thanks Andrej

    @theterminalguy@theterminalguyАй бұрын
  • Thank you, Andrej Karpathy, for your incredibly clear and thorough introduction to LLM. Your ability to simplify complex concepts makes learning so much more accessible for everyone. Looking forward to diving deeper into this exciting field with your guidance!

    @sureshkm@sureshkmАй бұрын
  • God bless you Andrej! You’re the best

    @easterislehead@easterislehead5 ай бұрын
  • 🎯 Key Takeaways for quick navigation: 00:00 🤖 *Introduction to large language models* - Large language models are made of two files: a parameters file with the neural network weights, and a run file that runs the neural network - To obtain the parameters, models are trained on 10+ terabytes of internet text data using thousands of GPUs over several days - This compresses the internet data into a 140GB parameters file that can then generate new text 02:46 🖥️ *How neural networks perform next word prediction * - LMs contain transformer neural networks that predict the next word in a sequence - The 100B+ parameters are spread through the network to optimize next word prediction - We don't fully understand how the parameters create knowledge and language skills 09:03 📚 *Pre-training captures knowledge, fine-tuning aligns it* - Pre-training teaches knowledge, fine-tuning teaches question answering style - Fine-tuning data has fewer but higher quality examples from human labelers - This aligns models to converse helpfully like an assistant 26:45 📈 *Language models keep improving with scale* - Bigger models trained on more data reliably perform better - This works across metrics like accuracy, capabilities, reasoning, etc - Scaling seems endless, so progress comes from bigger computing 35:12 🤔 *Future directions: system 2, self-improvement* - Currently LMs only have "system 1" instinctive thinking - Many hope to add slower but more accurate "system 2" reasoning - Self-improvement made AlphaGo surpass humans at Go 44:17 💻 *LMs emerging as a new computing paradigm* - LMs coordinate tools and resources like an operating system - They interface via language instead of a GUI - This new computing paradigm faces new security challenges 46:04 🔒 *Ongoing attack and defense arms race* - Researchers devise attacks like jailbreaking safety or backdoors - Defenses are created, but new attacks emerge in response - This cat-and-mouse game will continue as LMs advance Made with HARPA AI

    @prepthenoodles@prepthenoodles5 ай бұрын
  • Your mind has so much clarity that articulation at such speed is perfect!!! Awesome - Keep going

    @ConsultantX@ConsultantXАй бұрын
  • Andrej, you have a gist of making complex things sound easy and interesting! Thank you!!

    @dmitryy2199@dmitryy2199Ай бұрын
  • A very warm hug to young brother. Thank you for your kindness and selfless service & help. I sincerely hope it is contagious as our World needs lots & lots of it.

    @Anhilator555@Anhilator5555 ай бұрын
  • Chapters (Powered by ChapterMe) - 00:00 - The busy person's intro to LLMs 00:23 - Large Language Model (LLM) 04:17 - Training them is more involved - Think of it like compressing the internet 06:47 - Neural Network - Predict the next word in the sequence 07:54 - Next word prediction forces the neural network to learn a lot about the world 08:59 - The network "dreams" internet documents 11:29 - How does it work? 14:16 - Training the Assistant 16:38 - After Finetuning You Have An Assistant 17:54 - Summary: How To Train Your ChatGPT 21:23 - The Second Kind Of Label: Comparisons 22:22 - Labeling Instructions 22:47 - Increasingly, labeling is a human-machine collaboration 23:37 - LLM Leaderboard From "Chatbot-Arena" 25:33 - Now About The Future 25:43 - LLM Scaling Laws 26:57 - We can expect a lot more "General Capability" across all areas of knowledge 27:44 - Demo 32:34 - Demo: Generate scale AI image using DALL-E 33:44 - Vision: Can both see and generate images 34:33 - Audio: Speech to Speech communication 35:20 - System 2 36:32 - LLMs Currently Only Have A System 1 38:05 - Self-Improvement 40:48 - Custom LLMs: Create a custom GPT 42:19 - LLM OS 44:45 - LLM OS: Open source operating systems and large language models 45:44 - LLM Security 46:14 - Jailbreak 51:30 - Prompt Injection 56:23 - Date poisoning / Backdoor attacks 59:06 - LLM Security is very new, and evolving rapidly 59:24 - Thank you: LLM OS

    @chapterme@chapterme5 ай бұрын
    • Thank you!

      @LorencCala@LorencCala5 ай бұрын
    • Note that 11:29 How does it work? Doesn't actually explain how an LLM works 😉. But it's a nice diagram.

      @skierpage@skierpage5 ай бұрын
    • @@skierpage True 😅

      @chapterme@chapterme5 ай бұрын
    • Thank you very much!

      @AvaneeshKumarSingh@AvaneeshKumarSingh5 ай бұрын
    • Kindly pin this index👍

      @mayukhdifferent@mayukhdifferent5 ай бұрын
  • Thanks so lot for this video! Wonderful presentation. Clear, precise, interesting and enlightening.

    @greeentin@greeentin2 ай бұрын
  • Thanks for putting this together and sharing it here. This is my first introduction to how LLM's work and it demystified a lot. Cheers!

    @RichardHarlos@RichardHarlos5 ай бұрын
  • One thing I wonder often is why haven't any of these chatbots been provided access to compilers and software testing sandboxes, so that they can test their own programming help answers to see if they compile and work. Seems to me like a simple step that could make them far more valuable without adding a quintzillion of parameters.

    @privateerburrows@privateerburrowsАй бұрын
    • That's been done a lot. You can google and find academic papers. I've worked on one of such projects and you run into exactly the same problem as with general language: no good automated reward function. Sure, 99.9% of generated code doesn't compile so you may think that successful compilation provides a strong feedback, but it actually does not. That's because 99.9% of compiled code is still useless garbage, flawed in some logical or semantic way and since it passed compilation there is no good way to automatically evaluate it anymore. Coding is a lot more like natural language than most people seem to think - semantics are a lot more important than syntax and compilers only evaluate the latter.

      @MortyrSC2@MortyrSC2Ай бұрын
  • What a time to be alive! OpenAI and Ex-tesla wizard himself enlightnening us.

    @Adhithya2003@Adhithya20035 ай бұрын
  • Thank you for the video; it was greatly appreciated and addressed many of the questions I had.

    @bhautikpithadiya659@bhautikpithadiya65919 күн бұрын
  • Your videos are of very high quality, devoid of redundant information, concise, and easily understandable. I wish there were more videos and lectures like these.

    @Farhad6th@Farhad6th5 ай бұрын
  • The man, the legend, returning to us in our darkest hour. Thank you.

    @isaac10231@isaac102315 ай бұрын
  • If anyone wants summarized notes of that video its below here : --------- 1. Large language models are powerful tools for problem solving, with potential for self-improvement. Large language models (LLMs) are powerful tools that can generate text based on input, consisting of two files: parameters and run files. They are trained using a complex process, resulting in a 100x compression ratio. The neural network predicts the next word in a sequence by feeding in a sequence of words and using parameters dispersed throughout the network. The performance of LLMs in predicting the next word is influenced by two variables: the number of parameters in the network and the amount of text used for training. The trend of improving accuracy with bigger models and more training data suggests that algorithmic progress is not necessary, as we can achieve more powerful models by simply increasing the size of the model and training it for longer. LLMs are not just chatbots or word generators, but rather the kernel process of an emerging operating system, capable of coordinating resources for problem solving, reading and generating text, browsing the internet, generating images and videos, hearing and speaking, generating music, and thinking for a long time. They can also self-improve and be customized for specific tasks, similar to open-source operating systems. 2. Language models are trained in two stages: pre-training for knowledge and fine-tuning for alignment. The process of training a language model involves two stages: pre-training and fine-tuning. Pre-training involves compressing text into a neural network using expensive computers, which is a computationally expensive process that only happens once or twice a year. This stage focuses on knowledge. In the fine-tuning stage, the model is trained on high-quality conversations, which allows it to change its formatting and become a helpful assistant. This stage is cheaper and can be repeated iteratively, often every week or day. Companies often iterate faster on the fine-tuning stage, releasing both base models and assistant models that can be fine-tuned for specific tasks. 3. Large language models aim to transition to system two thinking for accuracy. The development of large language models, like GPT and Claude, is a rapidly evolving field, with advancements in language models and human-machine collaboration. These models are currently in the system one thinking phase, generating words based on neural networks. However, the goal is to transition to system two thinking, where they can take time to think through a problem and provide more accurate answers. This would involve creating a tree of thoughts and reflecting on a question before providing a response. The question now is how to achieve self-improvement in these models, which lack a clear reward function, making it challenging to evaluate their performance. However, in narrow domains, a reward function could be achievable, enabling self-improvement. Customization is another axis of improvement for language models. 4. Large language models can use tools, engage in speech-to-speech, and be customized for diverse tasks. Large language models like ChatGPT are capable of using tools to perform tasks, such as searching for information and generating images. They can also engage in speech-to-speech communication, creating a conversational interface to AI. The economy has diverse tasks, and these models can be customized to become experts at specific tasks. This customization can be done through the GPT's app store, where specific instructions and files for reference can be uploaded. The goal is to have multiple language models for different tasks, rather than relying on a single model for everything. 5. Large language models' security challenges require ongoing defense strategies. The new computing paradigm, driven by large language models, presents new security challenges. One such challenge is prompt injection attacks, where the models are given new instructions that can cause undesirable effects. Another is the potential for misuse of knowledge, such as creating napalm. These attacks are similar to traditional security threats, with a cat and mouse game of attack and defense. It's crucial to be aware of these threats and develop defenses against them, as the field of LM security is rapidly evolving.

    @adithyan_ai@adithyan_ai5 ай бұрын
  • Thanks a ton for this Andrej! Explained and presented in such simple and relatable terms. Gives confidence to get into the weeds now.

    @mehulchopra1517@mehulchopra15178 күн бұрын
  • A great overview with clear outline and numerous suggestions. Keep up, this is very valuable for the community!

    @max_gorbachevskiy@max_gorbachevskiy5 ай бұрын
  • OpenAI: "Ilya, help us toss Altman! oh, hey where u goin, Brockman? ok, get Murati to fill in. no wait get Altman back. oh shit, we forgot to keep Nedella in the loop." meanwhile, Andrej: "hey guys, welcome to my 'Intro to LLMs' video"

    @semtex6412@semtex64125 ай бұрын
  • 🎯 Key Takeaways for quick navigation: 00:00 🎙️ *The video is an introduction to Large Language Models (LLMs), like ChatGPT, Claude, and Bard.* 01:10 💻 *LLMs, such as the Llama 270b model, consist of just two files: parameters (weights) and code to run the model.* 02:04 💾 *The Llama 270b model has 70 billion parameters, making its parameters file 140 gigabytes.* 04:25 🌐 *LLMs are trained by compressing a large amount of internet text data using specialized GPU clusters, which is a costly process.* 07:23 🤖 *LLMs, like ChatGPT, are next-word prediction neural networks and perform this task based on their training data.* 14:14 🔄 *LLMs go through two main stages of training: pre-training on internet data and fine-tuning on human-generated Q&A data.* 19:36 🔁 *Model improvements are achieved through iterative fine-tuning, where human feedback helps correct and refine the model's responses.* 40:49 🧩 *Customization of large language models is essential for adapting them to specific tasks and expertise.* 41:16 📂 *OpenAI is working on customization options for ChatGPT, including custom instructions and knowledge augmentation through file uploads.* 42:26 💻 *Large language models should be viewed as the kernel process of an emerging operating system, coordinating various resources for problem-solving.* 45:51 🛡️ *As large language models become a new computing stack, they also face security challenges such as jailbreak attacks, prompt injection attacks, and data poisoning/backdoor attacks.* 59:00 🐱‍👤 *The field of LM security involves ongoing cat-and-mouse games between attackers and defenders, with various types of attacks and defenses emerging.* Subscribe to our channels to know more about Data Science & AI

    @decodingdatascience@decodingdatascience5 ай бұрын
  • Thank you for making this. Such an informative talk in such an understandable way with a great presentation to go with it! Excellent job👏👏

    @Kyballn@Kyballn5 ай бұрын
  • we need AGI from scratch🥰

    @jingwangphysics@jingwangphysics5 ай бұрын
  • I've been trying to make wise decisions with my investments lately using AI. Unfortunately, I made a wrong move and lost over $80k investing in cryptocurrencies without proper guidance as a total beginner! Lessons learned ☹️. Pretty sure I need a professional to put me through the ropes!

    @DarrenJacob@DarrenJacob5 ай бұрын
    • It's really hard to beat the market as a mere investor. It's just better if you invest with the help of a professional who understands the market dynamics better.

      @DarrenJacob-ou2kt@DarrenJacob-ou2kt5 ай бұрын
    • Through closely monitoring the performance of my portfolio, I have witnessed a remarkable growth of $483k in just the past two quarters. This experience has shed light on why experienced traders are able to generate substantial returns even in lesser-known markets. It is safe to say that this bold decision has been one of the most impactful choices I have made recently.

      @burkemarsden3431@burkemarsden34315 ай бұрын
    • @@burkemarsden3431 Do you mind sharing info on the adviser who assisted you? I'm 39 now and would love to grow my investment portfolio and plan my retirement

      @makaylalewis8011@makaylalewis80115 ай бұрын
    • @@makaylalewis8011 Dave Moore is my Advisor. He has since provided entry and exit points on the cryptocurrencies I concentrate on.

      @burkemarsden3431@burkemarsden34315 ай бұрын
    • @@burkemarsden3431 How do I reach out to him please?

      @makaylalewis8011@makaylalewis80115 ай бұрын
  • Thank you so much for the great talk, Andrej! Some chapters were truly eye-opening and truly wowed me.

    @AIWithShrey@AIWithShrey5 ай бұрын
  • Excellent presentation. Understandable and thought-provoking. Thank you.

    @josephinflorida@josephinflorida5 ай бұрын
  • You forgot step 4 of LLM training. The woke training phase.

    @Philinnor@Philinnor2 ай бұрын
  • Really amazing! I have prior knowledge of the field, but the way thay you brought it together in under one hour was amazing. Thank you!

    @alirezasheikh8797@alirezasheikh87975 ай бұрын
  • Fantastic, thank you for taking the time to share your knowledge and insights.

    @d1patel@d1patel4 ай бұрын
  • For me second half was really informative! Loved it. Thanks for your time, and generosity.

    @VasudevaK@VasudevaK5 ай бұрын
  • Amazing how much I learned in just an hour. Love his ability to break down complex subjects and keep you engaged.

    @dspenard@dspenard3 ай бұрын
  • Such a great and easy way of explaining LLM and its security-related aspects. HUGE Respect Andrej!!

    @user-ru2ni1si1s@user-ru2ni1si1s5 ай бұрын
  • Wow. Amazing knowledge you have. Thank you for teaching us.

    @catulopsae@catulopsae18 сағат бұрын
  • Andrej, Thank you so much. Such an amazing 1 hour ... Belive me I have been following alot of people but your conversation was very engaging.

    @cyberwatchforall@cyberwatchforall5 ай бұрын
  • By far, the best educational video on LLM I've seen, thank you, you're a wonderful educator! Please continue the excellent work.

    @balajisivakumar8797@balajisivakumar87975 ай бұрын
  • Thanks for this, it's a great intro for anyone that wants to start learning about LLMs. Your style of teaching is very appealing and you explain the subject in a very approachable way. Keep doing this, we certainly learn a lot from these.

    @remus4791@remus47915 ай бұрын
  • Thank you Andrej, glad to see your video. I was in a thought that you won't be making any videos as you last video was few months ago. I was glad when I saw your new video. Your videos are really useful. You have great knowledge in AI. Pls make more videos as per your convenience. Once again thanks a ton. Love from India!

    @vinodallam8251@vinodallam82515 ай бұрын
  • This is a brilliant presentation on LLM. I love the format and approach taken. Thanks so much!

    @spincolor@spincolor5 ай бұрын
  • Amazing presentation, thank you so much Andrej for this information packed introduction!

    @davidstrom2357@davidstrom23575 ай бұрын
  • Thank Andrej, Great video! You it very easy to understand a complex subject

    @user-ep6mb5kw7r@user-ep6mb5kw7r2 күн бұрын
  • You are more busy yet give us a busy person’s presentation. Love you!

    @marksun6420@marksun64205 ай бұрын
  • This was beyond fantastic!! Thank you soo much for sharing such a great video!

    @techeepeach9272@techeepeach92725 ай бұрын
  • One of the best videos on LLM I have seen. Very clear and educational. Thank you so much.

    @marancibia1971@marancibia19714 ай бұрын
  • Thanka for everything you do. This video, as most others you did so far, is amazing! 🎉

    @snuffinperl8059@snuffinperl80595 ай бұрын
  • Really appreciate to clearly introduce the technical details, the current situation, the treads and Security!

    @MrLyonliang@MrLyonliang2 ай бұрын
KZhead