"How to give GPT my business knowledge?" - Knowledge embedding 101

2024 ж. 20 Мам.
161 512 Рет қаралды

A step by step guide on how to create your own knowledge base embedding, from prep knowledge data to retrieval augmented generation
🔗 Links
- Follow me on twitter: / jasonzhou1993
- Join my AI email list: www.ai-jason.com/
- My discord: / discord
- Finetune LLM video: • "okay, but I want GPT ...
- No code alternative: relevanceai.com/
- Github repo: github.com/JayZeeDesign/Knowl...
⏱️ Timestamps
0:00 What is Knowledge embedding?
4:21 Core business use cases
5:52 Step1 Prep knowledge data
6:25 Step2 Create embedding
8:34 Step3 Similarity search
9:55 Step4 Retrieval augmented generation (RAG)
12:23 Step5 Deploy
14:49 No code alternatives
👋🏻 About Me
My name is Jason Zhou, a product designer who shares interesting AI experiments & products. Email me if you need help building AI apps! ask@ai-jason.com
#gpt #autogpt #ai #artificialintelligence #tutorial #stepbystep #openai #llm #langchain #largelanguagemodels #largelanguagemodel #bestaiagent #chatgpt #embedding #openaiembeddings #wordembeddings

Пікірлер
  • A few people asked “why only vectorise one column instead of the whole csv?” Adding a few more explanation here: So vectorise is mainly for search, and the column to vectorise can be considered as “index” or “id” of the dataset; while the data it return will still be in question/answer pair; The reason I want to vectorise only one column is because: 1. It save cost - vectorise using embedding model which means every token we vectorise generate cost 2. It increase accuracy, in this case I want to only search for past customer email instead of sales response; search both column might return wrong answer “e.g. search for “interested in learning more”, it can return pair: “client: stop sending me emails; sales: understood, let us know if you are interested in learning more in future!” Hope this help!

    @AIJasonZ@AIJasonZ9 ай бұрын
    • It seems Embedding enriches your search query. how about answers? In your example, do you 'train' llm with Q&A pair?

      @ozfish17@ozfish179 ай бұрын
    • @@ozfish17 yep, it return both Q&A pair!

      @AIJasonZ@AIJasonZ9 ай бұрын
    • Jason, brilliant step-by-step guide on knowledge embedding! Your breakdown of the process was super insightful. I'm curious about how AI Agents in Langchain perform, especially in long-running scenarios. Hope you'll consider diving into that topic in the future. Keep up the stellar content!

      @Taskade@Taskade8 ай бұрын
    • So if you want the output response email to be generated by the LLM based on a specific tone, why wouldn't the 2nd column be a part of vectorizing the dataset?

      @sandeepbansal1195@sandeepbansal11957 ай бұрын
    • Hey Jason! What would be the best way to do this with financial PDFs? I want to ask questions and get accurate insights from the large documents. Would using embeddings be best or the fine tuning from your other video? Thanks! @AIJasonZ

      @csss142@csss1426 ай бұрын
  • Small channels like this are the ones that hold the most values.

    @psychxx7146@psychxx71469 ай бұрын
  • In 2 minutes and 54 seconds you explained what is vectoring better than any other video online. You made it easy. Thank you!

    @Helpsmallbusinesses@Helpsmallbusinesses9 ай бұрын
  • I have the same idea in mind. I have tons of product documents that I wish I could just ask an agent something about it instead of scrolling hundreds of word pages. I really appreciate your video man.

    @nguyenvanduc2000@nguyenvanduc20003 күн бұрын
  • man you have a really rare ability to explain super complicated things in a very simple way and organize the information so it's even more clear. Bravo and thank you

    @funkyboodah@funkyboodah2 ай бұрын
  • I really love your style, first explaining the theory and then demonstrating it by an example

    @fuxxs5994@fuxxs59949 ай бұрын
  • Thank you very much! Nobody explained Embedding and Vectorization like this! Thank you again!

    @muhammadanasazambhatti2772@muhammadanasazambhatti27729 ай бұрын
  • Thank for sharing your knowledge with us, your channel is literally a gold mine of information. Keep doing what you doing, Jason!

    @sidavidsin@sidavidsin10 ай бұрын
  • Absolutely great video, I loved that you took the time to explain everything in theory and then went on to give a detailed walkthrough of the code. Please keep posting such videos !

    @shivamroy1775@shivamroy17759 ай бұрын
  • Keep it up man probably one of the only channels with incredible value

    @SaminYasar_@SaminYasar_9 ай бұрын
  • Man... you have a serious gift for teaching! This is super helpful. Thanks.

    @_arman_@_arman_7 ай бұрын
  • one of the best channels out there, really appreciate your content!

    @davidkwon1233@davidkwon12339 ай бұрын
  • I've watched many video on this topic and I can say that your simple examples has covered most of what I need to know. Thanks Jason.

    @humadi2001@humadi20012 ай бұрын
  • You have helped the community so much with this valuable content. Keep it up my friend, i'll be watching!

    @Optable@Optable9 ай бұрын
  • This is super duper helpful man ! Great work and thanks !

    @TheDestint@TheDestint8 ай бұрын
  • This is pure gold. Thank you so much!

    @gautamdawar5067@gautamdawar50678 ай бұрын
  • Great job and appreciate a lot on sharing your knowledge. Looking forward for Open LLM content.

    @JJ-vq8mu@JJ-vq8mu8 ай бұрын
  • Thanks for your work Jason. You're one of the best, and I follow tons.

    @koen.mortier_fitchen@koen.mortier_fitchen9 ай бұрын
  • yo bro.. i really like when you explain all the step-by-step and all relevant tools out there! thank you!

    @stepkurniawan@stepkurniawan9 ай бұрын
  • Bro... you are incredibly smart and are a great teacher. This is going to provide 10x value to my users

    @devinoutfleet1980@devinoutfleet19807 ай бұрын
  • the best video about embedding ive seen; thank you!

    @jasonfinance@jasonfinance10 ай бұрын
  • you're the man Jason, great content!

    @aliyousefi9735@aliyousefi97359 ай бұрын
  • this is virtual gold, mad props to jason for clearly describing complex topics and even showing practical application, saved me hours of research lol, it'd be great if you can touch up on the various services out there that offer AI services that embed, and how they compare in performance, pros / cons etc.

    @normanluismadrid422@normanluismadrid4228 ай бұрын
  • I'm blown away. Thank you!!

    @pietdebeer7972@pietdebeer79728 ай бұрын
  • have been waiting for this video, Thank you!

    @kylelau1329@kylelau13299 ай бұрын
  • Love your content good sir, tuned for all next videos you are the leader

    @michalf16@michalf169 ай бұрын
  • This was super helpful. Thank you, Jason!

    @aliq6709@aliq670929 күн бұрын
  • This is GOLD !! Thank You !

    @MrDe0@MrDe08 ай бұрын
  • this is just awesome, now people who didnt had idea now dont only have idea but also reference

    @VaibhavShewale@VaibhavShewale9 ай бұрын
  • Really high quality content, thank you Jason!

    @ridg2806@ridg28065 ай бұрын
  • Outstanding. Your ability to explain complicated topics is incredible. Thank you.

    @christhornham@christhornhamАй бұрын
  • I really like your video. You knows how to reach the people attention. Please make more videos like this 😊

    @stevi32800@stevi328008 ай бұрын
  • Another great video! Thanks Jason, keep up the excellent work

    @half_way_expert@half_way_expert10 ай бұрын
  • Absolutely outstanding. I liked, subscribed and shared. Best explanation of knowledge embedding I have come across!!!!

    @wojpaw5362@wojpaw53626 ай бұрын
  • The video is very inspiring and straightforward, a valuable lesson

    @king94596511@king945965119 ай бұрын
  • Cannot be more valuable than this. Loved it 🎉

    @manojnaidu619@manojnaidu61917 күн бұрын
  • thank you for this As a dev with no AI experience, you really make it easy to understand

    @kurtcampher4716@kurtcampher47168 ай бұрын
  • Great video! Very simple to understand.

    @manideepatalukdar9201@manideepatalukdar92019 ай бұрын
  • More excellent content, thanks mate

    @CyberSQUID9000@CyberSQUID90009 ай бұрын
  • Great tutorial! You covered a LOT of ground quickly, but thoroughly. Haha. Nice work.

    @PlectrumShorts@PlectrumShorts8 ай бұрын
  • Exactly want I want , thanks Jason.

    @YangYang-rh8uy@YangYang-rh8uy2 ай бұрын
  • Anyone looking to make a great startup in AI,you have to jump on this!

    @photon2724@photon272410 ай бұрын
    • Working on it!

      @i_forget@i_forget9 ай бұрын
    • Working on it now

      @dragoon347@dragoon3479 ай бұрын
  • Very well done! Straightforward to follow!

    @robertcormia7970@robertcormia79704 ай бұрын
  • Great content and love the intros

    @chrisvienneau3366@chrisvienneau33669 ай бұрын
  • Fantastic content, thank you.

    @adi2hot@adi2hot9 ай бұрын
  • Amazing explanations, thank you!

    @NickWatching@NickWatching3 ай бұрын
  • Thank you! This was incredibly helpful

    @kylearnold9647@kylearnold96477 ай бұрын
  • Thank you for sharing!

    @alvropena@alvropena9 ай бұрын
  • These are gems

    @AI_Ron@AI_Ron9 ай бұрын
  • This is hilariously good. Thanks for this wonderful ressource!

    @AlessaOxygen-ot4rl@AlessaOxygen-ot4rl4 ай бұрын
  • Great job. You deserve more subscribers.

    @karankanchetty8320@karankanchetty83202 ай бұрын
  • Thanks Jason this was a great tutorial! :)

    @scratch123@scratch1237 ай бұрын
  • Love your content, very easy to digest and understand. The only recommendation I would give is to use other embeddings and LLM models besides OpenAI. Mid/Large sized companies cannot use OpenAI in their environment because of legal issues around OpenAIs data retention policy. Alot of companies want to develop their own implementations so including other models like Llama 2, Vicuna, etc would allow you to reach a bigger audience.

    @verasalem5071@verasalem50719 ай бұрын
    • yea great points, thanks for the recommendation! totally get that company dont want to send any data to OpenAI LOL

      @AIJasonZ@AIJasonZ9 ай бұрын
    • +1 for using more open models. I love your content and the approach you take to your videos. But even though I'm not a big company I just value using systems that are open instead of closed.

      @Ascended23@Ascended239 ай бұрын
  • You are the man Jason!

    @xulipaTV@xulipaTV10 ай бұрын
  • I am really surprised that these tools can help so many businesses doing the low-cost and autonomous response specifically for customer service! Great video!

    @farid3101@farid31012 ай бұрын
  • Big thank you ❤

    @andrzejpec4886@andrzejpec48869 ай бұрын
  • Loved to see similar demo of knowledge search with open source models not with openai models

    @shethromesh@shethromesh9 ай бұрын
  • Came here after the fine tune model video - looking for exactly this. Thanks!

    @rahuliyer6007@rahuliyer60075 ай бұрын
  • Very nice video.

    @fenderbender2096@fenderbender20968 ай бұрын
  • Great explanation

    @arunkabilan@arunkabilan9 ай бұрын
  • Great video as always, Jason. Thank you for making one of the few channels with genuine AI tools video that actually demonstrate implementation and applications rather than hyping up the content through sweet talk then simply dropping an affiliate link.

    @averagegamer9513@averagegamer951310 ай бұрын
    • This! I feel so grateful that the KZhead algorithm blessed me with Jason's channel. Beautiful explanations and clear steps.

      @senxo.visuals@senxo.visuals9 ай бұрын
    • Yeah, he's one if the real ones. I've asked him if he could add a github for the code. It's the only thing this channel lacks imo.

      @koen.mortier_fitchen@koen.mortier_fitchen9 ай бұрын
    • @@senxo.visualssame feelings here

      @frankchangshow@frankchangshow7 ай бұрын
  • Great video, subscribed.

    @takeshikriang@takeshikriang9 ай бұрын
  • Amazing video Jason! Pretty useful information. I would love to see a video about GPT4All as a personal assistance for everyday life.

    @naimneman@naimneman9 ай бұрын
  • Awesome explanation, thanks.

    @markieuanroberts@markieuanroberts6 ай бұрын
  • So helpful! I started using relevance ai because of your videos & just as a no-code developer been able to build some sick ass LLM chains with Zapier Custom HTTP Requests. I have my development team even using it & it’s definitely speeding up our velocity to iterate🙌🔥

    @growthub8541@growthub85419 ай бұрын
    • thats great to hear! 🤘

      @AIJasonZ@AIJasonZ9 ай бұрын
  • hey, can you share the salse_response.csv file also, it's not in the git repo

    @himanshumishra6253@himanshumishra62539 ай бұрын
  • Excellent.

    @slimyelow@slimyelow8 ай бұрын
  • this dude is on FIRE 🔥

    @shrvn110@shrvn1109 ай бұрын
  • this is the best video on your channel.

    @user-nt2fs7qp6c@user-nt2fs7qp6c7 ай бұрын
  • Amazing!!

    @ayusharora2019@ayusharora20199 ай бұрын
  • Excellent vid thank you !

    @oscarcharliezulu@oscarcharliezulu9 ай бұрын
  • This was 🔥🔥🔥. If I hadn't already subscribed, I would have. Excellent use case! Looking to impliment this using Flowise.

    @AssassinUK@AssassinUK9 ай бұрын
  • Great video

    @sameergamer4567@sameergamer45678 ай бұрын
  • Thanks a lot for the info!! Greetings from Mexico 🤙

    @patriciodiaz2377@patriciodiaz23778 ай бұрын
  • 解釋得非常清楚

    @user-ps3jj1ey5k@user-ps3jj1ey5k9 ай бұрын
  • This is exactly what I was looking for, I have a question Jason: How can we secure our company personal data?

    @kiraakamaru@kiraakamaru9 ай бұрын
  • you make really useful videos man

    @ivant_true@ivant_trueАй бұрын
  • Jason you are awesome!

    @davide.2349@davide.23499 ай бұрын
  • 当中间向量查询的结果出来, 一下子就了解了整个流程, 非常赞. 原来是拿向量查询的结果, 再去扔给llm, 当作promt instruction, 然后让llm给出答案.

    @coldestlin@coldestlin3 ай бұрын
  • great video! is that enough info to go out and start building a customer response ai for other people or businesses?

    @rverm1000@rverm10008 ай бұрын
  • Thank you for the super video. I'm learning LLM and am quite confused between knowledge base embedding, that was mentioned, vs prompt tuning. Could you tell me the difference?

    @DeLeizard@DeLeizard8 ай бұрын
  • hey, thanks for the very detailed tutorial. Just a question how do you manage to set higher weights for the most recent messages?

    @blitttzzz@blitttzzz9 ай бұрын
  • thank you so much Jason, is there a way to tell the model not to answer anything not from the csv file?

    @joannezhu101@joannezhu1018 ай бұрын
  • Dude. You. Are. Awesome!

    @user-gv6ek5tg2f@user-gv6ek5tg2f4 ай бұрын
  • Thanks!

    @desiderata2745@desiderata27459 ай бұрын
  • Amazing! Thanks for sharing

    @davidwylie8491@davidwylie84919 ай бұрын
  • Thanks for No coding alteratives

    @maciejbalasinski2419@maciejbalasinski24199 ай бұрын
  • Amazing video, thanks so much for sharing! I haven't really understood LangChain until now. Now, let's assume we want to update the vector database because we have additional rows in our CSV or data file. How can we do this (or do you have a video explaining this?)?

    @MarkShust@MarkShust9 ай бұрын
  • Bro you are awesome.

    @tahunal@tahunal9 ай бұрын
  • I wonder why those AI channels, like yours, are not exploding. This is so important for the future what you all are doing. Only a few people get this!

    @ludwigvanbeethoven61@ludwigvanbeethoven619 ай бұрын
  • Do you only need to have one use case for the data or can you just upload a lot of data that could be used in different ways? For example, your use case was for responding to customer emails from what I understand, but what if you wanted to upload all of your organization's data and then ask it various questions or use it in various ways?

    @andypejman@andypejman9 ай бұрын
  • It would be amazing if you could make a video creating a knowledge base using long pdfs as source,, and use gpt as well to make an expert assistant in a topic.

    @camach28@camach289 ай бұрын
    • Yes like if the data source is like a book and we want to search the contents in it giving relative data like “I remember this part of the book saying something like this… where was it?” … or “the book had this story … where was it and the main ideas”

      @frankchangshow@frankchangshow7 ай бұрын
  • top tier content!!!!

    @___Madara__@___Madara__4 ай бұрын
  • Amazing content my guy Amazing

    @tauraik@tauraik9 ай бұрын
  • Thanks ! when will this be on Github ?

    @mike8677@mike86779 ай бұрын
  • Is there a way to carry context of previous messages to the next one? so a follow up message can be answered.

    @shahbaazsingh6605@shahbaazsingh66057 ай бұрын
  • Question: Can this be done with ohter LLMs like Falcon for example instead of using OpenAI API key [kinda new ai development and wanna try things out before paying for stuff]

    @Productiveshiz-cy3qd@Productiveshiz-cy3qd8 ай бұрын
  • Great video, thank you

    @SS-rt8oo@SS-rt8oo2 ай бұрын
  • Hey @AIJasonZ, great video! I'm curious, is there a method to retrieve the confidence level from the embeddings? Since it's possible that not all the information will be present in the embeddings, it would be helpful to have a way to handle such scenarios. For instance, if certain information is missing, perhaps the system could respond with "response not found" or trigger another action like calling an API.

    @ZorinsFactFrenzy@ZorinsFactFrenzyАй бұрын
  • Thank you so much for your video. Its very helpful. At the same time, is there a way to run this with Llama-2 or other open source LLM's? Edit: If security is my main concern, how do I go about embedding?

    @satyamgupta2182@satyamgupta21829 ай бұрын
KZhead