Building a RAG application from scratch using Python, LangChain, and the OpenAI API
2024 ж. 4 Нау.
45 474 Рет қаралды
GitHub Repository: github.com/svpino/youtube-rag
I teach a live, interactive program that'll help you build production-ready machine learning systems from the ground up. Check it out at www.ml.school.
Twitter/X: / svpino
"All you need is Attention" to the details this GREAT teacher is providing , he could have picked up a simple context within the limits and would have been done BUT no his zeal to share and show every single road block that we need to know is OUTSTANDING. God Bless You. Could you please do a Speech To Speech Application as well?
DUDE!! this is the best resource I have found so far. The part I truly appreciate is that you focused on the conceptual parts and NOT just here is the code
This will be a great resource! I'd love to see a dedicated video diving deeper into LangGraphs and their applications. Keep up the great work!
Excellent walk through capturing all the required details
Love it how you explain everything step by step!
Looking forward to upcoming videos! Also it would be interesting if you can cover the following topics: 1. Langchain - saving context of conversation and using in next questions 2. Dialogflow intents or if it can be built in LangChain. 3. the real-time data to be added based on conversation. 4. Streaming audio to LLM, and the way to implement conversations like chatGPT by holding button, with less latency.
Nice - looking forward to dive into this. Could be nice to see a setup for Application with local LLM for private docs. I think a lot of people are looking for such a video 🤞🏻💪🏻 TY for your Nice work!
Santiago, thank you for the valuable RAG videos. Your insights are well-explained and informative. I would be interested in seeing more of your work on this topic.
Awesome video with great explanation, step by step with examples, images and code!! Love it.
Super helpful walkthrough. Thank you !
Thank you for this, your explanation on embeddings was superb!!
Wow. Great teaching! Explaining the "why" on your lessons, in this case (RAG) is what you dont find in other teaching material . Thank you, this has been very insightful! 👍
This is the best crispy clear RAG walkthrough that I have seen until now ! Congratulations and thank you for putting effort into educational videos on GenAI use
Thanks!
Super insightful! Very nice explanation and walkthrough! Thank you!
Such an amazing Resource !! Thanks Santiago ❤
Genuinely found this to be hyper-valuable, after watching this, tons of concepts now actually make sense to me, Thank you!
I already have experience with Langchain, but wanted to learn more about Pinecone. You are just exceptional at teaching. Thank you so much for this. :)
This tutorial was the perfect level of theory + code for a beginner like me to get enough knowledge to start tinkering myself. Subscribed and looking forward to more content!
Love your method of teaching!!!
You're the Legend, Santiago! Love the way you explain :)
Thank you. It is the perfect video when you want start doing RAG just after knowing what RAG is for. Now it is easy to do chatbot for technical documentation queries.
Thanks for the video. Very clear explanation of all concepts involved. Looking forward to your next videos. If possible, please can you cover the topic of production deployments and operations for a LLM project.
Thanks for the detailed, clearly explained tutorial. Nice
This video got to me on the perfect day, you just solved a whole project for me with this. Thanks!
For me, good mentor is someone that can explain difficult topics with elegant simplicity and you doing it superb! Looking forward to Cohort 12 :)
I’ll see you in class!
Keep uploading this kind of content
Thank you a lot Santiago, best tutorial I have watched so far
WOOWWW!! an amazing effort thank you so much. Look forward for more
I have implemented your example in node js and it's working like a charm. Thanks!
But I really hate the langChain api. I'm trying to implement it as plain old functions.
Great tutorial! Thank you Santiago! I actually tried it myself (first time using the OpenAI API). I was able to build a RAG application to get answers based on a PDF document. And it works! However, I see that there are some more stuff to think about, such as storing the embedded pages in the vectorstore might not be the best approach. I am looking forward to new videos!
Thanks for your valuable insights! It has been immensely helpful.
Awesome content and explained in details, thank you !!
Great information and delivery as always, Santiago! Greetings from Chile
Gracias!
This is really good. Thanks for posting this tutorial
Very helpful and practical tutorial. Thank you Santiago.
Amazing explanation. Thank you for making this video.
This video is very informative ,thank you
Fantastic tutorial! Thanks a lot! I already saw other tutorials on LangChain and also I purchased a Udemy course. This tutorial here is the best of all. Everything is well explained and can easily be understood.
Nicely done!!
Excellent Video and Excellent Explanation. Thanks alot.
Great video and explanation, thanks m8!
Wonderful video! I made my post-graduate final project exactly like this.
man this is great, keep it up. also it will be appreciated if you make a video for a full RAG system including: 1. adding the most suitable way of using memory . 2. how to embed and deploy these application for production(web, mobile, saas...etc)
Great tutorial - thanks
You gain a new follower. Great job!
Damn... I learned a lot from this video! Thank you for this. I have subscribed to your channel and looking forward to watch your future videos.
Köszönjük!
Really informative video. Thanks Santiago :)
best class period. you need to launch LLM course!
Great explanation. Waiting for new videos
Wow this was great!
Awesome video! Thank you for it!
Glad you liked it!
Thank you! This is really great job and much helpful to me. It would be greatly appreciated if you could make a video about prompt engineering.
thank you for this video, mate :)
super. Well explained. thanks
Great video. Best video on youtube. Muchas Gracias Santiago.
Dude this is the perfect video
Thank for your video, It is easy to understand
This is really great video, thanks a lot @svpino for sharing this! Liked and subscribed!
Thanks man!
This was great and I learned a lot, I have now subscribed you, thank you so much!
Great video. The jupyter notebook was very helpful! Interesting that, by default and only given the context of "Patricia likes white cars," the parser came to the conclusion that Patricia's car was white even though she might not actually own a white car. I added instructions to tell me when it was inferring an answer but makes me wonder what other things it might be inferring without telling me why.
I don't know why but I get the response "I don't know" for that question even though my code and the prompts are identical and I'm also using the OpenAI API.
@@freerider6300 Given that the template for the prompt says to respond as "I don't know" that seems appropriate. Are you by chance using gtp-4 vs gtp3.5-turbo? I changed my model to use gtp-4 and I get the response as "I don't know" where, with gtp-3.5-turbo, I get the response as "White" just like it shows in the video.
great video sir, thank you very much. Please make video on fine-tuning using PEFT.
Great video. Subscribed.
very clear, i like it, keep going bro
Thanks, will do!
Thank you so much. You're the best.
Glad it helped!
amazing video, thanks
Excellent introduction to Pinecone use for RAG using Langchain. Looking forward to more...I had an error 2/3rds way through AttributeError: 'builtin_function_or_method' object has no attribute '__func__'. Eventually, I installed Python 3.11.8, re-cloned the repo, installed the same dependencies and it worked again, no problem!
Thanks for another great video Santiago. Any insight into when it might be a good idea to incorporate LlamaIndex into an application? Just learning about it but trying to understand how it fits in with something like Pinecone.
Perfect!
Great video
Best beginner level explanation of RAG with Langchain on Internet. Great one Santiago!
Thanks!
Love your video! I you allow me a small suggestion, I would have put your face in the top left corner in order to avoid the overlapping with the code. Great Work
True
Thanks Santiago ! As always , high value content ! Just one quick question: why not match the LLM max content size with the chunk document size ?
great content. I have a question: "how can we add the openai's function calling in this for extrenal api's calling?"
Great ❤
Hello, thank you so much for this video that help me a lot to clarify a lot of concept. i have a question related of aggregation questions in LLM documents.for example the vector database have thousands documents with date property, and i want ask the model how much document i received in the last week? what can be the best practice to handle this kind of use cases?
Thanks Santiago, this helped me understand RAG as a beginner more than any other video out there. Your simple example alongside the more complex video vectorstore in particular. Question, are there local opensource vector store alternatives to pinecone which are robust and straightforward? I saw your followup video which used ollama... Going to try and write some example code to replicate this with ollama vs openai. Thank you!
Check this link: www.reddit.com/r/ChatGPTCoding/s/OIBZJnNYuI
Thanks!
Very well structured tutorial. If there are transcripts of 100's of videos, then how to send the video also in the answer so that user can open and watch it after reading the answer? Has anyone tried it?
if anybody had the same problem with activating the virtenvir venv as i: source command didnt work - it’s just different based on the os so source works for linux/macos and the .venv\scripts\activate for windows
Great one even people with little knowledge about LLMs can understand this very much.
In the retrieval part, what difference should we make if we plan to retrieve from a knowledge graph/base instead of performing a vector search?
Great, thanks, Santiago. What about using gpt4?
So, langchains are pipes for LLMs ? Do they provide any of the output splitting/redirection that you get from classic stdio ?
Nice video! Very helpful for beginners! Just one question, in terms of performance isn't it better to create one prompt and one call to model with the context, answer and translation? It would be just one call to OpenAI server instead of two..
For that particular example, yes. One call would be better. But think beyond that. You may have two separate chains using different models and processes. One call might not be possible, and that’s where chaining different chains might be helpful.
Oh, I see.. Valid argument! Thanks for explanation, it's really great explanation of the concept itself!
What route would you advise to go from an unstructured individual pdf (just used once) to a structured json output with the help of an api?
Hello again! I am trying to build upon what I've learned from your video, but now extending it to sequential chains and using multiple input variables. Everything seems to be good until I try to invoke it alongside Pinecone. I already have my index ready and I am getting the correct data when doing the similarity_search. I tried this: setup = RunnableParallel(context=pinecone.as_retriever(), input=RunnablePassthrough(), topic=RunnablePassthrough()) chain = ( setup | prompt | model | StrOutputParser() ) res = chain.invoke({"input":"What are LLMs?", "topic:"Artificial Intelligence"}) and also this setup for the chain chain = ( { "context": pinecone.as_retriever(), "topic": itemgetter("topic"), "input": itemgetter("input"), } | prompt | model | StrOutputParser() ) res = chain.invoke({"input":"What are LLMs?", "topic:"Artificial Intelligence"}) but I keep getting an `TypeError: expected string or buffer` error. Does this mean only the input should be included in the invoke call? I was able to do the same chain.invoke() with multiple input variables in your earlier examples without the vectorstores, so I am not sure why now it is only asking for a string. Would love to get your insights on this. Thank you so much!
Very interesting. Can embeddings be used to detect similar chain of strings (like part numbers with and without dashes or with different variuants) or they do work only with concepts like cars, parent relations, colors...?
Good question. I’m not sure. It should be simple to try.
Is this the same type of thing you do with ChatGPT assistants or CoPilot chatbots that point at docs/sites?
Great
Hi there, first time on this channel, was wondering if this covers cases where my data changes.. for example i want to edit a price of an item.. do i need to manually manage my chunks , delete them , embed and push the updates? Let me know if you are solving this in this video so id watch the full video.. Either way all the best ! Subscribed ❤
It doesn’t. I’ll record a video about that.
@@underfitted thanks, I think that's one of the differences between just-for-fun projects and actual production systems, would love to see how you reach a solution. Thanks for replying! All the best!
Thanks for the video! Perfectly explained. However I did get an error when I tried to transcribe the YT Video: URLError: Why that? Thanks!
This's a verrrrrrry amazing video teaching us how to build a RAG. Thanks soooooooo much~ But i met with a question when following the "Setting up a Vector Store" step, it shows this error "AttributeError: 'builtin_function_or_method' object has no attribute '__func__'". Does someone know how to deal with it? Thanks so much~~~
Same for me! Did you find a way around it?
Hi Santiago, at 31:33 you mentioned that 1000 words is 750 Tokens. Isn't it the other way round? 1 word around 3/4 Tokens?
Ha! Yeah, more tokens than words. I’m always getting this wrong (notice you also made the same mistake in your comment.) 1 word is about 1.3 tokens
@@underfitted😅
you have not use the opensource vector database and opensource language model in the above video. please try to make video on that.
Really great content , i'm happy to announce that you will be my virtual teacher for my Generative AI Journey , thanks for such videos!
How would Elastic Search fit into this architecture ?
Hi Santiago, thank you for this video. What is the PINECONE_API_ENV. I could not find that info anywhere?
Hi, he answered that question in another comment : "We don't need that one any more, actually. I apologize. When I wrote the code, that variable was needed, but not anymore."
awesome, now subscribed to your yt
Hello, Thank you for the GREAT tutorial ! i am stuck in step 3. i get AuthenticationError: Error code: 401 saying the API key is not correct. i am providing the correct key i got from OpenAI. Kindly, do you have any infos about this issue. thank you.
I tried it with an API key from my work account and it worked!