RAG for LLMs explained in 3 minutes

2024 ж. 15 Мам.
10 573 Рет қаралды

How I Explain Retrieval Augmented Generation (RAG) to Business Managers
(in 3 Minutes)
Large language models have been a huge hit for personal and consumer use cases. But what happens when you bring them into your business or use them for enterprise purposes? Well, you encounter a few challenges. The most significant one is the lack of domain expertise.
Remember, these large language models are trained on publicly available datasets. This means they might not possess the detailed knowledge specific to your domain or niche. Moreover, the training data won't include your Standard Operating Procedures (SOPs), records, intellectual property (IP), guidelines, or other relevant content. So, if you're considering using AI assistants "out of the box," they're going to lack much of that context, rendering them nearly useless for your specific business needs.
However, there's a solution that's becoming quite popular and has proven to be robust: RAG, or Retrieval Augmented Generation. In this approach, we add an extra step before a prompt is sent to an AI assistant. This step involves searching through a corpus of your own data-be it documents, PDFs, or transactions-to find information relevant to the user's prompt.
The information found is then added to the prompt that goes into the AI assistant, which subsequently returns the answer to the user. It turns out this is an incredibly effective way to add context for an AI assistant. Doing so also helps reduce hallucinations, which is another major concern.
Hope you find this overview helpful. Have any questions or comments? Please drop them below.
If you're a AI practitioner and believe I've overlooked something or wish to contribute to the discussion, feel free to share your insights. Many people will be watching this, and your input could greatly benefit others.

Пікірлер
  • Clear thanks

    @antoineroyer3841@antoineroyer38412 күн бұрын
    • Great to hear!

      @MannyBernabe@MannyBernabe2 күн бұрын
  • thank you for the video George Santos :)

    @adipai@adipaiАй бұрын
    • 🤣

      @MannyBernabe@MannyBernabeАй бұрын
  • Very Nice. However an example would’ve helped augment the answer. Like ask it the gdp of Chad in 2023 when using ChatGPT.

    @farexBaby-ur8ns@farexBaby-ur8ns21 күн бұрын
    • Agree. Thanks for feedback. 😊

      @MannyBernabe@MannyBernabe17 күн бұрын
  • Just wanted to clear my confusion, would i yield better results by applying RAG to a fine-tuned model (i.e. fine-tuned in my field of work) or is RAG on a stock LLM good enough?

    @jasondsouza3555@jasondsouza35552 ай бұрын
    • Hey Jason, the current best practice is to first try RAG with a stock LLM and see if that works. If not, then consider fine-tuning, because it requires more effort than RAG. Hope that helps.

      @MannyBernabe@MannyBernabe2 ай бұрын
  • Does the LLM first defaults to check the additional datastore we gave it to see if it has any relevant data related to the prompt the user enters, and if it finds relevant data, it responds to the user without checking the original data on which it has been trained, and if it doesnt find any relevant data in the datastore to the prompt, will then act as if RAG wasnt even implemented, and will respond based on the data on which it has been originally trained, or am i getting it wrong?

    @DanielBoueiz@DanielBoueiz25 күн бұрын
    • You got it. First will ping the corpus for relevant data, retrieve and insert into prompt. If none, then you just get the standard LLM output. Hope that helps.

      @MannyBernabe@MannyBernabe25 күн бұрын
  • So does that mean that the data needs to fit the llm context window ? Or is the data going through some sort of compression ?

    @victormustin2547@victormustin25472 ай бұрын
    • Correct. The retrieved context still needs to fit into the context window with the original prompt. In terms of compression, we can summarize the retrieved context, saving space as well. Hope that helps.

      @MannyBernabe@MannyBernabe2 ай бұрын
KZhead