CODE-LLAMA For Talking to Code Base and Documentation

2024 ж. 5 Мам.
19 059 Рет қаралды

Learn how to chat with your code base using the power of Large Language Models and Langchain. In this video we will use CODE-Llama to talk to the GitHub repo of LangChain.
"🚀 Dive into Advanced Source Code Analysis with LLMs! | How GitHub Co-Pilot & Others Transform Coding 🧠 | Step-by-Step Guide on QA Pipeline & Innovative Splitting Strategies! 🔍 #CodeAnalysis #LLMmagic"
CONNECT:
☕ Buy me a Coffee: ko-fi.com/promptengineering
|🔴 Support my work on Patreon: Patreon.com/PromptEngineering
🦾 Discord: / discord
▶️️ Subscribe: www.youtube.com/@engineerprom...
📧 Business Contact: engineerprompt@gmail.com
💼Consulting: calendly.com/engineerprompt/c...
LINKS:
Code link: python.langchain.com/docs/use...
Timestamps:
00:00 Intro
01:22 Setup - LangChain
01:51 Doc loader and Text Splitter
03:03 Retriever and Vector Store
03:57 Code Llama with LlamaCpp
04:40 Where to Download Code-Llama From?
05:55 Setting up LlamaCpp
06:55 Running Code-llama
08:00 RAG with Code-llama
09:17 Wrong way of using Code-llama
10:40 Correct Template for Code-llama
12:43 GPU speed to expect with Llama-Cpp

Пікірлер
  • It's one of the rare channels where I've been happy to see a new video notification for a long time. I'm waiting for the next time to come. I feel so lucky that you post videos very often and tell us the latest information perfectly. We don't want to lose you, keep it up bro

    @rehberim360@rehberim3608 ай бұрын
  • This is really great!

    @CognitiveComputations@CognitiveComputations8 ай бұрын
    • Thank you 🙏

      @engineerprompt@engineerprompt8 ай бұрын
  • Amazing mate!

    @TheCopernicus1@TheCopernicus18 ай бұрын
    • Thank you! Cheers!

      @engineerprompt@engineerprompt8 ай бұрын
  • Nice video! I had a question. What would be the best and fastest small model to use on an old cpu? Like 6th generation i5 for example. I heard of orcamini3b and falcon1b what do you think?

    @finnsteur5639@finnsteur56398 ай бұрын
  • First second I hear Indian accent = Imma'bout to get pure useful information from this video, this is for sure. Thx for the video though 1 thing - when you talk about "previous video" at the beginning of the video, I'd consider to provide this "leading link" that shows in the corner because I'm not familiar and have to dig into your feed to find it. Not a great issue but still - just a tip to consider.

    @andrzejpec4886@andrzejpec48867 ай бұрын
  • Thank you for your videos. I would appreciate if you publish the notebook

    @CesarVegaL@CesarVegaL8 ай бұрын
  • Thanks for the Amazing tutorial, Not able to get BLAS=1, created a ticket for the same. Looking forward to hearing from you

    @andy111007@andy1110077 ай бұрын
  • 谢谢!

    @kevinyuan2735@kevinyuan27357 ай бұрын
    • Thank you 🙏

      @engineerprompt@engineerprompt7 ай бұрын
  • What's the difference between those two prompts you defined: sys prompt way and the simple one?

    @bakistas20@bakistas208 ай бұрын
  • Thanks brother

    @mohammedsuhail1500@mohammedsuhail1500Ай бұрын
  • Can you share the notebook link?

    @vswraith@vswraith8 ай бұрын
  • Hi Prompt Engineer. Amazing stuff as always. I just have a question about embedding in the Chroma DB in general. Where can i find the different Embedding models and how can i speed the embedding process up. Because for lets say 100 documents i think the app would crash. Doesnt seems scalable. Or am i getting something wrong here?

    @japorto100@japorto1007 ай бұрын
    • you can use something like this : embeddings = SentenceTransformerEmbeddings( model_name= "sentence-transformers/all-MiniLM-l6-v2", model_kwargs= {'device':'cuda'}, encode_kwargs= {'normalize_embeddings': False}) Chroma.from_documents( chunks, embeddings)

      @kamadelva1235@kamadelva1235Ай бұрын
  • Are the models from hugging face the same as those released by Meta?

    @THEVIERAOS@THEVIERAOS8 ай бұрын
  • can we run llama2 model in windows machine with gpu with own documents. let me suggest which video will give information about this. mostly i am getting error while downloading data from meta llm using hugging face api.

    @malleswararaomaguluri6344@malleswararaomaguluri63447 ай бұрын
  • Please do a video to setup GPU on a windows machine. I've been trying for days to use my GPU and can't get it to work. I've installed CMAKE, CUDA developer tools and etc. Nothing works. I have two GPUs on my laptop.

    @sunilanthony17@sunilanthony177 ай бұрын
  • When I'm trying with your template that you describe in the video. I'm getting strange answer when I print the output from: # Docs question = "{Current Question}" docs = retriever.get_relevant_documents(question) print(docs) [Document(page_content='ans = qa.run(\'Based on the stg_jaffle_shop.yml file generate sample csv data for each table with minimum 100 rows. The sample data should be with foreign key constraint.\') # \'Write a test case for the database connection using unittest.\' #\'Write a test case for the code in connection.py using unittest.\' #\'Based on the .yml files generate sample csv data for jaffle_shop_customers table with 100 rows\' I see that they are in comments ... but the answer from model is the following..... "This request is not clear to me. Please provide more details so we can help you better. Do you want us to generate sample CSV files or do you want to create a test case using unittest? " it looks like the LLM model doesn't recognize the right question.

    @MrBorkori@MrBorkori7 ай бұрын
  • Thank you for the video. I get 0 documents loaded when integrating the code to my Python script. does this work on Windows?

    @danaelifaz4835@danaelifaz483515 күн бұрын
  • How do I ask questions that hit the same file, for example if I want to know how many different constructors a certain class has? I tried everthing and it always came back to me with an inferior number to the actual ones

    @RobertoFabrizi@RobertoFabrizi4 ай бұрын
  • Can I use GGUF model my gpu doesnt have enough VRAM

    @alefdoreu@alefdoreu7 ай бұрын
  • thanks! if we do not want to use OpenAI embedding, what open source one you recommend for python code in this case? FAISS will be fine?

    @haroonmansi@haroonmansi7 ай бұрын
    • For embeddings, I like to use the instructors embeddings. FAISS will be fine as well.

      @engineerprompt@engineerprompt7 ай бұрын
    • Thanks! @@engineerprompt

      @haroonmansi@haroonmansi6 ай бұрын
  • How we can restrict the answers to the given context? No matter one what codebase I am creating embedding, the model LLM is given responses based on it pretrained data

    @adnanrizve5551@adnanrizve55517 ай бұрын
    • You will need to provide a system prompt. Check out my latest video on localgpt. There is an example prompt

      @engineerprompt@engineerprompt7 ай бұрын
  • How to generate code documentation from codellama? The input will be a class or method in java.

    @mehulparmar9976@mehulparmar9976Ай бұрын
  • Here also openAPI key is required. Is there anyway to access hugging face models without API key or with some free Key?

    @Shahidma58@Shahidma588 ай бұрын
    • Yes, should have highlighted more, the openai api key is required for the embedding model, you can replace his by any other open source embedding model and this should work

      @engineerprompt@engineerprompt8 ай бұрын
    • Hi, thanks for your response, I have just started learning the long-chain and for the time being, I am not able to purchase chatPGPT for API key but I want to develop some apps using hugging face models to develop the apps, like 'Ask the doc' to read PDF and answer the questions. I will buy chatGPT if my app goes through the testing. Best Regards,

      @Shahidma58@Shahidma588 ай бұрын
  • Hi,thanks for the tutorial but I am facing a RateLimitError while executing this particular line of code :"db = Chroma.from_documents(texts, OpenAIEmbeddings(disallowed_special=()))" for both Code-LLAMA and GPT-4 LLM .How can I solve this error?

    @user-vg6qi7pd6o@user-vg6qi7pd6o4 ай бұрын
    • use a different embeddings, like embeddings = SentenceTransformerEmbeddings( model_name= "sentence-transformers/all-MiniLM-l6-v2", model_kwargs= {'device':'cuda'}, encode_kwargs= {'normalize_embeddings': False})

      @kamadelva1235@kamadelva1235Ай бұрын
  • Can you please give an example that loads 34b instead of 13b? For some reason, I can't get it to work with 34b,

    @CognitiveComputations@CognitiveComputations7 ай бұрын
    • I will have a look at it today, are you trying gguf format?

      @engineerprompt@engineerprompt7 ай бұрын
  • i did not see the notebook attachment. will you provide it? is this applicable to link to ui?

    @yusufkemaldemir9393@yusufkemaldemir93937 ай бұрын
    • The link to documentation is in the description where the code is, should work with UI as well

      @engineerprompt@engineerprompt7 ай бұрын
  • Is it possible to use your localGPT project with Code-LLAMA (and ingest code to ChromaDB)?

    @korchi@korchi8 ай бұрын
    • Yes but you will need to change the document loader part in the code. Will also need to make changes to the splitter part, the rest will work just fine.

      @engineerprompt@engineerprompt8 ай бұрын
  • Could you do a video about Cursor? So far to me seems very useful and better than copilot

    @janalgos@janalgos8 ай бұрын
    • what is differrence form this and copilot

      @parasetamol6261@parasetamol62618 ай бұрын
    • @@parasetamol6261 easier and more integrated compared to tabnine/copilot

      @janalgos@janalgos8 ай бұрын
  • Do we have to use LlamaCpp for RAG?

    @Finwe2@Finwe28 ай бұрын
    • Not really, it’s just for using the ggml/gguf models

      @engineerprompt@engineerprompt8 ай бұрын
    • @@engineerprompt Hm yeah I couldn't get RAG to work properly with my org's source code. Even tried to use it with LlamaCpp and the llm = pipeline() -> llm("my prompt") method

      @Finwe2@Finwe28 ай бұрын
  • Can I use codellama-7b-hf model with langchain

    @7stuti@7stutiАй бұрын
  • Is it possible to run this on Google colab?

    @echofloripa@echofloripa8 ай бұрын
    • Yes, you probably want to use the 7B model

      @engineerprompt@engineerprompt8 ай бұрын
  • all examples are on python code base, trying to make it work for Csharp code base but it doesnt work

    @shalabhgarg8225@shalabhgarg8225Ай бұрын
  • Can you list system specs, rather than just stating that you used a GPU?

    @CrispinCourtenay@CrispinCourtenay8 ай бұрын
    • I have M2 Max 96GB

      @engineerprompt@engineerprompt8 ай бұрын
  • Nice vid, but you can't just use any generic embeddings model, you'd need something that works well for code and that's tough. OpenAI is probably one of the best still. Also anyone who uses GPT4 for code and tries 13B CodeLlama, or 34B even, is going to be sad. It's as bad as GPT3.5, mostly worse. Kind of useless for any higher level reasoning. Also if you're using it for Python why not use the Python-specific CodeLlama model that was optimized exactly for Python use?

    @3lbios@3lbios8 ай бұрын
  • Add collaboration file please

    @VOKorporation@VOKorporation7 ай бұрын
  • Can u share the codelab code ?

    @jerryyuan3958@jerryyuan39588 ай бұрын
    • Yes? Will put together it in a colab and share

      @engineerprompt@engineerprompt7 ай бұрын
    • @@engineerprompt where can i find your github repo Sir?

      @FranciscoMonteiro25@FranciscoMonteiro257 ай бұрын
  • Why are you saying "GPU" in relation to M2? It's arm CPU and llama.cpp project just uses very bespoke optimizations for that CPU, but not the GPU

    @alx8439@alx84398 ай бұрын
    • M2 has integrated GPU.

      @erikjohnson9112@erikjohnson91128 ай бұрын
    • @@erikjohnson9112 and the whole thing is having power supply and LCD screen. But what does it have to do with inference?

      @alx8439@alx84398 ай бұрын
  • Again very useful video, Thank you very much!! I'm new in this area and it's not clear .... I have a concern about embedding function/model - if you use OpenAIEmbeddings should I worry about some privacy concerns in cases involving sensitive data. I have already tried with open source model embedding like HuggingFaceInstructEmbeddings model_name = "hkunlp/instructor-large" but when I try to load it into Chroma gives me the following error message: chromadb.errors.InvalidDimensionException: Embedding dimension 384 does not match collection dimensionality 1536

    @MrBorkori@MrBorkori7 ай бұрын
    • Yes, you are sharing your data with openai. I regards to that error, simply delete your vector store and then rerun the embedding computation, this should work. Basically you have existing vector store where embeddings have different dimensions. You want to recompute everything from scratch

      @engineerprompt@engineerprompt7 ай бұрын
    • It works - Thank you a lot again!! You save me from headache!!@@engineerprompt

      @MrBorkori@MrBorkori7 ай бұрын
  • My comments get deleted not sure what is happing

    @cudaking777@cudaking7778 ай бұрын
  • hello

    @RickySupriyadi@RickySupriyadi8 ай бұрын
  • I actually wanted to talk to my code to ask why it's so bad.

    @pabloartero1155@pabloartero11553 ай бұрын
  • docs = retriever.get_relevant_documents(question), not sure what is this "retriever". getting NameError: name 'retriever' is not defined

    @csowm5je@csowm5je8 ай бұрын
    • had to include previous video code to make it work. Thanks

      @csowm5je@csowm5je8 ай бұрын
  • This is very insightful tutorial I have applied this but I have doubt how you have created retriever object for Llama? retriever.get_relevant_documents(questions)

    @DurgeshParekh@DurgeshParekh8 ай бұрын
KZhead