LocalGPT Updates - Tips & Tricks

2024 ж. 26 Мам.

22 711 Рет қаралды

In this video, we will look at all the exciting updates to the LocalGPT project that lets you chat with your documents. The new updates include support for GGUF format model with llama cpp, better prompt template for restricting answering using Llama-2 template and a lot more!
If you like the repo, don't forget to give it a ⭐
💻 Pre-configured localGPT VM: bit.ly/localGPT (use Code: PromptEngineering for 50% off).
▬▬▬▬▬▬▬▬▬▬▬▬▬▬ CONNECT ▬▬▬▬▬▬▬▬▬▬▬
☕ Buy me a Coffee: ko-fi.com/promptengineering
|🔴 Support my work on Patreon: Patreon.com/PromptEngineering
🦾 Discord: / discord
▶️️ Subscribe: www.youtube.com/@engineerprom...
📧 Business Contact: engineerprompt@gmail.com
💼Consulting: calendly.com/engineerprompt/c...
💻 Pre-configured localGPT VM: bit.ly/localGPT (use Code: PromptEngineering for 50% off).
▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬
LINKS:
Github Link: github.com/PromtEngineer/loca...
LocalGPT- Detailed walkthrough: • LocalGPT: OFFLINE CHAT...
LocalGPT with Llama2: • Llama-2 with LocalGPT:...
LocalGPT with Memory: • LocalGPT API: Build Po...
▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬
Timestamps:
[00:00] Intro
[00:54] LlamaCpp with GPU
[04:54] GPU VRAM Required for LLMs
[06:33] Which LLMs are supported
[07:00] Adding Documents to Vector Store
[08:08] Chatting with Documents
[11:25] Limit Answers to the given context
[14:18] Where are the models downloaded.
[15:23] Define Context Window
[16:00] Change N_GPU_layers
[16:22] Change the Embedding model
[16:50] Change the LLM
All Interesting Videos:
Everything LangChain: • LangChain
Everything LLM: • Large Language Models
Everything Midjourney: • MidJourney Tutorials
AI Image Generation: • AI Image Generation Tu...

Пікірлер

Want to connect? 💼Consulting: calendly.com/engineerprompt/consulting-call 🦾 Discord: discord.com/invite/t4eYQRUcXB ☕ Buy me a Coffee: ko-fi.com/promptengineering |🔴 Join Patreon: Patreon.com/PromptEngineering ▶ Subscribe: www.youtube.com/@engineerprompt?sub_confirmation=1
@engineerprompt8 ай бұрын
It'd be great if you could create a step by step series for all of this aimed at complete novices such as myself. Starting from the very beginning and assuming no prior knowledge or other supporting software already installed (e.g. git or conda).
@Shogun-C8 ай бұрын
- I support this request. Thank you
  @YounessMazouz8 ай бұрын
- There's many tutorials on how to get git, conda, and python setup
  @nattyzaddy65557 ай бұрын
- kzhead.info/sun/pap-kt5xanlviHA/bejne.html
  @paulbishop73997 ай бұрын
Excellent work. Thank you for the walk through, especially on the mac/linux side. The modular approach will make much easier to maintain moving forward.
@shortthrow4348 ай бұрын
- Glad you found it helpful. I agree, that’s the plan
  @engineerprompt8 ай бұрын
Great update. I appreciate the video format for the updates like this. Thank you.
@MikeCreuzer8 ай бұрын
- Thank you 🙏
  @engineerprompt8 ай бұрын
thank you for sharing, have a great day 🙂
@lalpremi4 ай бұрын
Thank you very much for this project, it's such a pleasure to use it !
@Nihilvs8 ай бұрын
- Glad you like it!
  @engineerprompt8 ай бұрын
Congrats on the great content, it is flawless! 🎉 Are we able to connect the project with slack and build a chat bot?
@tiagoneto39027 ай бұрын
Great new changes. I have customized a lot of the core scripts myself, but maybe you could put an example that just shows how to access the persist database standalone? I was trying to run some data viz on my chroma, but running into issues understanding how to access it just standalone.
@prestonmccauley438 ай бұрын
Hello, thank you for this video and the brilliant work how is it possible to force the response of the model in a language other than English typically here French the model that I use mistral, 7B already knows how to respond in French but yet the most answers are in English
@deltawp7 ай бұрын
Please share what is the devicetype for intel IRIS GPU ?
@sadiqkavungal92568 ай бұрын
Hi, I like this project and wanna have a try in my notebook. However, it has 6GB VRAM only. Shall I use the openai API key? Being a layman to machine learning and Python, I would like to know any VRAM requirements for embeddings. If no need for VRAM, I may try on an old notebook without VRAM. Thanks a lot. :)
@edwardchan35216 ай бұрын
Great! Thanks for sharing! Does LocalGPT support code-tuned LLMs such as Codellama?
@MichealAngeloArts8 ай бұрын
- No reason it should not as long as you're using the right format, gguf, ggml, gptq, no clue on others like gpt4
  @stephenthumb29128 ай бұрын
- Yes
  @engineerprompt8 ай бұрын
This is very good project. How do we do fine tuning using Quantization?
@venugopalasrinivasa74182 ай бұрын
Great, great ,great
@caiyu5388 ай бұрын
Thanks for sharing this, when is the support for falcon expected?
@Antaromran8 ай бұрын
- You should be able to run gguf and gguf version models including falcon even now
  @engineerprompt8 ай бұрын
For installing LLama CPP on windows, this worked for me: setx CMAKE_ARGS "-DLLAMA_CUBLAS=on" setx FORCE_CMAKE 1 pip install llama-cpp-python==0.1.83 --no-cache-dir Also if your computer defaults to using the cpu use --device_type cuda for windows Even with all that it kicks me out, BLAS=0
@hectornonayurbusiness26317 ай бұрын
- Same here .. I haven’t investigated why yet
  @kevinfutero71667 ай бұрын
- @@kevinfutero7166 Any idea why its not using gpu?
  @Yeti9693Ай бұрын
How can I clear the cache from the last time I ran the model? I swapped all the docs with a new set of documents and my LOCALGPT model keeps giving me answers from the last set of docs which are no longer relevant for this version?
@user-zw8ep6xp2f6 ай бұрын
what are factors to consider when choosing a cloud server for the above project? say am using Llama-2-7b-Chat-GGUF model, which instance is best? how much gpu memory is required?
@vasimraja68115 ай бұрын
Do we need a tone of video ram for localgpt? Doesnt look like lenovo p51 can cope....
@vservicesvservices70956 ай бұрын
Can i use this with pdfs and docs having complex tables and images
@vivekkawathalkar90716 ай бұрын
This is exactly what I was looking for quite some time. I am just wondering if I can use it for generating Code in a specific structure, by ingesting the documents as .py or .java files and tries to use one of code generation model so that it can generate code in a specific structure as well as spot a snippet of code which is doing a particular functionality?
@Koorawithsedky6 ай бұрын
- This is more of a search feature. Basically it will be looking for specific information in the document. You could use it to retrieve certain function or code snippet but then you will need to have a subsequent LLM call to use it.
  @engineerprompt6 ай бұрын
Can this be used with dual K80 GPUs?
@42svb588 ай бұрын
谢谢！
@kevinyuan27358 ай бұрын
- Thank you 🙏
  @engineerprompt8 ай бұрын
How if I use ubuntu with cpu? Is there any change ? I am struggling with llama_cpp
@khalidal-reemi33618 ай бұрын
Would love the code walk through!
@42svb588 ай бұрын
can it produce structured article from prompts
@hassentangier38918 ай бұрын
Thanks, this video was extremely helpful! Could you make a video on how to use another language than english for the local chatgpt? For example if I feed it with documents in Swedish, and I want to ask the question in Swedish and also get an answer in Swedish? Is that possible?
@wennis898 ай бұрын
- Should be the same just use the multilingual embedding models
  @dustinloring89898 ай бұрын
- Yes, as mentioned above you need an embedding model that supports the language you are working with as well as an llm
  @engineerprompt8 ай бұрын
- @@engineerprompt Do you know which LLm from TheBloke (or in Hugging Face) provides answers in spanish? My docs are in spanish. I tried some models called Falcon, Roberta, Bert, but they are not compatible with localGPT. Thanks in advance. Amazing project
  @juandavidpenaranda61367 ай бұрын
Hi, great video! You said if the "BLAS" variable is set to 1, llama.cpp uses my GPU, if it is set to 0 it is not. I have a M1 Mac and I want to run it on CPU, which I specify with --device_type cpu. However, BLAS is still set to 1. Can someone explain? The LLaMA 7B Chat model also takes forever to answer and if I load other models there is no answer at all.
@ninahaller99134 ай бұрын
- BLAS=1 means that llamacpp is able to see your GPU. if you explicitly set device_type to cpu, then the code will use cpu. That might explain why its running so slow. How much RAM do you have on your system and what quantization level are you using?
  @engineerprompt4 ай бұрын
How well does it work on excel or CSV file? Overall great info and thanks for sharing an update.
@jimitnaik8 ай бұрын
- This setup will work with cvs and excel files but you will need to experiment with embeddings and models for better performance
  @engineerprompt8 ай бұрын
With regard to the splitting operation, I want to split on paragraphs, not random chunks. Can localgpt accommodate that out of the box, or would I need to hack your source code? Could I achieve my result with a decorator? Thx
@malikrumi12068 ай бұрын
- It’s using the recursive character text splitter which uses paragraphs for splitting. So I think it will work for your use case
  @engineerprompt8 ай бұрын
Amazing Work! It would be really awesome, if you could give it a GUI and a 1click installer, like other tools have it already. Like GPT4ALL or subtitle edit. For Mac, Windows and Linux. This would extend the userbase dramatically and give your more coffees ;)
@wasserbesser8 ай бұрын
- Thanks for the idea! will see what I can put together.
  @engineerprompt8 ай бұрын
- @@engineerprompt Seconding this request. I've spent the last 18 hours installing (and uninstalling and reinstalling and then uninstalling and reinstalling again) literally hundreds of gigabytes of CUDA and Visual Studio nonsense trying to get this thing to work.
  @jafizzle958 ай бұрын
- @@engineerprompt I agree, it needs a GUI. Noone prefers a command prompt over a GUI
  @BabylonBaller6 ай бұрын
- @@BabylonBaller There are two UI options now. One is via the API and the other is dedicated UI via streamlit. I am working on a gradio one that will make thing much more easier.
  @engineerprompt6 ай бұрын
- @@engineerpromptmuch appreciated , will look into streamlit
  @BabylonBaller6 ай бұрын
excellent updates. now can you write it in Nodejs?
@sbacon928 ай бұрын
- I don’t have experience with Nodejs but hopefully someone can implement it
  @engineerprompt8 ай бұрын
hey i have m1 8 gb ram model can i run a 4bit quantized model
@stupidbollywood40668 ай бұрын
If the pdf has the tables in it, it's not been extracted the same format as it is. What is the best way to ingest a pdf having tables which has lof of missing values?
@richierosewall30358 ай бұрын
- Look at the unstructured loader for working with pdfs
  @engineerprompt8 ай бұрын
- @@engineerprompt tried it. Row columns alignments are mismatched after loading. Bcz of that llm is giving incorrect response while asking n×m based questions.
  @richierosewall30358 ай бұрын
can local gpt installed on windows11? very exiting video
@intellect51246 ай бұрын
How is localgpt different from quivr project ?
@ps33018 ай бұрын
Still im getting llm corpus data if I ask other than source document
@shreyamahamuni10217 ай бұрын
I like to have a cup of coffee with you. Great.
@ayanbandyapadhyay7 ай бұрын
can you release the code you are showing every example I am finding using openai I made several changes locally to localgpt but I would like to see this spin on it
@user-mo8uj9vq5u8 ай бұрын
- Code is in the localgpt repo
  @engineerprompt8 ай бұрын
- Oh ok my sorry busy day writing code when i seen this KZhead video notification I thought was new code sorry@@engineerprompt
  @user-mo8uj9vq5u8 ай бұрын
Is anyone else struggling to run llamacpp on windows using cublas No matter what, blas is always 0 😐
@ImranKhan-wr2il8 ай бұрын
- If you have NVidia gpu, my recommendation is to use gptq models
  @engineerprompt8 ай бұрын
- @@engineerprompt Gptq in giving nice inference speed on my nvidia 3070Ti but I am struggling to use conversationbuffermemory with it. What options do we have for memory for GPTQ models.
  @ImranKhan-wr2il8 ай бұрын
Im using cpu getting errro like int object not collable
@shreyamahamuni10217 ай бұрын
Setx on visual studio gives a syntax error !
@photize8 ай бұрын
- just use "set VARIABLE_NAME"
  @kevinfutero71667 ай бұрын
Dude you gotta start learning to put relevant links to your videos.
@craig60958 ай бұрын
Are you indian bro ?
@kgsdgj8 ай бұрын
- I think YES!
  @marcusmayer10558 ай бұрын
How can I use an Arabic LLM with this ?? How would I set that up and what steps would I need to take? This is awesome!!!
@blackstonesoftware70746 ай бұрын