Vector Databases simply explained! (Embeddings & Indexes)

2024 ж. 28 Сәу.
261 041 Рет қаралды

Vector Databases simply explained. Learn what vector databases and vector embeddings are and how they work. Then I'll go over some use cases for it and I briefly show you different options you can use.
Resources:
- Gentle introduction: frankzliu.com/blog/a-gentle-i...
- What is a vector database: www.pinecone.io/learn/vector-...
Get your Free Token for AssemblyAI👇
www.assemblyai.com/?...
00:00 - Intro
00:44 - Why do we need vector databases
01:29 - Vector embeddings and indexes
02:58 - Use cases
03:45 - Different vector databases
Vector Database Options:
- Pinecone
- Weaviate
- Chroma
- Redis
- Qdrant
- Milvus
- Vespa
▬▬▬▬▬▬▬▬▬▬▬▬ CONNECT ▬▬▬▬▬▬▬▬▬▬▬▬
🖥️ Website: www.assemblyai.com
🐦 Twitter: / assemblyai
🦾 Discord: / discord
▶️ Subscribe: kzhead.info?...
🔥 We're hiring! Check our open roles: www.assemblyai.com/careers
▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬
#MachineLearning #DeepLearning

Пікірлер
  • I just watched an IBM explanation of vector databases and came away lost. Then I watched yours, and got it right away. Point goes to you. ;)

    @ChrisBrogan@ChrisBrogan6 күн бұрын
  • Yes, a video describing available VDBs in terms of, e.g. Open/Closed, simplicity of operation, and user interaction patterns (quality/expressiveness of API) would be great!

    @bobsavage3317@bobsavage331711 ай бұрын
    • Seconded

      @ProSaladToss@ProSaladToss11 ай бұрын
    • kzhead.info/sun/jNNmcd6Op4mohZ8/bejne.html You may find it helpful to start with the time frame of the video above!!

      @assethotorch2395@assethotorch239511 ай бұрын
    • Also important how to extend the vdb with custom distance functions

      @RomeoKienzler@RomeoKienzler9 ай бұрын
  • Yes, looking forward to a more in-depth video.

    @leftright1606@leftright16069 ай бұрын
  • The concise, high-level explainer that I needed. Thanks.

    @nickstaresinic9933@nickstaresinic993311 ай бұрын
  • Yes please, more on this topic, I would appreciate it.

    @jolly2002me@jolly2002me11 ай бұрын
    • 👆

      @HeyImAK@HeyImAK11 ай бұрын
  • Definitely! I'd love to see comparable benchmarks for common LLM and other tasks (i.e. transfer learning, use-cases in the context of fine-tuning, etc)

    @matt_88@matt_8811 ай бұрын
  • Let's see the more in depth comparison! Also would love to know your take on where it will go? Are they able to automatically generate vectors for your multimodal data already? Are there known companies using vector databases currently? Are there lightweight alternatives to the services you offered? (ie. a numpy verision of a vector database?)

    @DanielTorres-gd2uf@DanielTorres-gd2uf11 ай бұрын
  • Very useful. Now I can imagine what is a vector database. Thanks

    @brunomesquitazamberlan8876@brunomesquitazamberlan887621 күн бұрын
  • Love your work Patrick. Definitely would like to see more on vector databases, especially when you would use one over an array or other options and the pros and cons of some of the types you mentioned (I.e. Pinecone, Milvus, etc.)

    @vincerocchi9083@vincerocchi908311 ай бұрын
  • Straight forward and simple. Thanks! 😊

    @jonmichaelgalindo@jonmichaelgalindo2 ай бұрын
  • Thank you so much, Patrick. Would love to watch a video detailing and comparing all VDBs.

    @karthickmj6312@karthickmj63127 ай бұрын
  • Brief and to the point. Great video.

    @SoharabHossain@SoharabHossain12 күн бұрын
  • Thanks for putting this together! :)

    @TeamUpWithAI@TeamUpWithAI8 ай бұрын
  • Great video, thanks! Short and exactly on point -- much appreciated. Yeah, it'd be cool to see more in-depth comparison of the dbs.

    @alexfowler1683@alexfowler16839 ай бұрын
  • Nice summary on Vector databases. A comparison of Graph and Vector databases with specific use cases would also help. Thank you

    @maneeshs3876@maneeshs387610 ай бұрын
  • This was a very clear explanation. Thank you!

    @ThomasLapperre@ThomasLapperre5 ай бұрын
  • yup!! looking forward to a detailed analysis and comparison

    @saimanikanta7360@saimanikanta73608 ай бұрын
  • Thank you for this video - just what I needed! If you haven't done one already, please do an explainer comparing. 🙏

    @ksnydertube@ksnydertube9 ай бұрын
  • Thank you, nice and short overview to get an idea of what a vector db is.

    @user-hf3rq7qe9v@user-hf3rq7qe9v4 ай бұрын
  • I would love to see a comparison of the different Vector Databases!

    @asifmian43@asifmian438 ай бұрын
  • It would be great if you explained how to use vector databases to give LLM's long term memory! 🙏

    @otto.bjorkland@otto.bjorkland11 ай бұрын
  • Excellent overview. Many thanks!

    @4XLibelle@4XLibelleАй бұрын
  • Thanks, describe very simply what the vector database is and its uses.🥀

    @muhammadmursalin8915@muhammadmursalin89159 күн бұрын
  • Great intro to VD! Would love to see a more in-depth video on some real-world use cases :)

    @anirudhgangadhar6158@anirudhgangadhar61589 ай бұрын
  • Thanks, this is what I needed to understand the overall idea of vector db.

    @rezaru2000@rezaru20002 ай бұрын
  • thanks for a such a detailed and easily understandable knowledge

    @sanjeevKumar-eg6hp@sanjeevKumar-eg6hp4 ай бұрын
  • Good informative video. Thanks!

    @khari_baat@khari_baat8 ай бұрын
  • Would definitely be interested in more details, especially on self hosted VDBs

    @VastCNC@VastCNC11 ай бұрын
  • Great video thank you!

    @hughesadam87@hughesadam878 ай бұрын
  • Simply explained. Thanks!

    @realjackofall@realjackofall11 ай бұрын
  • Perfectly clear. Thanks!

    @olivierrochon3858@olivierrochon38585 ай бұрын
  • Thanks for a nice video! Would be great to learn more on how one could use Redis and PostgreSQL as vector databases. Additionally, more examples and use cases for vector databases would be cool.

    @slawikus1982@slawikus19827 ай бұрын
  • to the point and concise explanation !!

    @nsitkarana@nsitkarana11 ай бұрын
  • Thank You... It's a great explanation on Vector database. Please make a in depth videos on Pinecone & Redis vector databases

    @08ae6013@08ae601311 ай бұрын
  • This is a really good explanation and visualization

    @kevinli3767@kevinli37672 ай бұрын
  • That was very helpful! Thank you!

    @camilaferraz8153@camilaferraz815311 ай бұрын
  • Thank you Patrick.

    @caiyu538@caiyu53811 ай бұрын
  • Incredible video

    @ednavas8093@ednavas80937 ай бұрын
  • Awesome explanation! Thank you

    @divakaratanjore1059@divakaratanjore10599 ай бұрын
  • Definitely need a comparisio video and small example code for the top 3 Vector DB's used !! By the way ,Fantastic walk through of the concept !!.

    @bjugdbjk@bjugdbjk9 ай бұрын
  • Thanks for the Video. you are awesome and very easy to understand what they are. I think Pinecone is quite popular so if there is a video about it, it would be great. Cheers

    @jaylee7864@jaylee786411 ай бұрын
  • A breakdown of differences between vector databases would be nice. But also a comparison to graph databases like neo4j and TitanDB et al would help this n00b

    @nickmhc@nickmhc8 ай бұрын
  • I would love to see a comparison of the different VDB's and perhaps your thoughts on which one or two are the best. Thanks for a great video.

    @SubirSengupta1@SubirSengupta111 ай бұрын
  • It could be interesting to see a case of adding a vector dbase to an existing sql database, if it can replace it, or if a parallel approach might be interesting, using them side by side, each taking advantage of strenghts. etc.

    @davidlepold@davidlepold11 ай бұрын
  • comparison video for the mentioned VDBs at the end would indeed be awesome!

    @mbrochh82@mbrochh8211 ай бұрын
  • Could you provide an overview on the comparison of different Vector Database providers and how to decide which is better?

    @user-zn1kl4hq4j@user-zn1kl4hq4j7 ай бұрын
  • Yes please, a VDB comparison would be great, and please include FAISS and other self-hosted options.

    @Anonymous-lw1zy@Anonymous-lw1zy11 ай бұрын
  • Thank you, more please :)

    @ser1ification@ser1ification11 ай бұрын
  • Super helpful!

    @bindass1000@bindass10003 ай бұрын
  • This was a really good video! Thanks so much :)

    @p.j.816@p.j.8164 ай бұрын
  • Thanks for the video 👍

    @ryansteiger6960@ryansteiger696010 ай бұрын
  • woud love to see detailed comparison of the vector databases

    @WmSadler@WmSadler11 ай бұрын
  • Yes please a video about that. Liked and subscribed

    @Maisonier@Maisonier10 ай бұрын
  • You would need to upload ur own embeddings to these db though? Or do they calculate it for you in a multimodal way? Pinecone seems like the former? If so, why not just host locally in your Postgres?

    @angeloinvestor@angeloinvestor9 ай бұрын
  • Great video… please go on with the next one

    @dabravo100@dabravo10011 ай бұрын
  • thank you

    @ducbuivan9378@ducbuivan93786 күн бұрын
  • Very helpful animations:) How did you do them with exalidraw, if I may ask?

    @PaperTigerLive@PaperTigerLive11 ай бұрын
  • An in-depth comparison would be great!

    @user-gh4id3gg4q@user-gh4id3gg4q3 ай бұрын
  • Good topic 🎉

    @RanitDA@RanitDA11 ай бұрын
  • Good explanation. Thumbs up 👍

    @beemerrox@beemerrox11 ай бұрын
  • Great one

    @hammanadamafarukabubakar8365@hammanadamafarukabubakar836511 ай бұрын
  • I love the video. One critique would be to set up further away from the background to possibly reduce the reverb you're getting

    @ayoubthegreat@ayoubthegreat10 ай бұрын
  • nice video - thanks!

    @katsunoi@katsunoi5 ай бұрын
  • thanks, you have a video for the comparate diferences quality between?

    @SonGoku-pc7jl@SonGoku-pc7jl6 ай бұрын
  • Great video! thank you! A big YES for a Vector DB dedicated video Btw I am happy I have found this channel, let's subscribe !

    @soubinan@soubinan11 ай бұрын
  • go in depth please we would like to see a video about all these technologies

    @moeal5110@moeal51107 ай бұрын
  • Supabase also joined the vector DB club a while ago.

    @MartinQLynx@MartinQLynx8 ай бұрын
  • helpful >>

    @adityadubey7509@adityadubey75097 ай бұрын
  • It would be great to see a comparison of the vector database companies

    @user-ng3to6lh7z@user-ng3to6lh7z11 ай бұрын
  • Please continue..)

    @fnmby@fnmby29 күн бұрын
  • Why isn't KX mentioned in this overview? They have a very strong vector database and support time-series data as well. Formula 1, manufacturing, utilities, and all the banks use them.

    @hughster657@hughster65711 ай бұрын
  • This is a great explanation. But the indexing part is what I was looking for. Nearest neighbor search is already a hard problem in Computer Graphics and gaming (to detect collisions. E.g. if you ever play Madden and do a slow-mo replay, you'll see that the receiver never actually touches the ball. or E.g. cloth simulations for a cape often "clip" into the 3d model of the person wearing the cape).

    @nikilragav@nikilragav2 ай бұрын
  • Cool, please explain more details about each vector db thanks

    @VaibhavPatil-rx7pc@VaibhavPatil-rx7pc10 ай бұрын
  • Can you make a video around pinecone?

    @vladsinjavin9189@vladsinjavin91898 ай бұрын
  • Great content.I noticed the Elastic name is missing from the list of vector databases. Could you please include it in the list?

    @UDAY-pv5il@UDAY-pv5il6 ай бұрын
  • 🎯 Key Takeaways for quick navigation: 00:41 📊 Vector databases store Vector embeddings for fast retrieval and similarity search. 01:07 📝 Unstructured data like images, text, and audio can be challenging to store in relational databases, making vector databases valuable. 02:02 🔍 Vector embeddings allow for finding similar items by calculating distances and performing nearest neighbor searches. 03:10 🗂️ Vector databases have various use cases, including equipping language models with long-term memory, semantic search, similarity search, and recommendation engines. 03:50 💽 Examples of vector database options include Pinecone, Chroma, Redis, Cool, Trans, Milvus, and Vespa AI, each with its strengths and capabilities.

    @decodingdatascience@decodingdatascience6 ай бұрын
  • I would love to see.. what is the Best Vector database... ease of use vs performance. and why. This way we can stop guessing which one to try to use and just know this one is by Standard the best.

    @jessem2176@jessem21769 ай бұрын
  • Yes please, i habe to decide soon which database, redisearch is cloud only, pinecone too i think

    @smartfusion8799@smartfusion879911 ай бұрын
  • Would love an explanation of indexing and how to use this with an LLM

    @pavlotriantafyllides5687@pavlotriantafyllides56879 ай бұрын
  • I would enjoy seeing a comparison among these different vector databases. Today I just picked the one that’s most convenient. But there’s probably a better rationale for choosing among them. The other topic I’d like to see is sustainability. For example, if I’m adding a new vector to the database once a week what will happen after 10 years? Is that a sustainable growth when I have a 1016 element vector everyweek of the year or do I need to do something to re-index the database so that my performance doesn’t drop after a number of years? The data I’m creating now would be relevant for many decades.

    @lancerkind@lancerkind2 ай бұрын
  • Please explain further, any one of the vector databases with an example for each Weaviate, Pinecone..

    @vamc256@vamc2569 ай бұрын
  • Yes please!

    @AndrewPrice2704@AndrewPrice270411 ай бұрын
  • This is like that scene from the Matrix where Neo stops the bullets and he sees the Matrix(humans, objects alike) as lines of code. We are now converting objects like banana and apples into a bunch of numbers which even we can no longer understand looking at them via the vector embedding.

    @harshjain3122@harshjain312211 ай бұрын
  • I remember working on a vector database in the mid 1980s. That was a Pick system, mostly used for accounting, warehouse management and the like. Re-innovation. 😁

    @pascalmartin1891@pascalmartin18919 ай бұрын
  • A comparison of their underlying architecture would be useful.

    @mechcooper8341@mechcooper83419 ай бұрын
  • yes please!

    @liperuf@liperuf11 ай бұрын
  • you can make a mor explication of diferences and optimitzacions cases :) thanks!

    @SonGoku-pc7jl@SonGoku-pc7jl6 ай бұрын
  • Yes, please.

    @DanielNiklaus@DanielNiklaus11 ай бұрын
  • i would love to know more

    @thantzinoo938@thantzinoo93811 ай бұрын
  • I want to know how indexes work. How does the vector of the search prompt get mapped via index?

    @DanielWeikert@DanielWeikert11 ай бұрын
  • I say what Bob says. Thanks Bob.

    @darrylcatay2295@darrylcatay229511 ай бұрын
  • In LLM, I'm facing a token limit issue. With the vector database, will I be able to overcome token issues in llm?

    @akshaysena6598@akshaysena65983 ай бұрын
  • I would like to see a practical application example. Adding vector database info into a group of images and how it's searched for.

    @mr9373@mr937310 ай бұрын
  • Can we FAISS vector store in production?

    @tabrezshaikh758@tabrezshaikh75811 ай бұрын
  • Which vectors, you are explaining my vectors of my matrix?

    @urimtefiki226@urimtefiki2262 ай бұрын
  • Tx

    @darksilentcore0@darksilentcore011 ай бұрын
  • Vector DB’s do not get around LLM context size limitations, but it seems like that’s the hot use case for them. Embeddings are not useful until they’ve been transformed though a neural network. I keep looking at these weird use cases like Langchain and I’m baffled people accept their wide margin of failure.

    @johnshaff@johnshaff11 ай бұрын
  • What about kdb+ ?

    @senthil_the_analyst@senthil_the_analyst11 ай бұрын
  • Someone needs to make a 3D model of a LLM engine. In the document, “Attention is All You Need” the number 512 is given. Relating the number 512 to your X/Y coordinates, with the 4 quadrants: Would it be accurate for me to assume that the size of your four quadrants are each with 512 total (22.6 x 22.6)? Furthermore, given that there are 512 (22.6 x 22.6) allocated for each word at the input prompt, with 512 (22.6 x 22.6) allocated to each of the 6 of the LLM layers (processed in series). Am I correct in understanding this?

    @TimKaseyMythHealer@TimKaseyMythHealer11 ай бұрын
  • Can you point to where I can learn about how the indexing is done at a mechanical/gears-level ? Not like, the state of the art version, so much as like, “here’s the naive approach, and here’s the simplest improvement on it” ?

    @drdca8263@drdca826311 ай бұрын
    • Simplest way to understand this is from pure higher mathematics. Look up vector spaces, inner product spaces, and “metrics” (e.g. metric spaces). The “vector embedding” is an algorithm (function) that assigns your actual data a vector (N-element array of #s) in a mathematical “Vector Space”. Vector spaces have nice mathematical properties; these ones are usually hyperspaces with hundreds or thousands of dimensions. You can then go further and define all sorts of add-ones; a function that defines the distance between 2 vectors is a “metric”, one that maps your vectors to lower dimensional vectors is a “projection”, and so on. All these functions have to satisfy some abstract mathematical rules to be proper metrics, projections, etc. but once they do you pick up all sorts of additional nice properties for free. The “index” is generally the number(s) generated by applying 1 or more of these functions to your vector. For example, the index could be the # of nonzero indices the vector has. Or it’s length, as defined by some metric. It’s some value(s) that allow searches to quickly prune away or skip most vectors so that full checks and calculations only need to run on a much smaller subspace.

      @zackyezek3760@zackyezek37608 ай бұрын
KZhead