[ML News] Llama 3 changes the game

2024 ж. 17 Мам.
45 075 Рет қаралды

Meta's Llama 3 is out. New model, new license, new opportunities.
References:
llama.meta.com/llama3/
ai.meta.com/blog/meta-llama-3/
github.com/meta-llama/llama3/...
llama.meta.com/trust-and-safety/
ai.meta.com/research/publicat...
github.com/meta-llama/llama-r...
llama.meta.com/llama3/license/
about.fb.com/news/2024/04/met...
twitter.com/minchoi/status/17...
twitter.com/_akhaliq/status/1...
twitter.com/_philschmid/statu...
twitter.com/lmsysorg/status/1...
twitter.com/SebastienBubeck/s...
twitter.com/_Mira___Mira_/sta...
twitter.com/_philschmid/statu...
twitter.com/cHHillee/status/1...
www.meta.ai/?icebreaker=imagine
twitter.com/OpenAI/status/177...
twitter.com/OpenAIDevs/status...
twitter.com/OpenAIDevs/status...
twitter.com/CodeByPoonam/stat...
twitter.com/hey_madni/status/...
cloud.google.com/blog/product...
twitter.com/altryne/status/17...
twitter.com/xenovacom/status/...
twitter.com/minchoi/status/17...
www.udio.com/
www.udio.com/pricing
Links:
Homepage: ykilcher.com
Merch: ykilcher.com/merch
KZhead: / yannickilcher
Twitter: / ykilcher
Discord: ykilcher.com/discord
LinkedIn: / ykilcher
If you want to support me, the best thing to do is to share out the content :)
If you want to support me financially (completely optional and voluntary, but a lot of people have asked for this):
SubscribeStar: www.subscribestar.com/yannick...
Patreon: / yannickilcher
Bitcoin (BTC): bc1q49lsw3q325tr58ygf8sudx2dqfguclvngvy2cq
Ethereum (ETH): 0x7ad3513E3B8f66799f507Aa7874b1B0eBC7F85e2
Litecoin (LTC): LQW2TRyKYetVC8WjFkhpPhtpbDM4Vw7r9m
Monero (XMR): 4ACL8AGrEo5hAir8A9CeVrW8pEauWvnp1WnSDZxW7tziCDLhZAGsgzhRQABDnFy8yuM9fWJDviJPHKRjV4FWt19CJZN9D4n

Пікірлер
  • “If you don’t know what I’m talking about - and I don’t know why you wouldn’t…” I don’t know it because you’re my main source for important developments in machine learning.

    @tantzer6113@tantzer611324 күн бұрын
    • PS I don’t mind getting news with delay. I like it that you get into algorithms, capabilities, and technical overviews.

      @tantzer6113@tantzer611324 күн бұрын
    • if you on Twitter and follow few folks in llm community, it is almost impossible to escape this hype and news on your timeline

      @mriz@mriz23 күн бұрын
  • I had my doubts about Zuck, but check him out now-championing open source AI like a boss! Maybe he should just grab the name 'Open AI'-that is, if nobody's snagged it yet

    @YuraCCC@YuraCCC24 күн бұрын
    • The training data will be a legal nightmare on these proprietary things. Making it open source is the only way in this case.

      @demetriusmichael@demetriusmichael24 күн бұрын
    • I know, this latest version of Zuck is amazing! I watched an interview of him talking about Llama 3 and he was so human-like

      @antonystringfellow5152@antonystringfellow515223 күн бұрын
    • Opener AI?

      @peterfireflylund@peterfireflylund23 күн бұрын
    • @@antonystringfellow5152 yeah, his avatar got a massive upgrade, he's almost human now

      @monad_tcp@monad_tcp23 күн бұрын
    • He was like this in VR too. Throughout VR development Meta/Facebook published many things open source, including computer vision models.

      @Wobbothe3rd@Wobbothe3rd23 күн бұрын
  • Llama 3: they had no moat

    @mikethedriver5673@mikethedriver567324 күн бұрын
    • This wouldn’t be such big news if they didn’t have a moat that is just now being bridged.

      @float32@float3223 күн бұрын
  • The next revolution imo definitely needs to be getting things to run locally with any sort of fidelity.

    @GuagoFruit@GuagoFruit24 күн бұрын
    • If Stable Diffusion is a good historical example, then we should see some pretty significant perf improvements as soon as people (nerds) decide to stubbornly mess with it until it works.

      @Aphixx@Aphixx23 күн бұрын
  • I am especially happy that Llama 3 supports multiple languages :-) Most open access or open source models are English only and no real alternative to OpenAI GPT.

    @olcaybuyan@olcaybuyan24 күн бұрын
  • Those t-shirt stripes are an example of reverse CAPTCHA - it spins humans right into dizziness and blackout, but AIs? They just keep watching and learning.

    @YuraCCC@YuraCCC23 күн бұрын
  • Mixture of Depth is a promising direction in modularizing LLMs, you could basically use only part of the model for specific applications

    @vladimirtchuiev2218@vladimirtchuiev221823 күн бұрын
  • The anti-open source AI safety person impression at 13:48 is too accurate🤣

    @mikayahlevi@mikayahlevi23 күн бұрын
  • Udio is great in my opinion. You don't let the AI create whole songs, but segments (of around 33s). It usually creates 2 variants at the same time. You then can extend those segments (before or after); either by a midsegement, intro or outro. You can even insert your own lyrics and it works like a charm. If you are happy with the song, you then can "publish" it and even pick a text-to-image cover art. I love that stuff.

    @thirdeye4654@thirdeye465423 күн бұрын
  • I think a valid reason not to build models on 95% English data is that it could significantly influence the world view and "zeitgeist" of the model in all languages. it makes sense to have fully local models to not homogenize the world even further with US thought.

    @Embassy_of_Jupiter@Embassy_of_Jupiter20 күн бұрын
  • There a numerous papers about data quality and data selection going back to 2000. Good to see people realize quantity is not the "end all" of training LLM. Creating a good dataset has always been an art. Will the filters and pipeline for processing the data get open sourced?

    @woolfel@woolfel23 күн бұрын
  • As always the best curated ML news. Love your expertise and humor :D oh, and... more fish for Yann LeCat!

    @propeacemindfortress@propeacemindfortress23 күн бұрын
  • if the training text would be plain ascii, and average token length 4 characters , the training dataset would have been ~ 55 terabytes plain ascii. wow!

    @pietrorse@pietrorse22 күн бұрын
  • thank you very much for your videos! Could you hint me to some of the techniques that you find most promising in context length extension?

    @Hacking-Kitten@Hacking-Kitten23 күн бұрын
  • I really like the way you frame meta/zuck making llama 3 open source. They choose the option that is best for the company, but whats best changes. For research and optimization an open source model is better. For profit a closed source one is better. What they do just depends on what is best at the moment, but i like that its open source for llama 3 right now and hope it will stay that way!

    @Voljinable@Voljinable18 күн бұрын
  • Really enjoy these ML News vids. Great for keeping AI normies like me up to speed.

    @pablowentscobar@pablowentscobar24 күн бұрын
  • Cool to have great open LLMs. Unfortunately, this is not the case for image generation models: all the recent advanced models like SDXL or Photoshop are not commercial free ones.

    @Nico79489@Nico7948923 күн бұрын
  • Right at the moment, Phi-3 changed the game again!...

    @yaxiongzhao6640@yaxiongzhao664024 күн бұрын
    • huh yeah, the 7b standard is no more a standard. It's a pretty model really that can be runned also on 4GB VRAM gpus.

      @quickpert1382@quickpert138223 күн бұрын
  • 20:40 An alternative to this is using a documented SDXL Turbo workflow with ComfyUI locally, which can produce images of decent fidelity at even faster speeds than this demo, at least on my 3090.

    @marsandbars@marsandbars24 күн бұрын
  • we are getting to model sizes where they might as well just be compressed lookup tables

    @sebastianp4023@sebastianp402323 күн бұрын
    • That’s essentially what they are regardless. That’s how attention works. Tokens are used as queries against keys for computing similarity scores and then the values are summed up based on those scores. It’s literally essentially keying into a learned “dictionary” index

      @andreicozma6026@andreicozma602621 күн бұрын
  • thank you

    @yurykorolev@yurykorolev23 күн бұрын
  • This channel is one of the rare ones that I genuinely watch, amidst hours and hours of clicbait recycled AI hype videos :)

    @Iaotle@Iaotle23 күн бұрын
  • Interesting times in many ways.

    @OperationDarkside@OperationDarkside23 күн бұрын
  • You should recapitulate the math from code synthesis project from MIT using llama 3 because that would be lit

    @brandonheaton6197@brandonheaton619724 күн бұрын
  • I find the license really fair. Their models will be obsolete by next year anyway. I think it is only appropriate that they should profit off of it until then, for having developed such a great step forward in local LLMs.

    @Embassy_of_Jupiter@Embassy_of_Jupiter20 күн бұрын
  • I'm not sure why people have reservations about Phi specifically. We don't know what data were used to train the other models and to what extend their performance rely on "fitting to the test dataset". Did OpenAI ever admit what role does the human-curated part of their training dataset play in the model's performance?

    @pawelkubik@pawelkubik19 күн бұрын
  • If we had access to GPT 4 Weights and biases how different would it be to the LLama 3 Weights? I use all the LLMs and find them pretty much even. Claude is fast but limited. I find Gemini pro a little dumber.

    @FilmFactry@FilmFactry24 күн бұрын
  • I don't even know what benchmarks to believe

    @JorgetePanete@JorgetePanete23 күн бұрын
  • The scores are high somehow and it makes me wonder whether they specially aligned the curated and the validation data when doing instruction finetuning!!

    @auresdz701@auresdz70123 күн бұрын
  • 🎉

    @snarkyboojum@snarkyboojum24 күн бұрын
  • "we have used 15T tokens from publicly available sources . . . pls don't look to close . . ." 😂

    @sebastianp4023@sebastianp402323 күн бұрын
    • that's a big "trust me bro"

      @sebastianp4023@sebastianp402323 күн бұрын
    • The open-source dataset "The Pile" contained the 108 GB Books3 shadow library of approximately 196,640 pirated books, most of which are still under copyright. It is a "Publicly available source," so the lying executives can shrug and screw over authors. They have $billions to spend on Nvidia chips but won't even buy e-books of the creative works they train on. (It's hard to tell what the current status is of The Pile and Bibliotek... intentionally so. The first Llama model trained on Books3. Rumor is all the AI companies saved a copy of Books3 before scripts pointing to the dataset were deleted, and Nvidia is being sued over training its NeMo LLM family.)

      @skierpage@skierpage23 күн бұрын
  • and we are witnessing a t-shirevolution .. I'm still dizzy from seeing those stripes ..

    @huveja9799@huveja979924 күн бұрын
    • its to confuse visual learning algorithms

      @monad_tcp@monad_tcp23 күн бұрын
    • @@monad_tcp Well, I hadn't thought of that, but now that you mention it, it may well be ..

      @huveja9799@huveja979923 күн бұрын
  • Very nice

    @cherubin7th@cherubin7th23 күн бұрын
  • Though we appreciate this greatly. Go away and go back on vacation until you're rested!

    @theosalmon@theosalmon24 күн бұрын
  • Do I really need a better LLM than Llama3 70B? If I have a good agent with search, RAG, and memory, isn't that good enough?

    @dr.mikeybee@dr.mikeybee23 күн бұрын
  • Lama 3 BS generator v5 don't worry, I'll include the Lama 3 at the start

    @alan2here@alan2here23 күн бұрын
  • Does Llama 3 have any vision capabilities like GPT4?

    @DanielWolf555@DanielWolf55523 күн бұрын
  • 6:23 "There is enough research to show that once you are capable at one language, you only need quite little data on another language to transfer that knowledge back and forth" Does anyone give me related papers to this argument? I am interested in cross-lingual transfer in language models.

    @214F7Iic0ybZraC@214F7Iic0ybZraC15 күн бұрын
  • helli hello!

    @thomasmuller7001@thomasmuller700123 күн бұрын
  • I tried to use existing llm's to prepare a fine tuning dataset specific on theravada thought, philosophy and practice, turned out that all models I tried were incapable of capturing any nuances in the meaning of words and concepts but stuck diligently to "their own philosophical framework of interpretation" regardless of the different system prompt, regardless of feeding scriptures, papers or video transcripts, they couldn't even identify the proper questions, so please don't mind me on disagreeing that language alone, maybe even regardless of percentage distribution, doesn't cut it on any task that require cultural, philosophical or religious understanding... not even talking about the human component in it... translation ofc is a totally different thing, used phrases and stuff can be captured quite well... the underlying unspoken human component not so much.

    @propeacemindfortress@propeacemindfortress23 күн бұрын
    • Yeah, that seems like a very hard task for such a model. Sometimes you have to properly manage your expectations with these things

      @klausschmidt982@klausschmidt98222 күн бұрын
    • @@klausschmidt982 I've given up on it, current models neither have the capability nor the training data that would allow for finer nuances on rare topics... future models might be capable but with the move to synthetic data... 🤷‍♂ very doubtful that future architectures can do it after the synthetic data has been flattened into a unified interpretation... then we will have american and chinese buddhist interpretations 😂 so I join into that "Yeah," was a nice idea but specialized things might need a lot more human work and training investment than I can afford. Have a good, thanks for reply.

      @propeacemindfortress@propeacemindfortress22 күн бұрын
  • Old news there is Phi-3 now.

    @pandalayreal@pandalayreal23 күн бұрын
  • the 8B param number strikes me as a bit weird. Why not 7B to make a fair comparison between the models? did they not achieve good results with 7B or did they just not test and decided in advance to compare against weaker models?

    @semaraugusto@semaraugusto23 күн бұрын
    • They have a larger vocabulary and through that more parameters in the embedding layers. The other architecture (number of layers and heads) should still be the same.

      @BrandnyNikes@BrandnyNikes23 күн бұрын
    • Fat-fingered typing error? 🙂

      @skierpage@skierpage23 күн бұрын
  • ScreenAI seams to be everything you need for a agent capable to perform UX interaction. That's exciting and disappointing at the same time (I agree that the accessibility via Google Vertex AI is very limited). Why can Google not provide a SIMPLE on-the-go-payment-API-call solution like OpenAI and Anthropic??

    @OmicronChannel@OmicronChannel23 күн бұрын
  • Its very easy to get models to hallucinate when asking for music recommnds. LLama is no different.

    @mimszanadunstedt441@mimszanadunstedt44121 күн бұрын
  • Be sure to DL the Purple and Purple Free Versions by getting two emails for each model set requested. But be prepared for TB instead of GB worth of dls.

    @timothywcrane@timothywcrane23 күн бұрын
  • I would be careful not to just laugh safety guys off as these silly modern-day luddites. Anyway, can't wait for llama3-uncensored:400B. But then again, I just want to do cool stuff and see the world burn, so don't mind me! 😊

    @unvergebeneid@unvergebeneid23 күн бұрын
  • Remember math results not utilizing wolfram are meaningless. Since their results will be childs play compared to results using worlfram as a tool.

    @Dogo.R@Dogo.R23 күн бұрын
    • Cannot tell what you are trying to say.

      @eadweard.@eadweard.23 күн бұрын
  • @13:16 "and with the past with Llama 2 we've already seen that all these people who have announced how terrible the world is going to be if we open-source these models have been wrong -- have been plainly wrong. The improvement in the field, the good things that have happened undoubtedly, massively, outweigh any sort of bad things that happen, and I don't think there's a big question about that. It's just that the same people now say 'Well okay not this model, but the next model... is really dangerous to release openly.' So this is the next model, and my prediction today is it's going to be just fine, in fact it's going to be amazing releasing this." @Yannic, That's quite a set of claims. What are all "the good things that have happened" beyond technical advances like more efficient models? I'm sure millions of people are more productive and writing better (or at least spewing grammatically correct verbiage), but are there actual studies of the good things, both with AI in general and open-source models? Meanwhile it's unclear how long it will be before we discover the awful uses of AI in the 2024 election cycles in major countries and other disinformation campaigns. I'm willing to believe your take, but some evidence for your optimism would be nice.

    @skierpage@skierpage23 күн бұрын
  • I can't care about LLMs until we get personal assistants that are completely customizable and fully transparent with no censorship.

    @hypno5690@hypno569024 күн бұрын
    • But you can? It just requires a beefy PC

      @lonelybookworm@lonelybookworm23 күн бұрын
    • Well you should care, because large language models and their evolutions are about to take over your life.

      @Rhannmah@Rhannmah23 күн бұрын
  • Pls add timestamps to the video

    @logangarcia@logangarcia22 күн бұрын
  • If it wasn't for open weights, crazies banging away on 1050tis and pis like myself would have never been "allowed".

    @timothywcrane@timothywcrane23 күн бұрын
  • As a language learner, it feels not so great for other languages, and I question whether 5% non-English is enough. Discuss German and it'll often make mistakes explaining the grammar. Try to talk in Chinese and it'll switch back to English at every opportunity. Hopefully these are just issues with the prompt or instruction tuning that will be fixed by other fine-tunes, but for now I'm going back to Mixtral and ChatGPT...

    @BinarySplit@BinarySplit23 күн бұрын
  • Llama can pretend to run code, I got it to simulate a dos prompt and play text adventure games.

    @zyxwvutsrqponmlkh@zyxwvutsrqponmlkh24 күн бұрын
  • More paper reviews please

    @kbizzy111@kbizzy11123 күн бұрын
  • I really hoped that the small model would be better in the german language but unfortunately not good enough that I would prefer to talk to it only in german and don't think that some of my only german speaking friends would like to talk to it. Probably the bigger model is much better in foreign languages but unfortunately that one is again too big for a 4090. It's a pitty. So having our own app available via VPN at home from the mobile phone to let it also use our other non english speaking friends is still not really an option. Normal people are ignorant and would laugh at me. - Maybe not if I would give a female assistant an erotic french voice? ;-) But I must say that despite of that I really like the instruct model but the chat model gave me a lot ! of bs. But maybe some parameters tweaking may change that. Haven't had the time to play around with it more right now. But we'll see...

    @henrischomacker6097@henrischomacker609724 күн бұрын
  • i think there is no need to make it private - the moment the model requires more than ~24gb of ram to run, it is out of the hands of most businesses to directly use - you can release the weights and you can privately run the poor models quickly, the medium models slowly, or you can laugh as your hardware runs out of ram trying to run the full Facebook model ...

    @andytroo@andytroo24 күн бұрын
  • If you are the first to produce the MMLU is that an achievement or shameful? luv that OpenAi just added reverse "gas fees".

    @timothywcrane@timothywcrane23 күн бұрын
  • Immediately passed by Phi 3

    @JacobAsmuth-jw8uc@JacobAsmuth-jw8uc23 күн бұрын
  • 0:05 The usual little cynical chuckle

    @mauricioalfaro9406@mauricioalfaro940624 күн бұрын
  • Wow it's only been two days since llama 3 release!? I swear it felt like a month ago..

    @diga4696@diga469624 күн бұрын
  • 13:23 To be fair, you can only end the world once, and after that happen you (luckily) won't be around to witness the outcome. Black swan with a touch of the anthropic (no pun intended) principle; you can only be alive to witness the state of the world if the world you live in has not been ended yet; once that happens, you likely will not be in conditions of acknowledging that it has happened; it is not something you can look back and see it after the fact, you can only experience it the first time it happens, and that is if you will be able to have any experience at all while it is happening.

    @TiagoTiagoT@TiagoTiagoT24 күн бұрын
    • Yannic does not believe that AI can cause existential risk. With this generation of models, he is probably right, but the trend is not promising...

      @Hexanitrobenzene@Hexanitrobenzene23 күн бұрын
    • @@Hexanitrobenzene Humanity is blindly approaching the "tickling the dragon's tail" territory; but unlike with the Demon Core, once it goes criticial, it won't be just a matter of a few lab workers suffering of radiation exposure. Who knows, maybe we'll luck out and go the comic book route and gain godly super-powers; but in the real world, the odds aren't looking good. Don't get me wrong, I'm not saying we would be safer with just the big corps handling the development of the future, or the end of the future, of humanity; we're fucked either way, Molloch, you know?

      @TiagoTiagoT@TiagoTiagoT23 күн бұрын
    • @@TiagoTiagoT relax. A language model doesn't have the agency nor the tools to make actions in the real world, and even if it did, it wouldn't be able to react and incorporate the results. We're quite far from the situation you're thinking of. Doesn't mean you don't have to think about it because it's pretty much undoubtedly coming in the future, but there is nothing to freak out about. The only actual worry currently to be had in the immediate future is the amount of people who become unemployable because of the performance of generative models.

      @Rhannmah@Rhannmah23 күн бұрын
    • @@Rhannmah You must not be following the news closely in recent years; people being giving them all those abilities bit by bit at a faster and faster rate. Unemployment is a concern; but that's just the bathtub starting to overflow; meanwhile the air faintly smells like gas and there are lit candles all over the place....

      @TiagoTiagoT@TiagoTiagoT23 күн бұрын
    • @@Rhannmah even without agency for the unknowable goals of an ASI, current AIs allow bad actors to do bad things with minimal effort. And what's really concerning is Google/Meta/Microsoft/OpenAI are run and owned by billionaire sociopaths whose goals include: getting you hooked on a stream of content so they can know all about you to monetize your profile; avoiding any meaningful government regulation; and stopping the redistribution of their wealth. Now imagine even worse actors and political campaigns having similar capabilities.

      @skierpage@skierpage23 күн бұрын
  • Yikes! Missed the phi-3 announcement..

    @christopherknight5526@christopherknight552623 күн бұрын
    • The second half of the video is about phi-3

      @tomaszkarwik6357@tomaszkarwik635723 күн бұрын
  • Yeah you can get it in Africa, US, Asia, but not in the UK

    @alan2here@alan2here23 күн бұрын
  • What if they keep overtraining the smaller models until they plateau?

    @TiagoTiagoT@TiagoTiagoT24 күн бұрын
    • that's literally what they did with lama3

      @JurekOK@JurekOK24 күн бұрын
    • @@JurekOK I thought they said they hadn't plateaued yet by the time they stopped training?

      @TiagoTiagoT@TiagoTiagoT23 күн бұрын
  • WinAmp ....

    @SimonJackson13@SimonJackson1323 күн бұрын
  • I tried a 2 bit quantized 70B model and it blew my mind how good it still was

    @Embassy_of_Jupiter@Embassy_of_Jupiter20 күн бұрын
  • 8:25

    @user-rk4ux3cj8q@user-rk4ux3cj8q19 күн бұрын
  • > The good that's come from these models far outweighs the bad Really? Don't get me wrong, I think language models are great but I know people have lost their jobs over this, we've seen data breaches, people are falling in love with AI personas, one guy was driven to suicide, scams are on the rise... I have no shortage of bad things to mention that have come out of AI, but I can't think of anything truly good. I mean I'm sure a good number of people are a bit more productive in their work, but that doesn't seem like a worthy tradeoff to me. I also disagree with your cavalier attitude towards safety based on past experience. It seems possible to me that as these models become more powerful, we may attain the AI singularity (ability for self-improvement). Once that happens, past experience will have very little wisdom to impart on us regarding what will happen next. It's very possible that we're worried for nothing, but given the scale of what's at stake, it only makes sense to be cautious.

    @syncrossus@syncrossus23 күн бұрын
  • People who thought that nuclear proliferation will cause nuclear war were wrong, they were wrong all along.

    @XOPOIIIO@XOPOIIIO23 күн бұрын
    • Well they're weren't wrong that it was extremely risky. It just didn't happen to happen.

      @eadweard.@eadweard.23 күн бұрын
    • @@eadweard. Exactly

      @XOPOIIIO@XOPOIIIO23 күн бұрын
    • @@XOPOIIIO Cannot tell what you are trying to say.

      @eadweard.@eadweard.23 күн бұрын
    • @@eadweard. How old you are? What is your IQ? Did you watch the video?

      @XOPOIIIO@XOPOIIIO23 күн бұрын
    • Is nuclear war possible without the proliferation of nuclear weapons? If you’re going to talk about out logical causes, be specific about the claim.

      @oncedidactic@oncedidactic23 күн бұрын
  • Correction: this is not open source. Open weights without the release of processes or code, is akin to a binary library, you can build with it, but you depend on it without knowledge. Open source should be at the minimum open like Grok 1.0, otherwise it is quite an evil way of sourcing, getting regulation ease from the govt and dev ideas from the community, but keeping them dependent on the "binary". Same goes for Mistral.

    @seventyfive7597@seventyfive759724 күн бұрын
    • Not sure what you mean. If they release no architectural code/information, how can you use it at all, even for inference?

      @eadweard.@eadweard.23 күн бұрын
    • It's "open weights". Classical code does not have an analogy. With open weights you can do quite a lot of customizations, in contrast to binary library, which would require an extremely difficult task of reverse engineering to do any modification. True open source would be revealing the training data, the model code and the details of training processes.

      @Hexanitrobenzene@Hexanitrobenzene23 күн бұрын
    • ​​​@@Hexanitrobenzene I have to disagree on your first paragraph, what you're referring to is akin to the include file that goes along the binary. It's closed code. For comparison look at the amount of information X-AI released along Grok 1.0 . Mistral and Meta are local closed code, while OpenAI are SaaS closed code, but both are closed.

      @seventyfive7597@seventyfive759723 күн бұрын
    • It is much worse with Llama becuase they reserve a right to terminate your license by accusing you of violating Acceptable Use Policy. Which they can change basically at any time. They also force you to defend them in court (indemnification) if your users sue them. Which could be a big deal for a small company.

      @clray123@clray12323 күн бұрын
    • @@seventyfive7597 By "modifications" I meant fine tuning, which can be done way cheaper (

      @Hexanitrobenzene@Hexanitrobenzene21 күн бұрын
  • Just a reminder that openai constantly nerfs their production models. Beating today's cgpt3.5 doesn't mean it beats the launch cgpt3.5 which formed our first impressions.

    @ulamss5@ulamss523 күн бұрын
    • Launch version doesn't exist anymore and will never exist again, so not really sure there's a point to compare to a ghost. As long as open source models keep getting better, it's progress :D

      @Phobos11@Phobos1122 күн бұрын
  • Regarding Llama3, Sam looked scared out of his mind in a recent video. ClosedAI sucks.

    @dr.mikeybee@dr.mikeybee23 күн бұрын
  • The obsessively blaming the safety crowd IMO is kinda cringe and lame. It’s obvious why Open AI and Anthropic don’t open source their models, it’s for profitability reasons. They don’t even pretend like it isn’t and they don’t use safety as an excuse. Constantly blaming people who care about safety is gonna lead to a rude awakening when Facebook realizes it’s tanked enough competitor market share and announces its own fully closed off monetized models.

    @halocemagnum8351@halocemagnum835122 күн бұрын
  • Spun the 8b parameter Llama 3 model up locally with Ollama, asked it to summarise some text and it just spat out garbage. Tried it on Q4, Q8 and FP16 quantizations and apart from "Why is the shy blue?" everything else I tried was a totally rubbish response. Also found that it often went into a long, seemingly endless, cycles of outputting the same paragraph of nonsense over and over again. Can't speak for the 70b parameter model but the results with the 8b show that this smaller version is definitely not suitable for prime time.

    @TheEarlVix@TheEarlVix24 күн бұрын
    • ++++++ same result

      @whoareyouqqq@whoareyouqqq23 күн бұрын
    • Phi3 significantly better

      @whoareyouqqq@whoareyouqqq23 күн бұрын
    • @@whoareyouqqq Yes I tried Phi3 for a sanity check because it all seemed a bit odd especially after all the Llama3 release hype and Phi worked fine, not perfect but definitely without complete garbage issues.

      @TheEarlVix@TheEarlVix23 күн бұрын
  • Phi 3 is out and beats llama 3 7b model already, it's like a week after llama 3 release.

    @definty@definty23 күн бұрын
  • Onest

    @TylerMatthewHarris@TylerMatthewHarris24 күн бұрын
  • 🦙

    @jermunitz3020@jermunitz302024 күн бұрын
  • This has really changed my perspective (from pessimistic to a little more optimistic); both the dunking on the doomers but also, by releasing these models and being unapologetic about it, we can start to get rid of the mystique that has been given to them because of this Wizard of Oz game OpenAI was playing. letting people learn to deal with these systems by themselves and see what’s under the hood. I’m confident that’s going to lead to the more efficient use of these systems, something that’s achievable when the name of the game is just “MAKE MODEL BIGGA! MOAR DATA! MOAR CIMPUTE!!!” The power of having generalized approximators is wasted if all you use them for is effectively brute force on a graph. The thing about data quality cannot be overstated. If we can be rational adults for a second and drop the hype, the fact of the matter is that calling these systems “artificial intelligence” and acting as if they’re machine god doesn’t change the fact that they’re not intelligent, doing anything close to it, nor have any of the cognitive properties the hypers and the doomers keep attributing to them. They are just functions; literal f(x)’s (granted big spicy ones). You’re fitting a function to data under some optimization procedure. The relevance of the data is that in neural networks (and siblings) we have mathematical guarantees about them being able to fit anything *(within reason)- they’re general purpose approximators. That’s a super useful thing to have! Quite powerful. You know what the weakness is though? **You can fit anything**. Anything includes things you as a human don’t want! But if the thing you don’t want generated a signal that can be used to minimize loss, then the systems doing things you don’t want, is actually working as intended. Being able to fit anything means that the function you’re using to fit ceases to be of central importance m, completely shifting the burden onto the data itself. Fitting these models (assuming you pulled it off) is just moving the data distribution from a data explicit format, to a functional representation. Hopefully, this leads to a sobering if the field and maybe an attempt is made to return to symbolics with the gains of these models and maybe, just maybe, an artificial system could not just sound like a human, but reason like one.

    @haldanesghost@haldanesghost23 күн бұрын
    • The only way these LLMs can successfully predict the next word in an answer or conversation on almost any subject expressible in a sequence of characters (!!) is by being intelligent and having cognitive properties like understanding. We are all SO BLOODY TIRED of people claiming otherwise; if you deny AI "intelligence" and "understanding," you are making up your own definitions so you can move the goalposts to another sports stadium altogether. Just say that AI intelligence, understanding, and cognitive properties are not the same as human intelligence and human understanding (yes, we know), and give us your take on how they fall short. (I tried to use Bing Copilot Chat to find the pithy tweet where an AI expert trashed your tired wisdom that these things aren't intelligent... and it couldn't find it.)

      @skierpage@skierpage23 күн бұрын
  • I don’t think you can say there is no harm from open source AI. It is too early. It was inevitable anyway that there would be leaks. But people will try to cause harm. Kids in the US machine gun schools in order to be famous, so of course someone will try to create ‘terminator’. You laugh about EU legislation but at least certain activities have to be illegal otherwise they cannot be stopped. The effort on safety has to be stepped up.

    @markburton5318@markburton531823 күн бұрын
  • No, it is not. Just another player and not the best of all.

    @Effectivebasketball@Effectivebasketball22 күн бұрын
  • Can you please put summary to the beginning or end of your videos? It is so boring to listen "wow it is so good!" or "best model" etc.

    @ivanstepanovftw@ivanstepanovftw24 күн бұрын
    • Adding this may improve your videos possibly, but I disagree that it is boring. I very much enjoy your videos 😊

      @mikethedriver5673@mikethedriver567324 күн бұрын
    • @@mikethedriver5673 OK! Here is spoiler for LLaMA 4/Mistral 2 7B/Phi/etc: OH MY GOD, IT IS SO MUCH BETTER, IT BEATS GPT-3.5.

      @ivanstepanovftw@ivanstepanovftw23 күн бұрын
  • Llama 3 is a real downgrade

    @RozenKrieg@RozenKrieg23 күн бұрын
  • Kinda late video.

    @tunestar@tunestar24 күн бұрын
  • well, llama3 compared to mistral does not really perform much better, 7b and 8b that is.

    @buttpub@buttpub24 күн бұрын
KZhead