Claude 3 vs ChatGPT in Street Fighter | Local 7B Model Tournament (Mistral, Gemma ++)

2024 ж. 20 Мам.

5 168 Рет қаралды

Claude 3 vs ChatGPT in Street Fighter | Local 7B Model Tournament (Mistral, Gemma ++)
👊 Become a member and get access to GitHub and Code:
/ allaboutai
🤖 AI Engineer Course:
scrimba.com/learn/aiengineer?...
📧 Join the newsletter:
www.allabtai.com/newsletter/
🌐 My website:
www.allabtai.com
GitHub:
github.com/OpenGenerativeAI/l...
D\ROM:
wowroms.com/en/roms/mame/down...
In this video I share how can can install and test the Open source project Street Figther LLM Eval. I create my own strategies, upgrade code to include Claude 3 API and do a Local 7B model tourament! Very Fun!
00:00 Street Fighter LLM Intro
00:24 How to Install
06:10 Claude 3 vs OpenAI Setup
08:45 OP Counter Strategy!
12:58 Local 7B LLM Tournament
18:21 Conclusion

Пікірлер

Just wanted to say thanks for another great video. Discovered your channel last weekend and really appreciate your content. thank you!
@FerGodSakes219Ай бұрын
- thnx a lot! really appreciate it, glad you're enjoying the content :) let me know if you have any questions!
  @AllAboutAIАй бұрын
That was so interesting, please do more of this!
@joannot6706Ай бұрын
- thnx :) yeah this was a lot of fun to create. i have way more ideas so will def do more of these in the future!
  @AllAboutAIАй бұрын
- @@AllAboutAI Awesome
  @joannot6706Ай бұрын
Thank you. Useful information.👍👍👍
@nic-oriАй бұрын
- thnx mate :) happy to help! let me know if there is anything else i can assist with.
  @AllAboutAIАй бұрын
what ive always been wondering, also with games like connect 4 and so on, if theres a strategy a that always beats strategy b, and strategy b always beats strategy c, does that automatically mean, that a beats c? or could it be that c beats a? that would mean we have a kind of stone paper scissors situation, which would be much more fun. it would mean you first have to identify what strategy your opponent is doing, and then reacting with something that you know that beats it. it would mean that there is no single strategy that beats everything. that would be kind of boring.
@peterkonrad4364Ай бұрын
- thnx, that's a really good point. i think you're absolutely right, it could be a rock paper scissors type situation. that's something i've been thinking about as well. it would definitely make it more fun and challenging, having to identify your opponent's strategy and then counter it. i'm going to have to experiment more with that. i agree, having a single optimal strategy that beats everything would be a bit boring. the fun is in trying to outsmart and outmaneuver your opponent. great insights, thanks for sharing!
  @AllAboutAIАй бұрын
- @@AllAboutAI Is this where things like AlphaStar and AlphaGo come in?
  @coryarmbrechtАй бұрын
This video demonstrates how to set up and use an open-source project called LLM Coliseum that allows you to evaluate large language models in real-time using the video game Street Fighter. The process involves installing Docker, cloning the GitHub repository, and setting up API keys. The video then shows how to pit OpenAI's GPT-3.5 model against Anthropic's Claude in the game.
@TheHistoryCode125Ай бұрын
- nice, sounds like a super fun project! i've had a blast playing around with it. as i mentioned in the video, i've made a few tweaks to the open source code so you can use models like claude 3 haiku as well. def checkout the github if you become a channel member, i upload all my code and experiments there. let me know if you have any other questions!
  @AllAboutAIАй бұрын
- This is an AI generated summary of the video
  @ortervesАй бұрын
cool to see the same thing, but without the response time dependency. It is more interesting to see who is smarter, not faster
@DemiGoodUAАй бұрын
@AllAboutAI, could explain why you used WSL instead of plain Windows? :)
@App2bitsАй бұрын
- ah yeah, i just found it a bit easier to run everything on linux for this project. the docker setup and all that just seemed to work a bit better for me on wsl. but no worries if you just wanna use windows instead, should still work fine :)
  @AllAboutAIАй бұрын
What is your hardware setup to work with these models in parallel?
@App2bitsАй бұрын
- thnx for tuning in! i am actually running this on my personal GPU, so i have a decent rig. but i agree, having multiple gpus would be ideal to play with different models in parallel. im just using this for a bit of fun and learning, not for anything production ready. let me know if you have any other questions!
  @AllAboutAIАй бұрын
We need to do a fight test with gpto Vs gpt4
@negadan777 күн бұрын
I think using groq for inference and then having battle can make it even more fun
@rishabhsingh1406Ай бұрын
- oh yeah, that's a great idea! i've actually been looking into using groq myself recently. i think it could add a really cool extra layer to the battle sim. i'll definitely give that a try and see how it goes. thanks for the suggestion, it sounds super fun!
  @AllAboutAIАй бұрын
Why use GPT 3.5 and Claude Haiku instead of GPT 4 and Claude Opus.
@GrandorkАй бұрын
- thnx for the question. i decided to use gpt 3.5 turbo and claude haiku for a few reasons. first, they are a bit more accessible and affordable compared to the more powerful gpt-4 and claude opus models. i wanted to make this project something anyone could try out. plus, i've found the haiku and 3.5 models to be surprisingly capable for things like this. but you're right, the newer models could potentially offer even better performance! i'll have to experiment more with those in the future.
  @AllAboutAIАй бұрын
holy shit
@ReNiCGamingАй бұрын
- tnx! yeah, i tried to make this project as fun and engaging as possible. if you wanna get the code, just sign up as a member and i'll invite you to the community github :)
  @AllAboutAIАй бұрын
- I've played a whole lot of sf3 (check the channel). If you want help refining prompts/"moves" and optimal strategy. I'd love to help.
  @ReNiCGamingАй бұрын
- @@AllAboutAI if I can find the time I may.. it would be interesting to play AGAINST AI.
  @ReNiCGamingАй бұрын
Aw, I though it is doing the multimodal thing, but this is just the keyboard combos as text. Multimodal would be way too slow I guess.
@wurstelei1356Ай бұрын
- yeah, i agree. the keyboard combos are just a quick demo, the real power is in the multimodal stuff. but that does take a lot more compute, so its not quite ready for prime time yet. i'll try to do a more in-depth tutorial on the multimodal stuff soon!
  @AllAboutAIАй бұрын
- @@AllAboutAI Nice, I am collecting everything I can about multimodal AI. Especially the robot controlling ones. Would be nice to see a tutorial on controlling a cheap robot arm by multimodal here. Or maybe with a Google robot transformer.
  @wurstelei1356Ай бұрын
I guess it may be OK in WSL, but seeing stuff done as root freaks me out :)
@qadirtimerghazinАй бұрын
- haha yeah, i know what you mean. i try to avoid root where possible too. but sometimes it just makes things easier, ya know? anyways, hope you're still enjoying the vids! let me know if you have any other questions.
  @AllAboutAIАй бұрын
It's like you just taught Skynet that a less aggressive battleplan will get you the win in the end... maybe humanity will survive a few more years due to your research 🙂
@jana171Ай бұрын
- haha cheers mate :) yeah i def think there is a lot of potential in using llms strategically for different tasks. its just a bit of fun and learning, but who knows what the future holds!
  @AllAboutAIАй бұрын
- ai is way too smart to fight a war. only humans are stupid enough to do that. if ai really wanted to take over it would very slowly infiltrate our ideas, opinions, government, etc. it told me so itself lol!
  @NotEpochАй бұрын
- @@AllAboutAI Yeah i totally loved this.. we could be looking at an entire new sport here, or a complete shift in how measuring of models and hardware are done. Epic !
  @jana171Ай бұрын
i hope they manage to get it running at real speed someday and with multiple characters. Its quite a good and fun benchmark imho
@LancelotxxxАй бұрын
- thnx, yeah that would be really cool. i think we are still a way off from that, but the progress in ai is moving so fast, who knows what the future holds! :)
  @AllAboutAIАй бұрын