"Think Before You Speak" (Quiet-STaR) AI Experiment

2024 ж. 16 Мам.

7 396 Рет қаралды

"Think Before You Speak" (Quiet-STaR) AI Experiment
👊 Become a member and get access to GitHub and Code:
/ allaboutai
🤖 AI Engineer Course:
scrimba.com/?ref=allabtai
📧 Join the newsletter:
www.allabtai.com/newsletter/
🌐 My website:
www.allabtai.com
GitHub Repo:
github.com/AllAboutAI-YT/thin...
Quiet-STaR Paper:
arxiv.org/pdf/2403.09629.pdf
In this video I share my what would be my GPT-5 dream feautre and I create an experiment with Claude 3 Opus where i try to create i "Think before you speak" System 2 " (Quiet-STaR) AI Experiment Thinking using prompting. Interesting results
00:00 GPT-5 Dream Feature
02:34 System 2 Thinking Prompting
04:50 Writing the AI Code
11:07 Test 1 - Is it working?
14:06 Test 2 - Base Line
17:13 Test 3 - Final Answer
20:58 Test 4 - GPT-4 Eval

Пікірлер

GitHub Repo: github.com/AllAboutAI-YT/think-before-you-speak
@AllAboutAIАй бұрын
- awesome, just uploaded the code to the repo, feel free to check it out! let me know if you have any questions or issues, always happy to help :)
  @AllAboutAIАй бұрын
Brilliant as always. Thank you for sharing legend!
@davediamondАй бұрын
- thnx a lot mate, really appreciate it :) thnx for tuning in and the kind words!
  @AllAboutAIАй бұрын
Great points- thanks man
@TimTruthАй бұрын
- no problem :) thnx for tuning in!
  @AllAboutAIАй бұрын
Thank you. This is by far the best prompt engineering I tested ! I would like to express to you my gratitude. Can you share more prompt engineering methods ?
@Jeffben24Ай бұрын
Yeah claude 3 is awesome! Finally an LLM that does what I want it to do lol
@BillyRybkaАй бұрын
- yeah im really happy with the progress, hoping to see more cool stuff from claude in the future :)
  @AllAboutAIАй бұрын
Great video, as always
@OdinsTechAndGamingАй бұрын
- thnx a lot, really appreciate the kind words :)
  @AllAboutAIАй бұрын
you could give more compute to deeper chain of thoughts. Maybe asking several LLMs responses and maybe asking all of them a deeper thought on a subject, and then maybe each of them, assessing each others answers, and then working out the best response. All of this takes a heap of compute though ... but the result maybe a very thorough and concise answer.
@miker99Ай бұрын
That's exactly my experiment ;) I'm trying to simulate the System 1 and SYstem 2 thinking via simulating the conversation flow before the response reaches the user :)
@petersobolewski1354Ай бұрын
Interesting approach! Thank you for your experiments, I am always happy to see your ideas. I immediately thought of systemic counseling. In short, the counselor would not give advice, but would motivate the client to pursue his wishes by asking open questions or come to the answer himself, since he already has the answer for himself anyway.
@SuccessDynamicsАй бұрын
- thnx a lot, really appreciate the feedback! that's a super interesting perspective - i love the idea of guiding someone to find their own answer rather than just giving advice. definitely something i'll think more about as i continue exploring these kinds of systems. cheers!
  @AllAboutAIАй бұрын
I loved the video and definitely learned something there! I just can't seem to connect this kind of double pass prompt technique with the Quiet STaR paper itself, which to me doesn't seem to be a prompting technique but rather a built-in system in the LLM itself
@adebinha198417 күн бұрын
Kris have you tried the python package "rich"?
@avi7278Ай бұрын
cool love the video , i have signed up as a member but i cant see the find the GH repo, what should I do?
@NeuroNapАй бұрын
- hey! just send me your github username to kris@allabtai.com and i will add you to the private repo asap =)
  @AllAboutAIАй бұрын
This sounds like ideas explored in chain-of-though prompting or multi step prompting. AFAIK there is not way to really have the LLM "think more" about the answer by spending more compute directly on the problem. It will always spens the same amount of compute on the same amount of tokens, thats just how the algorithm works. So if you want it to "think more" about a problem you have to make it generate more tokens. And possibly have it summarize the result in the end. Naturally that will make it more expensive.
@MarcusHastАй бұрын
- hey, thnx for the comment :) yeah you are right, it is def inspired by chain-of-thought and multi-step prompting. and i agree, the only way to make it "think more" is to make it generate more tokens. but i think combining it with some of the "self reflection" technics could be interesting. but for now, its mostly just a fun experiment, and a easy way to learn some of the core concepts in a applied way. thnx for tuning in! :)
  @AllAboutAIАй бұрын
It would neat to see any differences in math performance. I want to create a spanish tutor and its helpful to have a ai agent that can count words precisely. Maybe that is a simple prompting issue. One more thought: could this work for image generation?
@matthewfuller9760Ай бұрын
Whats GPT 4? Its all about Claude 3 now. :P
@helix8847Ай бұрын
- haha, yeah i do agree. I concider Claude 3 > GPT-4 for almost all tasks now for sure. Haiku is also amazing
  @AllAboutAIАй бұрын
Great concept! I think during Nvidia’s most recent GTC keynote, Jenson pointed out that the architecture of the Blackwell GPUs includes a kind of ‘smart router’ on a hardware level, which allocates computing resources based on the task requirements.
@darrylrogue7729Ай бұрын
- thnx! yeah, that's really interesting about the nvidia blackwell chips. i'll have to check that out, sounds like it could be a step towards what i was describing. always cool to see hardware innovations that can support more advanced ai systems. thanks for sharing that!
  @AllAboutAIАй бұрын
- I just had to confirm, I used the transcript from the GTC keynote and then asked GPT4 if there is in fact a intelligent router, this was the response: Yes, Jensen Huang mentioned Blackwell’s capability to allocate compute resources dynamically, especially in relation to AI models and their computational demands. This capability is tied to Blackwell’s Transformer engine, which can dynamically and automatically rescale and recast numerical formats to a lower precision whenever possible. This feature is crucial for AI applications because it allows for more efficient computation by adapting the precision and computational resources based on the specific needs of the task at hand. Essentially, this means Blackwell can adjust its compute allocation to optimize performance and efficiency for various AI model demands, whether it’s processing simple tasks or complex AI algorithms. This approach not only maximizes computational efficiency but also enhances the overall performance of AI applications running on Blackwell.
  @darrylrogue7729Ай бұрын
- Thnx:) really great observation, will def look into that!
  @AllAboutAIАй бұрын
- @@AllAboutAI 8/10
  @darrylrogue7729Ай бұрын
great job:) where can i find this code? and will this be a feature in GPT-5?
@unsolved_mysteriezАй бұрын
- thnx a lot :) the code for this is on my github, just check the link in the description. as for gpt-5, that's just a dream feature of mine for now, but who knows what the future holds! hope you enjoyed the vid.
  @AllAboutAIАй бұрын
This is exactly what I've been saying. First grade I was asked to read aloud to the class. I had dyslexia but I didn't know it. I decided to read the sentence in my head constructing it before I spoke. When I read it aloud the teacher advanced me to advance reading. It took until third grade to figure out that I had a reading problem but I thought before I spoke with the sentences I read aloud. I guess I faked out the teachers for three grades before they found out I had dyslexia putting me in advanced reading how messed up is that.
@hope42Ай бұрын
We dedicate the same level of compute per token, yes, but one of those questions will yield a longer response for each subsequent token the response-so-far is part of the context. Quite often ChatGPT changes its mind during a response, which is in effect spending more compute for a more complex question. Also, you can already switch compute allocation for different questions by changing ChatGPT models to the crappier-and-faster ones.
@skaramickeАй бұрын
- hey mate :) thnx for the detailed comment and feedback, rly apreciate it. agree 100% that the current chatgpt model is quite dynamic in how it spends compute and tokens.cheers!
  @AllAboutAIАй бұрын
- Thirdly, your idea is awesome and I’m implementing something similar in my agent manager thingie.
  @skaramickeАй бұрын
And if it’s already answered the question it would have that for reference and the allocation of needed “thought time” would be reduced. Additionally that would be the thought behind why what is a banana being more simple because it’s already answered or has more data behind it. Therefore a new question you asked yesterday would take less time tomorrow after you or others have asked. Collectively it would be understandable for openAI to collect the data as we ask. I think the api doesn’t save questions. You could however for easy reference. Every question with answer could be backed up and have the ai read the questions for easy data retrieval and only grab info from data backup if it’s useful.
@playthisnoteАй бұрын
- thnx for the insightful comment :) yeah i think you are spot on, if the model has seen the question before, the response would be much faster. collecting a knowledge base of questions and answers is def the way to go for a prod system. excited to see what the future holds in this space. appreciate you tuning in and sharing your thoughts!
  @AllAboutAIАй бұрын
great video! is it even possible do give "thinking time" to a LLM today or in the future? what do you think?
@aifplАй бұрын
- thnx a lot! yeah i think in the future we could see llms with more customizable "thinking time" and compute allocation. it's a really interesting concept and i'm excited to see how the tech evolves. for now, i'd say we're still a bit limited, but the core idea is really cool. can't wait to see what happens next!
  @AllAboutAIАй бұрын
- okey cool, do you think GPT-5 or some other model provider?@@AllAboutAI
  @aifplАй бұрын
It would surprise me if they didn't have some method to choose appropriate amount of compute based on token-length. But maybe you are right
@AntonBj3Ай бұрын
- yeah, I would like this to happen in either a different system before any token is generated that I can see, some kind of arch that uses time to come as close as possible to the Best / correct answer with high accuracy, I dont wanna regenerate for a "better answer", i want the "best" the first time I ask
  @AllAboutAIАй бұрын
yeah this is good:) what would you think is the biggest challenge to create this system? and how do i become a member to get access to the code?
@KrisPeteCODАй бұрын
- thnx :) the biggest challenge is probably the compute and storage needed to save all those inner monologue thoughts. becoming a member is easy, just use the link in the description to join up!
  @AllAboutAIАй бұрын
- @@AllAboutAI 8/10
  @KrisPeteCODАй бұрын
Great
@agedbytes82Ай бұрын
- thnx a lot :)
  @AllAboutAIАй бұрын
so what is the answer? Options are green, yellow, brown or black
@yoagcurАй бұрын
- Also red, or a combination of some of those colors
  @jimgsewellАй бұрын
When would AGI start to create its own KZhead channels
@chrisbloem730Ай бұрын
- thnx for the question! i'm as excited about the future of agi as you are, but i think we're still a ways off from agi creating its own youtube channels. there's still a lot of work to be done on the technical and safety side before we get to that point. for now, i'm focused on making great ai-powered videos to share with the community. but i'll definitely keep an eye on the progress of agi, it's a fascinating area! let me know if you have any other questions.
  @AllAboutAIАй бұрын
I really enjoy your channel and I’m interested in experimenting with Quiet-STaR too. I’m sorry to say, but I think that you should spend more time thinking about this. In your first example “What color is a banana?” I believe you are looking for an easy answer of yellow. Great, but this isn’t the correct answer. A banana can be yellow, green, red, brown, black, or a combination of some of those colors. So, I don’t think that the correct answer is as easy as you think that it is. Also a single quick search of Wikipedia for Pythagorean theorem yields multiple examples of proofs. You almost need to know the answer to a question to know if it is an easy or difficult question. Even then, LLMs don’t necessarily process info the same as we do. What is easy for us, could be more difficult for the LLM, and what is difficult for our brain, may be a simple lookup for the LLM. Further, I don’t think that you are being fair with your baseline question. You ask a very different question and don’t ask it to format the answer in the way that you want. Then complain about the format. It seems that you are purposefully biasing the phrasing of your questions in order to get the result that you want, that using the Quiet-STaR method results in better answers. I’m with you man, I too think that the Quiet-STaR method or something like it can provide better answers with lesser models, but I think that this project is flawed enough, that it really doesn’t illustrate that. I have enjoyed many of your projects, you are one of my favorite AI channels, but I think this project needs to be better thought out. I think the analysis shows this to be true. I’m looking forward to your next project. I hope that you revisit this think before you speak idea again in the future, after you have given it more thought.
@jimgsewellАй бұрын
- hey, thnx for the thoughtful comment and feedback, really appreciate it :) you def raise some good points, and i agree that the examples might be a bit cherrypicked to prove my point. but i still think the general idea has potential, but ofc needs more testing and thought. tnx for supporting the channel and tuning in, awesome to have you onboard and get this kind of input, keep it coming =)
  @AllAboutAIАй бұрын
Aww… i was hoping it would use a voice and personality to speak to you. 😊
@Ms.Robot.Ай бұрын
Would this method get us closer to 42?
@avgplayerАй бұрын
- thnx for the question! haha, well i guess it would get us closer to 42 in a philosophical sense, but i'm not sure the compute power is quite there yet. maybe in a few years with gpt-5 or 6 we could start to really crack that big question ;)
  @AllAboutAIАй бұрын
- Did you notice that 42 was only the first part of the answer and 179 with access to a 42 is the full one ?
  @wurstelei1356Ай бұрын
i dont see this happening in years to be honest. and your select compute allocation seems sus. do you mean ppl can set thinking time and compute allocations as parameters before running the prompt?
@ThinkBigPodАй бұрын
- yeah, i know it's not realistic right now, but that's how i'd love to see it evolve in the future. you're right, letting folks set the compute and thinking time would be pretty neat. maybe one day!
  @AllAboutAIАй бұрын
This video tries to sound cutting-edge but it's mostly over-engineered fluff. The core idea - having the AI reflect before responding - is interesting, but here's the real takeaway: 1) You can achieve a similar effect with simpler prompts that encourage thoughtful, personal responses. 2) Asking the AI to rate its own responses is pointless - it doesn't understand nuance the way humans do. 3) True value lies in your ability to guide prompts for the results you want, not fancy systems.
@TheHistoryCode125Ай бұрын
- Bro typed a whole lotta bs , dumb comment
  @RomathefirstАй бұрын
- The effect you achieve through prompting is far worse.
  @GodbornNovenАй бұрын
- I completely disagree with your second point. I'd even argue its better at rating responses than most humans
  @GodbornNovenАй бұрын
- 3) that's completely wrong too. There's only so much you can do with prompting 😂.
  @GodbornNovenАй бұрын
- Q*
  @lydelltyАй бұрын
nah I think you are delutional, i dont think GPT-5 is even close to this yet... big fail
@tswiftly89Ай бұрын
- nah, i dont think so mate. this is just a bit of fun and creativity, not meant to be super realistic. i'm just exploring some ideas and having a bit of a think about what i'd love to see in the future. if youre not into it, no worries! i know ai hasnt quite gotten there yet, but its fun to imagine :) hope you still enjoy the vids!
  @AllAboutAIАй бұрын
What’s your email? Do you have a discord server?
@tsap1Ай бұрын