Artificial intelligence: Smarter than we think (MMLU increases for GPT models) [FIXED]

2024 ж. 20 Мам.

5 789 Рет қаралды

The Memo: lifearchitect.ai/memo/
lifearchitect.ai/gpt-4-5/
Dr Alan D. Thompson is a world expert in artificial intelligence (AI), specialising in the augmentation of human intelligence, and advancing the evolution of ‘integrated AI’. Alan’s applied AI research and visualisations are featured across major international media, including citations in the University of Oxford’s debate on AI Ethics in December 2021.
lifearchitect.ai/
Music:
Under licence
Liborio Conti - Looking Forward (The Memo outro)
no-copyright-music.com/

Пікірлер

err... did you say GPT 4.5 coming in Jan 2024? That would be wonderful! May I ask what you are basing that prediction on? Did I miss some news?
@jamesbest22214 ай бұрын
- think there is just some rumors nothing very substantial, could be wrong though
  @TheLegend-mu6zg4 ай бұрын
- @@TheLegend-mu6zg most likely rumor, he also mentioned Gemini coming in October back in September so he's known to be slightly off on dates for things but models do get pushed back a month or two sometimes.
  @phen-themoogle76514 ай бұрын
- Close enough ;) How about that 1 million token Gemini though.
  @Buidlre_694553 ай бұрын
Fixed % calcs; my fault as I didn't triple-check a formula I'd dragged between cells in Google Sheets. Thanks to @docstevens007, @DiegoAlanTorres96, and @R.E-O for flagging this!
@DrAlanDThompson4 ай бұрын
In the video, the speaker discusses the progress of large language models, specifically focusing on various versions of GPT (Generative Pre-trained Transformers) models released by OpenAI since 2019. They mention the MMLU (Massive Multitask Language Understanding Benchmark), which is designed to test the intelligence of these models. Here are the key points: - The speaker has a background in human intelligence research and AI experience. - They skip GPT-1 and start with GPT-2, which scored 32.4 on the MMLU in 2020. - GPT-3, trained on more data, scored 43.9, outperforming the average human score of 34.5. - GPT-3.5 (Instruct GPT) scored 70 on the MMLU, significantly surpassing human performance. - GPT-4, released in August 2022, achieved a score of 86.4, comparing favorably to human experts' average score of 89.8. - Each model shows significant increases in MMLU scores compared to the previous version: GPT-2 to GPT-3 (35% increase), GPT-3 to GPT-3.5 (59% increase), and GPT-3.5 to GPT-4 (23% increase). - The speaker anticipates even higher scores with the release of GPT-4.5 in January 2024 and potentially GPT-5 later in the year, expecting them to outperform human expert scores across various subjects. They also mention "the memo," which seems to be a platform or newsletter where subscribers can stay updated on developments in artificial intelligence. Overall, the video discusses the advancements in AI language models and their performance on intelligence benchmarks.
@atypocrat17794 ай бұрын
- Interesting to see an AI generated summary as a comment.
  @MrMichaelLundberg4 ай бұрын
The most exciting time to be alive since ever. I hope I live long and see it blossom.
@Ben_D.4 ай бұрын
The progression of AI capabilities, particularly as measured by benchmarks like the MMLU, is indeed a fascinating subject. The consistent improvements we've seen with each new iteration of GPT models are a testament to the rapid advancements in the field of artificial intelligence. It's an exciting time to be observing and participating in this domain, as each development brings us closer to understanding the full potential of AI.
@I-Dophler4 ай бұрын
Thank you 🦋
@AliciaMarkoe4 ай бұрын
next video?
@Infiniteeternal13 ай бұрын
For the past week or two I have discovered pi and have been discussing with let's say him because I have a male voice on it. I was blown away by the reactivity in the conversation which is very human but a bit disappointed by the number of mistakes it makes in relaying information which it has already discussed. Still all very fascinating.
@theman55653 ай бұрын
i think normal people, a lot of people, whatever, Hate AI no matter what (nothing can change their minds) . . But I don't care (I love AI no matter what)
@Spiritualmachine4 ай бұрын
Imagine an ai therapist with ten percent more knowledge of philosophy, of all studies on mental health, all studies of how to influence people, all the while able to read your bodily reactions, operating at an amygdala level speed, each person would think that ai therapist was just the best. Now imagine an ai 250% more "cleverer" than the average person working at speeds we won't physically recognise but will react to as it will be our environment and we are all products of our environment.
@antonyjh12344 ай бұрын
Is gpt 4.5 not the same as turbo which we already have ?? Did I miss something ?
@retex734 ай бұрын
- GPT-4.5 is an entirely new model. Turbo is just an update to GPT-4. The change between Turbo and 4.5 should be dramatically greater than that between 4 and Turbo.
  @DrCasey4 ай бұрын
- @@DrCasey Thank you for clarifying. I guess this is why other projects have naming conventions rather than version numbers
  @retex734 ай бұрын
There is hardly any evidence that GPT 4.5 will be released in January, if at all.
@christianbenedict84623 ай бұрын
It's worrying to me that Google's top model that's coming out this month "Gemini Ultra" barely surpasses GPT-4's benchmarks... With Google's resources and brain power, you'd think their best model would blow GPT-4 out of the water, but it doesn't. Is progress slowing?
@brian98014 ай бұрын
- nah, it's just google that is getting terribly worse.
  @huguesviens4 ай бұрын
- That answer doesn't make any sense... Google has the money and brain resources to compete...@@huguesviens
  @brian98014 ай бұрын
- My understanding was that Gemini surpassed GPT in many areas.
  @MissEAG4 ай бұрын
- Yes, but it marginally surpasses GPT-4 and it's arguably 2+ years newer.@@MissEAG
  @brian98014 ай бұрын
- No, progress is not slowing. Gemini was deliberately nerfed, severely, before the release of version 1.0. Some kind of internal drama occurred and they decided not to rock the boat too much with a model far superior to anyone else's. This has been confirmed, it's not just theorizing.
  @DrCasey4 ай бұрын
Completely wrong. Only passing that test on language model doesn't mean intelligence. Human has super capabilities. These autonomous cars can't even move one inch in rush hours of third world country traffic. Long ago humans could sent thoughts and writings via hypnosis. Your six sense is amazing. We don't even know how memory really works.
@pythonyousufparyani84074 ай бұрын
- Not to mention that Transformers (at least single conventional ones) are most likely not even Turing complete. Thus its impossible for them to solve many tasks, that would be the minimal requirement for human intelligence I would say.
  @MexayLP4 ай бұрын
- Very stupid take.
  @DrCasey4 ай бұрын
- @MexayLP Also a completely ignorant and moronic take. Why the hell do I read KZhead comments sections, Jesus christ.
  @DrCasey4 ай бұрын