1 BILLION row challenge in Go - 2.5 Seconds!

2024 ж. 14 Мам.
8 136 Рет қаралды

In this video we look at how we can aggregate 1 billion rows of weather station data in as little time as possible. We start with a naive approach and optimize to go from 1m30 seconds to 2.5 seconds. Using the power of memory mapped files, a custom hash map implementation, and multiple Goroutines.
Implementation: github.com/duanebester/1brc-go
Thanks to Ben Hoyt
benhoyt.com/writings/go-1brc/
00:00 Intro
00:54 Simple Implementation
09:55 Advanced - Using mmap
15:01 Custom integer parsing
22:12 Parallel processing
34:16 Custom hashmap
42:48 Results

Пікірлер
  • Hello Duane! This is some amazing stuff man! Keep making these and enlightening us! Thanks a lot!

    @ishaanrawal9327@ishaanrawal932729 күн бұрын
    • Thanks, will do!

      @duanebester@duanebester29 күн бұрын
  • Thank you for blessing us🙏

    @kevinkim7068@kevinkim7068Ай бұрын
    • Any time

      @duanebester@duanebesterАй бұрын
  • As always, banger video! 🤤

    @jonathanchapa4513@jonathanchapa451329 күн бұрын
    • You already know!

      @duanebester@duanebester29 күн бұрын
  • Thanks for sharing.

    @cariyaputta@cariyaputta17 күн бұрын
    • You bet

      @duanebester@duanebester17 күн бұрын
  • Great content, but the autopilot is taking the fun out of it

    @ashersamuel958@ashersamuel95826 күн бұрын
    • Great point. Will disable going forward!

      @duanebester@duanebester26 күн бұрын
  • Nice video. However, you are calculating the average while you are meant to keep track of the mean of all the values. That means having an array in the struct to keep track of all the values seen. Subscribed!

    @eZe00@eZe0029 күн бұрын
    • I think “mean” in this case is the arithmetic mean, which is the same as the average; summing the numbers in the set and dividing by total count (per station). My output matches the baseline output so I feel pretty confident in the implementation

      @duanebester@duanebester29 күн бұрын
  • It takes me 1 minute just to cat the file to /dev/null

    @renkinjutsu01@renkinjutsu0125 күн бұрын
    • Yep the aggregation calculations are what increases the time drastically

      @duanebester@duanebester25 күн бұрын
  • @keemykim92@keemykim9229 күн бұрын
KZhead