Stop using LINQ to order your primitive collections in C#

2022 ж. 7 Қыр.
96 006 Рет қаралды

Check out my courses: dometrain.com
Become a Patreon and get source code access: / nickchapsas
Hello everybody I'm Nick and in this video I will show you a couple of new LINQ methods introduced in .NET 7 that you probably shouldn't be using. There are some even faster and more memory efficient alternatives that you can use instead, depending on your usecase.
Don't forget to comment, like and subscribe :)
Social Media:
Follow me on GitHub: bit.ly/ChapsasGitHub
Follow me on Twitter: bit.ly/ChapsasTwitter
Connect on LinkedIn: bit.ly/ChapsasLinkedIn
Keep coding merch: keepcoding.shop
#csharp #dotnet

Пікірлер
  • Another important thing to mention is that Array.Sort() and List.Sort() perform unstable sort, so they not preserve the order of elements that have the same key, while LINQ .OrderBy performs stable sort, so never changes places of the equal keys. This could be useful (and even required) when you work with collections of non-primitive types (and this is probably the reason why OrderBy is using slower sorting algorithms).

    @maksymkyian4920@maksymkyian4920 Жыл бұрын
    • This is the first thing that came to mind when I watched this video and the reason why i always use LINQ to sort lists.

      @rafazieba9982@rafazieba9982 Жыл бұрын
    • Thanks for clarification! 😎

      @rusektor@rusektor Жыл бұрын
    • I actually didn't know this. What happens if you have elements with equal keys?

      @bilbobaggins8953@bilbobaggins8953 Жыл бұрын
    • @@bilbobaggins8953 With stable sort algorithms you have guarantee that their order will not change. With unstable you never know.

      @rafazieba9982@rafazieba9982 Жыл бұрын
    • They use different sort algorithms

      @IldarIsm@IldarIsm Жыл бұрын
  • Glad you sorted it out

    @KoScosss@KoScosss Жыл бұрын
  • Great video Nick! I was really hoping that you would run the benchmarks with the Params attribute to see what happens with 1_000, 10_000 and 1_000_000 sized collections. I would also really like to see the results from the sort using spans method. Really great video though as always! Keep them coming…

    @kazepis@kazepis Жыл бұрын
  • I have been testing around a lot with those OrderBy and Sort methods (both Array and List indeed) in the past, and I also discovered that smart self-choosing sorting algorithm stuff on especially those Sort methods indeed. And if I am right a native code version of those algorithms is chosen as well when possibile. I noticed in certain situations, you're indeed better off with the Sort methods than with linq OrderBy, but in real life these cases are very limited, because very often there is the more important decision to make is at which moment you actually want to store your data in a memory buffer (like array / list) anyway, or if just enumerating once is enough. Personally I remember that I noticed this with file / directory enumeration if I remember it right. Choosing your sorting method is one problem to solve, but an actual program has a lot more factors, with filtering and buffering as well.

    @jongeduard@jongeduard Жыл бұрын
  • Theos! You are my defacto no.1 resource for C# on KZhead. Enjoying the flow and straight to the point vids. I tip my hat for you sir :)

    @SubVengeance@SubVengeance Жыл бұрын
  • I think it's worth pointing out that `random.Next()` is a significant factor in the order(by) test cases and it *dominates* the sort test case. Without the hundred calls to `random.Next` the Sort() benchmark will be several times quicker for both the list and the array.

    @tymekmajewski4718@tymekmajewski4718 Жыл бұрын
  • If you wanted to do a descending order, I believe you can chain a Reverse() to the List as well. It may be slightly slower, but given the improvements overall, it may be worth it.

    @KnightSwordAG@KnightSwordAG9 ай бұрын
  • I believe that a static lambda like x => x with no closure will not allocate anything. It's compiled as a static method in the using class scope. It can be seen by decompiling the output managed DLL. Only delegates and lambdas that actually capture some state will be compiled to a separate class that then needs to be instantiated (which causes the allocation). No closure delegates will have their object set to null.

    @julkiewicz@julkiewicz Жыл бұрын
  • Way to go Nick, thanks for the useful video!

    @pavfrang@pavfrang Жыл бұрын
  • Hey Nick! I have suggestion for your next video! Since performance is your thing, why not talking about ObjectPool? Kind regards, Cristiano Dias

    @cristianodias3529@cristianodias3529 Жыл бұрын
  • Still working as of today, ty!

    @parkercrofts6210@parkercrofts6210 Жыл бұрын
  • What aout sorting objects using a property inside them, I think that's the most important use case. Can we use List.Sort with IComparison In that case?

    @tarekel-shahawy891@tarekel-shahawy891 Жыл бұрын
  • You probably shoulnt initialize the list in the benchmark method. Also, for the seed to have an effect, shoulnt you rather set the seed and init the array in an IterationSetup method? That way every benchmark iteration operates on the same array?

    @cn-ml@cn-ml8 ай бұрын
  • Shouldn't you re-seed the Random for each Enumerable for the same values, especially with int.ToString()?

    @flybyw@flybyw Жыл бұрын
  • I got sick of writing IComparers every time I wanted to Sort a List, so I created a generic IComparer class that replicated LINQ's OrderBy/ThenBy/Descending chaining ability. //List list; list.Sort(new OrderBy(x => x.IntField).ThenByDescending(x => x.Otherfield);

    @jdlessl@jdlessl Жыл бұрын
  • Behind the scenes, List just uses Array.Sort() on its internal array, so it's not surprising that they compare pretty much the same.

    @RobinHood70@RobinHood70 Жыл бұрын
  • wich is your Visual Studio Theme nick?

    @NakaT88@NakaT887 ай бұрын
  • In most cases, I'd say the readability of Linq sorts is more useful than the potential performance upsides of using in-place sorts. Would probably recommend starting with that then moving to in-place if performance becomes a bottleneck.

    @HAMYLABS@HAMYLABS Жыл бұрын
  • Order/OrderBy advantages: + purity (i.e. doesn't change underlying collection) + simplicity + readability + same for primitive/complex types + extension method of IEnumerable => you don't have to have a collection before and ToArray/ToList can be way down the chain of other methods + expression tree (allows for other optimizations etc.) Sort advantages: + faster (not always) + more memory efficient So it depends. I believe Order/OrderBy is better choice in most of the cases. Title of the video is a bit misleading.

    @Kundrtcz@Kundrtcz Жыл бұрын
    • For all cases the non-LINQ methods are objectively more perfromant. Even when they are slower for strings, the memory allocation offsets the speed in GC time so they are still faster. It's why LINQ is a huge no for any high performance software

      @nickchapsas@nickchapsas Жыл бұрын
    • @@nickchapsas Honest question, since I am not well versed in C#: Is high performance software usually written in C#?

      @ferranis104@ferranis104 Жыл бұрын
    • @@ferranis104 Yes

      @nickchapsas@nickchapsas Жыл бұрын
    • If software really need high performance, C# is not right language. Some system level language would be better choice e.g Rust, c++, C , GO for example

      @philosophy5561@philosophy5561 Жыл бұрын
  • It seems like it would be a simple optimisation to have order call the underlying sort functions. I guess the extension method is on the interface, but it seems like it would be worth it to check the type of the collection.

    @namewastaken360@namewastaken360 Жыл бұрын
    • This crossed my mind as well and you can write an extension method that does just that. A problem you run into, though, is that the extension method would then mutate the underlying array, which would break the expectation that LINQ methods are pure

      @doorhammer@doorhammer Жыл бұрын
  • thank you

    @rely1020@rely1020 Жыл бұрын
  • This is the best free software Ive seen. Respect.

    @davidmaldonado5098@davidmaldonado5098 Жыл бұрын
  • How are you on RC1? Site still says SDK 7.0.100-preview.7

    @jadenrogers3133@jadenrogers3133 Жыл бұрын
  • (still watching) at 7:45 should you the random in the method too? each method wont reset the seed which mean the benchmark are ran with different set of numbers

    @Spirch@Spirch Жыл бұрын
    • It doesn't really matter as long as they are not sequential

      @nickchapsas@nickchapsas Жыл бұрын
  • Do linq methods perform sort immediately, or can it be deferred or halted. Like say we order and take 100 out of a million, will the performance be the same as we array sort said million. I would benchmark it later, now I am from phone. But if someone beats me to it, you are welcome.

    @kyryllvlasiuk@kyryllvlasiuk Жыл бұрын
  • Great job pointing out that Array.Sort mutates the existing array (as opposed to returning a new array, which might point item at.

    @timothywestern6488@timothywestern6488 Жыл бұрын
  • 2:45 Wait, why does the lambda need to allocate? It doesn't capture any variables, so it seems like the compiler should be able to allocate it statically, right?

    @welltypedwitch@welltypedwitch Жыл бұрын
    • Nop. Delegates are referece types so they will always be allocated. To prevent that you can mark them as static or extract them into a static field/property and use that

      @nickchapsas@nickchapsas Жыл бұрын
    • @@nickchapsas I think you're wrong about this. - Non capturing lambdas are automatically cached in compiler generated static fields even without the static keyword (sharplab link at the end of this comment as proof). - The difference between allocations of OrderBy(x => x) and Order() in your tests at 5:20 is about 0.42 kB, that capture-less lambda can't possibly take up that much memory. - If you Benchmark OrderBy(x => x) against OrderBy(...) with a static cached delegate the memory allocation is equivalent. - My educated guess as to what is actually happening is that the IOrderedEnumerable, which is actually OrderedEnumerable (which implements IIListProvider.ToArray(), which Enumerable.ToArray() calls) is being smart and checking if it has been given an Identity Func (probably by comparing the reference to EnumerableSorter.IdentityFunc) and thus is able to optimize the sorting, as opposed to getting a Func that could be anything. Sharplab link: sharplab.io/#v2:EYLgxg9gTgpgtADwGwBYA+ABATARgLABQGADAAQY4oDchJ5OAdADICWAdgI40G0DM5WUgGFShAN6FSU8v3YAXANoBdUgGVocmABMAFBV4AeeQD5SMNgFcAtjCgBDYABsYASlEFppCR8/SMAdjNLG3snGAYAeSgtWwAhAE8dBFIAXlMEFwYAFQgAQSh7RJduTwBfQlKgA

      @petrusion2827@petrusion2827 Жыл бұрын
  • BTW. By *not* using a new Random(420) the arrays in are different in each benchmark.

    @tymekmajewski4718@tymekmajewski4718 Жыл бұрын
  • This comment is a friendly reminder for Nick to update Rider.

    @Gallardo994@Gallardo994 Жыл бұрын
  • In most cases I'd rather pay the cost in time for the side-effect free operations.

    @7th_CAV_Trooper@7th_CAV_Trooper5 ай бұрын
  • If you don't want to mutate your initial collection, how about cloning and then mutating the clone? Would that change the results in any meaningfull way?

    @guiorgy@guiorgy Жыл бұрын
    • It doesn't scale. You will always need to re-allocate the full collection which will get very costly

      @nickchapsas@nickchapsas Жыл бұрын
  • This might sound like a stupid question, but can any body tell me what IDE Nick is using? I'm currently using Visual Studio 2019. I like using it but Nick's seems more responsive than mine.

    @horwoodg@horwoodg Жыл бұрын
    • @@ilya-zhidkov Thank you. I'll give it a try.

      @horwoodg@horwoodg Жыл бұрын
  • For descending, how about following the list.Sort with a list.Reverse?

    @SteveGietz@SteveGietz Жыл бұрын
    • Bad idea. Just use a static comparer

      @nickchapsas@nickchapsas Жыл бұрын
  • Maybe worth pointing out by having three different arrays of different random numbers it can make for inconsistent results

    @Bliss467@Bliss467 Жыл бұрын
    • I was coming to mention the same thing. He made a point of that initially, then made changes that created three unique lists. However, I don't think that it invalidates his conclusions since the results are so stark.

      @sgartner@sgartner Жыл бұрын
    • It doesn’t make any difference for these scenarios. The only thing that really makes a difference is whether the arrays contain sequential numbers or not

      @nickchapsas@nickchapsas Жыл бұрын
  • to be clear - when you compare sorting methods you should compare this same data sets for each algorithm so instead of creating new collections of random items each time you should create this collection once and work on a copy of this collection

    @michaksiazek5905@michaksiazek5905 Жыл бұрын
    • This wouldn’t work for the sort method because they mutate the collection. That’s why I’m using a random with a seed. To deterministicly create the same random data

      @nickchapsas@nickchapsas Жыл бұрын
    • @@nickchapsas That's why OP said "work on a copy of this collection". You can create one randomized collection and then just create copies using the List constructor or the Linq .ToList(). Then you have different heap lists but the same content.

      @DJDoena@DJDoena Жыл бұрын
  • Present!

    @xRalphy@xRalphy Жыл бұрын
  • nice shirt in the thumbnail

    @pedrojose9135@pedrojose9135 Жыл бұрын
  • I don't think there is such big difference between qsort and introsort. Most likely such speed difference happens because OrderBy creates inner array of keys to sort, and then maps it into resulting IEnumerable.

    @PsiHamster@PsiHamster Жыл бұрын
    • There is a very noticable difference as you go on different sizes of collections, because the average of the algorithm changes

      @nickchapsas@nickchapsas Жыл бұрын
    • ​@@nickchapsas ​ I think my comment were deleted because of link of github. So i will wrote without code link I made a benchmark that tests Array.Sort, OrderBy and simple QSort implementation And it has no such a big difference between Array.Sort and QSort. Array sort slightly faster than QSort on small array size, and almost same for biggest. But Linq sorting with OrderBy always more than twice slow and getting worse with bigger sizes. | Method | Size | Mean | Error | StdDev | Allocated | |---------- |------- |--------------:|------------:|------------:|----------:| | ArraySort | 100 | 2.943 us | 0.0292 us | 0.0273 us | 576 B | | LinqSort | 100 | 6.707 us | 0.0479 us | 0.0400 us | 2024 B | | QSorting | 100 | 4.114 us | 0.0640 us | 0.0568 us | 576 B | | ArraySort | 10000 | 488.011 us | 4.6788 us | 4.1477 us | 40176 B | | LinqSort | 10000 | 1,215.639 us | 10.1845 us | 9.5266 us | 160425 B | | QSorting | 10000 | 598.974 us | 2.3151 us | 1.9332 us | 40177 B | | ArraySort | 100000 | 5,830.275 us | 94.3979 us | 88.2999 us | 400220 B | | LinqSort | 100000 | 15,815.721 us | 307.9211 us | 354.6024 us | 1600596 B | | QSorting | 100000 | 6,993.457 us | 24.4451 us | 21.6700 us | 400220 B |

      @PsiHamster@PsiHamster Жыл бұрын
    • QSort benchmark code [Benchmark()] public int[] QSorting() { var items = Enumerable.Range(0, Size).Select(_ => random.Next()).ToArray(); SortArray(items, 0, items.Length - 1); return items; } public int[] SortArray(int[] array, int leftIndex, int rightIndex) { var i = leftIndex; var j = rightIndex; var pivot = array[leftIndex]; while (i pivot) { j--; } if (i

      @PsiHamster@PsiHamster Жыл бұрын
  • The difference is not so significant when you're just benchmarking with 100 items. I've benchmarked it with 500k items and the difference between Array.Sort and OrderBy is a lot more significant. It's about 5ms vs. 60ms. So actually, Array.Sort is really nice, 12x faster for large loads 😄 And I'm not sure why they even use comparison-based algorithms. Integers and strings can be sorted in linear time with radix / bucket sort. Can someone explain why they don't use those algorithms? Is there an implementation detail that I'm missing out on? 🤔

    @marcotroster8247@marcotroster8247 Жыл бұрын
    • If I remember correctly, bucketsort uses 3 arrays of the same size: the original, another one for index-swapping, and a third one for the ordered result - which is a LOT of memory waste. But correct me if I'm wrong.

      @mgerry1992@mgerry1992 Жыл бұрын
    • @@mgerry1992 So, you're arguing that the memory allocation of 2 additional arrays with n elements each is more costly than a factor of O(log n) more computation time? 🤔 I mean that's log2(500k) = 19 as an additional factor. What's the factor of bucket sort or radix sort? Is it really worse than quick sort? 🤔

      @marcotroster8247@marcotroster8247 Жыл бұрын
    • @@marcotroster8247 "Is it really worse than quick sort" - Yes. I guess there is almost no usecase to sort 500k elements so therefore they dont use it. But if u have it then radix is the way to go.

      @patziwyruch@patziwyruch Жыл бұрын
    • ​@@marcotroster8247 kzhead.info/sun/ebJ6hdmRpJ2Dg30/bejne.html here u have a funny video where sorting Algorithm are compared. (14:52 results)

      @patziwyruch@patziwyruch Жыл бұрын
    • @@patziwyruch I hope you're not being seriously. An API designer should never restrict the user in such a way. People are relying on an API to sort their data with optimal time complexity; that's the reason for using an API in the first place. And people do have legit reasons to do so. You're not allowed to judge as an API designer. Just do your damn job and put the radix sort in when somebody wants to sort lots of strings/ints. It's really not that hard. My CS theory prof once told me 25% of computation time is spend on sorting. So I'd say this really matters and people are killing our climate with such an ignorant attitude of intentionally making things worse than they needed to be.

      @marcotroster8247@marcotroster8247 Жыл бұрын
  • You didn't run Benchmark with span example 😉 I was waiting for Benchmark Result of Span, but you end the video 😋

    @slimaneikourfaln3496@slimaneikourfaln3496 Жыл бұрын
    • It is equally as fast as the array part and less memory efficient because it allocates the memory

      @nickchapsas@nickchapsas Жыл бұрын
    • @@nickchapsas Okey I see. Thank you Nick for all your hardwork

      @slimaneikourfaln3496@slimaneikourfaln3496 Жыл бұрын
    • @@nickchapsas well, you could create a reference to the array, then create a span from it, sort the span, and the original reference to the array would be sorted and just return it without any additional allocations

      @autoplanet4833@autoplanet4833 Жыл бұрын
    • ​@@nickchapsas You are just declaring "items" as a Span but it has an implicit operator to convert the array returning from ToArray(), you could save the reference as array and then use AsSpan().Sort() to sort the array in place and then simply return "items" avoiding the ToArray().

      @carmineos@carmineos Жыл бұрын
  • This comment is unrelated to this video. Hey Nick, I really hate ENUMs. Specifically when newer devs use them in service contracts and change somewhat regularly. Would you mind putting together a video on a more elegant way to handle this topic that doesn't break contacts? I see this specifically in an EntityType or EventType and the moment a new one is added everything breaks.

    @dayv2005@dayv2005 Жыл бұрын
    • Seems like you could solve that by versioning your interfaces and map newer enum entries to best fit from the older enum definitions or a default value for code not ready to use the latest enum definitions on the implementation of the interface.

      @JetsetDruid@JetsetDruid Жыл бұрын
    • @@JetsetDruid sure but I would argue that adding a new type to an entity could be done without it increasing the API version. A good example would be to build something like a value object that represents that type and exposing the primitive of it such as a string so you don't break a contract by introducing a new type.

      @dayv2005@dayv2005 Жыл бұрын
    • @@dayv2005 I'd rather do my logic based on an enum that a string, but that's just me I guess. I'd say you are on to something if the property you are representing can be changed at run time (say if we were modelling a car and you stored model as an enum) but if the change would require compile time changes anyway (say something like the fuel type) then I see no problems with enums

      @JetsetDruid@JetsetDruid Жыл бұрын
  • What screen recorder do you use?

    @ronnyek4242@ronnyek4242 Жыл бұрын
    • OBS

      @nickchapsas@nickchapsas Жыл бұрын
  • Irrelevant to the video, how do you make that popup at 8:43 appear ?

    @kyriakosch@kyriakosch Жыл бұрын
    • Just hover over the method

      @nickchapsas@nickchapsas Жыл бұрын
  • .NET 7 is available and people are discovering that the methods available from .NET 1.1 can perform better :-D

    @mabakay@mabakay Жыл бұрын
  • Of course this would not apply to Linq to entities.

    @stephenyork7318@stephenyork7318 Жыл бұрын
  • Premature optimization the bane of every unproductive developer.

    @endofunk2174@endofunk2174 Жыл бұрын
  • If I found any of my devs spending this much time to save nanoseconds and Kb of memory we would have a discussion about "premature optimization." These videos are very fun to watch, but this isn't useful for production code. By the time you scale up to make a difference here, you've probably made some mistakes that lead you there. That's the place to focus your refactor.

    @bob-xm7ny@bob-xm7ny9 күн бұрын
  • My brain hurts

    @chezchezchezchez@chezchezchezchez Жыл бұрын
  • but I guess I just have to deal with bluetooth, tNice tutorials is a big con.

    @ali.mazeed@ali.mazeed Жыл бұрын
  • I wonder what your first language is. I am actually confused, because ur last name sounds of Greek origin, but your accent does not match that.

    @arjix8738@arjix8738 Жыл бұрын
    • I'm Greek

      @nickchapsas@nickchapsas Жыл бұрын
    • @@nickchapsas woah, your English accent is so nice! I hope my accent one day gets that good

      @arjix8738@arjix8738 Жыл бұрын
  • "Nick Chapsas - Champion of The Span Struct" No judgement, just poking fun at your regular recommendations in the use of Span. LOL ... and stop worrying about the pronunciation so much. English isn't your first language and the words with almost equal parts vowels as syllables are difficult for everyone. I'll endlessly make fun of Americans, Brits, etc., that were born into the language butchering it, but it's understandable for ESL folks to get tripped up sometimes. :D

    @asteinerd@asteinerd Жыл бұрын
  • This video feels like 12 minutes just to get talked down to and find out something I already knew and would have no problem finding on google if I didn’t…

    @officebatman9411@officebatman9411 Жыл бұрын
    • Were you on gunpoint when you clicked the video?

      @nickchapsas@nickchapsas Жыл бұрын
    • You're just saying this video wasn't right for you. Based on the responses, people found it interesting. You do know that youtube allows you to skip or close the video all together right?

      @marklord7614@marklord7614 Жыл бұрын
  • Lots of issues with this. When you re-write the code to generate the item list in every iteration without re-seeding the PRNG you're potentially operating on different lists of integers every time, which tends to invalidate the benchmark. Just make a copy of the array and pass it to the benchmark method using "[ArgumentSource(...)]" and a parameter. That or clone the array when needed in the "Array.Sort" benchmark and leave the original "_items" alone. "Array.Sort()" on strings is slow *if you don't provide a comparer.* If you pass in "StringComparer.Ordinal" (or whichever comparison you prefer) then it will run significantly faster. Without the comparer I found it performed a bit worse as indicated in your benchmarks, with the comparer it was much faster than OrderBy() or Order(). "Array.Sort()" with and without a supplied comparison on my machine was 13us (with) vs 83.17us (without). (Interestingly the StdDev was much lower for the "with" version.) Order() and OrderBy() also accept comparers, and perform *much* better when you use them. They still allocate more, but in terms of speed they're much faster when you explicitly provide the comparison. Moral of the story is to always specify the comparer when using any of the sorts. Not doing so will slow you down. Sorting spans is a tiny bit faster, and you can use "AsSpan()" to wrap an existing array without double-allocating. Something like this works just fine and is very slightly quicker (~1 SD worth in my tests) than "Array.Sort()": var data = _items.ToArray(); data.AsSpan().Sort(StringComparer.Ordinal); return data; And finally, "Array.Reverse()" is blazing fast and never allocates, so "Array.Sort(...); Array.Reverse(...);" is still faster, even when you're making a copy of the array to leave the original intact.

    @TheMonk72@TheMonk7211 ай бұрын
  • Shouldn't it be ToList().OrderBy() to actually sort a list?

    @realsk1992@realsk1992 Жыл бұрын
    • Oh no, because you won't be getting a list back. You'll be getting an enumerable and you'd have to re-enumerated

      @nickchapsas@nickchapsas Жыл бұрын
KZhead