Don’t Use the Wrong LINQ Methods

2023 ж. 17 Жел.
45 369 Рет қаралды

Use code MICRO20 and get 20% off the brand new "Getting Started with Microservices Architecture" course on Dometrain: dometrain.com/course/getting-...
Get the source code: mailchi.mp/dometrain/cpl-fuiefwu
Become a Patreon and get special perks: / nickchapsas
Hello, everybody, I'm Nick, and in this video, I will show you the difference between two LINQ-associated methods that look exactly the same but perform very differently.
Workshops: bit.ly/nickworkshops
Don't forget to comment, like and subscribe :)
Social Media:
Follow me on GitHub: github.com/Elfocrash
Follow me on Twitter: / nickchapsas
Connect on LinkedIn: / nick-chapsas
Keep coding merch: keepcoding.shop
#csharp #dotnet

Пікірлер
  • Small correction at 8:03. The 40 bytes are not because the enumerator was allocated, in this case the enumerator a List gives back is a struct. The 40 bytes are because the struct needs to be boxed into an IEnumerator interface because foreach operated on an IEnumerable. A foreach on List doesn't allocate and is faster than a foreach on (IEnumerable)List which allocates, as a List does an explicit implementation of GetEnumerator (to hide it, because it returns an interface) and adds another GetEnumerator method that returns the struct enumerator directly - which foreach will use - which avoids boxing to an interface. It's also faster because calls are direct (static binding b/c struct methods) instead of virtual (dynamic binding through a vtable because of the interface).

    @mrahhal@mrahhal5 ай бұрын
    • And if I would go to check list, I would use more traditional check with param list. There is no need to push LINQ everywhere. If there is created list that means that should be exist real need for early list allocation. It is strange, what bad is engineered C#. And there is no big sense to optimize code unless programmer is writing high speed code. But in this case it would be assumed that he/she is high skilling programmer.

      @infeltk@infeltk5 ай бұрын
    • @@infeltk your comment does not compile.

      @Lammot@Lammot5 ай бұрын
    • Thank you for this explanation. I love seeing comments like this that help explain the behind scenes allocation process of certain features in C#.

      @8BitsPerPlay@8BitsPerPlay5 ай бұрын
    • To add to this. For those confused how foreach can skip the boxing when enumerating over a list, the reason it works is due to foreach using "duck typing". That is, foreach does not look if the type is IEnumerable, but instead it looks if there is a GetEnumerator method. But that brings an interesting point, why is .All() not optimized for it already? .Count() checks if the target type is a collection (and if it is, avoids enumeration). In theory, can't .All() do the same? Check if the target has an indexer (I think IReadOnlyList interface would be fine) and use the for loop in that case.

      @wokarol@wokarol5 ай бұрын
    • @@wokarol Yes, in theory it can, but linq methods weren't really designed for performance. The fact that all non-materializing linq methods allocate new objects already trumps any attempt at micro optimization. Linq is for readability and convenience, not for performance (doesn't hurt to optimize though since it's part of a standard lib, but gains will always be minimal after a certain point).

      @mrahhal@mrahhal5 ай бұрын
  • You can try this List extension method approach as a simple fallback too since it's easier to incorporate into your code rather than changing the underlying List implementation: namespace System.Collections.Generic { public static class ListExtensions { public static bool All(this List list, Predicate predicate) { return list.TrueForAll(predicate); } } } This can be extended to handle functions, such as Any => Exists, FirstOrDefault => Find and so on.

    @RadusGoticus@RadusGoticus5 ай бұрын
  • I would defer to the principle of least surprise: if the performance is not a (measured!) issue, use the generic .All(), otherwise do whatever weird stuff you gotta do for performance. I wish collections had specialized implementations of certain LINQ methods. E.g. having to use the trifecta of .Length, .Count and .Count() depending on what kind of collection you're working with is annoying, and it feels like specialized implementations of .Count() could exist for Lists/Arrays/etc. I bet someone is going to do some nasty things about this with interceptors at some point...

    @onetoomany671@onetoomany6715 ай бұрын
  • The enumerator for the List will check the _version field of the list on the MoveNext calls. That field is updated every time the list is modified so if the list is modified while you are iterating over it, you will get an InvalidOperationException exception informing you that the collection was modified. TrueForAll does not check the _version field and just directly indexes into the list. If All used TrueForAll under the covers, then that would be a behavior change. Also, is Func equivalent to Predicate? I wasn't aware that the latter existed until watching this video.

    @ecpcorran@ecpcorran5 ай бұрын
  • TrueForAll is not LINQ method. This is List method. Just like Add or Remove. Don't mix them.

    @mightybobka@mightybobka5 ай бұрын
    • Nick never stated that TrueForAll is a LINQ method. On the contrary, he mentioned explicitly that TrueForAll is a method on List at 5:30.

      @chr_kress@chr_kress5 ай бұрын
    • @@chr_kress right, but the video is called "Don’t Use the Wrong LINQ Methods" , so it kinda gives wrong context

      @JSWarcrimes@JSWarcrimes5 ай бұрын
    • ​@@JSWarcrimes, @mightybobka I agree, this is a misleading title . I missed that. Sorry!

      @chr_kress@chr_kress5 ай бұрын
    • unfortunately, feels quite clickbaity

      @kocot.@kocot.2 ай бұрын
  • So basically TrueForAll() >= All() ? I don't understand why MS did it this way. Lets say if All() exists before TrueForAll() - why didn't MS just replaced the Implementation of All() with the more performant one? And if TrueForAll() did exist before All(), why did they even add it in the first place?

    @neralem@neralem4 ай бұрын
  • I'm curious if this is generally true for other Linq methods with native counterparts. For example Exists vs Contains or Any / Where vs FindAll. Would the only difference be the generation of the Enumerator? I know some Linq methods do some magic underneath the hood but I always assume the native implementation to better.

    @lordshoe@lordshoe5 ай бұрын
    • I have the same questions. And it's about both the methods of the List class as well as the static methods of the Array class. Though what I remember, although this was not with the latest DotNet version but a couple of years ago, is I tested things like Sort to be a lot faster than OrderBy. The code of Sort tries to use all kinds of optimized sorting algorithms when it can, depending on collection size and data type. Even an intrinsic native version that it imports from the dotnet runtime, which is written in C++. However LINQ methods are still really great when you have any kind of method that generically accepts any IEnumerable, which is actually a really good coding practice too. I still prefer that in most non performance critical code. This is still a lot more efficient than only accepting stored buffers like List and Array when they don't need to be created in the first place. But if you are writing a library, it can a good practice as well to serve different overloads, also accepting Spans and things like that.

      @jongeduard@jongeduard5 ай бұрын
  • There are already many LINQ methods that have different behaviors depending on the real type of the source enumerable. Why don't they change the All method so that it uses TrueForAll when source is a List or a T[] ?

    @Krimog@Krimog5 ай бұрын
    • I have the same question. Why use All when TrueForAll exists ?

      @micoberss5579@micoberss55795 ай бұрын
    • @@micoberss5579 Because when you have an IEnumerable, you don't know if TrueForAll exists or not. If they included the TrueForAll method in the All implementation (for types that have a TrueForAll method), basically, that video wouldn't have been needed. You would have the most performant algorithm automatically.

      @Krimog@Krimog5 ай бұрын
    • @@micoberss5579 Checking each IEnumerable whether it is a list will of course make all calls to "All" slower, but they do it in other cases...

      @TheTim466@TheTim4665 ай бұрын
    • My guess is that Any is simple and fast enough, that checking the underlying type would be unnecessarily expensive.

      @chris-pee@chris-pee5 ай бұрын
    • @@chris-pee The Any without predicate checks the underlying type. As for the Any with a predicate and All, I'm pretty sure checking for the underlying type would still be quicker than memory allocation.

      @Krimog@Krimog5 ай бұрын
  • Nick, you provided the wrong explanation: allocations are due to interface as a parameter type. Just make two functions with foreach inside: one with list parameter, and the other with interface like IReadOnlyCollection; pass list into both and behold the allocations in the second method

    @andreypiskov_legacy@andreypiskov_legacy5 ай бұрын
  • Thinking of IEnumerable as a linked list - counting the elements is in itself an enumeration of the list. The implementation of All() therefore could not depend on the count of elements like TrueForAll does and can't avoid the enumerator. Using TrueForAll on an IEnumerable would require a ToList() or ToArray() call first, which also uses the enumerator unless the thing can be pattern matched to ICollection, in which case it does a memcpy. Even if the thing is ICollection and then TrueForAll() could be used, this is actually not the same behavior as All() because an Enumerator does more on each MoveNext than what TrueForAll is doing in its loop body, which another commenter has already pointed out.

    @Nate77HK@Nate77HK4 ай бұрын
  • So the collections need an optimal fold implementation with early exit, right? Having that, one can express First(), Single(), Any(), Any(Func), All(Func), Select(Func), Aggregate(...), Skip(int), Where(Func), Take(int) and some others without need for IEnumerable. Something like `IFoldable { TAgg Fold(TAgg seed, Func f) }`. Now add a lazy `IFoldable Reverse()` that would exploit indexing and you get Last(), SkipLast(int), TakeLast(int)

    @user-tk2jy8xr8b@user-tk2jy8xr8b5 ай бұрын
  • Do you really need your own `MyList` class? Wouldn't the `foreach` on the standard `List` have worked the same? The problem with the `foreach` in `Enumerable.All()` is that the static type of the sequence is `IEnumerable`. That causes boxing of the enumerator and virtual calls for `MoveNext()` and `Current`. If the static type were `List`, the `foreach` would have been as performant as the `for`.

    @vyrp@vyrp5 ай бұрын
    • Was gonna comment the same thing. It's not necessarily the foreach or 40 bytes that's the issue here, but the virtual calls through an interface (which brings with it the overhead of allocating and disposing). Using IEnumerator requires method calls to MoveNext() and get_Current() every iteration. So three calls per iteration (MoveNext(), get_Current(), and predicate()) instead of two (this[int] and predicate()). I'm curious how the JIT assembly compares between the two.

      @colejohnson66@colejohnson665 ай бұрын
  • The foreach issue strikes again. Wonder if C# could just have 2 paths when running any foreach. If its a simple List :run the basic - no enumerator for loops. Then LINQ doesnt have to be responsible for checking for List/NonList switching.

    @VeNoM0619@VeNoM06194 ай бұрын
  • Someone made LinqAF. It's Linq, but implemented entirely using structs and is nearly allocation-free (hence the AF). Apparently it's a little slower than Linq, but its performance is more consistent for cases such as game development because of the immensely reduced allocations.

    @RealCheesyBread@RealCheesyBread3 ай бұрын
  • Maybe it could be possible to write a source generator that intercepts the LinQ method call, and forces to use the most optimized implementation according to the the type

    @kikinobi@kikinobi5 ай бұрын
    • at that point, since linq is built-in, why not just do compiler magic instead?

      @saeedbarari2207@saeedbarari22075 ай бұрын
    • static analysis catches a lot of those, so I really dont see why bother

      @kocot.@kocot.2 ай бұрын
  • As always, you never disappoint, always on point. Thanks Nick!

    @akeemaweda1716@akeemaweda17165 ай бұрын
  • Very very very useful.

    @KodingKuma@KodingKumaАй бұрын
  • As a Unity developer, I always create extensions that copy linq methods optimised for array and list. That's my workaround, I'm just piggybacking that heap of extensions or make them a new. I really use a lot select and to array, so i make SelectArray method for status, lists and enumerable and roll with it.

    @zORg_alex@zORg_alex4 ай бұрын
    • Also unity lacks a lot of simple things, like Array.IndexOf and other things, they are for some reason under ArrayUtility. Heaps of extensions. If I'd know that I'll write so much code just for ease of use a decade ago😂.

      @zORg_alex@zORg_alex4 ай бұрын
  • Thanks for a very useful video. I appreciate the level of detail that you provide.

    @HeathInHeath@HeathInHeath5 ай бұрын
  • They need to have some improvements left for net9 😅

    @ryan-heath@ryan-heath5 ай бұрын
    • There might be some drastic improvements in LINQ for the readonly collections coming in .Net 9 if an experiment comes through. Specifically, they're trying to finally make `ICollection` implement `IReadOnlyCollection` (same for `IList` & `IReadOnlyList`, `IDictionary` & `IReadOnlyDictionary` , and `ISet` & `IReadOnlySet`)

      @modernkennnern@modernkennnern5 ай бұрын
    • They are very busy fixing all the serious bugs in other areas

      @nothingisreal6345@nothingisreal63455 ай бұрын
    • ​@@nothingisreal6345Where can we find those bugs, any examples? (Real question)

      @WDGKuurama@WDGKuurama5 ай бұрын
  • I wonder how ConvertAll vs Select, Find vs FirstOrDefault, FindAll vs Where, Exists vs Any perform, in both List and Array types. I expected them to be optimised like it's done for Count(), but in this video we see it's not a rule.

    @kyjiv@kyjiv5 ай бұрын
    • I guess they are doing a trade-off between slowing down all calls, even those where the object really only is an IEnumerable, and gaining performance in other cases. With Count(), checking can dramatically improve performance (iterating through the whole thing vs. just reading the length, so it probably is worth it. On the other hand, in many cases the IEnumerable will be a list or array...

      @TheTim466@TheTim4665 ай бұрын
  • I tend to not use higher order functions that often in programming besides maybe sorting, but wanted to mention that the SonarLint extension for Visual Studio will tell you in these cases when a better alternative (type specific) that should be used is available.

    @reikooters@reikooters5 ай бұрын
  • Hurts, that C#'s extension methods are not really methods, in a sense that they are not dynamically dispatchable. If that was the case, List could have had its own overload of All(p) and other LINQ methods with the applicable optimizations. I guess if I'm speculating, I'd also add, that they of course should be optimized to static dispatch by the compiler in all the applicable places. So in this example you would get exactly the same performance as TrueForAll, and in a case, where you treat the list as IEnumerable you would only get the overhead of dynamic dispatch, not the allocation.

    @zwatotem@zwatotem5 ай бұрын
  • Does it make sense to use arrays instead List in such cases?

    @MirrorBoySkr@MirrorBoySkr5 ай бұрын
  • Why "x => x > 0" lambda didn't contribute to allocations? Is it some kind of C# compiler optimization?

    @1nterstellar_yt@1nterstellar_yt5 ай бұрын
    • I guess the C# compiler actually takes the managed function pointer and emit calli instead of creating the delegate object.

      @samsonho5537@samsonho55375 ай бұрын
    • Because it does not capture anything, so it compiles to a regular function.

      @mk72v2oq@mk72v2oq5 ай бұрын
    • No variable captures

      @MRender32@MRender325 ай бұрын
    • It's not a closure so the lambda is just a delegate to a BTS generated method. Otherwise the compiler would generate a class for the closure and the instance would be allocated to the heap.

      @lordmetzgermeister@lordmetzgermeister5 ай бұрын
  • IDE eating 1MB of RAM during idle times - great way to use memory for blinking the cursor :D

    @michalidzikowski5239@michalidzikowski52394 ай бұрын
  • Yes! Very good analysis.

    @Psykorr@Psykorr5 ай бұрын
  • In other words, if you are dealing with a known collection type directly, use the specialized methods if you care about performance and memory allocation. That way you escape the enumerable, especially when not needed for a query.

    @marna_li@marna_li5 ай бұрын
  • "...by using the right one, you're going to reduce the performance of said operation in half" No, thank you! 😂

    @zwatotem@zwatotem5 ай бұрын
    • I stopped the video right there looking for a comment like this and ready to write it myself if none was to be found!

      @johnnyblue4799@johnnyblue47995 ай бұрын
  • im i guess this is what an extension could fix with intercepters. Simply go in and fix all the bad usage of linq by changing the methods around?

    @Subjective0@Subjective05 ай бұрын
  • i think that most of business developers will not care about this so much as GC is ready and there is 64GB+ of ram on machine and cpu has a lot of cores. This is probably one of many hidden gems that u cant find easly in msdn. if I want to follow optimization path i would rather craft my own function based maybe on plinq or i could introduce SIMD. so for this is a half way only solution. but i like this kind of usefull gems. I wonder only why m$ not obsolate another slower version.

    @dev-on-bike@dev-on-bike2 ай бұрын
  • and how does this method compare to the .Any method?

    @Sander-Brilman@Sander-Brilman5 ай бұрын
    • .Any(predicate) == .All(not(predicate)). I would be surprised if their performance differs in any way, they should both terminate at the exact same enumeration.

      @onetoomany671@onetoomany6715 ай бұрын
  • Are there more methods like this?

    @dimitar.bogdanov@dimitar.bogdanov5 ай бұрын
    • Any() when you have an IEnumerable and Exists() when you have a List, Array

      @lingfar4134@lingfar41345 ай бұрын
  • TLDR; list implementations are better than generic IEnumerable methods, and btw one of the 2 methods discussed is not even LINQ o_O, so the title isn't very honest. Now, if you're interested why (foreach vs for), feel free to skip to ~ 5:00 (although reading comments is probably a better idea as the explanation from the video is arguable)

    @kocot.@kocot.2 ай бұрын
  • Personally I'd use .Any(x => x < 0) for that kind of evaluations, not .All() and I've never used TrueForAll(), it's like that one is not in my orbit at any time. Do you have (or can you make a new video about it) Any() benchkmark (pun intended)?

    @BrunoBsso@BrunoBsso5 ай бұрын
    • This

      @clantz@clantz5 ай бұрын
    • .Any(predicate) vs .All(not(predicate)) is a question of code clarity, IMO. Use whichever best expresses the business rule or whatever. Hard enforcing the use of one over the other is a code smell IMO.

      @onetoomany671@onetoomany6715 ай бұрын
    • Except that any can stop iterating at first false, which can make a real difference

      @daniellundqvist5012@daniellundqvist50125 ай бұрын
    • @@daniellundqvist5012 so does All, it makes no difference whatsoever

      @chylex@chylex5 ай бұрын
    • Any() can stop at first true and return true, All() can stop at first false and return false. What is your point,@@daniellundqvist5012?

      @palecskoda@palecskoda5 ай бұрын
  • Your content is generally great, but you’re getting into micro-optimisation territory Nick. 40 bytes of allocation, or 6us extra, is typically *not going to matter*. I’ve worked with plenty of devs that that worrying about all of these micro-optimisations to the detriment of clean code.

    @optiksau1987@optiksau19875 ай бұрын
    • If 1 billion C# programmers watch this video and take the advice, collectively 6 seconds of computing time will have been saved! I do enjoy the videos and find these things interesting to know, however I’ve also worked with developers who had to constantly change code to use the latest in thing, and it meant their code was never finished as something else would come along they wanted to use.

      @ghaf222@ghaf2225 ай бұрын
  • You definitely have to record a video about Aspirate and k8s deployment via Aspire

    @MatinDevs@MatinDevs5 ай бұрын
  • Not using the new UI in the latest version of Rider?? I like it, much cleaner hiding most of the menus I never use and focusing on one I use all the time

    @chrisusher5362@chrisusher53625 ай бұрын
  • is "All" twice as slow on larger list than 3 items?

    @lordicemaniac@lordicemaniac5 ай бұрын
    • The benchmark had 10000 items, not 3.

      @phizc@phizc5 ай бұрын
  • What is with .Any ? Why do we need all or trueforall, if we have any 😁

    @kosteash@kosteash5 ай бұрын
    • if you have any, you need to reverse condition for it to function and short-circuit. But it doesn't solve the performance problem it can bring. If you to use .All in frequently executed code - this will bring huge stress on GC

      @NickMaovich@NickMaovich5 ай бұрын
    • Exists is the List-specific equivalent of Any

      @zabustifu@zabustifu5 ай бұрын
  • would be nice to get a `TrueForAny` method to get the performance gains and not have to invert the condition for existing code

    @serbanmarin6373@serbanmarin63735 ай бұрын
    • list.Exists() exists 👀

      @RadusGoticus@RadusGoticus5 ай бұрын
  • I don't remember I ever used this method. The faster way is to use reverse Any method. Instead of numbers.All(x => x > 0) use numbers.Any(x => x

    @isnotnull@isnotnull5 ай бұрын
    • Would it be way faster? They should both exit on the same enumeration, which is the first negative or 0.

      @okmarshall@okmarshall5 ай бұрын
    • @@okmarshall My bad

      @isnotnull@isnotnull5 ай бұрын
  • What a perfect example of over optimization and "clever" code for high limited improvements, msft constantly chases perf improvements......why bother learning this

    @berzurkfury@berzurkfury5 ай бұрын
  • If you're concerned about the performance of foreach vs a for loop, you probably shouldn't be using C# in the first place, if performance is that important to your application... And it's probably worth pointing out that even if it's "twice" as fast with 10k values, doesn't mean that it'd still be twice as fast with 100k values

    @Dimencia@Dimencia5 ай бұрын
    • Performance aware code is important regardless of language. Why make something slower if it costs you nothing. I agree with the idea that you shouldn't spend an exorbitant amount of time optimizing, but if simply choosing the correct data structure can have drastic effect on performance then you should be aware of it. Like, if you know a collection is 100 elements, specify that in the list constructor. Better performance and makes the code clearer

      @modernkennnern@modernkennnern5 ай бұрын
    • Simply not true anymore. C# is very fast compared to a lot of languages, depending on your application. We can always think about optimisation, and yes, some do use C# for high-performance applications.

      @okmarshall@okmarshall5 ай бұрын
    • @modernkennnern it doesn't cost nothing, it costs readability. There's a reason people prefer to use and read foreach loops, and a reason people use c# instead of c++

      @Dimencia@Dimencia5 ай бұрын
    • @@Dimencia I definitely agree with that specific scenario. I never use for loops either, and despite what this video told you there's actually no perf difference between for and foreach loops anymore. Nick even has a video on this topic, I'm surprised he didn't think this through. All I'm saying is that I do not subscribe to the notion that "performance does not matter" that so many developers nowadays tout. It's not critical, but it _does_ matter. However, I'm also not advocating for using `ReadOnlySpan`s and `InlineArray`s everywhere in order to eek out that tiny bit of performance, while make the code unreadable in the process. Hence the phrase "performance aware". You should be aware of what you're doing, and realize that a `List` is not the only collection type available (contrary to practically all code I've ever seen at the company I work for).

      @modernkennnern@modernkennnern5 ай бұрын
  • allPositive must be refactored into anyNegative and use .Any() and filter if any < 0

    @andreyz3133@andreyz31335 ай бұрын
  • Zoomies

    @ryanzwe@ryanzwe5 ай бұрын
  • hahahaha, 420 . I know what you mean

    @zoiobnu@zoiobnu5 ай бұрын
  • It is not a Linq method man

    @unskeptable@unskeptable4 ай бұрын
  • This level of optimization is not meant for C# code. Use C or C++ if you are counting bytes.

    @mustafasabur@mustafasabur5 ай бұрын
  • I would have done list.Min() >= 0 xD

    @BlyZeHD@BlyZeHD5 ай бұрын
    • Use list.OrderBy(x => x).First() >= 0 to get on the sigma grindset, my dude.

      @lordmetzgermeister@lordmetzgermeister5 ай бұрын
    • That is expensiv

      @nothingisreal6345@nothingisreal63455 ай бұрын
    • All and TrueForAll stops if the predicate is false, which could be first item. Min has to get the smallest value, so it has to evaluate every item. E.g. If the first value is -1, and there are 1 million items, All/TrueForAll evaluate 1 item before returning false, while Min has to evaluate all 1 million items.

      @phizc@phizc5 ай бұрын
KZhead