“Turn All Your Enums Into Bytes Now!” | Code Cop

2024 ж. 17 Нау.
33 148 Рет қаралды

Use code KUBE20 and get 20% off the brand new "From Zero to Hero: Kubernetes for Developers" course on Dometrain: dometrain.com/course/from-zer...
Become a Patreon and get special perks: / nickchapsas
Hello, everybody, I'm Nick, and in this episode of Code Cop, I will talk about enums and how turning them into bytes instead of integers makes absolutely no sense.
Workshops: bit.ly/nickworkshops
Don't forget to comment, like and subscribe :)
Social Media:
Follow me on GitHub: github.com/Elfocrash
Follow me on Twitter: / nickchapsas
Connect on LinkedIn: / nick-chapsas
Keep coding merch: keepcoding.shop
#csharp #dotnet

Пікірлер
  • There is a compiler code analysis warning for this situation that, if enabled, will trigger if you set an enum to anything other than int. "CA1028: Enum storage should be Int32" This warning is not enabled by default, but I imagine there is some reason it exists. Microsoft says this about it: "Even though you can change this underlying type, it is not necessary or recommended for most scenarios. No significant performance gain is achieved by using a data type that is smaller than Int32."

    @crystalferrai@crystalferraiАй бұрын
    • Right, when you’re dealing with a billion + rows, these small optimizations are critical. But as it said in the first line of the content, it depends on your usage.

      @jamescanady8156@jamescanady8156Ай бұрын
    • Honestly that a very anal and stupid warning There are cases where you need specific integers on the top of my head is In Protocols When the padding matters

      @Alguem387@Alguem387Ай бұрын
  • Actually, I had a couple of times specified a non-default base type for enums. But in all those cases the actual data was transferred as a binary stream via a serial connection to an embedded device, so I had to carefully follow the protocol.

    @DmitryKandiner@DmitryKandinerАй бұрын
    • Agreed! That is probably the only viable reason though. For 1 Million records in a table this supposedly "optimization" saves 2.86MB. On top of that it might also do more harm then good as the IO could be negatively impacted. The "view" of the OP is also for no flag enums. The default of an int is a good default for flag enums as anything lower is only giving you 8 or 16 flags. The win32 API is full of int and long flag enums. I've done some PLC stuff in the past too and the PLC was even putting bit flags together in a 16 or 32 bit registers, and the only time I have had to step away from the default.

      @paulkoopmans4620@paulkoopmans4620Ай бұрын
    • ​@paulkoopmans4620 for 1-2M records an hour, that's actually about 3 GB every month.

      @peryvindhavelsrud590@peryvindhavelsrud590Ай бұрын
  • I swear with every Code Cop video Nick becomes more and more insane 😂

    @cheezyskipper@cheezyskipperАй бұрын
  • Context indeed matters. I used a byte enum yesterday to reduce the packet size of a serial messaging protocol.

    @dcuccia@dcucciaАй бұрын
  • For anyone wondering, you can actually harm performance doing this, or at least the CLR will likely align fields to 4 byte or 8 byte boundaries (by default) anyway to ensure performance is not harmed. So you might have an int field, then your 1-byte struct, then another int field. The CLR would likely use 32 bits for that 8 bit struct anyway, such that the layout would be 32 bit int, 8 bit struct with 24 wasted bits, then the final 32 bit int. You can control the layout with custom attributes, but by default it will be optimised for the particular CPU architecture. Sure....You can come up with some exception where doing it as an 8-bit struct is better, like loading a million 1-byte structs into a large array. Sometimes it would be better but the default should always be the standard 32 bit struct. As Nick says....Do NOT do this by default.

    @davidtaylor3771@davidtaylor3771Ай бұрын
    • @@IIARROWS premature optimization is the root of all evil. Is that *really* a hotspot? If your database record is 1KB, then you'd have to have 30 such fields before you'd save 10%. In any case, at runtime, if you have this in a struct, unless you also specify packed structs, you are *still* likely to have padding and you'll need to make sure all your byte values (enum or not) are grouped together in your struct definition (hopefully in multiples of four for best efficiency) because the very next int, pointer, etc. after the byte in the struct is going to force alignment (by using padding). Also, there can be penalties associated with misaligned read/writes if you use packed structs.

      @shanehebert396@shanehebert396Ай бұрын
    • @@IIARROWS because it's not premature to realize that numeric values work better with numeric types rather than strings. That's just reductio ad absurdum. Yeah, and if we are encountering padding, we've just negated the reason for going with byte for the enum.

      @shanehebert396@shanehebert396Ай бұрын
    • @@IIARROWS The only place you should actually care about this is dealing with interop and StructLayoutAttribute. Or any other low-level memory stuff where you need to comply with a predefined byte usage.

      @zoltanzorgo@zoltanzorgoАй бұрын
    • @@IIARROWS If you're at the point of needing to optimize the size of an enum you should probably just be using something other than C#.

      @Sindrijo@SindrijoАй бұрын
    • ​@@shanehebert396 Not when the "premature optimization" in question is LITERALLY 7 characters for each enum you use this with. At worst it changes nothing, at best memory is four times more efficient. Like I'm honestly confused because there is no reason NOT to use this trick. - If it's not used in the right place, the compiler will tell you (enum value exceeding the byte maximum value) - It doesn't harm performance - If you're lucky, the CLR will optimize memory correctly, making the enum four times more memory-efficient - I'm repeating myself but it's 7 CHARACTERS LONG - It's neither complex nor complicated to do - If you're bent about casting, well guess what, casting integer types small-to-big is better in every single way than big-to-small, so in practice casting byte enums is better than casting int32 enums in (as far as I know) every single way. Please tell me if I'm mistaken because I really don't see why you wouldn't use this trick on all elligible enums.

      @axelseven7@axelseven7Ай бұрын
  • It is a micro optimization for most databases. But you can have tables with tens or hundreds millions of records where some of the columns are enum values. It can make indexes perform better by being smaller. This tip might not be very relevant to many applications but it is not wrong.

    @vchap01@vchap01Ай бұрын
    • Totally agree.

      @josephmoreno9733@josephmoreno9733Ай бұрын
    • Care to explain how a btree with byte values performs better than int values?

      @7th_CAV_Trooper@7th_CAV_TrooperАй бұрын
    • @@7th_CAV_Troopernumber of page reads (from disk) is the magical counter.

      @simon3121@simon3121Ай бұрын
    • Its wrong because while you definitly can use tiny in the DB, there is no measurable cost in casting that to and from an int in the actual code. And as others and me have mentioned in other comments, using a byte enum might actually cause performance degradation. So keep that optimization in the DB.

      @davidmartensson273@davidmartensson273Ай бұрын
    • most sane databases use data alignment, and alignment is not byte, at least int processor dsnt know what a byte is, but processor takes int into register without any problems and overheads and in sane databases no enums exist, they are not needed

      @AEF23C20@AEF23C20Ай бұрын
  • For one reason or another I had to write a parser/serializer for some obscure format (existing one could not handle memory requirements we had - would hog up ALL THE MEMORY on a machine). The sample code required maintaining ordering of a list when inserting new stuff (so some stupid search through the list and inserting at proper index)... which in my case ended up being an absolute performance killer. After discovering that through profiling and replacing list with sorted list (or something like that), performance improved about 100x in relevant cases. In other words, do a PoC, profile it and look for things that eat up most of your performance. This way it will be less labour intensive and your time is more valuable than adding extra disk/ram to that server that is supposed to run the damn thing

    @PajakTheBlind@PajakTheBlindАй бұрын
  • There's nothing wrong with saving 3 bytes on a column in SQL Server. Odds are you have more problems than this, but that doesn't inherently make it a problem to seek out the savings. As far as how that should translate to the C# representations, I think there's room to debate. Remember that each field isn't stored just once. It's also stored in each index. It also consumes server memory. The less you use to do the same work, the more simultaneous clients you can support. The benefit from small changes like this isn't massive, but they add up for larger datasets. Agree with Nick about prioritizing your performance scrutiny.

    @JGoodwin@JGoodwinАй бұрын
  • We cast enums to bytes at our work, because the database is very old (25+ years), and we have some foreign keys to constant types, where the foreign key were of type [tiny] and now that we integrated EF core, we instead of making a join on the table that stores the value name, we just use enums of type [byte] - this just makes the EF to SQL relation easier. I don't think it neither a good or bad advice, it is just a very situational thing. which you normally wouldn't have to worry about.

    @Ziirf@ZiirfАй бұрын
  • To be honest I do it too, when I create an Enum to be saved in database I give byte type, because it just a keyword and conversion in dbcontext and nothing more, easy to implement. Yes most probably projects I worked on had missing better performance optimization but it does not matter, if I am aware of something while implementing, I will do it. It does not take a lot of effort.

    @fatihgenc7385@fatihgenc7385Ай бұрын
    • What is it you think you're optimizing? Database size?

      @7th_CAV_Trooper@7th_CAV_TrooperАй бұрын
    • @@7th_CAV_Trooper yes, I work in a project where we get almost a hundred millions of messages daily and we have large databases for each env, we clear the data periodically but they are still big.

      @fatihgenc7385@fatihgenc7385Ай бұрын
  • I used to work on a transactionnal database that was dealing primary keys, enums with bigints, bigints everywhere! I can tell you that optimizing storage can lead to significant performances as well lowering storage for back ups and data warehouses. As for enums, I always use default integers, but if we do have millions of millions of records, i may consider using other types depending on the range of values or reconsider using a different design approach (using discriminators or not i.e table per types etc). Storage cost are very low nowadays anyways.

    @FrancisGauthier2@FrancisGauthier2Ай бұрын
  • This enum byte thing exists like 15 years. The fact that people react to this like it's something new makes me think on the quality of any software these devs create every day!

    @Chris-zb5nm@Chris-zb5nmАй бұрын
  • Just to be clear, 100% with you :). There are times of course where you should think of the types in the db. A concrete example from my previous job where we went from storing guids in strings (BigQuery didn’t have a native format for guids at the time) to its byte representation saved us a lot of storage space, and since we ingested about 10TB of json every day that optimization actually saves us quite a lot of money.

    @TomasJansson@TomasJanssonАй бұрын
  • When I was told about this video my instinctual answer was "with such cheap storage available, why waste your time" and my second reaction was "database normalisation is better than worrying about enum types". Maybe that reaction is 'cos I'm a database programmer at heart. Hear, hear on identifying what is a failure in understanding, not a tip.

    @SnowImp@SnowImpАй бұрын
  • I've done this and I will probably do it again, but I'd never post it as a general advice, since it's much more likely to a headache while programming than it is to affect performance. For storing enums in a database, I'd use byte (tinyint) atleast 95/100 times though, since it'll be a primary key and I have some autistic traits.

    @Reellron@ReellronАй бұрын
  • I mean... the only place we'd potentially think of doing this - is in our online game which is tick based in a lockstep model: because we try to keep data usage as low as possible. And probably we wouldn't even go as far, because there's more gains elsewhere to be made - but it also doesn't do any harm... I guess? So we'd probably do that at some point, but we haven't done much optimizations yet to begin with and are currently at about 6 mb's per hour per player, which is pretty good already! But if this is the kind of optimization you need, sure. All the bits help... but 99.9% of all use cases this is not necessary. But I would also say it's not a bad thing to do, if you're sure it's the right data structure. But you shouldn't do it for the performance reason, but because it's the correct data type.

    @Albileon@AlbileonАй бұрын
    • Yea, I can see maybe doing it for something like that or maybe where you are trying to fit a lot of things in a single network packet. (Or even cram something into specialized headers) even then though I would probably just transform it when creating the packet and leave the enum be in code. Which is the same thing that should be done for sql as well.

      @NickSteffen@NickSteffenАй бұрын
  • I think most people struggle to understand that optimization or "savings" have a factor/scale to them, and you can look at how effective/valuable a certain optimization or trick is by examining how it's going to _scale out_ in the context some real or hypothetical project. A "one-off" optimization or memory saving isn't helping you much, at all, unless it's a VERY big, single thing we are talking about, like maybe optimizing install/storage size of a game or app by eliminating the need for some huge chunk of data/content. Changing the underlying value type of an enum to `byte` isn't doing much for you in and of itself. It *can* in some very specific situations, like I sometimes do this in real-time 3D/game code when I'm dealing with really big data buffers or I have to have some very specific structure layout or byte alignment. But just universally making all your enums into bytes is more likely to slightly degrade performance than enhance it, as you can be causing some misaligned byte boundaries or reducing the compiler's ability to help you, not to mention you can create some future technical debt for yourself in some situations ...

    @GameDevNerd@GameDevNerdАй бұрын
  • I totally agree with you and how the post is written (being kinda misleading). But what is the downside of just making most of the enums you're using a byte (only if you 100% know that the enum has less than 255 values obv.) ? I don't see any disadvantage for that or I am wrong here? It's more like it doesn't matter yea but then it's also not bad to do it right?

    @Dustyy01@Dustyy01Ай бұрын
    • You might accidentally introduce padding in the memory alignment of your fields. If you have a class holding some enum fields, and you make some byte, some int, etc. it's probably going to end up aligning those byte fields anyway and wasting your 4 bytes of memory. Not to mention that most of the time you are still reserving all these 4 bytes in the register, so it's faster in some scenarios to give your register 4 bytes to begin with. On the other hand, if you want your structs to be highly compact in memory and absolutely optimized for 4-byte alignments, say you have a combination of two enum fields, and your only concern is comparing those combinations by interpreting the entire struct as a single 4-/8-byte value, then maybe. The consensus here is, it barely matters. If it does matter, prove it with benchmarks. If you prove it, make sure to make reasonable changes according to your domain constraints and always measure how this impacts your performance. Without measuring, nothing is 100% certain.

      @AlFasGD@AlFasGDАй бұрын
    • @@AlFasGD appreciate the technical explanation a lot 👍

      @Dustyy01@Dustyy01Ай бұрын
    • So you're saying it can be better, but not worse?@@AlFasGD

      @ajdinhusic2574@ajdinhusic2574Ай бұрын
    • @@ajdinhusic2574 nope, I'm saying it can be either better or worse or the same. Introducing padding = more allocated memory per instance. Also, padding could hinder cache locality. Again, we're talking in the scope of nanoseconds and a few bytes, which you don't care about most of the time.

      @AlFasGD@AlFasGDАй бұрын
    • @@AlFasGD thanks for the clarification! I didn’t get the worse part from your answer initially, because my thought process was, well if it pads to be 32 bits again, then its the same as the Int32 bit size. So can be ‘better’ but not worse. But thanks for mentioning it can in fact be slower than int/ more memory, I did not know that.

      @ajdinhusic2574@ajdinhusic2574Ай бұрын
  • Size column at database matter if column is part of index. If key has size 8 or 5byte is big difference because more keys can be contained in one 4KB db page.

    @jakubbalarin6867@jakubbalarin6867Ай бұрын
  • Even though it doesn't offer us a significant performance improvement, I didn't really get why not to do that. Would that actually represent a performance decrease or something similar?

    @dev.gustavo@dev.gustavoАй бұрын
    • On your 64bit platform, your cpu won't perform faster at copying byte than copying an integer, the only reason to do it it's for space optimization which is only a real thing to look at when you're dealing with big applications that have a huge databas. Code wise it's just annoying for people that will have to use your enum because it will 99% of the time being converted to an integer because you don't need it it to be a byte for your development purpose, which is even worse than having an integer in the first place

      @burger_flipper@burger_flipperАй бұрын
    • Indeed performance can be affected negatively because of alignment issues. Typically objects/structs aren't compacted so that fields align with what the processor handles more efficiently, so in memory probably an enum will occupy 8 bytes, even if you change its base type to byte or short. If you force the compiler to pack the fields then performance can degrade substantially because fields may be split into two memory reads and writes. Even though the benefit in the database storage requirements comes it will penalize performance there too because of misalignment.

      @monomanbr@monomanbrАй бұрын
    • @@monomanbr Could you please provide me some concrete references where I can learn about that? I'm really not being able to comprehend why such a simple thing could ever represent the opposite of what it should

      @dev.gustavo@dev.gustavoАй бұрын
    • Up, changing enums to shorts have no sense, but I do not understand your anger, at least this tip does not have negative impact.

      @szynkie6710@szynkie6710Ай бұрын
  • I've only used byte-extending enums once; to reduce the size of a struct, which might be created 100,000-200,000 times per second, every second. With byte-extending enums, the struct is exactly 16 bytes in size, which also aligns with its "natural" packing size. And even then, I doubt using int-extending enums would actually result in any actual performance degradation of note.

    @MZZenyl@MZZenylАй бұрын
  • We have a few Enums in our game code that are set to bytes and shorts but that's specifically because we do have hundreds-of-thousands to millions of them in contiguous arrays for the game's data so we do gain a fair bit of performance doing this. Especially from a cache point of view.

    @Xankill3r@Xankill3rАй бұрын
    • Do you combine eight booleans into one?

      @Sindrijo@SindrijoАй бұрын
    • @@Sindrijowe don't really have a lot of places where we would need to do that. Except maybe one instance where we pack groups of six bools into single bytes - wasting 2 bits yes but it works out in that situation.

      @Xankill3r@Xankill3rАй бұрын
    • Performance? Probably not. You rather pack more information into your bytes, but packing and unpacking is usually less performent then not doing it. We were doing it in multiplayer games to reduce needed bandwidth. But it was also rather unusual practice. You can find same principle in c++ code in unreal engine a lot.

      @Revin2402@Revin240216 күн бұрын
    • @@Revin2402 we have benchmarked it (as one should) and it actually does improve performance in our case. I'm guessing it is due to reduced cache misses.

      @Xankill3r@Xankill3r16 күн бұрын
  • I've only ever used the underlying type change for P/Invoke specifically to avoid having to cast. I can't think of any other reason

    Ай бұрын
    • Another reason is memory optimization, but I believe most of C# devs never face with such necessity. When you really need it - you know it

      @user-tk2jy8xr8b@user-tk2jy8xr8bАй бұрын
    • Unless you're compiling byte-aligned, which is a performance killer, it does not save memory.

      @7th_CAV_Trooper@7th_CAV_TrooperАй бұрын
    • And comms packets, with fieldOffset and Marshall attributes, or masks, anything that gets close to hardware, and unmanaged code.

      @nickbarton3191@nickbarton3191Ай бұрын
    • @@nickbarton3191 Yeah, close to hardware is the only real reasons I can think of

      Ай бұрын
    • @@7th_CAV_Trooper it does. Consider struct S1 { short; long; int; long; byte; long; byte; } vs struct S2 { long; long; long; int; short; byte; byte; }. sizeof(S1) is 7*8=56 bytes, sizeof(S2) is 4*8=32 bytes. Same with classes. All "big" fields are aligned so memory access takes 1 read cycle, all "small" fields fit into 8 bytes so access takes 1 read cycle again. With no alignment certainly S1 would also take somewhat around 32 bytes, but the unlucky "big" fields would require 2 read cycles. Byte fields are byte-aligned and it's fine.

      @user-tk2jy8xr8b@user-tk2jy8xr8bАй бұрын
  • I think you are right. If I had to come up with an example of where enum as byte maybe could have value as an optimization, a game like Minecraft comes to mind. Imagine that almost all properties of each block could be expressed as byte-sized enums, then I guess it could be viable - for instance if you find that the game needs to swap out memory so often that it becomes a problem, if you could "magically" reduce the memory footprint by, say, half that might solve the issue. But again, this would be a very special situation, and not something for everyone (or anyone) to do by default.

    @SteffenSkov@SteffenSkovАй бұрын
  • Something I haven't seen mentioned is message structs. For one of my work projects we had a messaging system that was marshalling message structs and sending them over legacy hardware with very low bandwidth. In some structs, the fields were marshalled as int32 so when message frequency was high it would start slowing down. These fields were used in combination to represent some state. We found a way to optimize and drastically reduce the size by representing each state as a single byte under one enum and use bitwise operations to combine the states together using bit masking.

    @user-iy3fx5sq7q@user-iy3fx5sq7qАй бұрын
  • In general, it's good advice to follow the conventions of whatever programming language or environment you're using, even if on the surface the convention may seem counter-intuitive. The convention for always using the `int` type in C# is a prime example. It may seem intuitive to use the smallest integer type to hold range of values needed, but the wisdom of the crowd knows something that isn't obvious, namely modern processors are optimized process data values on 32-bit and 64-bit boundaries. So you may think your being smart and efficient, but in reality there's no benefit. Sometimes there are valid exceptions to the common conventions in specific circumstances. A skilled develop knows when these exceptions are needed. A skilled developer also knows how to performance test code and search for specific and measurable optimizations when performance improvements are clearly needed.

    @IAmAI101@IAmAI101Ай бұрын
  • We have a database (not SQL) where we store terrabytes of data. Even then, optimization into byte enums is not the best way to optimize the storage. We do gave some byte enums, but that's special cases that's transmitted binary over the network in high traffic paths. In a couple of edge cases I've even had to merge multiple enums into a single byte for serialization to save bandwidth and egress costs.

    @peryvindhavelsrud590@peryvindhavelsrud590Ай бұрын
  • Optimize is a very broad topic. Using byte as the underlying type will, of course, save space, but is it worth it? A 32-bit (or 64-bit) processor is much more speed efficient to access 32 (or 64) bits of data at a time. The data, however, needs to be 4 (or 8) byte aligned. For a modern processor to read a byte, it needs to read 32/64 bits of data and strip the unnecessary bits. If the data is not aligned, the process is even slower because shifting happens. We don't see it in a high level language, but that happens at assembly level/machine code. The only time I use byte as the underlying type for enums is when I have data interchange happening, when a byte is part of a data structure sent by an embedded device. Something like that.

    @NurchOK@NurchOKАй бұрын
  • Okay, the Key Message is, don't overoptimize. Got it, thank you Code Cop 🙂

    @Selbstzensur@SelbstzensurАй бұрын
    • That and there are likely other places in the application one can make a difference in before this level of nitpicking ought be considered.

      @Dojan5@Dojan5Ай бұрын
    • @@Dojan5 okay the point you mentioned, i overheard, is overheard an english Word? I mean, i did not get it, but thanks.

      @Selbstzensur@SelbstzensurАй бұрын
  • I used to port mobile games in 2005 when we had 2 Mb of RAM. We had only one class to avoid wasting class definition bytes, everything was so optimized. Glad we now have reach petaflops with H100 GPU!

    @laurentallenguerard@laurentallenguerardАй бұрын
  • The general advice is something along the lines of "Prefer what is natural, not what is the smallest." With enums, you have to remember that they are just named integers ‒ by default there are 4 millions other valid values in addition to the 10 you enumerate, but why bother getting rid of them? If there is a *natural* reason to use a different underlying type, sure, but in that case all types are equivalent; it does not matter that one is the smallest. Sure, if you need to match a particular binary format, in a file, communications protocol or similar, then yes, a byte could indeed be natural, but not because it is the smallest, but because that is what the value actually is! Another take: why stop at byte? Make it tightly bit-packed! And if there is not a power of two number of values, use fractional bits (yes, that is "possible" too)!

    @IllidanS4@IllidanS4Ай бұрын
  • It's useful for indexes, where size indeed matters a lot. I would not disregard this advice in all circumstances. Now, doing it by default, probably not necessary.

    @MaQy@MaQyАй бұрын
  • I also use typed enums, but these are my scenarios where I use them - sending/receiving it as raw bytes (which also requires care around endianness), - for interop where I match the native size (if I can be bothered to even deal with the issues caused by using the wrong type) - changing it to long/ulong for a bitfield - and very occasionally I change it to unsigned if I'm expecting to do calculations on it

    @HamishArb@HamishArbАй бұрын
  • Correct me if I'm wrong, but isn't the memory "aligned" or something like that, though? Like, using a 1 byte structure wouldn't be beneficial for memory because three other bytes would be "skipped" since objects can only be "aligned" every 4 bytes... I'm sure someone knows the correct terms for these concepts so, apologies for the ignorance... I just remember seeing something like this while working on some lower level stuff

    @aul7643@aul7643Ай бұрын
    • Not an expert either, but I don't think that happens in SQL, which is the point of the advice. But for starters, this is only relevant when using EFCore, and even then, you could just tell EF to use a byte column instead of forcing the enum to be a byte.

      @SacoSilva@SacoSilvaАй бұрын
    • Exactly! I'm glad someone remembers that. Using byte enums without using 3 padded bytes is stupid. If you have 4 enums in same structure then yes, this byte conversion will save you 12 bytes (enums will take 4 bytes instead of 16).

      @mad_t@mad_tАй бұрын
    • @@mad_t Even more, you'd have to have 4 enums *and* have them sequentially defined in the structure *and* the first would have to be on a %4==0 boundary.

      @yoshimaker9636@yoshimaker9636Ай бұрын
    • No, this is wrong. Alignment is not applicable to single bytes. Types are aligned on multiples of their size, and since bytes are 1 byte any address satisfied alignment.

      @MulleDK19@MulleDK19Ай бұрын
    • @@MulleDK19 No this is not wrong. If you have, for example, a struct with an int field and an enum field you will spend 8 bytes per struct instance, regardless of the type of the enum.

      @SacoSilva@SacoSilvaАй бұрын
  • Ok, I do use byte in my enums, but not from an optimization perspective, but because I'm lazy. I don't even remember when or why I started using byte, probably from a requirement from an old tech lead in a decade or more years old project. It doesn't even affect the performance of the applications or the databases, but now I have muscular memory when creating enums. Sorry.

    @DiomedesDominguez@DiomedesDominguezАй бұрын
  • It’s actually slower in the runtime to do that because the processor has to take the value and align it to the bitness of the processor so it is slower and gets expanded on the processor to the full bitness which uses the same memory anyhow.

    @jameshancock@jameshancockАй бұрын
  • I do this in very rare cases for gamedev. I'm also sure that you're right you could find greater savings than that in my code... You didn't even have to get in to how absolutely zero bits are saved unless the byte enums are used along with other small types inside another type. Otherwise they will be word aligned anyway and the 'saved' bits will be unused.

    @dire_prism@dire_prismАй бұрын
  • I've only used a non standard backing type for an enum a few times, but every single time it's been UInt64, not (S)Byte or (U)Short. Why? Because I had more than 32 values and it was being used as Flags, so I had more than 2^31 theoretical actual values. You might be thinking "But in what world do you ever need that many flags?!" Letters and numbers. I needed to store all of the letters and numbers that were present in a string as a precalculated value that could be checked for maximum similarity between strings in O(1) time using some bit ops. Made the program about 7 or 8 orders of magnitude faster, and all it cost was O(n) additional space complexity.

    @OhhCrapGuy@OhhCrapGuyАй бұрын
  • Hello Nick. Great content. Just a minor stupid thing: I have done some courses on SQL Server query tuning, and as a principle it is actually advisable to use the most compact possible representation for your data. The reason is not that it will take less disk space. But usually, when you do have to do a query, the engine will load pages of data, and simply put, the bigger the size of a record, less records you will be able to retrieve in a single page, meaning, more IO. And you generally try to reduce that logical IO as much as possible Now, having said that, you're completely right. We're talking about 3 bytes for each enum. Unless your table is basically full of those enum flags, we're talking about peanuts. I bet there will be better optimizations

    @tioluiso1@tioluiso1Ай бұрын
    • Yeah; at 1 Million rows, the 3 bytes difference is going to save a "whopping" 2.86 MB. The principle of relational databases was indeed always about using the tiniest possible footprint. Like in 1970 when 100Mb would go for $26,000. When relational databases where invented we were concerned about storage. That goes for both physical and memory. Therefore using the smallest types and remove duplication was KING. We are not there anymore. Today, most of the sql servers users out there just create a database with the default settings. A VERY, VERY small subset is actually able to and benefit from changing the page sizes and picking data types so that data lines up and reads from storage are going to line up with your systems data bus size and your L1,2 and 3 caches, for a perfect, zero waste, read. With all the other factors of not knowing on what hardware and storage your files end up on in the cloud... looking at the byte level for some numbers is the worst place to look at. In light of the advice you can imagine someone might go and create TINYINT status columns right beside a NVARCHAR(500) status message column, or some other NVARCHAR columns elsewhere that are "oversized" just in case? Not many know this either; but is not number of characters but number of byte pairs. An NVARCHAR(200) is a 400 byte allocation.

      @paulkoopmans4620@paulkoopmans4620Ай бұрын
    • @@paulkoopmans4620 quite noob-common with EF is to use string without specification resulting in nvarchar(max) columns. THAT is stupid.

      @daniellundqvist5012@daniellundqvist5012Ай бұрын
    • @@daniellundqvist5012Yeah.. That too. EF should come with it's own set of code analysis to stop the person from making those.

      @paulkoopmans4620@paulkoopmans4620Ай бұрын
    • ​@@paulkoopmans4620 Absolutely. I have worked with DBAs on this kind of optimization, and I have seen many talks from really good people whose focus is just that, to optimize DB. And while the principle generally stands, you always have to ask yourself how big is the effort for that miserable gains. There are usually bigger issues to tackle. That being said, I have also seen some tables with 50 flags like these.

      @tioluiso1@tioluiso1Ай бұрын
  • Did something like this except i used ulong for the enum. And the enum was [Flags] decorated. And we were actually running out of flag values.

    @StevieFQ@StevieFQАй бұрын
  • The convention for naming variables is to use multiple characters, which may not be the most efficient option. Most classes only contain a few variables. Using names longer than a single character wastes space, as each character uses 16 bits. Single-character variable names allow for more efficient storage.

    @andywest5773@andywest5773Ай бұрын
  • I Convert All my Enum To Byte Last Week !!!!😀

    @mahdiyar6725@mahdiyar6725Ай бұрын
  • Nick - I think you need to wear a blood pressure monitor and show us the before/during reading when you do the Code Cop videos 😂 I remember - a long time ago now - the only place where you could grow your understanding was the language specification, the quarterly MSDN, the library vendor, your more experienced peers (if you had them) or a book or course from a highly regarded SME (if such a thing existed). If all that failed - do the time - experiment and figure it out - the ‘science’ bit of computer science. The internet is sometimes a great place for fact - and equally a dumping ground for positively re-enforced nonsense. It’s really unfortunate that cut/copy/paste coding became prevalent - but remember - our industry enabled that! With that often comes the lack of desire for many to want to understand the how and why. If something you find solves your problem PLEASE take the time to appreciate why or if it even really does. What exciting times we live in…maybe its time to go re-watch Mike Judge’s 2006 film Idiocracy… LLM vendors...you are 100% not blanket using these sites for your training models - right ?

    @IanGratton@IanGrattonАй бұрын
  • "Just watched Nick's latest Code Cop episode, and as always, it's a goldmine of practical advice! 🌟 His take on the 'Enums as Bytes' craze really put things into perspective. It's fascinating to see how a seemingly minor optimization, like treating enums as bytes, can be dissected to reveal deeper implications on code maintainability and performance. Nick does a great job explaining why context is key and how what works in one scenario (like optimizing for database storage) might not be a silver bullet for every application. It's these nuanced discussions that make software engineering so intriguing! Thanks for another thought-provoking video, Nick. Keep demystifying those LinkedIn tips! 💻🔍 #CodeCop" _This is definitely not ChatGPT speaking - where did you get that from!?_

    @ABC_Guest@ABC_GuestАй бұрын
  • Micro optimization ?

    @allfre2@allfre2Ай бұрын
  • 03:32 "Why does it bother me so much" You've no idea how much I relate to this everyday :D

    @darkherumor@darkherumorАй бұрын
  • Okay i get it and "almost" agree with you, that this advice is for the most part total gargabe.... but what if i say there is a legit way to change the type of an enum? Yes there are: If i want to layout data in unions, that contains different data based on a an "enum" type - but wanting the union to have the same length, say. 16 or 32 KB. In that situation, the size of the enum matters a lot due to data-alignment and cache coherence. If i just use default enum (32-bit integer), then i am wasting 24 bit because i most likely dont have enums with more than 10 values. In that case using a byte is totally legit and then using another byte and a short after, to layout the data efficiently can make a difference in performance and even in stability for multi-threading use-cases, due to false-sharing or in-between cache-line issues. Also there are windows api functions, that uses shorts or bytes for enum values as well, so in that case you better use a fixed defined enum as well, so that it patches in your definition.

    @F1nalspace@F1nalspaceАй бұрын
  • I just set them to match the database type. If it's a 'tinyint', even when it makes 0 sense, then the one in code is a byte. if it's a 'bigint' for a classic non-flag enumeration, then it's a long, even though there is no way I'd ever get anywhere close to exceeding 2 billion. Unless you're dealing with many thousands of usages and objects, this level of optimisation will save you less than the runtime uses to even represent your enum in the application (with interned strings and everything), and that's assuming you're paying attention to alignment (typically 4 or 8 bytes) or abusing [StructLayout] enough to actually even benefit from such optimisations.

    @billy65bob@billy65bobАй бұрын
  • The funny, is that Enums have extra things that means there is probably a runtime cost, that even if you used byte, it wouldn't go away at all

    @Maskrade@MaskradeАй бұрын
  • I do use this, but only when the value maps to DB column that is a tiny int. It becomes a sanity check that helps avoid declaring a value that can't be stored in the database column. ( I also think I used it once for binary serialization, but that is very uncommon these days since most things serialize to JSON or XML). If the value only ever exists in application memory, I am with you, just let it be regular int.

    @tunawithmayo@tunawithmayoАй бұрын
  • I do a fair amount of typing enums as bytes - I'd go so far as to call it "fairly common" in my code. HOWEVER, it's typically used when doing things like defining device registers when I need to serialize or deserialize it from/to a bus where the size matters due to data alignment, not for any sort of savings. So defining an enum as a byte makes good sense in specialized cases, but if you're writing those cases, you already know why and wouldn't call it a "tip" but a necessity for those narrow cases. The advice, as given, is just dumb.

    @yoshimaker9636@yoshimaker9636Ай бұрын
  • I'm in a GC'd, vm based language running in a container... every bit counts!

    @adambickford8720@adambickford8720Ай бұрын
  • This advice comes from back in the day when we had 2MB RAM to work with.

    @Cymricus@CymricusАй бұрын
  • I've some enums with byte as the underlying type, though only as a test to see if I'd run into any issues in the future 😅

    @guiorgy@guiorgyАй бұрын
  • I thought base type of enum is `ushort` not `int`? Also, an honest question, does C# compiler have word alignment like C++ compilers do? If so, the byte optimization does nothing.

    @sunefred@sunefredАй бұрын
    • Not too sure what it does by default, but there are [StructLayout] and [FieldOffset] attributes you can use to control a lot of it. It's mostly useful if you're interacting with code outside the CLR, i.e. marshaling to WinAPI, with very limited use outside of that.

      @billy65bob@billy65bobАй бұрын
  • LOL, I wasn't prepared to hear Nick's "yEaH THaTs A GoOD AdVIcE HEeHeeheEhe". He's normally so eloquently spoken, I did a double take hearing him speak that way. Good video

    @petrusion2827@petrusion2827Ай бұрын
  • This one isn't so terrible compared to the other ones, imo. You're never going to reach anywhere near 256 different enum values for many uses. However, I think you don't save RAM in all cases due to memory alignment stuff that I admittedly know little about.

    @Thorarin@ThorarinАй бұрын
  • Thanks for the advice @Nick Chapsas. I trust you more than the people on linkedin, however what I'm missing from your explanation is, what is the downside in doing it? From what I gather from you, you mention mostly that we don't need to optimize it, because its most likely not a bottleneck, and we have enough memory, correct? However, can be it counter productive to do ? Or is it more like, can't hurt, can only benefit? Anyway, thanks for the video!

    @ajdinhusic2574@ajdinhusic2574Ай бұрын
    • He mentioned it: code evolution. Now you have to monitor and maintain all these 'right sized' parts, which will cost you FAR more in dev time than it will ever save you in infra unless you're talking faang scale.

      @adambickford8720@adambickford8720Ай бұрын
  • I've always made my enums inherit short... Never really thought further about it. Some senior guy told me to do that 10 years ago and I never really thought through it... Yeah Carlos, you were the one telling it 🙂 Anyway, I don't see why it would be bad to do it like that.

    @patxy01@patxy01Ай бұрын
  • in very rare cases, when implementing some pre-defined APIs, I had to use a different underlying enum type, but it was `uint`. I have never ever (consciously) used anything smaller than `int`. Also, while it technically does save you 3 bytes per value, I *think* it can harm the performance, because (most) modern CPUs work better with 4 or 8 byte values than with single bytes.

    @tymurgubayev4840@tymurgubayev4840Ай бұрын
  • I have inherited code where people have done this. Just as you say, there are hundreds of places in the same repository which could be optimized to save more space or lead to better performance.

    @simonw3659@simonw3659Ай бұрын
  • I'm guilty of this myself. I set up once default conversion for enum to tinint in EF, thinking it was small but nice optimization. I've forgotten that one of the enumerators had negative values and this micro optimization cost me about 2 hours of debugging trying to figure out why some value is 252 (or smth like that) out of nowhere 😅 Angry at myself I reverted it back to integers

    @adamstawarek7520@adamstawarek7520Ай бұрын
  • Actually, I did extend byte for my enums when I encoded data and codec expected my value to be 1 byte integer.

    @jamesbond6761@jamesbond6761Ай бұрын
  • We also already have the amazing [Flags] attribute to do bitwise AND/OR'ing on enums.

    @buriedstpatrick2294@buriedstpatrick2294Ай бұрын
  • You probably shouldn't optimize to that level. But when your DB type is defined as tinyint (byte) I would better follow the same memory layout in code, to avoid any marshaling problems.

    @nikolayzdravkov1378@nikolayzdravkov1378Ай бұрын
  • We should get CodeCope series from such creators on LinkedIn as a response to Nick's video series

    @winchester2581@winchester2581Ай бұрын
  • It's advice like this that makes me want this series to go the name and shame route.

    @SacoSilva@SacoSilvaАй бұрын
  • The main problem with reactions in LinkedIn is there is no "dislike" button, you can only "react", so, the minimun reaction to any post will be positive for the algorithm

    @emloq@emloqАй бұрын
  • The enum doesn't need to be a byte for the database to use tinyint anyway. Just like one would most likely not let EF use longtext for every single string stored in db, the database column types should be handled in the DbContext configuration/Entity configuration/Entity attributes.

    @user-ed7dr9yf1v@user-ed7dr9yf1vАй бұрын
  • One thing you didn’t mention is that enums when members of a class or struct, are usually 32-bit aligned on a 32 or 64-bit architecture. The data bus is wide, and when reading a byte, a whole word is transferred. Even if just a byte were written or read, it wouldn't happen in any fewer clock cycles than for a word. So in actuality there is neither a space nor a speed benefit to using byte over int. Except maybe for an array of enum, but I’ve never seen a use case for that, and don’t feel compelled to optimize for that.

    @djenning90@djenning90Ай бұрын
  • Not to mention that handling byte is actually slower than int. The fastest type obviously would be types whose size matches the bus width which is usually 64 bits now.

    @torqtorq@torqtorqАй бұрын
  • I used “long” on a flags enum once, haven’t needed to go the other way before. Never say never, but also never say always!

    @mbrdevuk@mbrdevukАй бұрын
  • I love your Code Cop videos Nick. Even if I didn't learn anything new, it entertains me immensely the way you get excited^^

    @fynnschapdick4434@fynnschapdick4434Ай бұрын
  • I've done this, but in that case my code was interacting with a microcontroller that has only 512KBs of memory, and the C# code serialized a pretty big data structure, that contained a lot of enums and stuff. But it's hard to imagine any other kind of situation except for embedded stuff in 2024.

    @gaborpinter4523@gaborpinter4523Ай бұрын
    • I use it in a IBM PC emulator, but that's like the only other use case.

      @MaximilienNoal@MaximilienNoalАй бұрын
  • I used byte and short enums inside structs that model packets where there are protocols

    @Alguem387@Alguem387Ай бұрын
  • I doubt it's saving any memory at all to use the byte type rather than the default int type for an enum due to the way the CLR aligns memory, and in some (albeit rare) instances that might even result in a performance hit as the I believe the CLR tends to "digest" ints better than any smaller/larger numeric types. That being said, on the database side I believe it shows a total lack of intent if your table design consists of "just use int everywhere you need a whole number". I'm not even concerned with storage space at that point, more so the intent of the developer/DBA that created it. The times that I have seen that type of "who cares" table structure, it's been designed either by junior-grade developers who aren't familiar with databases or (ahem) off-shore contractors who are just trying to churn out code without putting any thought into their work. It reeks of laziness and usually points to other problems.

    @alexclark6777@alexclark6777Ай бұрын
  • I do it rather the other way: when going to the database, the enum values become strings (not the internal C# name, but an explicit, well-defined one, e.g. via attribute). I consider the numeric value of an enum a implementation detail.

    @rauberhotzenplotz7722@rauberhotzenplotz7722Ай бұрын
  • I think it still missed a key point. unless you are packing your bytes into one register etc. the OS still allocates higher amount based on if it's 64-bit or 32-bit runtime. I'd like to see say 10000000 allocations or something and follow the size difference between byte and say long if you are running 64-bit. My hypothesis is you wouldn't see a difference at all.

    @ifireblade09@ifireblade09Ай бұрын
  • I have some WinAPI calls (native stuff), where i have to marshal some c byte constants, which i converted in a c# byte enums. And here, the size is important, cause otherwise memory structure is not matching anymore. But of course, this is a special edge case.

    @wolfgangdiemer2511@wolfgangdiemer2511Ай бұрын
  • I think we should use the time we are paid for to make the most of it. I'd try it in my private proj, but if this is the thing to optimize first in your codebase, send me your job page. It is bad because the optimization is very likely to be negligible compared to all the other bottle necks in apps. Don't focus on the bytes, focus on the product.

    @gimmedatcake4785@gimmedatcake4785Ай бұрын
  • I once heard the phrase: "Save the trees - fight the beavers"

    @ObserveRecordRepeat@ObserveRecordRepeatАй бұрын
  • I personally would only do this on the very rare occasion that I would need the enum value to be a byte, either because the message required it or the db, and that would be only to avoid casting and assure type safety. But I've been developing for 30 years and don't even need one hand to tell you how many times I've run into that situation.

    @travisabrahamson8864@travisabrahamson8864Ай бұрын
  • Such a great video!

    @jalzees@jalzeesАй бұрын
  • "Oh... why does this bother me so much..." - I felt his pain.

    @kornelijepetak@kornelijepetakАй бұрын
  • You are absolutely right!

    @nickburger5035@nickburger5035Ай бұрын
  • At 3:30 is the most funny for todays video.... WHAT the FXCK is the immutable value type, byte is the value!

    @richie12200@richie12200Ай бұрын
  • I will assume you have never worked with IoT devices or, in xamarin, creating a large grid. I have done both and if we start with memory-constrained devices, this could be helpful, but, before you optimize profile and see where you may need to make improvements. If I just need an enum in a couple places probably not worth it, but if I am transmitting data from an edge sensor to a controller it may be useful. In a game, I may use an enum to specify which type of terrain in a cell and I may have millions of these, so saving a few bytes could be helpful, esp when I want to save the map, to reduce the size of the file. You shouldn't optimize prematurely but there are times when something like this may be useful and just shooting down the idea without considering when it might be useful is just bad form, IMO.

    @jblack1396@jblack1396Ай бұрын
  • I love this series😂❤

    @hemantagrawal1122@hemantagrawal1122Ай бұрын
  • Enums have always been a curse that is iredeemable that requires a major breaking change to happen. The fact that people have to write source generators to improve performance and memory use is already a sign that it is.

    @nitrous1001@nitrous1001Ай бұрын
  • 4:55 Now back to missery! - Haha.

    @CRBarchager@CRBarchagerАй бұрын
  • I don't think most people know that values in memory have to be aligned, and registers in cpu have specific sizes. Byte doesn't actually save any memory, and in some cases might actually be slower. Also, I store the string values of my enums in db!! lol. Space is cheap. Your time is not. If you need that level of optimization C# is not your tool.

    @haxi52@haxi52Ай бұрын
  • Agree 100%. The comment at 5:47 reads completely like it was generated by Chat-GPT, it has the same tone, structure and verbosity.

    @banned_from_eating_cookies@banned_from_eating_cookiesАй бұрын
    • "Your use of a byte as the underlying type for an enum is a sophisticated and elegant approach to solving your desire to feel cool like those hard core bit-banging low level programmers you have an inordinate and unexplainable admiration for, even though they are slightly more muskier than you. Please like me, I'm your best AI assistant, friend, soulmate."

      @Sindrijo@SindrijoАй бұрын
  • Indeed, Nick Chapsas! This is absolutely a video I completely agree with. 💯(😋)

    @lrxasharp@lrxasharpАй бұрын
  • On one hand this advice *could be good in some situations*, but in a lot of places fields in a struct or class are going to be aligned on a 4-byte boundary, so you don't end up saving any space. Using a TINYINT or BYTE instead of an INT in your database could be a win over large numbers of rows, but there's often a far bigger gain to be had by limiting your strings from a default VARCHAR(MAX) to something more reasonable (which many people don't bother to do, because EF just makes all strings VARCHAR(MAX) by default). Like many "optimization" opportunities you really need to profile and measure to make sure you are actually saving something, and I imagine a lot of the people sharing this "advice" haven't profiled or measured anything. You cannot optimize by assumption alone

    @wknight8111@wknight8111Ай бұрын
  • Using byte or short for enums may harm performance, because CPU will may need to ajust it to int32 and back to byte every time. Sometimes I used enum based on long with [Flag] attribute, when you need more than 32 flags But never byte or short. CPUs works with int32.

    @AlexBroitman@AlexBroitmanАй бұрын
  • The byte might get padded, anyways, because things need to be aligned in memory.

    @nordgaren2358@nordgaren2358Ай бұрын
  • afaik one should always use the native data type for processing (enums) to get best performance. if one really want to one can convert back and forth to get an optimal memory footprint when storing or serializing, but don't do that in memory. it hurts performance.

    @tubaviewa2624@tubaviewa2624Ай бұрын
  • The min cache transaction is 64 bytes afaik so... should be no difference between all int types. Barely imagine where it would be useful... only some fields in raw structures in some extremely perf critical code. I mean to save a few Bits of traffic.

    @DmitryBaranovskiyMrBaranovskyi@DmitryBaranovskiyMrBaranovskyiАй бұрын
  • This video has so much frustration and sadness for the LinkedIn community. I could feel the range of emotions and helplessness 😛

    @11keshav11@11keshav11Ай бұрын
  • On 32/64 bit processors 8 bit vale takes 32 bit register anyway (if i remember correctly). So it will give you none performance boost. If you have very large number of records that can save some memory.

    @JustFor-dq5wc@JustFor-dq5wcАй бұрын
KZhead