• @[email protected]
    link
    fedilink
    41 month ago

    Pl/1 did it right:

    Dcl 1 mybools, 3 bool1 bit(1) unaligned, 3 bool2 bit(1) unaligned, … 3 bool8 bit(1) unaligned;

    All eight bools are in the same byte.

    • @[email protected]
      link
      fedilink
      1601 month ago

      And compiler. And hardware architecture. And optimization flags.

      As usual, it’s some developer that knows little enough to think the walls they see around enclose the entire world.

      • @[email protected]
        link
        fedilink
        41 month ago

        I don’t think so. Apart from dynamically typed languages which need to store the type with the value, it’s always 1 byte, and that doesn’t depend on architecture (excluding ancient or exotic architectures) or optimisation flags.

        Which language/architecture/flags would not store a bool in 1 byte?

        • @[email protected]
          link
          fedilink
          11 month ago

          Apart from dynamically typed languages which need to store the type with the value

          You know that depending on what your code does, the same C that people are talking upthread doesn’t even need to allocate memory to store a variable, right?

            • @[email protected]
              link
              fedilink
              229 days ago

              I think he’s talking about if a variable only exists in registers. In which case it is the size of a register. But that’s true of everything that gets put in registers. You wouldn’t say uint16_t is word-sized because at some point it gets put into a word-sized register. That’s dumb.

        • @[email protected]
          link
          fedilink
          21 month ago

          things that store it as word size for alignment purposes (most common afaik), things that pack multiple books into one byte (normally only things like bool sequences/structs), etc

          • @[email protected]
            link
            fedilink
            129 days ago

            things that store it as word size for alignment purposes

            Nope. bools only need to be naturally aligned, so 1 byte.

            If you do

            struct SomeBools {
              bool a;
              bool b;
              bool c;
              bool d;
            };
            

            its 4 bytes.

            • @[email protected]
              link
              fedilink
              229 days ago

              sure, but if you have a single bool in a stack frame it’s probably going to be more than a byte. on the heap definitely more than a byte

              • @[email protected]
                link
                fedilink
                128 days ago

                but if you have a single bool in a stack frame it’s probably going to be more than a byte.

                Nope. - if you can’t read RISC-V assembly, look at these lines

                        sb      a5,-17(s0)
                ...
                        sb      a5,-18(s0)
                ...
                        sb      a5,-19(s0)
                ...
                

                That is it storing the bools in single bytes. Also I only used RISC-V because I’m way more familiar with it than x86, but it will do the same thing.

                on the heap definitely more than a byte

                Nope, you can happily malloc(1) and store a bool in it, or malloc(4) and store 4 bools in it. A bool is 1 byte. Consider this a TIL moment.

                • @[email protected]
                  link
                  fedilink
                  127 days ago

                  c++ guarantees that calls to malloc are aligned https://en.cppreference.com/w/cpp/memory/c/malloc .

                  you can call malloc(1) ofc, but calling malloc_usable_size(malloc(1)) is giving me 24, so it at least allocated 24 bytes for my 1, plus any tracking overhead

                  yeah, as I said, in a stack frame. not surprised a compiler packed them into single bytes in the same frame (but I wouldn’t be that surprised the other way either), but the system v abi guarantees at least 4 byte alignment of a stack frame on entering a fn, so if you stored a single bool it’ll get 3+ extra bytes added on the next fn call.

                  computers align things. you normally don’t have to think about it. Consider this a TIL moment.

      • Lucien [he/him]
        link
        fedilink
        141 month ago

        Fucking lol at the downvoters haha that second sentence must have rubbed them the wrong way for being too accurate.

  • @[email protected]
    link
    fedilink
    61 month ago

    I swore I read that mysql dbs will store multiple bools in a row as bit maps in one byte. I can’t prove it though

  • @[email protected]
    link
    fedilink
    English
    281 month ago

    It’s far more often stored in a word, so 32-64 bytes, depending on the target architecture. At least in most languages.

    • @[email protected]
      link
      fedilink
      5
      edit-2
      1 month ago

      No it isn’t. All statically typed languages I know of use a byte. Which languages store it in an entire 32 bits? That would be unnecessarily wasteful.

      • @[email protected]
        link
        fedilink
        English
        127 days ago

        C, C++, C#, to name the main ones. And quite a lot of languages are compiled similarly to these.

        To be clear, there’s a lot of caveats to the statement, and it depends on architecture as well, but at the end of the day, it’s rare for a byte or bool to be mapped directly to a single byte in memory.

        Say, for example, you have this function…

        public void Foo()
        {
            bool someFlag = false;
            int counter = 0;
        
            ...
        }
        

        The someFlag and counter variables are getting allocated on the stack, and (depending on architecture) that probably means each one is aligned to a 32-bit or 64-bit word boundary, since many CPUs require that for whole-word load and store instructions, or only support a stack pointer that increments in whole words. If the function were to have multiple byte or bool variables allocated, it might be able to pack them together, if the CPU supports single-byte load and store instructions, but the next int variable that follows might still need some padding space in front of it, so that it aligns on a word boundary.

        A very similar concept applies to most struct and object implementations. A single byte or bool field within a struct or object will likely result in a whole word being allocated, so that other variables and be word-aligned, or so that the whole object meets some optimal word-aligned size. But if you have multiple less-than-a-word fields, they can be packed together. C# does this, for sure, and has some mechanisms by which you can customize field packing.

        • @[email protected]
          link
          fedilink
          127 days ago

          No, in C and C++ a bool is a byte.

          since many CPUs require that for whole-word load and store instructions

          All modern architectures (ARM, x86 RISC-V) support byte load/store instructions.

          or only support a stack pointer that increments in whole words

          IIRC the stack pointer is usually incremented in 16-byte units. That’s irrelevant though. If you store a single bool on the stack it would be 1 byte for the bool and 15 bytes of padding.

          A single byte or bool field within a struct or object will likely result in a whole word being allocated, so that other variables and be word-aligned

          Again, no. I think you’ve sort of heard about this subject but haven’t really understood it.

          The requirement is that fields are naturally aligned (up to the machine word size). So a byte needs to be byte-aligned, 2-bytes needs to be 2-byte aligned, etc.

          Padding may be inserted to achieve that but that is padding it doesn’t change the size of the actual bool, and it isn’t part of the bool.

          But if you have multiple less-than-a-word fields, they can be packed together.

          They will be, if it fits the alignment requirements. Create a struct with 8 bools. It will take up 8 bytes no matter what your packing setting is. They even give an example:

          If you specify the default packing size, the size of the structure is 8 bytes. The two bytes occupy the first two bytes of memory, because bytes must align on one-byte boundaries.

          They used byte here but it’s the same for bool because a bool is one byte.

          I’m really surprised how common this misconception is.

      • @[email protected]
        link
        fedilink
        English
        11 month ago

        It’s not wasteful, it’s faster. You can’t read one byte, you can only read one word. Every decent compiler will turn booleans into words.

        • @[email protected]
          link
          fedilink
          129 days ago

          You can’t read one byte

          lol what. You can absolutely read one byte: https://godbolt.org/z/TeTch8Yhd

          On ARM it’s ldrb (load register byte), and on RISC-V it’s lb (load byte).

          Every decent compiler will turn booleans into words.

          No compiler I know of does this. I think you might be getting confused because they’re loaded into registers which are machine-word sized. But in memory a bool is always one byte.

              • @[email protected]
                link
                fedilink
                English
                127 days ago

                Internally it will still read a whole word. Because the CPU cannot read less than a word. And if you read the ARM article you linked, it literally says so.

                Thus any compiler worth their salt will align all byte variables to words for faster memory access. Unless you specifically disable such behaviour. So yeah, RTFM :)

                • @[email protected]
                  link
                  fedilink
                  127 days ago

                  Wrong again. It depends on the CPU. They can absolutely read a single byte and they will do if you’re reading from non-idempotent memory.

                  If you’re reading from idempotent memory they won’t read a byte or a word. They’ll likely read a whole cache line (usually 64 bytes).

                  And if you read the ARM article you linked, it literally says so.

                  Where?

                  Thus any compiler worth their salt will align all byte variables to words for faster memory access.

                  No they won’t because it isn’t faster. The CPU will read the whole cache line that contains the byte.

                  RTFM

                  Well, I would but no manual says that because it’s wrong!

      • @[email protected]
        link
        fedilink
        21 month ago

        C/C++ considers an nonzero number, as your true value but false is only zero. This would allow you to guard against going from true to false via bit flip but not false to true.
        Other languages like rust define 0 to be false and 1 to be true and any other bit pattern to be invalid for bools.

  • @[email protected]
    link
    fedilink
    21 month ago

    Does anybody ever figure in parity when comparing bit sizes and all that jazz or are we only ever concerned with storage space?

  • @[email protected]
    link
    fedilink
    English
    521 month ago

    Back in the day when it mattered, we did it like

    #define BV00		(1 <<  0)
    #define BV01		(1 <<  1)
    #define BV02		(1 <<  2)
    #define BV03		(1 <<  3)
    ...etc
    
    #define IS_SET(flag, bit)	((flag) & (bit))
    #define SET_BIT(var, bit)	((var) |= (bit))
    #define REMOVE_BIT(var, bit)	((var) &= ~(bit))
    #define TOGGLE_BIT(var, bit)	((var) ^= (bit))
    
    ....then...
    #define MY_FIRST_BOOLEAN BV00
    SET_BIT(myFlags, MY_FIRST_BOOLEAN)
    
    
    • @[email protected]
      link
      fedilink
      101 month ago

      With embedded stuff its still done like that. And if you go from the arduino functionss to writing the registers directly its a hell of a lot faster.

    • @[email protected]
      link
      fedilink
      English
      51 month ago

      Okay. Gen z programmer here. Can you explain this black magic? I see it all the time in kernel code but I have no idea what it means.

      • @[email protected]
        link
        fedilink
        English
        6
        edit-2
        1 month ago

        The code is a set of preprocessor macros to stuff loads of booleans into one int (or similar), in this case named ‘myFlags’. The preprocessor is a simple (some argue too simple) step at the start of compilation that modifies the source code on its way to the real compiler by substituting #defines, prepending #include’d files, etc.

        If myFlags is equal to, e.g. 67, that’s 01000011, meaning that BV00, BV01, and BV07 are all TRUE and the others are FALSE.

        The first part is just for convenience and readability. BV00 represents the 0th bit, BV01 is the first etc. (1 << 3) means 00000001, bit shifted left three times so it becomes 00001000 (aka 8).

        The middle chunk defines macros to make bit operations more human-readable.

        SET_BIT(myFlags, MY_FIRST_BOOLEAN) gets turned into ((myFlags) |= ((1 << 0))) , which could be simplified as myFlags = myFlags | 00000001 . (Ignore the flood of parentheses, they’re there for safety due to the loaded shotgun nature of the preprocessor.)

      • NιƙƙιDιɱҽʂ
        link
        fedilink
        7
        edit-2
        1 month ago

        It’s called bitshifting and is used to select which bits you want to modify so you can toggle them individually.

        1 << 0 is the flag for the first bit
        1 << 1 for the second
        1 << 2 for the third and so on

        I think that’s correct. It’s been years since I’ve used this technique tbh 😅

  • Tekhne
    link
    fedilink
    141 month ago

    Are you telling me that no compiler optimizes this? Why?

    • @[email protected]
      link
      fedilink
      231 month ago

      Well there are containers that store booleans in single bits (e.g. std::vector<bool> - which was famously a big mistake).

      But in the general case you don’t want that because it would be slower.

        • @[email protected]
          link
          fedilink
          51 month ago

          The mistake was that they created a type that behaves like an array in every case except for bool, for which they created a special magical version that behaves just subtly different enough that it can break things in confusing ways.

    • @[email protected]
      link
      fedilink
      6
      edit-2
      1 month ago

      Consider what the disassembly would look like. There’s no fast way to do it.

      It’s also unnecessary since 8 bytes is a negligible amount in most cases. Serialization is the only real scenario where it matters. (Edit: and embedded)

      • @[email protected]
        link
        fedilink
        41 month ago

        In embedded, if you are to the point that you need to optimize the bools to reduce the footprint, you fucked up sizing your mcu.

    • @[email protected]
      link
      fedilink
      341 month ago

      It would be slower to read the value if you had to also do bitwise operations to get the value.

      But you can also define your own bitfield types to store booleans packed together if you really need to. I would much rather that than have the compiler do it automatically for me.

  • @[email protected]
    link
    fedilink
    21
    edit-2
    1 month ago

    I have a solution with a bit fields. Now your bool is 1 byte :

    struct Flags {
        bool flag0 : 1;
        bool flag1 : 1;
        bool flag2 : 1;
        bool flag3 : 1;
        bool flag4 : 1;
        bool flag5 : 1;
        bool flag6 : 1;
        bool flag7 : 1;
    };
    

    Or for example:

    struct Flags {
        bool flag0 : 1;
        bool flag1 : 1:
        int x_cord : 3;
        int y_cord : 3;
    };
    
    • @[email protected]
      link
      fedilink
      31 month ago

      I watched a YouTube video where a dev was optimizing unity code to match the size of data that is sent to the cpu using structs just like this.

  • @[email protected]
    link
    fedilink
    161 month ago

    just like electronic components, they sell the gates by the chip with multiple gates in them because it’s cheaper

    • @[email protected]
      link
      fedilink
      121 month ago

      In terms of memory usage it’s a waste. But in terms of performance you’re absolutely correct. It’s generally far more efficient to check is a word is 0 than to check if a single bit is zero.

    • @[email protected]
      link
      fedilink
      English
      11 month ago

      Usually the most effective way is to read and write the same amount of bits as the architecture of the CPU, so for 64 bit CPUs it’s 64 bits at once.