The usefulness of bit packing

Post by **albinopapa** » August 2nd, 2019, 10:47 am

The title is a bit misleading actually. What I'm referring to is the use of bit packing for flags and am questioning the usefulness. From a library writers perspective I'm sure they are invaluable as you have less to make especially if the flags help define behavior of the same type of object. On the other hand, it seems like it actually makes things more difficult because you have to unpack all the flags when processing a parameter anyway.

Let's do this, say you want to draw a box and of course you want to draw the box. You could store the full 32 bit color and have the user choose the color out of the 4 billion choices or you could set it up so they are limited to a simpler palette using bit flags.

Code: Select all

enum ColorFlags
{
   BLACK   = 0x0,
   RED     = 0x1,
   GREEN   = 0x2,
   BLUE    = 0x4,
   WHITE   = 0x8
};
struct Box
{
   int x, y, w, h;
   char color_flags;
};

With these 5 flags you can have 16 combinations. So instead of have a RED box and a YELLOW box that takes up 32 bits storing a 32 bit value for it's color, you just unpack the flags and combine a color on the fly before drawing the box.

Code: Select all

void Graphics::DrawBox( Box const& box )
{
   unsigned int add_white = ( box.color_flags & ColorFlags::WHITE ) != 0 ? 128 : 0;
   unsigned int red = add_white, green = add_white, blue = add_white;
   
   if( box.color_flags != 0 )
   {
      if( add_white != 0 )
      {
         red += ( box.color_flags & ColorFlags::RED ) != 0 ? 128 : 0;
         green += ( box.color_flags & ColorFlags::GREEN ) != 0 ? 128 : 0;
         blue += ( box.color_flags & ColorFlags::BLUE ) != 0 ? 128 : 0;
      }
   }
   
   unsigned int color = ( 255 << 24 ) | ( red << 16 ) | ( green << 8 ) | blue;
   for(int y = box.y; y < box.y + box.h; ++y )
   {
      for( int x = box.x; x < box.x + box.w; ++x )
      {
         PutPixel( x, y, color );
      }
   }
}

So how this code works is
if the WHITE flag is not set, then you could get black, dark red, dark yellow, dark green, dark cyan, dark blue and purple.
If WHITE is set you could get gray, red, yellow, green, cyan, blue, magenta or white.

You as the developer save some time by not having to type out or make enums for each of the different colors and you save 3 bytes per box object by not storing the 4 bytes of color per box object. By not creating the 15-16 enums, you also don't have to write a switch statement to convert each flag into a color though there would be a lot less thought needed. Sounds like a win on some level, but what about performance? What if these flags had a different meaning? What if some combination of flags didn't make sense?

Now, let's say you wanted to have the behavior of an object be decided on flags.

Code: Select all

enum ActionFlags
{
   IDLE = 0x1,
   WALK = 0x2, 
   RUN = 0x4
};
struct Character
{
   ActionFlags action = ActionFlags::IDLE;
};

Would it make sense to pack any of these flags together? Nope. Allowing the user of your code to bitwise OR these flags together is bound to end up causing issues. If you check if the character is idle and the flags are 0x1| 0x2 then the character will remain idle unless you check for both in which case the last thing check for will be it's behavior and the bug may not show up because you get what you expect for a while. Plus, when the state changes, you have to remember to clear the flag and set the new flag.

So why do I bring this up? Surely people don't do this do they?

Well, they do. Just take a look through the Win32 API. There are a metric shit ton of flags and in some cases ( probably most cases ) certain flags don't make sense to be combined and the function call will fail if you try. I know the Win32 API is outdated and can't be changed. I know from a certain point of view it's easier to just create a bunch of flags to change the behavior of an object instead of just creating different versions of the object. I'm sure in each one of those functions they have a chain of if statements checking to make sure the users ( us ) don't try anything stupid and even more if statements to direct the behavior in the way we intended it to behave.
IF THIS FLAG IS SET DO THIS
ELSE IF THIS FLAG IS SET DO THIS
ELSE IF THIS FLAG IS SET DO THIS UNLESS THIS OTHER FLAG IS SET THEN FAIL

I think it also pretty lazy to write a EnableThis() function passing in a bool ( or BOOL in C ) to determine if it should be enabled. Why not write EnableThis() and DisableThis(), nope they write a

Code: Select all

EnableThis( BOOL bIsEnabled ) 
{
   if( is_enabled != bIsEnabled )
      is_enabled = bIsEnabled;
}

I never realized before how frustrating it is to have to remember what combination of flags will and won't work together. Even the newer APIs like the D3D libraries still have flags and have to be used together in certain groupings or not grouped together. It forces you to refer back to the documentation, spend so much time researching and debugging and AHHH. Some if not most of this is probably because the APIs have to be compatible with the C language and that's unfortunate. C++ has much better type safety and can be more expressive. Not to mention, with RAII, it's resource safe as well.

THE C LANGUAGE NEEDS TO DIE AT THIS POINT!!!

Anyway, that's my rant for the month I hope.

Post by **albinopapa** » August 16th, 2019, 8:50 pm

I know flags can be useful. One approach that might be a better fit than bit twiddling would be to use bit fields inside of a struct.

Code: Select all

struct bitfield
{
     // 0 = Black,
     // 1 = Red, 
     // 2 = Green,
     // 3 = Yellow,
     // 4 = Blue,
     // 5 = Magenta
     // 6 = Cyan,
     // 7 = White
     std::uint8_t color_attribute    : 3; // Choose a color from the palette
     std::uint8_t action             : 2; // 0 = Idle, 1 = Walk, 2 = Run
     std::uint8_t is_diagonal_facing : 1; // 0 = false, 1 = true
     std::uint8_t facing_east_west   : 1 // 0 = west, 1 = east
     std::uint8_t facing_north_south : 1 // 0 = north, 1, = south
};

The size of the bitfield struct is the same size as an unsigned char ( 1 byte ) and yet you get to choose from eight different colors, 3 action states, and 8 orientations and the best part is they're all named and no bit shifting and/or masking.

Usage:

Code: Select all

constexpr auto idle_action = 0;
constexpr auto walk_action = 1;
constexpr auto run_action = 2;

auto orientation = Vec2f{ 0.f, 0.f };
if( hero.attributes.action != idle_action )
{
     if( hero.attributes.is_diagonal )
     {
          // ox and oy are orientation x and orientation y
          auto const ox = hero.attributes.facing_east_west == 0 ? -.707f : .707f;
          auto const oy = hero.attributes.facing_north_south == 0 ? -.707f : .707f;
          orientation = { ox, oy }
     }
     else   // if not diagonal, then facing N, E, S or W directly
     {
          auto const ox = hero.attributes.facing_east_west == 0 ? -1.ff : 1.ff;
          auto const oy = hero.attributes.facing_north_south == 0 ? -1.ff : 1.f;
          orientation = { ox, oy }
     }
}

// update position based on orientation
hero.position += orientation * hero.speed;

Post by **chili** » August 19th, 2019, 11:14 pm

I think bitfields are cleaner but ultimately do the same thing as flags. Still suffer from the same issues of invalid combinations. The ultimate problem is that you have options which are not all mutually exclusive, but not completely independent either. Whatever representation you choose you will not be able to get around that fact.

I guess I would prefer just bool function params to packing if the # of options is low enough, but when the number of options is in the tens those function signatures would be insane.

A sexier solution would be instead of a dumb bitfield, use a smart class with functions for setting all your options. Then you could protect your class invariants, such as certain flags not being compatible with each other, with assertions. Would give much better diagnostic feedback, but would also be a lot more work to make those for every option set.

As for your ActionFlags example, that is retarded ofc. But does anybody actually do that?

Post by **albinopapa** » August 20th, 2019, 3:55 am

Hey, it took some thinking to think of that example lol.

I can't say I've seen a project that actually uses bit fields for behavior like this per se, but when you think about creating a window, you have integers that are OR'd together to decide it's visual style ( the WS_* macros ) and behavior ( the CS_* macros for registering the WNDCLASS(EX) structures ).

For memory limited devices, bit packing and bit twiddling would be the only way to get this much flexibility out of a single byte I would imagine. When storing a bool, you'd normally use up a single byte on it's own.

Relating to bit packing
I've seen CppCon videos where they use the first 16 bytes of a memory address to store information about the pointer that the 16 byte align address points to like type and/or size information.

People do some crazy stuff to pack in the data. It would seem like a lot of book keeping would be required, which to me seems like it would hinder performance packing and unpacking. On the other hand, you'd get a bit of a performance boost because you wouldn't have to fetch the pointer and then a separate enum ( type information ) or integer ( size information ).

Windows BSTR does this if I recall correctly. It's first few bytes are the length of the string and the rest is the string.
end example

I agree with the "smart class" option. It would be better for end user, but a lot of boiler plate for developer. I would consider that what inheritance is good for. If you are going to have different behavior for an object, but a lot of the same functionality as well, then why not just use inheritance?

Slidy · Post by **Slidy** » August 21st, 2019, 1:47 pm

I'd much prefer named flags than bool params for functions, even if there's a low amount.

This:

Code: Select all

Foo( FLAG_DELETE_SYSTEM64 | FLAG_ENGAGE_NUCLEAR_REACTOR | FLAG_KILL_EVERYONE );

is sexier than this:

Code: Select all

Foo( true, false, true );

Post by **chili** » August 21st, 2019, 3:28 pm

Definitely disagree there. If I type Foo and hit my intellisense, it tells me i got 3 bools and tells me their names. EZPZ.
With your situation, all I get is <int flags>. Oh great, now I gotta go digging in the docs to try and find this shit. :kms:

Flags are easier for skimming code (esp. without an IDE) tho, I will admit that.

cyboryxmen · Post by **cyboryxmen** » August 21st, 2019, 3:58 pm

Code: Select all

enum class Option
{
	a,
	b,
	c,
	a_b,
	a_c,
	b_c,
	a_b_c
};

Get on my level

Post by **albinopapa** » August 22nd, 2019, 5:51 am

Slidy wrote: ↑
August 21st, 2019, 1:47 pm
I'd much prefer named flags than bool params for functions, even if there's a low amount.

This:
Code: Select all
Foo( FLAG_DELETE_SYSTEM64 | FLAG_ENGAGE_NUCLEAR_REACTOR | FLAG_KILL_EVERYONE );
is sexier than this:
Code: Select all
Foo( true, false, true );

If ( true, false, true ) is FLAG_DELETE_SYSTEM64 | FLAG_KILL_EVERYONE, then might as well be
Foo( FLAG_ENGAGE_NUCLEAR_REACTOR ) by itself. LOL

@cyberyxmen
The enum class thing is helpful too, but still the setup for developer can be a headache even if you aren just dealing with 8 bits. I wanted to setup enums for the SIMD shuffle functions, which use all 8 bits. So, I DID all 256 variations. Then a few months go by and realized I could just be using template parameters on a constexpr inline variable THANK YOU C++11/14/17 ( can't remember which version introduced inline variables '14 or '17 ).

I know there is an easier way also, by defining operator& and operator| overloads for enum classes that have similar meanings AND can provide asserts for flag combinations that shouldn't be combined. This is probably the better route than defining all the different variations of possible flags.

If you want to be really dedicated, you could even have some type trait structures that have a list of flags that are allowed with the current flag.

Slidy · Post by **Slidy** » August 25th, 2019, 3:30 am

chili wrote: ↑
August 21st, 2019, 3:28 pm
Definitely disagree there. If I type Foo and hit my intellisense, it tells me i got 3 bools and tells me their names. EZPZ.
With your situation, all I get is <int flags>. Oh great, now I gotta go digging in the docs to try and find this shit. :kms:

Flags are easier for skimming code (esp. without an IDE) tho, I will admit that.

You just type the FLAG_ prefix and all the options come up. Also typically the enum will be typed (in which case you could also do EnumType:: and get all the options) or the function will have a comment telling you which flags are relevant, no docs digging necessary.

Post by **chili** » August 25th, 2019, 5:36 am

The FLAG_ "namespace" will get clogged up pretty soon if flags are used widely, so you'd have to subdivide it by different flag pools and then find a way of conveying the pool namespace to the user of the API. I will admit second point seems valid tho. If enum is used instead of int it's not so bad. The problem is though, the function would have to take an int type, because enumval | enumval => int. So for the enum case, you'd need a way of conveying the enum type to the programmer outside of the function signature I suppose.

Planet Chili

The usefulness of bit packing

The usefulness of bit packing

Re: The usefulness of bit packing

Re: The usefulness of bit packing

Re: The usefulness of bit packing

Re: The usefulness of bit packing

Re: The usefulness of bit packing

Re: The usefulness of bit packing

Re: The usefulness of bit packing

Re: The usefulness of bit packing

Re: The usefulness of bit packing