I’m sure bitwise operations will be covered later in this course, but I couldn’t resist using them for a nice, clean version of the IsIsogram check.
Bitwise operations allow you to manipulate and test the individual bits (the 0s and 1s) of your data types. One common use of bitwise operations is the creation of bit flags, which can be used in effect like an array of booleans, but in a much more compact format.
Here’s my solution, with an explanation of what’s going on below!
Footprints
Let’s talk briefly about type sizes. The smallest native types are the char and the bool, each coming in at 1 byte, or 8 bits. A short is 2 bytes, and the standard 32-bit int (at 8 bits a byte) comes in at 4 bytes!
So how can we represent the 26 characters of the alphabet in the most compact format?
An array of bools sounds appealing, but will be 26 bytes.
In an absolutely minimized map, using a <char, bool> as type, if we have one character we’re storing 2 bytes, one for the key itself, and one for the bool. At 16 bits, this is smaller than one bit per character, but it doubles in size for each character present in our hidden word.
For the sake of the exercise below, let’s imagine we’re testing the non-isogram entry acai.
Introducing the Bit Flag
What if, instead, we could represent each character as a single bit? If a character was present, it would be a 1, otherwise it would be a zero! That’s as compact as we can get!
Here, I’m using an int32 to hold my character presence data, but instead of using it as a number, I’m using each bit individually, so:
'a' = 0
'b' = 10
'c' = 100
and so on! 32 bits is a bit larger than our alphabet, but by using that type, we guarantee that we’re byte aligned, which is a useful to know topic that you don’t need to know right now.
Meeting the Operators
For the sake of ease, let’s imagine I’m only using an 8 character alphabet here. We’ll have a single byte bit flag that we’ll set to 0 at the start:
00000000 flag
As we loop through our letters, we’re converting each one from a-z to 0-25. Let’s imagine an a comes down the pipeline. That gives us a 0, which means we want to look at the 0th bit in our bit flag. But how can we test to see of the 0th spot is turned on or off? This is where bitwise AND comes in
In a bitwise AND, represented by the operator &, we take each bit, and do a true/false test against it to build a resulting value. If we had a single bit, 0& 0 = 0, 0 & 1 = 0, and 1 & 1 = 1.
We can represent our ‘a’ as “00000001” in this instance, and compare each bit against the one in our flag.
00000000 flag
00000001 & mask for 'a'
----------------
00000000 result of bitwise AND
Ta da! Comparing our character against the flag returns 0!, which means charFlag & 1 << checkIndex will be false. Speaking of which, what exactly is that 1 << checkIndex doing?
You’ve seen << and >> overridden for cout and cin, but natively, these are the bitwise shift operators. << moves a bit one to the ‘left’, and >> moves it one to the ‘right’.
(Note: you may be familiar with little vs big endian, but for the sake of visualization let’s ignore that for the purpose of this excercise)
We already know that we can represent ‘1’ as “0000001”.
Shifting it to the left once gives us “0000010”, which equals 2.
Once more gives us “00000100”, or 4, then 8, 16, 32, and so on.
But for our purposes, we’re using bitwise shifting to check against a single bit in our flag, so the value of the bit flag isn’t nearly as important to know as the position of the bit we’re checking.
Setting the flag
Our ‘a’ looks like it’s unique so far, but we should store it in our bit flag. To do this, we’ll need to use the bitwise OR, represented by the | operator.
Just like the bitwise AND, the bitwise OR compares two types one bit at a time, but instead of ANDing them, it ORs them! Using the same format as above:
00000000 flag
00000001 | mask for 'a'
---------------
00000001 result of bitwise OR
By using |=, we can set our flag equal to the result of the bitwise or operation!
Continuing the loop
A new character is coming up! It’s a ‘c’, which can be represented by shifting 1 left twice, 00000100. Let’s test it against our flag, which is still storing the ‘a’ from the last step.
00000001 flag, with 'a' from last iteration
00000100 & mask for 'c'
----------------
00000000 result; note that even though two bits are set,
they evaluate to false against the other field
No ‘c’ present! Let’s run it through our the bitwise OR!
00000001 flag, with 'a' from last iteration
00000100 | mask for 'c'
---------------
00000101 result; the bits for 'a' and 'c' are now both 'on'
Great! We’ve now marked two characters as present in our flag.
Uh oh, the next character in our string is an ‘a’ again. Let’s see what happens:
00000101
00000001 &
----------------
00000001 Result! Both fields have the 0th bit turned on,
so the result does as well!
Our result is non-zero! Our if check passes, because the ‘a’ has recurred, and we return!
Wrap Up
Bitwise operations can be a fantastic way to store and access a lot of boolean values in a relatively small space. They can be combined in interesting ways, and are a powerful tool for your arsenal.
Keep an eye out for a better bitwise operation explanation from the instructors (I’m sure), but I’m happy to have had the chance to give you an introduction to a really useful way to shrink your data!