Hi @OtenMoten and @McZawa, I really appreciate the feedback.
I didn’t realise that the bool array would initialize to true. I did try a modified version of the function where I use bool LettersSeen[ALPHABET_LENGTH] = {0}; // Set all letters to 0;
to avoid the loop, but I don’t know how that will be implemented when it is compiled so it might not actually be more efficient.
Normally I wouldn’t worry about the lower order terms when analyzing the algorithm’s performance: usually O(n+26) can be considered O(n) since the length of the string will have a much bigger effect for large values of n. But since most of the guesses we’re dealing with will be relatively short because there aren’t many long isograms it probably does make sense to try to eliminate the loop here.
So now I have another solution that involves using bitmasks. Since an integer contains 32 bits, all of the 26 letters of the alphabet can be stored as bits in a single integer rather than having 26 separate 1 byte (8 bit) bools. So these would be represented as:
0000 0000 0000 0000 0000 0000 0000 0001 : A
0000 0000 0000 0000 0000 0000 0000 0010 : B
0000 0000 0000 0000 0000 0000 0000 0100 : C
...
0000 0001 0000 0000 0000 0000 0000 0000 : Y
0000 0010 0000 0000 0000 0000 0000 0000 : Z
Then, if we get C as a character in our word, that will be int(C-A) = 2. We can create the representation of C by shifting the integer 1 (which is A in this encoding) by 2, ie: BitMask = 1 << 2.
Then we can check if it is in our LettersSeen integer by bit masking this with bitwise AND, ie LettersSeen & BitMask. For any letter, if it has not yet been seen the result will be 0.
Then we can add it to the LettersSeen integer using the same bitmask and bitwise OR, ie LettersSeen = LettersSeen | BitMask.
0000 0000 0000 0000 0000 0000 0001 0001 : Value is 17: already seen A and D
0000 0000 0000 0000 0000 0000 0000 0100 : Add C to it
---------------------------------------
0000 0000 0000 0000 0000 0000 0001 0101 : Value is now 21
This is more efficient in terms of space, since we’re using a single integer instead of an array. It should also be faster because, in addition to removing the array setup, these bitwise operations tend to be really fast.
Here’s the code:
bool isIsogramBitshift(const string& Word) {
int LettersSeen = 0;
// Loop over the word
for (size_t i = 0; i < Word.length(); i++) {
int LetterIndex = static_cast<int>(Word[i] - 'a');
int BitMask = 1 << LetterIndex;
if ((LettersSeen & BitMask) > 0) {
return false;
} else {
LettersSeen = LettersSeen | BitMask;
}
}
return true;
}