Notes on Bitwise Operators, Linear Feedback Shift Registers (LFSRs), and ASCII
Chris Tralie
Table of Contents
- Bitwise Operators
- Binary Bitwise Operators Practice
- XOR Encryption
- Linear Feedback Shift Registers
- Unicode/ASCII Binary Encoding for Text
Bitwise Operators
In addition to understanding how to represent numbers in binary and hex, we also need to learn two types of binary operators:
- Unary bitwise operators: These are functions that take in a binary number and return another one
- Binary bitwise operators: These are functions that take in a pair of binary numbers and return another one
Unary Shift Operators
Let's start with the unary operators on binary strings. The ones we'll mainly use in the course are the shift operators.
Left Shift
The left shift, denoted by << in C++ adds a 0 to the end of a binary string on the right. Below is an example of repeatedly applying the left shift operator to the number 174
in C++
Number of applications of num = num << 1; | Binary Number | Decimal Number |
0 | 0000000010101110 | 174 |
1 | 0000000101011100 | 348 |
2 | 0000001010111000 | 696 |
3 | 0000010101110000 | 1392 |
4 | 0000101011100000 | 2784 |
As you can see, shifting left is like multiplying by 2, and in fact, this is the fastest way to multiply by 2 on your computer.
Right Shift
The right shift, denoted by >> in C++ deletes the rightmost bit and shifts every other bit over to the right. Below is an example of repeatedly applying the right shift operator to the number 174
in C++
Number of applications of num = num >> 1; | Binary Number | Decimal Number |
0 | 0000000010101110 | 174 |
1 | 0000000001010111 | 87 |
2 | 0000000000101011 | 43 |
3 | 0000000000010101 | 21 |
4 | 0000000000001010 | 10 |
As you can see, the right shift is equivalent to dividing by 2 and discarding the remainder (or, equivalently, integer dividing by 2). In fact, this is the fastest possible way to do this on your computer.
Binary Not
The binary not operator, denoted by ~
in C++, changes all of the 1's to 0's and all of the 0's to 1s
Binary Number | Decimal Number | |
x | 0000000010101110 | 174 |
~x | 1111111101010001 | 65361 |
Binary Operators
There are also three binary bitwise operators we'll need to worry about in this class. They are
- AND
&
- OR
|
- XOR
^
x | y | & | | | ^ |
0 | 0 | 0 | 0 | 0 |
0 | 1 | 0 | 1 | 1 |
1 | 0 | 0 | 1 | 1 |
1 | 1 | 1 | 1 | 0 |
In plain English, here's what this is saying:
- AND: Only return a 1 if both input bits are 1
- OR: Return a 1 if at least one input bit is a 1
- XOR: Return a 1 if exactly one input bit is a 1. Or, in other words, return the sum of the two bits mod 2
Binary Bitwise Operators Practice
Below is a little applet where you can interactively examine what the binary bitwise operators yield for different inputs
XOR Encryption
Notice that if you XOR a bit string with the same bit string twice, you get back to where you started. For example
Encrypt: 0x1234 ^ 0xFACE = 0xE8FA
and then
Decrypt: 0xE8FA ^ 0xFACE = 0x1234
Try it yourself in the app above! We will use this fact in assignment 2 to encrypt data.
If we go back to the fact that the XOR operation is like adding bits mod 2, then the above magic trick becomes very clear. When we encrypt, we add one copy of some bit string to our input. Then, we add another copy to the result. Adding two copies of 0 does nothing. Adding two copies of 1 is also 0 mod 2. So we get back to where we started after doing this twice.
Linear Feedback Shift Registers
Now that we understand shifting and bitwise operations, we can explore a neat application in encryption.
A linear feedback shift register (LFSR) is a data structure for generating pseudorandom binary bits. One can think of it as a scheme for generating coin flips which look totally random, but which are perfectly repeatable given an initial state and a set of "taps." Together, these bits of information can be thought of as the password that we use to hide data. In particular, we will use the sequence of pseudorandom bits generated by this password to hide a message using XOR encryption in assignment 2.
A linear feedback shift register is specified by a binary string and a set of taps, which are locations of particular bits of the register numbered at 1 starting on the right. To generate a new bit, the LFSR takes the XOR of the bits at all of the tap positions. Then, the bit string of the LFSR is updated by deleting the leftmost bit, shifting the whole bit string by one to the left, and putting the new bit in the rightmost position. The animation below shows this in more detail. Please play with this until the concept is clear
Register: 0x |
Taps: |
Bit Length: |
|
Unicode/ASCII Binary Encoding for Text
All data in a computer is represented in binary, including text. Most text is represented with a format called UTF-8, which is the most encoding for text on the world wide web. In UTF-8, each character can be from 1 to 4 bytes long. To keep things simple, though, we will focus on an "ASCII subset" of UTF-8, in which each character is only 7 bits packed into a single byte. For example, the capital letter A is 0x41
.
Below you can see illustrations of a few examples of the conversion, which is done for you. For those interested, you can refer to the ASCII table here to see the hex codes for all ASCII characters (you can also type man ascii
in the terminal). You will see, among other things, that numbers come before uppercase letters, which come before lowercase letters.
Example 1
String |
|
|
|
|
|
|
|
|
|
|
| |||
ASCII |
0x49 | 0x20 | 0x6c | 0x6f | 0x76 | 0x65 | 0x20 | 0x43 | 0x53 | 0x20 | 0x31 | 0x37 | 0x34 | 0x21 |
Example 2
String |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| ||
ASCII |
0x4c | 0x65 | 0x74 | 0x27 | 0x73 | 0x20 | 0x73 | 0x65 | 0x6e | 0x64 | 0x20 | 0x73 | 0x65 | 0x63 | 0x72 | 0x65 | 0x74 | 0x73 |