Today's lab focuses on simple string manipulation.

Decoding Text

Here is a program (originally from Chapter 5 of Python Programming: An Introduction to Computer Science by John Zelle) that converts text into its Unicode codes:
# text2numbers.py
#     A program to convert a textual message into a sequence of
#         numbers, utilizing the underlying Unicode encoding.

def main():
    print("This program converts a textual message into a sequence")
    print("of numbers representing the Unicode encoding of the message.\n")
    
    # Get the message to encode
    message = input("Please enter the message to encode: ")

    print("\nHere are the Unicode codes:")

    # Loop through the message and print out the Unicode values
    for ch in message:
        print(ord(ch), end=" ")
        
    print() # blank line before prompt

main()
We are going to modify his program to create a Caesar cipher (or shift cipher) which is a simple code where each letter is replaced by one some fixed distance down the alphabet. The simplest Caesar cipher is to shift all letters to the right by one. So, ABC would be encoded as BCD and I LOVE PYTHON would be J MPWF QZUIPO.

To do this, we will modify the above program. Let's start by printing out the index of each letter in the alphabet that's entered. So, instead of printing out ord(ch), we'll print out ord(ch)-ord('A'), or how far past 'A' the letter is in the alphabet:

# encode.py
#     A program to convert a textual message into a Caesar cipher

def main():
    print("This program converts a textual message into a sequence")
    print("of numbers representing the Unicode encoding of the message.\n")
    
    # Get the message to encode
    message = input("Please enter the message to encode: ")

    print("\nHere are the Unicode codes:")

    # Loop through the message and print out the Unicode values
    for ch in message:
        print(ord(ch)-ord('A'), end=" ")     # CHANGED ONLY THIS LINE
        
    print() # blank line before prompt

main()

Try running the new program with input ABCDE. What is printed out? Let's save that value in a variable called index:

# encode.py
#     A program to convert a textual message into a Caesar cipher

def main():
    print("This program converts a textual message into a sequence")
    print("of numbers representing the Unicode encoding of the message.\n")
    
    # Get the message to encode
    message = input("Please enter the message to encode: ")

    print("\nHere are the Unicode codes:")

    # Loop through the message and print out the Unicode values
    for ch in message:
        index = ord(ch) - ord('A')
        print(ord('A')+index, index, end=" ")     
        
    print() # blank line before prompt

main()

This change to the program should produce the same output. We are using variables to make it easier to read. What does the chr() function do? Modify your program as follows to find out:

# encode.py
#     A program to convert a textual message into a Caesar cipher

def main():
    print("This program converts a textual message into a sequence")
    print("of numbers representing the Unicode encoding of the message.\n")
    
    # Get the message to encode
    message = input("Please enter the message to encode: ")

    print("\nHere are the Unicode codes:")

    # Loop through the message and print out the Unicode values
    for ch in message:
        index = ord(ch) - ord('A')
        print(chr(ord('A')+index), index, end=" ")     
        
    print() # blank line before prompt

main()

Note that chr() "undoes" the ord() function by taking the unicode number as input and returning the corresponding character. How can we use this to shift encode each character in our message? If we want to shift by 1, we can simply add 1 before printing:

# encode.py
#     A program to convert a textual message into a Caesar cipher

def main():
    print("This program converts a textual message into a sequence")
    print("of numbers representing the Unicode encoding of the message.\n")
    
    # Get the message to encode
    message = input("Please enter the message to encode: ")

    print("\nHere are the Unicode codes:")

    # Loop through the message and print out the Unicode values
    for ch in message:
        index = ord(ch) - ord('A')+1
        print(chr(ord('A')+index), index, end=" ")    
        
    print() # blank line before prompt

main()

What happens if you enter a Z? For a true Caesar ciper, Z should go to A or an index of 0, but ours has an index of 26. To fix it, we will use modular arithmethic to make sure any number 26 or above is "wrapped" around to the beginning to the alphabet:

# encode.py
#     A program to convert a textual message into a Caesar cipher

def main():
    print("This program converts a textual message into a sequence")
    print("of numbers representing the Unicode encoding of the message.\n")
    
    # Get the message to encode
    message = input("Please enter the message to encode: ")

    print("\nHere are the Unicode codes:")

    # Loop through the message and print out the Unicode values
    for ch in message:
        index = (ord(ch)-ord('A')+1) % 26
        print(chr(ord('A')+index), end=" ")    
        
    print() # blank line before prompt

main()

And lastly, we will convert all messages to upper case letters to simplify the encoding:

# encode.py
#     A program to convert a textual message into a Caesar cipher

def main():
    print("This program converts a textual message into a sequence")
    print("of numbers representing the Unicode encoding of the message.\n")
    
    # Get the message to encode
    message = input("Please enter the message to encode: ")

    print("\nHere are the Unicode codes:")

    # Loop through the message and print out the Unicode values
    for ch in message.upper():
        index = (ord(ch) - ord('A')+1) %26
        print(chr(ord('A')+index), end=" ")    
        
    print() # blank line before prompt

main()

Try encoding several messages to make sure your program works. You can use the same program to decode messages by changing the offset, i.e. how much each letter of plain text is shifted. Ours is currently set to +1, but you can "undo" this by setting it to -1.

Note that our program is not designed to handle punctuation or non-letters (for example, it replaces spaces by !).

Python Challenge

For today's lab, we are going to work through the first two problems of the Python Challenge, which is a series of programming exercises. The solution to each provides the key to start the next one.

Hint: For the first challenge, you do not need to define a function, only use the python shell to evaluate the mathematical expression. Take the number and use it as the URL to go to the next challenge.

Hint: For the second challenge, the encode.py program from the first part of this lab. See the challenge for a clue on the appropriate value for the index.

If you finish early, you may work on the programming problems.