20 64 0->19 1->9 2->5 3->0 4->12 5->10 6->18 7->8 8->7 9->17 10->16 11->11 12->4 13->6 14->2 15->15 16->14 17->13 18->1 19->3 Now there are 64 bytes per page and thus 64 characters on each l ine. Now perhaps you have heard of unicode and are wondering ho w that fits into encoding characters. Unicode gives every chara cter in every language in the world a number. UTF-8 is a way to encode that number as a series of bits. Now there are more char caters in the world that 2^8 = 128, which is the number of chara caters that can be represented with 8 bits (or 1 byte). So some characters are represented by multiple bytes in UTF-8. It is ef ficient for data files using English, but may not be as efficien t as UTF-16 for languages with use non-Latin alphabet characters . The first (western) encoding of characters to numbers was ASCI I, which you also may have heard of. The ASCII encoding is embed red into UTF-8. That is, the ASCII number corresponding to a let ted is the same as the UTF-8 number corresponding to that same l ether. So that is character encodings and unicode in a nutshell, although there is, as usual, much more information about it on t he internet. The rest of this data file will be filled up with t he alphabet and some random punctuation: a b c d e f g h i j k l m n o p q r s t u v w x y z . , ! $ # % & * ; ( ) A B C D E F G H I J K L M N O P Q R S T U V W X Y Z = + < > ? / 1 2 3 4 5 6 7