ASCII
Part of a series on |
Extraction |
---|
The American Standard Code for Information Interchange, or ASCII/US-ASCII, is a way for computers to encode symbols and characters via number codes. The standard version of ASCII includes 128 characters, including all letters of the English alphabet in both upper- and lower-case, the digits 0-9, and 33 'control codes' that are mostly obsolete holdovers from teletype machines. Each character is represented by a number from 0-127 (in decimal, although the system was commonly represented in binary, octal, or hexadecimal as well).
Background[edit | edit source]
See also: ASCII
ASCII was originally developed as a series of 7-bit teleprinter codes for Bell data services, with development taking place from May 1961 until its publication in 1963. From then, several updates to the standard were implemented, with the most significant updates (at least in terms of character list) occurring in 1965 and 1967.
In 1965, a change was proposed and approved, but not published, leading it to only be used by a select few IBM machines. The 1965 update would have primarily affected the standard's command codes, changing 22 out of the 33 codes' abbreviations to more suitable/descriptive ones. It also would have introduced several printable characters, including all lowercase English letters, and a few lesser-used punctuation marks, such as the tilde (~).
In 1967, a new update was actually published, following through on many of the changes proposed in the 1965 update. This meant that across all devices using ASCII, most command codes would have new abbreviations, and while the new characters were not all introduced with the same codes as the 1965 proposal, all but the Not Symbol (¬) did make it in.
In 1968, then president Lyndon B. Johnson issued a mandate that all computers used by the federal government had to be ASCII-compatible. Since there are many characters beyond the ASCII character set, there were many different code pages developed to define the upper 128 characters (with the lower 128 characters forming ASCII), known as Extended ASCII. The most popular of these was ISO 8859, which defined different codepages, such as Latin-1 (including diacritics for most Western European languages), Latin-2 (Central and Eastern European languages), Cyrillic, Greek, Arabic, and others. Alternative encodings were required to handle ideographs from East Asian languages like Chinese and Japanese kanji, as there are simply too many characters to fit a codepage with only 256 characters. This led to the development of UTF-8 and Unicode, a unified standard that aims to encode characters from every language, with UTF-8 being flexible enough to allow characters to be represented by anywhere between 1 and 4 bytes. Unicode overtook Windows-1252 (an extended ASCII codepage based on Latin-1) as the dominant character encoding by 2007.
The other major competitor for 8-bit character encodings at the time was EBCDIC, developed by IBM and used throughout the 1950's and 1960's on IBM machines.
Printable Character List[edit | edit source]
The following table contains all printable characters contained within the original, 7-bit ASCII model. It also shows the primary changes in character compositions between the three major updates in the 1960s.
Binary | Oct | Dec | Hex | Character | ||
---|---|---|---|---|---|---|
1963 | 1965 | 1967 | ||||
010 0000 | 040 | 32 | 20 | space | ||
010 0001 | 041 | 33 | 21 | ! | ||
010 0010 | 042 | 34 | 22 | " | ||
010 0011 | 043 | 35 | 23 | # | ||
010 0100 | 044 | 36 | 24 | $ | ||
010 0101 | 045 | 37 | 25 | % | ||
010 0110 | 046 | 38 | 26 | & | ||
010 0111 | 047 | 39 | 27 | ' | ||
010 1000 | 050 | 40 | 28 | ( | ||
010 1001 | 051 | 41 | 29 | ) | ||
010 1010 | 052 | 42 | 2A | * | ||
010 1011 | 053 | 43 | 2B | + | ||
010 1100 | 054 | 44 | 2C | , | ||
010 1101 | 055 | 45 | 2D | - | ||
010 1110 | 056 | 46 | 2E | . | ||
010 1111 | 057 | 47 | 2F | / | ||
011 0000 | 060 | 48 | 30 | 0 | ||
011 0001 | 061 | 49 | 31 | 1 | ||
011 0010 | 062 | 50 | 32 | 2 | ||
011 0011 | 063 | 51 | 33 | 3 | ||
011 0100 | 064 | 52 | 34 | 4 | ||
011 0101 | 065 | 53 | 35 | 5 | ||
011 0110 | 066 | 54 | 36 | 6 | ||
011 0111 | 067 | 55 | 37 | 7 | ||
011 1000 | 070 | 56 | 38 | 8 | ||
011 1001 | 071 | 57 | 39 | 9 | ||
011 1010 | 072 | 58 | 3A | : | ||
011 1011 | 073 | 59 | 3B | ; | ||
011 1100 | 074 | 60 | 3C | < | ||
011 1101 | 075 | 61 | 3D | = | ||
011 1110 | 076 | 62 | 3E | > | ||
011 1111 | 077 | 63 | 3F | ? | ||
100 0000 | 100 | 64 | 40 | @ | ` | @ |
100 0001 | 101 | 65 | 41 | A | ||
100 0010 | 102 | 66 | 42 | B | ||
100 0011 | 103 | 67 | 43 | C | ||
100 0100 | 104 | 68 | 44 | D | ||
100 0101 | 105 | 69 | 45 | E | ||
100 0110 | 106 | 70 | 46 | F | ||
100 0111 | 107 | 71 | 47 | G | ||
100 1000 | 110 | 72 | 48 | H | ||
100 1001 | 111 | 73 | 49 | I | ||
100 1010 | 112 | 74 | 4A | J | ||
100 1011 | 113 | 75 | 4B | K | ||
100 1100 | 114 | 76 | 4C | L | ||
100 1101 | 115 | 77 | 4D | M | ||
100 1110 | 116 | 78 | 4E | N | ||
100 1111 | 117 | 79 | 4F | O | ||
101 0000 | 120 | 80 | 50 | P | ||
101 0001 | 121 | 81 | 51 | Q | ||
101 0010 | 122 | 82 | 52 | R | ||
101 0011 | 123 | 83 | 53 | S | ||
101 0100 | 124 | 84 | 54 | T | ||
101 0101 | 125 | 85 | 55 | U | ||
101 0110 | 126 | 86 | 56 | V | ||
101 0111 | 127 | 87 | 57 | W | ||
101 1000 | 130 | 88 | 58 | X | ||
101 1001 | 131 | 89 | 59 | Y | ||
101 1010 | 132 | 90 | 5A | Z | ||
101 1011 | 133 | 91 | 5B | [ | ||
101 1100 | 134 | 92 | 5C | \ | ~ | \ |
101 1101 | 135 | 93 | 5D | ] | ||
101 1110 | 136 | 94 | 5E | ↑ | ^ | |
101 1111 | 137 | 95 | 5F | ← | _ | |
110 0000 | 140 | 96 | 60 | @ | ` | |
110 0001 | 141 | 97 | 61 | a | ||
110 0010 | 142 | 98 | 62 | b | ||
110 0011 | 143 | 99 | 63 | c | ||
110 0100 | 144 | 100 | 64 | d | ||
110 0101 | 145 | 101 | 65 | e | ||
110 0110 | 146 | 102 | 66 | f | ||
110 0111 | 147 | 103 | 67 | g | ||
110 1000 | 150 | 104 | 68 | h | ||
110 1001 | 151 | 105 | 69 | i | ||
110 1010 | 152 | 106 | 6A | j | ||
110 1011 | 153 | 107 | 6B | k | ||
110 1100 | 154 | 108 | 6C | l | ||
110 1101 | 155 | 109 | 6D | m | ||
110 1110 | 156 | 110 | 6E | n | ||
110 1111 | 157 | 111 | 6F | o | ||
111 0000 | 160 | 112 | 70 | p | ||
111 0001 | 161 | 113 | 71 | q | ||
111 0010 | 162 | 114 | 72 | r | ||
111 0011 | 163 | 115 | 73 | s | ||
111 0100 | 164 | 116 | 74 | t | ||
111 0101 | 165 | 117 | 75 | u | ||
111 0110 | 166 | 118 | 76 | v | ||
111 0111 | 167 | 119 | 77 | w | ||
111 1000 | 170 | 120 | 78 | x | ||
111 1001 | 171 | 121 | 79 | y | ||
111 1010 | 172 | 122 | 7A | z | ||
111 1011 | 173 | 123 | 7B | { | ||
111 1100 | 174 | 124 | 7C | ACK | ¬ | |
111 1101 | 175 | 125 | 7D | } | ||
111 1110 | 176 | 126 | 7E | ESC | | | ~ |
Puzzle Application[edit | edit source]
As an encryption method, ASCII is most commonly used just for its alphabet. However, unlike other codes, it has a much larger number range, both literally due to being represented by numbers 65 and over, and by virtue of of having two sets of letters to use, giving a total size of 52 values. This can be very helpful for puzzles that want to use an alphanumeric code but both can't get their numbers down to a 1-26 range and don't want to force solvers to modulo their results. Writers should still be careful, as unlike other alphanumeric codes, there are other characters surrounding both sets of letters, so any errors resulting in numbers outside of the 65-90 or 97-122 may still be interpreted as valid, even if they don't produce letters.
A less common use for ASCII is in ASCII Art, in which characters are types in a fixed-width font and arranged to form images on a larger scale. While this can be used as an actual solving element (see the Notable Examples below), it's more likely to appear as an artistic choice within a puzzle, or as a clue towards a general ASCII or computer-y theme/topic.
Strategy[edit | edit source]
Identification[edit | edit source]
ASCII (as used to represent letters) is ultimately very similar to A1Z26, in that it only uses a series of increasing numbers in order to represent the alphabet. What makes it slightly more difficult is that there are two different sets of numbers that can stand for English letters, and therefore two ranges to remember as important. Most puzzles use the capital letters exclusively, meaning the numbers 65-90 are the most important to remember. Puzzles will rarely use just the lowercase letters, and are more likely to form full sentences using a mix of upper- and lower-case letters. In these cases, solvers will need to watch out for both 65-90 and 97-122, along with any punctuation that may be used like periods (46), commas (44), space (32), and possibly numbers (48-57).
Since ASCII is also an acronym, puzzles may clue it through that as well. Keep an eye out for phrases that could acronym to ASCII, or any sets of five words (titles, categories, etc.) that start with those letters.
Decryption[edit | edit source]
ASCII is a relatively easy code to decipher, as there are many sites available to translate all common code forms into the characters themselves. This makes the process much easier, and one of these sites should be a mainstay of any puzzle-solvers bookmarks. In case the internet is unavailable, the code can also be printed out in table format, or reduced to a simple pair of numbers to remember where the upper-case and lower-case letters begin. Keeping 65 and 97 in mind as A and a respectively can help vastly reduce the amount of memory dedicated to ASCII needed to solve most puzzles. It's also possible that binary or hexadecimal may be used to represent ASCII; in the common 8-bit representation (or 2 hexadecimal digits), it suffices to look at the last five bits and to then use A1Z26 to convert, as 65 and 97 are 01000001 and 01100001, respectively, with the other letters counting up from A/a.
Notable Examples[edit | edit source]
Played Straight[edit | edit source]
- Blather (MITMH 2007) (web) - A puzzle that showed a way to do things the long way with ASCII. Consisting entirely of strings of the word 'blather', this puzzle involved Click to revealcounting the number of letters in each string, converting that number to an ASCII character, and running the entire thing as a Java program.
- ASCII Characters (MITMH 2016) (web) - A rare case of ASCII art being a functional part of a puzzle. In this one, solvers are given a string of characters that Click to revealmust be converted to even-width font and arranged to the correct width in order to form images. Notably, each image can have part deleted and the rest arranged into another image, with this process happening multiple times for each.
Notable Twists[edit | edit source]
- Whoa -- I have a Migraine! (MITMH 2003) (web) - Uses ASCII characters...but not for art, or for any kind of encryption. Instead, it presents a set of text that Click to revealfunctions as a stereogram or Magic Eye image.
- Lost (MITMH 2004) (web) - Click to revealUsed Extended ASCII characters (specifically DOS line-drawing characters) to create a solvable maze. Doing so correctly lets solvers get their final answer from numbers hidden in the path.
- The Sun Will Come Out (MITMH 2015) (web) - This puzzle plays extensively with the binary values of ASCII characters, Click to revealusing an OR function to combine them into new characters (which is fitting as the puzzle also deals with words that contain the substring OR, like TOM[OR]ROW).
See Also[edit | edit source]
- Puzzles involving ASCII
- ASCII to Hex, an ASCII translator that includes Hexadecimal, Decimal, Binary, and Base 64 translations of ASCII characters, as well as a ROT13 translator.