As part of a coding challenge within our in-house SAS users group at Nationwide Building Society, I created a SAS program to emulate the Enigma machine. The Enigma machine is a cipher device famously used by the German military command in World War II to encrypt messages to the field.
>> Learn more about the in-house community at Nationwide Building Society
In this recorded presentation I explain how the Enigma machine works, and how I used SAS features like custom formats and informats, arrays, and specialized SAS functions in my approach to solving the challenge.
The follow code emulates the Enigma machine and includes a simple example of encoding and decoding a small message.
/* ENIGMA MACHINE EMULATOR */
/* Code sections:
1) Create ALPHABET format and informats, mapping ABCD... to 1234...
and vice-versa.
2) Build 3 'randomized' wheels - these permute the alphabet by sending each letter
to a random position. These wheels will turn as further characters are read in,
altering the permutation. The randomized build process is seeded, so SHOULD always
produce the same wheels.
3) Build a reflector - this is similar to a wheel, but is reflexive - if A maps to X
then X maps to A. It is required to allow decryption as well as encryption.
4) Build the ENIGMA macro to take a string, pass it through Wheel1 - Wheel2 - Wheel3 -
Reflector - Wheel3 - Wheel2 - Wheel1, and print the output. This will also apply the
appropriate wheel rotations.
5) Run the ENIGMA macro on as many strings as desired.
6) Use the PROC GANNO procedure to plot a diagram of the first character's encryption,
indicating the wheel positions, input character and output character.
*/
/* 1) Create ALPHABET format and informats, mapping ABCD... to 1234...*/
/* and vice-versa.*/
/* First, create an informat to map the alphabet to its equivalent */
/* numerical rank (A=1, B=2, ...). Create an input dataset for PROC FORMAT: */
data Alphabet_Format (drop = alphabet);
alphabet = 'ABCDEFGHIJKLMNOPQRSTUVWXYZ';
fmtname = 'alphabet';
do label=1 to length(alphabet);
start=substr(alphabet, label, 1);
/* Type = 'i' for the Text-to-Number informat, and */
/* Type = 'n' for the Number-to-Text format. */
type = 'i'; output;
type = 'n'; output;
end;
run;
/* Now feed the dataset into PROC FORMAT to build the format/informat */
/* (we have to select the correct rows, and swap around the label/start vars for the format) */
proc format cntlin=Alphabet_Format (where = (type = 'i')); run;
proc format cntlin=Alphabet_Format (where = (type = 'n') rename = (start = label label = start)); run;
/* 2) Build 3 'randomized' wheels - these permute the alphabet by sending each letter*/
/* to a random position. These wheels will turn as further characters are read in,*/
/* altering the permutation. The randomized build process is seeded, so SHOULD always*/
/* produce the same wheels.*/
/* List every letter, and assign a random number to sort by */
data temp;
alphabet = 'ABCDEFGHIJKLMNOPQRSTUVWXYZ';
length char $1;
do i=1 to length(alphabet);
char = substr(alphabet, i, 1);
/* Assign each character A-Z four random numbers, to */
/* determine its mapping in each of the three wheels + */
/* the reflector */
wheel_sort1 = ranuni(1);
wheel_sort2 = ranuni(2);
wheel_sort3 = ranuni(3);
reflector_sort = ranuni(4);
output;
end;
run;
/* The wheels do not require reflexivity (A>X & X>A), so we can */
/* just take the above order & store these in macro variables for later. */
proc sql noprint;
select input(char,alphabet.) into: wheel1 separated by ' '
from temp
order by wheel_sort1;
select input(char,alphabet.) into: wheel2 separated by ' '
from temp
order by wheel_sort2;
select input(char,alphabet.) into: wheel3 separated by ' '
from temp
order by wheel_sort3;
quit;
/* 3) Build a reflector - this is similar to a wheel, but is reflexive - if A maps to X*/
/* then X maps to A. It is required to allow decryption as well as encryption.*/
/* Sort A-Z by the random reflector number */
proc sort data=temp;
by reflector_sort;
run;
/* Now map the 1st to the 14th letter, 2nd to the 15th, etc. */
data temp;
length output_mvar $26;
retain output_mvar;
merge temp (firstobs=14)
temp (obs=13 rename = (char = end));
substr(output_mvar, input(char, alphabet.), 1) = end;
merge temp (firstobs=14 rename = (char = end))
temp (obs=13) end=eof;
substr(output_mvar, input(char, alphabet.), 1) = end;
if eof then output;
run;
/* And store the reflector mappings in a mvar */
proc sql noprint;
select output_mvar into: REFLECTOR
from temp
order by char;
quit;
/* 4) Build the ENIGMA macro to take a string, pass it through Wheel1 - Wheel2 - Wheel3 -*/
/* Reflector - Wheel3 - Wheel2 - Wheel1, and print the output. This will also apply the*/
/* appropriate wheel rotations.*/
%macro ENIGMA(Input, /* INPUT: A string to code/decode (no quotes/special characters please). */
/*PlugBoard = Not implemented yet */
StartPoint_Wheel1 = %sysfunc(putn(%scan(&wheel1,1),alphabet.)), /* Starting letter for Wheel1 */
StartPoint_Wheel2 = %sysfunc(putn(%scan(&wheel2,1),alphabet.)), /* Starting letter for Wheel2 */
StartPoint_Wheel3 = %sysfunc(putn(%scan(&wheel3,1),alphabet.)) /* Starting letter for Wheel3 */
);
/* Wheel start points are 1 (the first character when sorted by the random ordering) by default. */
/* When alternative start points are specified in the parameters above, we need to figure out the */
/* correct start point as a number (StartPoint_Wheel&i._Num) */
%do i=1 %to 3;
%put StartPoint_Wheel&i = &&StartPoint_Wheel&i;
%let StartPoint_Wheel&i._Num = %sysfunc(inputn(&&StartPoint_Wheel&i,alphabet.));
%put StartPoint_Wheel&i._Num = &&StartPoint_Wheel&i._Num;
%end;
/* We do the processing in a dataset. We produce an output table, work.Output_Temp, which */
/* can be used for fun plotting later. */
data output_Temp;
/* Put each wheel into one of three arrays, and adjust to correctly account for alternate starting position */
%do i=1 %to 3;
array wheel&i(26) /*_temporary_*/ w&i._1-w&i._26 (&&wheel&i);
wheel&i._adjustment = whichn(&&StartPoint_Wheel&i._Num, of wheel&i(*)) - 1;
do i=1 to 26;
wheel&i(i) = wheel&i(i) - wheel&i._adjustment + 26*(wheel&i(i) - wheel&i._adjustment < 0);
end;
%end;
/* The input string is read in, and compress to only include spaces and alphanumeric chars */
Input = upcase(compress(symget('Input'), '', 'ksad'));
/* Mirror it with an Output variable that will be permuted */
Output = Input;
/* Now just iterate through each character */
do i=1 to length(Output);
character = char(Output, i);
/* Find the numeric version of the character (A=1, etc.) */
char_num = input(character, alphabet.);
/* If char_num (i.e. if the character is in the alphabet, so it returned a value when converted), */
/* we we initiate the encryption process */
if char_num then
do;
/* If char_num = A = 1, we extract the first element in the Wheel1 array (with adjustments for wheel turning) */
char_numw1 = wheel1(char_num + wheel1_adjustment - 26*(char_num + wheel1_adjustment > 26));
/* Now repeat for wheels 2 and 3 */
char_numw2 = wheel2(char_numw1 + wheel2_adjustment - 26*(char_numw1 + wheel2_adjustment > 26));
char_numw3 = wheel3(char_numw2 + wheel3_adjustment - 26*(char_numw2 + wheel3_adjustment > 26));
/* And pass through the reflector */
char_numw4 = find("&REFLECTOR", put(char_numw3, alphabet.));
/* And pass BACK through the wheels (this uses a slightly different method - we have */
/* to look up the index of the current character in the relevant array) */
char_numw5 = whichn(char_numw4, of wheel3(*)) - wheel3_adjustment;
if char_numw5 < 0 then char_numw5 = char_numw5 + 26;
char_numw6 = whichn(char_numw5, of wheel2(*)) - wheel2_adjustment;
if char_numw6 < 0 then char_numw6 = char_numw6 + 26;
char_numw7 = whichn(char_numw6, of wheel1(*)) - wheel1_adjustment;
if char_numw7 < 0 then char_numw7 = char_numw7 + 26;
/* Now convert from numeric back to character and permute the Output variable! */
final_char = put(char_numw7, alphabet.);
substr(Output, i, 1) = put(char_numw7, alphabet.);
output;
/* Following each character encyption, we turn the wheels. This involves incrementing */
/* the wheelx_adjustments variables, and ALSO incrementing the numbers present in the */
/* arrays. Don't worry, it makes sense if you draw it all out. */
wheel1_adjustment = mod(wheel1_adjustment + 1, 26);
if mod(i, 26) = 0 then wheel2_adjustment = mod(wheel2_adjustment + 1, 26);
if mod(i, 26**2) = 0 then wheel3_adjustment = mod(wheel3_adjustment + 1, 26);
do j=1 to 26;
wheel1(j) = wheel1(j) - 1 + 26*(wheel1(j) - 1 < 0);
if mod(i, 26) = 0 then wheel2(j) = wheel2(j) - 1 + 26*(wheel2(j) - 1 < 0);
if mod(i, 26**2) = 0 then wheel3(j) = wheel3(j) - 1 + 26*(wheel3(j) - 1 < 0);
end;
end;
end;
run;
/* This gives a little snapshot of the progress after each character encryption */
proc sql;
select Input, Output from output_Temp;
quit;
%mend ENIGMA;
options mprint;
/* 5) Run the ENIGMA macro on as many strings as desired.*/
/* When I encrypt ABCD, I get YMLH back - and encrypting YMLH gives ABCD! */
/* The system works. If the seeding has changed on your machine, you might not get YMLH back. */
%ENIGMA(abcd);
%ENIGMA(ymlh);
options nomprint;
/* 6) Use the PROC GANNO procedure to plot a diagram of the first character's encryption, */
/* indicating the wheel positions, input character and output character.*/
/* We write a bunch of useful helper macros here, to tidy up the actual */
/* datastep that will produce the PROC GANNO input. */
/* Move the cursor to (x,y). */
%macro move(x,y);
function = 'MOVE';
x=&x; y=&y;
output;
%mend move;
/* Draw a line from the current cursor position to (x,y). */
%macro draw(x,y, color="black");
function = 'DRAW';
x=&x; y=&y;
color = &color;
size = 1;
style = '';
output;
%mend draw;
/* Draw a rectangle with corners at the current cursor position & (x,y). */
%macro rectangle(x,y, color="black");
function = 'BAR';
_x = x; _y = y;
x=&x; y=&y;
color = &color;
size = 1;
style = '';
output;
x=_x; y=_y;
output;
%mend rectangle;
/* Write some text at (x,y). */
%macro text(text, x, y, color="black");
function = 'TEXT';
x=&x; y=&y;
color = &color;
size = 1;
text = &text;
style = '';
output;
%mend text;
/* Draw A-Z in a vertical column at the specified X position. */
%macro draw_alphabet(x, draw_right_lines = Y, alphabet_no = 1);
%do i=1 %to 26;
/* If the letter was actually selected in the encryption, colour it red! */
if (&alphabet_no = 1 and (&i = char_num or &i = char_numw7)) or
(&alphabet_no = 2 and (&i = char_numw1 or &i = char_numw6)) or
(&alphabet_no = 3 and (&i = char_numw2 or &i = char_numw5)) or
(&alphabet_no = 4 and (&i = char_numw3 or &i = char_numw4)) then color = 'red';
else color = "black";
%text("%substr(ABCDEFGHIJKLMNOPQRSTUVWXYZ,&i,1)", &x, 75 - 2*&i + 2, color=color)
%move(x - 2, y)
%draw(x - 4, y, color=color)
%if &draw_right_lines = Y %then
%do;
%move(x + 8, y)
%draw(x + 4, y, color=color)
%end;
%end;
%mend draw_alphabet;
/* Draw lines mapping two pairs of A-Z columns, indicating the permutations */
/* given by the requested &wheelno. */
%macro draw_wheel(x, wheelno);
%do i=1 %to 26;
%move(&x, 75 - 2*&i + 2)
/* If the connection was actually used in the encryption, colour it red! */
if (&wheelno = 1 and (&i = char_num or &i = char_numw7)) or
(&wheelno = 2 and (&i = char_numw1 or &i = char_numw6)) or
(&wheelno = 3 and (&i = char_numw2 or &i = char_numw5)) then color = 'red';
else color = "black";
%draw(&x - 8, 75 - 2*(w&wheelno._[&i]) + 2, color = color)
%end;
%mend draw_wheel;
/* Now we just create the Enigma_Anno input, using the output_temp dataset generated earlier. */
data Enigma_Anno (keep = function x y color style size text xsys ysys zsys);
set output_Temp (obs=1);
/* Set up our wheel arrays. */
array w1_ (26);
array w2_ (26);
array w3_ (26);
/* These are the variables required for GANNO to do its stuff. */
length function $32 x y 8 color $32 style $32 size 8 text $100;
/* The coordinate system is chosen - 1 means percentage, so we can specify */
/* x/y coords from 0 to 100. */
xsys = '1'; ysys = '1'; zsys = '1';
/* First we draw some pretty boxes. */
/* Border: */
%move(0,0)
%rectangle(100,100)
/* Draw the ENIGMA machine: */
%move(10,23)
%rectangle(20,77)
%rectangle(11,77)
%rectangle(30,77)
%rectangle(40,77)
%rectangle(50,77)
%rectangle(60,77)
%rectangle(70,77)
%rectangle(80,77)
%rectangle(90,77)
/* Label the boxes: */
%text("Wheel 1", 75, 79)
%text("Wheel 2", 55, 79)
%text("Wheel 3", 35, 79)
%text("Reflector", 15, 79)
/* Draw the alphabets: */
%draw_alphabet(25, alphabet_no = 4)
%draw_alphabet(45, alphabet_no = 3)
%draw_alphabet(65, alphabet_no = 2)
%draw_alphabet(85, draw_right_lines = N, alphabet_no = 1)
/* Draw the 3 wheels: */
%draw_wheel(79, 1)
%draw_wheel(59, 2)
%draw_wheel(39, 3)
/* and finally, draw the reflector: */
do i=1 to length("&REFLECTOR");
c_num = input(char("&REFLECTOR", i),alphabet.);
%move(19, 75 - 2*i + 2)
if c_num = char_numw3 or c_num = char_numw4 then color = 'red';
else color = "black";
%draw(11, 75 - (c_num + i) + 2, color=color)
%draw(19, 75 - 2*c_num + 2, color=color)
end;
run;
/* Now just give the annotation dataset to PROC GANNO! */
proc ganno annotate=work.Enigma_Anno;
run;
Not read it yet.
But put it on my to-do list.
Seems very interesting !
Cool project!
Thanks to @TomWheatley for completing the challenge.
I had multiple attempts at doing it but couldn't solve it and it was great to see someone take on the challenge and finish it.
This is a great example of how to use the WHICHN() function that I've used before.
have a look not only at the Enigma Machine, but also at the SAS community at Nationwide mentioned above. Very impressing,
Nationwide Building Society GB | SAS Ireland
Cheers
Markus
Super interesting!
Related with the cryptography world, I have implemented the AES encryption and decryption algorithms only with Base SAS (SAS macrofunctions) for encrypting variable values. Maybe it might be of your interest: Variable-value encryption in SAS
I haven't had the time to dig into this yet. However, a few years back I googled if anyone had done this in SAS with no luck. So I look forward to reading / watching this 🙂
I can recommend the video (Enigma Machine) - Numberphile to anyone who wants an introduction to the enigma.
Thanks for the video link. Really interesting video!
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 16. Read more here about why you should contribute and what is in it for you!
Data Literacy is for all, even absolute beginners. Jump on board with this free e-learning and boost your career prospects.