BookmarkSubscribeRSS Feed

How to program the Wordle game in SAS

Started ‎03-18-2022 by
Modified ‎04-01-2022 by
Views 4,921

This article describes my approach to implementing Wordle in the SAS programming language. You can use my version in SAS for Windows, SAS Enterprise Guide, and SAS Studio (on SAS 9.4 or SAS Viya). I've shared the story about my experience in this blog article. I've also shared the complete code on GitHub.

wordle-sas on GitHub

Here's how the code works.

Get "official" Wordle game words

The "pool" of Wordle words is set by the game maintainer (currently the New York Times). It's a subset of valid words (English language) that contain exactly 5 letters. Also, there is a larger set of words that are valid "guess" words. The original game does not allow a guess of just any 5 characters. For example, you cannot guess "AEIOU" if you want to find/eliminate vowels. (But you can guess "ADIEU", a popular start word that is actually not currently a solution word.)

 

Thanks to GitHub user cfreshman for curating and sharing these "official" lists. In this step, I used PROC HTTP to pull those lists into SAS, note the count of "game" words, and then create a data set with the game words and also the allowed guesses.

 

/* Get word list */
filename words temp;
filename words_ok temp;

/* "Official" word lists from NYT, via cfreshman GitHub sharing */
proc http url="https://gist.githubusercontent.com/cfreshman/a7b776506c73284511034e63af1017ee/raw/845966807347a7b857d53294525263408be967ce/wordle-nyt-answers-alphabetical.txt"
 out=words;
run;

proc http url="https://gist.githubusercontent.com/cfreshman/40608e78e83eb4e1d60b285eb7e9732f/raw/2f51b4f2bb96c02e1dee37808b2eed4ef23a3150/wordle-nyt-allowed-guesses.txt"
 out=words_ok;
run;

data words;
  infile words;
  length word $ 5;
  input word;
run;

/* sae this count for our upper range of random selection */
%let wordcount = &sysnobs.;

/* valid guesses that aren't necessarily in word list
   via cfreshman GitHub sharing
*/
data allowed_words;
  infile words_ok;
  length word $ 5;
  input word;
run;

/* allowed guesses plus game words => universe of allowed guesses */
data allowed_words;
  set allowed_words words;
run;

 

Select a game word: random or known

To start a game we need to pick a solution word. I used the rand function with the 'Integer" method to select a record from the game words. I also wanted to provide an option for you to seed a game word, making it easy to test the game play with a known solution. I store the game word in a macro variable &gamepick -- no peeking!

 

This startGame macro picks the word (or sets it from your supplied seed word). It also establishes the data set for guess tracking, which I'll describe next.

%macro startGame(seed);
  %global gamepick;

  %if %length(&seed) = 5 %then
    %let gamepick=&seed;
  %else
    %do;
      %let pick = %sysfunc(rand(Integer,1,&wordcount.));
      data _null_;
        set words (obs=&pick. firstobs=&pick.);
        call symput('gamepick',word);
      run;
    %end;
  data status;
    array check{5}  $ 1 checked1-checked5;
    length status $ 5;
    stop;
  run;
%mend;

 

Data structure to track guesses and status

We need a data set structure to track the guess-so-far as well as their evaluated status. Since I plan to use arrays to check the guesses, I decided to track each guess letter in its own SAS variable. For the status, I used a single 5-character flag variable where each character represents the status of a guess letter.

  • G = correct letter in correct spot
  • Y = correct letter but wrong spot
  • B = letter not in word

 Here's an example data set for a 6-guess game (solution is "QUALM"):

Obs    checked1    checked2    checked3    checked4    checked5    status

 1        a           d           i           e           u        YBBBY 
 2        t           a           u           n           t        BYYBB 
 3        f           r           a           u           d        BBGYB 
 4        g           u           a           n           o        BGGBB 
 5        q           u           a           r           t        GGGBB 
 6        q           u           a           l           m        GGGGG 

 

Check a guess and update status

Checking a guess has two main steps: first, check that the guess is a valid guess word. Then, check the guess word against the solution word and note which letters are correctly placed, present but in the wrong spot, or not in the word at all.

 

Check for a valid guess

We have a list of allowed guesses in the allowed_words data set, so I used PROC SQL to create a "is_valid" flag in a macro variable. If it's not valid, I used PROC ODSTEXT to report this in the results window.

%let guess = %sysfunc(lowcase(&guess));
 /* Check to see if guess is valid */
 proc sql noprint;
  select count(word) into :is_valid 
   from allowed_words
     where word="&guess.";
  quit;

 %if &is_valid. eq 0 %then
   %do; 
     proc odstext;
     p "%sysfunc(upcase(&guess.)) is not a valid guess." 
      / style=[color=blue just=c fontweight=bold];
     run;
     %put &guess. is not a valid guess.;
   %end;

 

Check guess against solution word

This step is the crux of the game and where different creative solutions might work. I used DATA step arrays, but other programmers might use the hash object or SAS/IML with its matrix operations.

 

The algorithm goes like this. First, check each guess letter to see if it's correct and in the proper position. If so, store that status ('G') in an indexed array and then remove that guess letter from further checking.

 

Then run a second pass to find guess letters that might appear in the solution but are in the wrong spot. For each one found, note that status ('Y') in the indexed array and remove it from further checking. This "nulling out" a found letter is important and it's where I've seen some other implementations fall short, because they fail to handle cases where a letter occurs multiple times the solution.

 

If a guessed letter doesn't appear in the solution, note that status ('B') in the array.

 

When we have a complete guess "record" (letters and the status flag), output to a one-row data set. Then append this to the running data set of guesses so far.

 

I've peppered this code with lots of comments to explain.

data _status(keep=checked: status);
  /* checked array will output our guessed letters */
	array check{5}  $ 1 checked1-checked5;
  /* stat array will track guess status per position */
  /* will output to a status var at the end          */
	array stat{5} $ 1;
	length status $ 5;
  /* these arrays are solution word letters, guess letters */
	array word{5} $ 1;
	array guess{5} $ 1;

	do i = 1 to 5;
		word[i] = char("&gamepick.",i);
		guess[i] = char("&guess",i);
	end;

	/* Better check for any in the correct spot first */
	do i = 1 to 5;
		/* if the guess letter in this position matches */
		if guess[i]=word[i] then
			do;
				stat[i] = 'G';
				check[i]=guess[i];
				word[i]='0'; /* null out so we don't find again */
			end;
	end;

  /* Now check for right letter, wrong spot */
	do i=1 to 5;
    /* skip those we already determined are correct */
		if stat[i] ^= 'G' then
			do;
				c = whichc(guess[i], of word[*]);

				/* if the guess letter is in another position */
				if c>0 then
					do;
						check[i]=guess[i];
						stat[i] = 'Y';
            /* if there was a letter found, null it out so we can't find again */
            word[c]='0';
					end;
				/* else no match, whichc() returned 0 */
				else
					do;
						check[i]=guess[i];
						stat[i] = 'B';
					end;
			end;
	end;

  /* Save string of guess status */
	status = catt(of stat[*]);
run;

/* append to game status thus far */
data status;
	set status _status;
run;

/* cleanup temp data */
proc delete data=_status;
run;

 

Report game status in Wordle "grid"

In my first pass for this game I used PROC REPORT to display the game grid.

ods escapechar='^';

data _report(keep=checked1-checked5);
 array c[5]  $ 40 checked1-checked5;
 length bg $ 6 message $ 40 solved 8;
 set status(obs=6) end=last;
 do i=1 to 5;
  if char(status,i) = 'G' then
    bg="00FF00";
  if char(status,i) = 'Y' then
    bg="FFFF00";
  if char(status,i) = 'B' then
    bg="CCCCCC";
   c[i]=catt("^S={background=#",bg,"}",upcase(c[i]),"^S={}");
  end;
run;
proc report data=_report noheader;
 column checked1-checked5;
run;		

But then a SAS user in Japan gave me the idea to use the DATA step Report Writing Interface to "draw" a gridded output, and it looks much better.

data _null_;
  length background $ 50 message $ 40;
  array c[5]  $ 40 checked1-checked5;
  set status(obs=6) end=last;
  /* Credit for this approach goes to my SAS friends in Japan!                          */
  /*  http://sas-tumesas.blogspot.com/2022/03/wordlesasdo-overhash-iterator-object.html */
  dcl odsout ob ();
    ob.layout_gridded (columns: 5, rows: 1, column_gutter: '2mm');
    do i=1 to 5;
      if char(status,i) = 'G' then
        background = "green";
      else if char(status,i) = 'Y' then
        background = "darkyellow";
      else if char(status,i) = 'B' then
        background = "gray";
      text = cats ("color = white height = 1cm width = 1cm fontsize = 4 vjust = center background =", background);
      ob.region ();
      ob.table_start ();
        ob.row_start ();
          ob.format_cell (data: upcase(c[i]), style_attr: text);
        ob.row_end ();
      ob.table_end ();
      call missing (background);
    end;
  ob.layout_end ();

 

The output grid looks like this:

 

ChrisHemedinger_0-1647615286554.png

 

Finish game: solved or out of guesses

The game ends when you solve with the correct guess or when you run out of guesses (you're allowed 6). I used DATA step to check the status for all Gs ('GGGGG') and then issue the familiar congratulation word for the player. Again I used PROC ODSTEXT for the message, as it allows for some flexibility in text formatting. 

/* ...continued within reporting DATA step */ 
if status='GGGGG' then do;
 if _n_ = 1 then message = "GENIUS!";
 if _n_ = 2 then message = "MAGNIFICENT!";
 if _n_ = 3 then message = "IMPRESSIVE!";
 if _n_ = 4 then message = "SPLENDID!";
 if _n_ = 5 then message = "GREAT!";
 if _n_ = 6 then message = "PHEW!";
end;
if last then do;
  if status ^= 'GGGGG' and _n_=6 then message="Missed it (%sysfunc(upcase(&gamepick)))!";
  message=catx(' ',message,"Guess",_n_,"of 6");
  call symputx('statmsg',message);
end;
run;

proc odstext;
 p "&statmsg." 
 / style=[color=green font_size=4 just=c fontweight=bold];
run;

 

Play the game

Submit the wordle-sas.sas program in your SAS session. This program should work in PC SAS, SAS OnDemand for Academics, SAS Enterprise Guide, and SAS Viya. The program will fetch word lists from GitHub and populate into data sets. It will also define two macros you will use to play the game.

 

Start a game by running:

%startGame;

This will select a random word from the word list as the "puzzle" word and store it in a SAS macro variable (don't peek!)
Optionally seed a game with a known word by using an optional 5-character word parameter:

%startGame(crane);

This will seed the puzzle word ("crane" in this example). It's useful for testing. See a battery of test "sessions" in wordle-sas-tests.sas
Submit a first guess by running:

%guess(adieu);

This will check the guess against the puzzle word, and it will output a report with the familiar "status" - letters that appear in the word (yellow) and that are in the correct position (green). It will also report if the guess is not a valid guess word, and it won't count that against you as one of your 6 permitted guesses.

 

Example game play in SAS Enterprise GuideExample game play in SAS Enterprise Guide

 

If you don't want to look at or copy/paste the game code, you can use Git functions in SAS to bring the program into your SAS session and play. (These Git functions require at least SAS 9.4 Maint 6 or SAS Viya.)

options dlcreatedir;
%let repopath=%sysfunc(getoption(WORK))/wordle-sas;
libname repo "&repopath.";
data _null_;
    rc = gitfn_clone( 
      "https://github.com/sascommunities/wordle-sas", 
      "&repoPath." 
    ); 
    put 'Git repo cloned ' rc=; 
run;
%include "&repopath./wordle-sas.sas";
 
/* start a game and submit first guess */
%startGame;
%guess(adieu);
Comments
Tom

Why not use the WINDOW or %WINDOW command to take in the guess?

@Tom That works only in Display Manager, not in any of the ways that most people are running SAS right now (in SAS EG, SAS Studio). However, a SAS user in Japan did implement a version of this. In fact, they even made a SAS/AF version. Credit to @japelin for this, which I cited in my blog post on this topic.

My current client uses SAS/AF a lot.  They also use display manager.

 

I missed the keyboard showing the used keys, Chris, so I added it.

 

ChrisNZ_1-1670471546114.png

 

/* Enhancement of SAS Wordle program on
     https://github.com/sascommunities/wordle-sas
   written by C Graffeuille.
   Displays the used keys on a QWERTY keyboard
     in green, yellow or dark grey.
*/

/* Get word list */
filename words temp;
filename words_ok temp;
options center;

/* "Official" word lists from NYT, via cfreshman GitHub sharing */
proc http
	url       ="https://gist.githubusercontent.com/cfreshman/a7b776506c73284511034e63af1017ee/raw/845966807347a7b857d53294525263408be967ce/wordle-nyt-answers-alphabetical.txt"
  proxyhost ="xx"
  proxyport =8080
	out       =words;
run;

proc http
	url="https://gist.githubusercontent.com/cfreshman/40608e78e83eb4e1d60b285eb7e9732f/raw/2f51b4f2bb96c02e1dee37808b2eed4ef23a3150/wordle-nyt-allowed-guesses.txt"
  proxyhost ="xx"
  proxyport =8080
  out=words_ok;
run;

data words;
	infile words;
	length word $ 5;
	input word;
  WORD=upcase(WORD);
run;

%let wordcount = &sysnobs.;

/* valid guesses that aren't necessarily in word list via cfreshman GitHub sharing */
data allowed_words;
	infile words_ok;
	length word $ 5;
	input word;
  WORD=upcase(WORD);
run;

/* allowed guesses plus game words => universe of allowed guesses */
data allowed_words;
	set allowed_words words;
run;

/*
use this to seed a new game. Will create a macro variable with the word - don't peek!
supply 'seed' value to set the word explicitly, good for testing
*/
%macro startGame(seed);
  %global gamepick;

  %if %length(&seed) = 5 %then
    %let gamepick=%upcase(&seed);
  %else
    %do;
      %let pick = %sysfunc(rand(Integer,1,&wordcount.));
      data _null_;
        set words (obs=&pick. firstobs=&pick.);
        call symput('gamepick',word);
      run;
    %end;
  data STATUS;
    length CHECKED1-CHECKED5 $1 STATUS $5 ;
    call missing(of _ALL_);
    stop;
  run;
  
  data PLAYED; length PLAYED $100; 
    PLAYED= ' '; 
  run;
%mend;

/* create a gridded output with the guesses so far */
%macro reportStatus;

  data PLAYED(keep=PLAYED);
    length BACKGROUND $16 TEXT $160 CHAR $1 ;
    LETTERS='QWERTYUIOPASDFGHJKLZXCVBNM'; 
    array C[5] $40 CHECKED1-CHECKED5;
    set PLAYED;

    dcl odsout ob ();                      
    ob.layout_gridded (columns:2, column_gutter:'20mm');
    ob.region ();
    do while (^LASTOBS);
      set STATUS(obs=6) end=LASTOBS;
      OBS+1;                      
      ob.table_start ();
      ob.row_start ();

      do I=1 to 5;
        CHAR=char(STATUS,I); 
        RANK=rank(char("&guess",I));   
        if CHAR= 'G' then do;
          BACKGROUND = "green";     
          if LASTOBS then substr(PLAYED,RANK,1)='G';    
        end;
        else if CHAR = 'Y' then do;
          BACKGROUND = "darkyellow";            
          if LASTOBS & char(PLAYED,RANK) ne 'G' then substr(PLAYED,RANK,1)='Y';  
        end;
        else if CHAR = 'B' then do;
          BACKGROUND = "gray";               
          if LASTOBS & char(PLAYED,RANK) =' ' then substr(PLAYED,RANK,1)='B'; 
        end;
        TEXT = cats ("color=white height=.8cm width=.8cm fontsize=3 vjust=center background=", BACKGROUND);
        ob.format_cell (data: C[I], style_attr: TEXT);
      end;
      ob.row_end ();
      ob.table_end ();        
    end;                     
    output PLAYED; 

    TEXT= ifc( ^&is_valid.             , "&guess is not a valid guess."
         ,ifc( STATUS='GGGGG' and OBS=1, "GENIUS!"
         ,ifc( STATUS='GGGGG' and OBS=2, "MAGNIFICENT!"
         ,ifc( STATUS='GGGGG' and OBS=3, "IMPRESSIVE!"
         ,ifc( STATUS='GGGGG' and OBS=4, "SPLENDID!"
         ,ifc( STATUS='GGGGG' and OBS=5, "GREAT!"
         ,ifc( STATUS='GGGGG' and OBS=6, "PHEW!"
         ,ifc( OBS=6                   , "Missed it (&gamepick)!"
         ,                               catx(' ',"Guess",OBS,"of 6")))))))));

    ob.format_text (data: ' ' ,style_attr: ' fontsize= '||cats(7.5-OBS) );
    ob.format_text (data: TEXT ,style_attr: 'color=green fontsize=3.5 fontweight=bold just=c');

    ob.region ();
    ob.table_start (); 
    ob.row_start ();
    do I=1 to 26;
      CHAR=char(LETTERS,I);            
      BACKGROUND=ifc(char(PLAYED,rank(CHAR))='G','green     '
                ,ifc(char(PLAYED,rank(CHAR))='Y','darkyellow'
                ,ifc(char(PLAYED,rank(CHAR))='G','green     '
                ,ifc(char(PLAYED,rank(CHAR))='B','darkgray  '
                ,                                'lightgray '))));
      TEXT = cats ("color=white height=.7cm width=.4cm fontsize=3 vjust=center background=", BACKGROUND);
      ob.format_cell (data: CHAR, style_attr: TEXT); 
      if I in (10,19) then do;    
        ob.row_end (); 
        ob.table_end ();
        ob.table_start ();   
        ob.row_start ();
      end;
    end;                       
    ob.row_end ();
    ob.table_end ();
    ob.layout_end ();                                     
    stop;
  run;

%mend;

/* process a word guess */
%macro guess(guess);
  %let guess = %upcase(&guess);
	/* Check to see if guess is valid */
	proc sql noprint;
		select count(word) into :is_valid 
			from allowed_words
				where word="&guess.";
	quit;

	%if &is_valid. %then %do; 

			data _status(keep=checked: status);
        /* checked array will output our guessed letters */
				array check{5}  $ 1 checked1-checked5;
        /* stat array will track guess status per position */
        /* will output to a status var at the end          */
				array stat{5} $ 1;
				length status $ 5;
        /* these arrays are solution word letters, guess letters */
				array word{5} $ 1;
				array guess{5} $ 1;

				do i = 1 to 5;
					word[i] = char("&gamepick.",i);
					guess[i] = char("&guess",i);
				end;

				/* Better check for any in the correct spot first */
				do i = 1 to 5;
					/* if the guess letter in this position matches */
					if guess[i]=word[i] then
						do;
							stat[i] = 'G';
							check[i]=guess[i];
							word[i]='0'; /* null out so we don't find again */
						end;
				end;

        /* Now check for right letter, wrong spot */
				do i=1 to 5;
          /* skip those we already determined are correct */
					if stat[i] ^= 'G' then
						do;
							c = whichc(guess[i], of word[*]);

							/* if the guess letter is in another position */
							if c>0 then
								do;
									check[i]=guess[i];
									stat[i] = 'Y';
                  /* if there was a letter found, null it out so we can't find again */
                  word[c]='0';
								end;
							/* else no match, whichc() returned 0 */
							else
								do;
									check[i]=guess[i];
									stat[i] = 'B';
								end;
						end;
				end;

        /* Save string of guess status */
				status = catt(of stat[*]);
			run;

      /* append to game status thus far */
			data status;
				set status _status;
			run;

      /* cleanup temp data */
			proc delete data=_status;
			run;

		%end;

  /* output the report of game so far */
	%reportStatus;
%mend;

/* 
 sample usage - start game and then guess 
 with your favorite start word

 %startGame;
 %guess(adieu);

 Then submit more guesses using the %guess macro until 
 you solve it...or run out of guesses.

*/
  

 

 

@ChrisNZ Love this! Would you be open to adding as a pull request in the GitHub project? Or I'm happy to merge.

There's a small bug where invalid words update the keyboard colours.
A revamped version is coming this weekend. 🙂

Chris, I couldn't attach the file here, so I created a new post here. Merge if you want but a lot has changed! 🙂

 

Version history
Last update:
‎04-01-2022 07:05 PM
Updated by:

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

Free course: Data Literacy Essentials

Data Literacy is for all, even absolute beginners. Jump on board with this free e-learning  and boost your career prospects.

Get Started

Article Labels
Article Tags