Richann Watson, DataRich Consulting; Louise Hadden, Abt Associates Inc.
A typical task for a SAS(r) practitioner is the creation of a new variable that is based on the value of another variable or string. This task is frequently accomplished by the use of IF-THEN-ELSE statements. However, manually typing a series of IF-THEN-ELSE statements can be time-consuming and tedious, as well as prone to typos or cut and paste errors. Serendipitously, SAS has provided us with an easier way to assign values to a new variable. The WHICH and CHOOSE functions provide a convenient and efficient method for data-driven variable creation.
SAS functions are powerful tools that transform variables, create variables, or provide valuable information regarding a variable. There are four basic categories that functions fall into: arithmetic (i.e., MIN, MEAN, MAX); date / time (i.e., TODAY, DATETIME); truncation (i.e., FUZZ, ROUND, TRUNC); and last but not least, string or character functions, such as CAT, TRIM, and FIND. WHICH and CHOOSE functions are valuable members of the string function category, which are generally utilized to clean and analyze string variables. WHICH and CHOOSE functions are relatively new and are available for both character and numeric processing: WHICHC, WHICHN, CHOOSEC, and CHOOSEN.
WHICH and CHOOSE functions are frequently used in conjunction with other character functions in order to streamline verbose coding. Functions can greatly reduce the amount of programming required to achieve desired results compared to the data step, formats, and other techniques. We will explore several time and effort saving applications for the WHICH and CHOOSE functions below.
Note that data used in this paper are fictitious and do not represent any subject level data associated with a particular study. This paper and presentation are intended for all proficiency levels and all industries. Code samples were run on the Windows operating system using SAS 9.4 Maintenance Release 5.
The WHICH functions return the index number from the first value in the list of values that matches the string. Table 1 provides the syntax and description of the two WHICH functions.
Which Functions* |
Description |
WHICHC(string, value-1 <, … value-n>) |
Returns the index of the first item in the character value list that matches the string |
WHICHN(string, value-1 <, … value-n>) |
Returns the index of the first item in the numeric value list that matches the string |
Table 1: Syntax and Description of WHICH Functions
Both WHICHC and WHICHN require at a minimum two arguments: string and value-1.
String is a constant, variable or an expression that evaluates to a value that will be searched for in the value list.
Value-n is a constant, variable or an expression that evaluates to a value to be searched. There should be a value for each item that is to be searched with the values separated by commas.
The WHICH functions return the positive integer i that corresponds to the ith value in the list that matches the string. Note that i corresponds to the argument number – 1. Recall that the first argument is the string, so the list of values does not start till the second argument.
The following section goes through an example to illustrate the use of these functions.
The CHOOSE functions return either a character or numeric value based on the item selected from the selection list. Table 2 provides the syntax and description of the CHOOSE functions.
CHOOSE Functions* |
Description |
CHOOSEC(index-expression, selection-1 <, … selection-n>) |
Returns the character value from the selection list that is associated with the index-expression |
CHOOSEN(index-expression, selection-1 <, … selection-n>) |
Returns the numeric value from the selection list that is associated with the index-expression |
Table 2: Syntax and Description of CHOOSE Functions
Both functions require at least two arguments.
The first argument is the index-expression which is used to determine which item to select from the selection list. The index-expression can be a numeric constant, variable that represents a numeric value or an expression that evaluates to a numeric value.
The second argument is the first time in the selection list. You can add additional arguments for each item in the selection list separated by commas. The items in the selection list are a constant, variable or an expression that evaluates to a value.
The CHOOSE functions returns one of the values in the comma separated list that corresponds to the value in the index-expression.
The best way to understand these functions is through the use of examples.
SAS Institute Inc. (n.d.). Dictionary of Functions and CALL Routines. Retrieved from https://documentation.sas.com/?docsetId=lefunctionsref&docsetTarget=p1q8bq2v0o11n6n1gpij335fqpph.htm...
Chauhan, Balram. “Alternative programming approach for Conditional processing in SAS”. Proceedings of the PhUSE 2018 Conference. Raleigh, NC: PhUSE. https://www.lexjansen.com/phuse-us/2018/ct/CT11_ppt.pdf
Horstman, J. “Beyond IF THEN ELSE: Techniques for Conditional Execution of SAS® Code”. Proceedings of the SAS Global Forum 2017 Conference. Orlando, FL: SAS Global Forum. https://support.sas.com/resources/papers/proceedings17/0326-2017.pdf
Horstman, J. “Fifteen Functions to Supercharge Your SAS® Code”. Proceedings of the PharmaSUG 2018 Conference. Seattle, WA: PharmaSUG. https://www.pharmasug.org/proceedings/2018/BB/PharmaSUG-2018-BB17.pdf
Su, Jason J. “A Game Plan for Beating the IF-THEN-ELSE Overhead in DATA Steps”. Proceedings of the SESUG 2020 Conference. Virtual: SESUG. https://www.lexjansen.com/sesug/2020/SESUG2020_Paper_152_Final_PDF.pdf
Please refer to attached PDF document for illustrations of these functions.
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Ready to level-up your skills? Choose your own adventure.