Paper 1093-2021
Authors
Richann Watson, DataRich Consulting; Louise Hadden, Abt Associates Inc.
Abstract
A typical task for a SAS(r) practitioner is the creation of a new variable that is based on the value of another variable or string. This task is frequently accomplished by the use of IF-THEN-ELSE statements. However, manually typing a series of IF-THEN-ELSE statements can be time-consuming and tedious, as well as prone to typos or cut and paste errors. Serendipitously, SAS has provided us with an easier way to assign values to a new variable. The WHICH and CHOOSE functions provide a convenient and efficient method for data-driven variable creation.
Introduction
SAS functions are powerful tools that transform variables, create variables, or provide valuable information regarding a variable. There are four basic categories that functions fall into: arithmetic (i.e., MIN, MEAN, MAX); date / time (i.e., TODAY, DATETIME); truncation (i.e., FUZZ, ROUND, TRUNC); and last but not least, string or character functions, such as CAT, TRIM, and FIND. WHICH and CHOOSE functions are valuable members of the string function category, which are generally utilized to clean and analyze string variables. WHICH and CHOOSE functions are relatively new and are available for both character and numeric processing: WHICHC, WHICHN, CHOOSEC, and CHOOSEN.
WHICH and CHOOSE functions are frequently used in conjunction with other character functions in order to streamline verbose coding. Functions can greatly reduce the amount of programming required to achieve desired results compared to the data step, formats, and other techniques. We will explore several time and effort saving applications for the WHICH and CHOOSE functions below.
Note that data used in this paper are fictitious and do not represent any subject level data associated with a particular study. This paper and presentation are intended for all proficiency levels and all industries. Code samples were run on the Windows operating system using SAS 9.4 Maintenance Release 5.
Which Functions
The WHICH functions return the index number from the first value in the list of values that matches the string. Table 1 provides the syntax and description of the two WHICH functions.
Which Functions*
Description
WHICHC(string, value-1 <, … value-n>)
Returns the index of the first item in the character value list that matches the string
WHICHN(string, value-1 <, … value-n>)
Returns the index of the first item in the numeric value list that matches the string
Table 1: Syntax and Description of WHICH Functions
Both WHICHC and WHICHN require at a minimum two arguments: string and value-1.
String is a constant, variable or an expression that evaluates to a value that will be searched for in the value list.
Value-n is a constant, variable or an expression that evaluates to a value to be searched. There should be a value for each item that is to be searched with the values separated by commas.
The WHICH functions return the positive integer i that corresponds to the i th value in the list that matches the string. Note that i corresponds to the argument number – 1. Recall that the first argument is the string, so the list of values does not start till the second argument.
The following section goes through an example to illustrate the use of these functions.
Choose Functions
The CHOOSE functions return either a character or numeric value based on the item selected from the selection list. Table 2 provides the syntax and description of the CHOOSE functions.
CHOOSE Functions*
Description
CHOOSEC(index-expression, selection-1 <, … selection-n>)
Returns the character value from the selection list that is associated with the index-expression
CHOOSEN(index-expression, selection-1 <, … selection-n>)
Returns the numeric value from the selection list that is associated with the index-expression
Table 2: Syntax and Description of CHOOSE Functions
Both functions require at least two arguments.
The first argument is the index-expression which is used to determine which item to select from the selection list. The index-expression can be a numeric constant, variable that represents a numeric value or an expression that evaluates to a numeric value.
The second argument is the first time in the selection list. You can add additional arguments for each item in the selection list separated by commas. The items in the selection list are a constant, variable or an expression that evaluates to a value.
The CHOOSE functions returns one of the values in the comma separated list that corresponds to the value in the index-expression.
The best way to understand these functions is through the use of examples.
References
SAS Institute Inc. (n.d.). Dictionary of Functions and CALL Routines. Retrieved from https://documentation.sas.com/?docsetId=lefunctionsref&docsetTarget=p1q8bq2v0o11n6n1gpij335fqpph.htm&docsetVersion=9.4&locale=en
Acknowledgements
Chauhan, Balram. “Alternative programming approach for Conditional processing in SAS”. Proceedings of the PhUSE 2018 Conference. Raleigh, NC: PhUSE. https://www.lexjansen.com/phuse-us/2018/ct/CT11_ppt.pdf
Horstman, J. “Beyond IF THEN ELSE: Techniques for Conditional Execution of SAS® Code”. Proceedings of the SAS Global Forum 2017 Conference. Orlando, FL: SAS Global Forum. https://support.sas.com/resources/papers/proceedings17/0326-2017.pdf
Horstman, J. “Fifteen Functions to Supercharge Your SAS® Code”. Proceedings of the PharmaSUG 2018 Conference. Seattle, WA: PharmaSUG. https://www.pharmasug.org/proceedings/2018/BB/PharmaSUG-2018-BB17.pdf
Su, Jason J. “A Game Plan for Beating the IF-THEN-ELSE Overhead in DATA Steps”. Proceedings of the SESUG 2020 Conference. Virtual: SESUG. https://www.lexjansen.com/sesug/2020/SESUG2020_Paper_152_Final_PDF.pdf
Examples
Please refer to attached PDF document for illustrations of these functions.
... View more