Obsidian | Level 7

## Creating a new variable based on a string of similar variables with identical responses

I have five survey variables, similar in naming convention:  ABC1, ABC2, ABC3, ABC4, and ABC5.  Responses for these variables are either 1 or 2.

I want to shorten/simplify my code of:

If ABC1=1 and ABC2=1 and ABC3=1 and ABC4=1 and ABC5 =1 then CODE=8;

How do I simplify that code?  In my real scenario, my last variable is ABC100.

Thank you!

8 REPLIES 8
Super User

## Re: Creating a new variable based on a string of similar variables with identical responses

It depends a bit on the full logic. An alternative equivalent calculation would be that the SUM() of all those items would be 100.

/*all values are 1 - assumes no missing values*/
if sum(of abc1-abc100) = 100 then code=8;

/*checks if any value of 2 is in the list of variables*/
IF WHICHN('2', of ABC1-ABC100) =0 then code=8
Obsidian | Level 7

## Re: Creating a new variable based on a string of similar variables with identical responses

My variables ABC1 through ABC100 are defined as character so I get an error.
Super User

## Re: Creating a new variable based on a string of similar variables with identical responses

The code you posted treats your data as numeric not character.

Use WHICHC() instead of WHICHN() then.

Documentations on functions are available here @Phil_NZ

You may also want ti book mark the formats and informats by category page (find them from the main documentation.sas.com page).

https://documentation.sas.com/?docsetId=lefunctionsref&docsetTarget=n01f5qrjoh9h4hn1olbdpb5pr2td.htm...

## Re: Creating a new variable based on a string of similar variables with identical responses

@Bluekeys49 wrote:
My variables ABC1 through ABC100 are defined as character ...

Then why not use character functions (this code works for numeric variables as well):

``if cats(of abc:)=repeat('1',99) then code=8;``

[Edit: Actually it's more robust for numeric variables because then the effect cats('11',' ')=cats('1','1') could not happen, assuming the standard setting of system option MISSING.]

Super User

## Re: Creating a new variable based on a string of similar variables with identical responses

@Bluekeys49 wrote:
My variables ABC1 through ABC100 are defined as character so I get an error.

PROC Star

## Re: Creating a new variable based on a string of similar variables with identical responses

If ABC1=1 and ABC2=1 and ABC3=1 and ABC4=1 and ABC5 =1 then CODE=8;

How do I simplify that code?  In my real scenario, my last variable is ABC100.

``````  if nmiss(of abc:)=0 and min(of abc:)=1 and max(of abc:)=1;
``````
--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
Barite | Level 11

## Re: Creating a new variable based on a string of similar variables with identical responses

Sorry, @Bluekeys49 , I just stop by and feel interesting in the code

Hi @mkeintz and @Reeza

Could you please tell me how to search the documents to learn about

sum(of...) or nmiss(of...), or is there any document in your head that you think can help me to learn about these nice functions (with of inside bracket)?

Thank you!

Thank you for your help, have a fabulous and productive day! I am a novice today, but someday when I accumulate enough knowledge, I can help others in my capacity.
Super User

## Re: Creating a new variable based on a string of similar variables with identical responses

If your variables are numeric you can test the RANGE of values. If the Range is 0 all of the values are the same.

So you test the range and a single (non-missing) variable for its value. If it is 1 then all the values are 1.

if range (of ABC: )=0 and Abc1=1 then code=8;

The ABC: uses all the variables whose names start with ABC. Or use range(of Abc1-Abc100) to use sequentially numbered variables. If there is a gap in the numbers, maybe you don't want to use ABC49 for some particular reason, you can provide two (or more) similar lists:  range(of abc1-abc48, of abc50-abc100);

Note: Range will ignore missing values in the calculation. So if the requirement is to have no missing value you could add:

and nmiss(of abc:)=0.

You might also consider coding things as 1/0 instead and treat 1 as the condition most often of interest (Yes) perhaps.

Then an "all 100 variables must be exactly 1" becomes:   sum(of abc1-abc100)=100

If you want to know that at least one variables has the value 1 (in this 1/0) coding scheme: max(of abc1-abc100)=1

At least one of the variables has a 0  : min (of abc1-abc100)=0.

Percent of variables with responses that answered 1 : mean(of abc1-abc100)  yields the decimal percentage such as .125 =12.5%

Discussion stats
• 8 replies
• 917 views
• 5 likes
• 6 in conversation