Hi,
I want to use case-when logic to condtionally create a new variable - in the example below, I want to assign all observations with value below 10 to the first group an all other to the second group. Can I do it without if-then/else block? I've found the examples of using select statement inside the data step but it does not work as expected - only the first record is assingned properly. What am I doing wrong?
data test;
input var;
datalines;
1
2
3
30
40
50
;
run;
data test;
set test;
select (var);
when (var < 10) target_group = 1;
otherwise target_group = 2;
end;
run;
The page https://documentation.sas.com/doc/en/pgmsascdc/9.4_3.5/lestmtsref/p09213s9jc2t99n1vx0omk2rh9ps.htm#p... indicates the correct use.
You should use the code:
select; when (var < 10) target_group = 1; otherwise target_group = 2; end;
The page https://documentation.sas.com/doc/en/pgmsascdc/9.4_3.5/lestmtsref/p09213s9jc2t99n1vx0omk2rh9ps.htm#p... indicates the correct use.
You should use the code:
select; when (var < 10) target_group = 1; otherwise target_group = 2; end;
See this article about the SELECT-WHEN logic in the DATA step.
The SELECT-WHEN statement supports two syntaxes. The first is when you are matching a discrete set of values that you know about in advance. For example:
data test2;
set test;
condition = (var < 10);
select (condition);
when (1) target_group = 1;
otherwise target_group = 2;
end;
run;
In the other syntax, you do not specify the name of the variable on the SELECT statement, and you put the logic in the WHEN statement, like this:
data test3;
set test;
select;
when (var < 10) target_group = 1;
otherwise target_group = 2;
end;
run;
Although you can use the SELECT-WHEN statement to create a new variable, you can also use a SAS FORMAT to create the values that you want.
Hi @chris2377,
Such a simple conditional assignment can indeed be written without any additional statements:
data want;
set test;
target_group=2-(var < 10);
run;
The Boolean expression var < 10 is evaluated to 1 if the inequality is true (which includes the case that var contains a missing value!), else to 0. So, two minus the result is exactly the desired value for target_group. This also explains the result of your SELECT statement: the "Boolean" 1 (for TRUE) happened to match your var value 1.
Also note that I used a different output dataset name (WANT) to avoid overwriting the input dataset in case of incorrect logic.
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.