Hi,
I want to use case-when logic to condtionally create a new variable - in the example below, I want to assign all observations with value below 10 to the first group an all other to the second group. Can I do it without if-then/else block? I've found the examples of using select statement inside the data step but it does not work as expected - only the first record is assingned properly. What am I doing wrong?
data test;
input var;
datalines;
1
2
3
30
40
50
;
run;
data test;
set test;
select (var);
when (var < 10) target_group = 1;
otherwise target_group = 2;
end;
run;
The page https://documentation.sas.com/doc/en/pgmsascdc/9.4_3.5/lestmtsref/p09213s9jc2t99n1vx0omk2rh9ps.htm#p... indicates the correct use.
You should use the code:
select; when (var < 10) target_group = 1; otherwise target_group = 2; end;
The page https://documentation.sas.com/doc/en/pgmsascdc/9.4_3.5/lestmtsref/p09213s9jc2t99n1vx0omk2rh9ps.htm#p... indicates the correct use.
You should use the code:
select; when (var < 10) target_group = 1; otherwise target_group = 2; end;
See this article about the SELECT-WHEN logic in the DATA step.
The SELECT-WHEN statement supports two syntaxes. The first is when you are matching a discrete set of values that you know about in advance. For example:
data test2;
set test;
condition = (var < 10);
select (condition);
when (1) target_group = 1;
otherwise target_group = 2;
end;
run;
In the other syntax, you do not specify the name of the variable on the SELECT statement, and you put the logic in the WHEN statement, like this:
data test3;
set test;
select;
when (var < 10) target_group = 1;
otherwise target_group = 2;
end;
run;
Although you can use the SELECT-WHEN statement to create a new variable, you can also use a SAS FORMAT to create the values that you want.
Hi @chris2377,
Such a simple conditional assignment can indeed be written without any additional statements:
data want;
set test;
target_group=2-(var < 10);
run;
The Boolean expression var < 10 is evaluated to 1 if the inequality is true (which includes the case that var contains a missing value!), else to 0. So, two minus the result is exactly the desired value for target_group. This also explains the result of your SELECT statement: the "Boolean" 1 (for TRUE) happened to match your var value 1.
Also note that I used a different output dataset name (WANT) to avoid overwriting the input dataset in case of incorrect logic.
April 27 – 30 | Gaylord Texan | Grapevine, Texas
Walk in ready to learn. Walk out ready to deliver. This is the data and AI conference you can't afford to miss.
Register now and lock in 2025 pricing—just $495!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.