Hello,
I have requirement to convert If statements to CASE in proc sql.
My data looks as below
var1 | var2 | var3 | sum_var | condition |
a | b | c | 10 | If sum_var <5 |
a | b | c | 20 | If sum_var > 15 |
a | b | c | 30 | If sum_var > 35 |
d | e | f | 40 | If sum_var > 50 |
d | e | g | 50 | If sum_var <80 |
Now I need to create output table which satisfies group and cases in proc sql as this code will run in DI studio.
So my var1, var2 and var3 are combinations of variables. This cant be hardcoded. So I would need output something as below
var1 | var2 | var3 | sum_var | condition |
a | b | c | 10 | If sum_var <5 |
a | b | c | 20 | If sum_var > 15 |
d | e | g | 50 | If sum_var <8 |
Need your advice.
P.S.Sent from blackberry. Please ignore Spelling mistakes and typos.
Assuming condition is always a simple inequality comparison, you could use something like this:
data have;
length var1 var2 var3 $4 condition $20;
input var1 var2 var3 sum_var condition &;
datalines;
a b c 10 If sum_var <5
a b c 20 If sum_var > 15
a b c 30 If sum_var > 35
d e f 40 If sum_var > 50
d e g 50 If sum_var <80
;
proc sql;
create table want(drop= target keep) as
select *,
input(scan(condition,3,' <>='), best.) as target,
case
when index(condition,'<=') > 0
then sum_var <= calculated target
when index(condition,'>=') > 0
then sum_var >= calculated target
when index(condition,'<') > 0
then sum_var < calculated target
when index(condition,'>') > 0
then sum_var > calculated target
else 0
end
as keep
from have
where calculated keep;
quit;
PG
You need to clarify your output and the rules you're trying to follow.
The condition is a character variable in the data set?
I don't understand the SQL restriction though, even though its DI studio, if its a code node or User written transformation you can still use data step code can't you?
Completely agree with you Reeza. But since this piece of code is going to be used in extract transformation, my client is adamant on sql code.
Also, you got req correctly. I have combination of these 3 vars. The conditions are present in another variable so we need to output only observations which satisfies conditions present there.
Hope I am making sense
Your output doesn't seem to match your data requirements.
Can you create a format that would do it, or you have to do straight SQL. I think formats would be better...or macro variable generation for a datastep if code.
You need to post all possible conditions though, otherwise this will be a long back and forth.
Assuming condition is always a simple inequality comparison, you could use something like this:
data have;
length var1 var2 var3 $4 condition $20;
input var1 var2 var3 sum_var condition &;
datalines;
a b c 10 If sum_var <5
a b c 20 If sum_var > 15
a b c 30 If sum_var > 35
d e f 40 If sum_var > 50
d e g 50 If sum_var <80
;
proc sql;
create table want(drop= target keep) as
select *,
input(scan(condition,3,' <>='), best.) as target,
case
when index(condition,'<=') > 0
then sum_var <= calculated target
when index(condition,'>=') > 0
then sum_var >= calculated target
when index(condition,'<') > 0
then sum_var < calculated target
when index(condition,'>') > 0
then sum_var > calculated target
else 0
end
as keep
from have
where calculated keep;
quit;
PG
Worked like charm. Although I have several other conditions not only inequality but your logic has helped a lot.. thanks
var1 | var2 | var3 | sum_var | condition |
a | b | c | 20 | If sum_var > 15 |
d | e | g | 50 | If sum_var <80 |
Don't miss out on SAS Innovate - Register now for the FREE Livestream!
Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.