I am trying to convert the below BASE sas code into proc sql. Kindly let me know the best possible solution in PROC SQL to get the similar output.
data sample;
input by_va1 by_va2 col1 col2;
infile datalines;
datalines;
1 2 1 10
1 2 1 11
1 2 1 12
1 2 3 9
;
run;
data output;
set sample;
by by_va1 by_va2;
retain flag1 flag2 flag3 0;
if first.by_va1 then do;
flag1 = 0;
flag2 = 0;
flag3 = 0;
end;
if col1 = 1 and col2 = 9 then do;
flag1 = 1;
end;
if col1 = 1 and col2 = 13 then do;
flag2 = 1;
end;
if col1 = 1 and col2 = 14 then do;
flag3 = 1;
end;
if last.by_va1 then do;
output;
end;
run;
SQL does not have a concept of row order - in this respect is the exact opposite of datastep. You cannot "retain" data down. An example method would be. And note your test data wouldn't trigger the flags, so I changed it.
data sample;
input by_va1 by_va2 col1 col2;
infile datalines;
datalines;
1 2 1 9
1 2 1 11
1 2 1 13
1 2 3 9
;
run;
proc sql;
create table WANT as
select *,
case when sum(case when COL1=1 and COL2=9 then 1 else 0 end) > 0 then 1 else 0 end as FLAG1,
case when sum(case when COL1=1 and COL2=13 then 1 else 0 end) > 0 then 1 else 0 end as FLAG2,
case when sum(case when COL1=1 and COL2=14 then 1 else 0 end) > 0 then 1 else 0 end as FLAG3
from SAMPLE A;
quit;
The question is why, if you have working code go through the whole validation, documentation process, just to do it in SQL?
Thanks for the perfect answer. I am trying to convert BASE SAS code into SAS DI and the PROC SQL code will be executed over the ORACLE server. Hence trying for the best possible way.
To achieve a single output row like the original data step, a group by is necessary in the SQL:
create table WANT as
select by_va1, by_va2,
case when sum(case when COL1=1 and COL2=9 then 1 else 0 end) > 0 then 1 else 0 end as FLAG1,
case when sum(case when COL1=1 and COL2=13 then 1 else 0 end) > 0 then 1 else 0 end as FLAG2,
case when sum(case when COL1=1 and COL2=14 then 1 else 0 end) > 0 then 1 else 0 end as FLAG3
from SAMPLE A
group by by_va1, by_va2;
In general you cannot since PROC SQL doesn't know anything about order.
But your example seems to just be using RETAIN to generate a group level condition.
It is easy to do this with binary variables since to test if any observation met the condition you can just take the MAX() over the whole group.
proc sql ;
create table output as
select by_va1
, by_va2
, max(col1 = 1 and col2 = 9) as flag1
, max(col1 = 1 and col2 = 13) as flag2
, max(col1 = 1 and col2 = 14) as flag3
from sample
group by by_va1, by_va2
;
quit;
If you need to translate to ORACLE code you can replace the boolean expressions with equivalent CASE clause instead.
, max(case when (col1 = 1 and col2 = 9) then 1 else 0 end) as flag1
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.