How to obtain the right table as output when the left table is given as input. I want to group by prod and geo columns and fetch the first not null value for all the rules columns. Is there any way to do this using SAS Programming or Proc SQL query?
This step should be closer to your expectations, as @Patrick already said: if you want tested code, provide data in usable form.
data want;
set have;
by prod geo;
array r[4] _temporary_;
array rules rule_1 - rule_4;
if first.geo then do;
call missing(of r[*]);
end;
do i = 1 to dim(rules);
r[i] = coalesce(r[i], rules[i]);
end;
if last.geo then do;
do i = 1 to dim(rules);
rules[i] = r[i];
end;
output;
end;
drop i;
run;
In the data you have show, the value of geo is constant for one value of prod, if this is not always the case, the result may differ from your expectations.
Untested:
proc sql;
create table want as
select prod, geo, max(rule_1) as rule_1, max(rule_2) as rule_2 ....
from have
group by prod, geo;
quit;
This step should be closer to your expectations, as @Patrick already said: if you want tested code, provide data in usable form.
data want;
set have;
by prod geo;
array r[4] _temporary_;
array rules rule_1 - rule_4;
if first.geo then do;
call missing(of r[*]);
end;
do i = 1 to dim(rules);
r[i] = coalesce(r[i], rules[i]);
end;
if last.geo then do;
do i = 1 to dim(rules);
rules[i] = r[i];
end;
output;
end;
drop i;
run;
In the data you have show, the value of geo is constant for one value of prod, if this is not always the case, the result may differ from your expectations.
@Vivi1997 If you're after tested code as answer then please provide sample data in the form of SAS data steps that create this data.
Ideally also provide the desired result based on the logic explained and the source sample data provided.
/*Plz post the data step code, Nobody would like to type it for you !*/
data have;
input prod geo r1-r4;
cards;
301 3001 . 300 . 600
301 3001 100 . 500 700
301 3001 200 400 . .
302 3002 10 . 50 .
302 3002 . 20 . 80
302 3002 . 30 40 70
;
data temp;
set have;
by prod geo ;
if first.geo then n=0;
n+1;
run;
proc sort data=temp;by prod geo descending n;
data want(drop=n);
update temp(obs=0) temp;
by prod geo ;
run;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.