Can someone please help me write code that says if the grouper field is populated then Sum paid amount based on ID number.
Here's the data set that I have.
Data Have
OBS | ID | Grouper | Paid |
1 | 111 | 100 | |
2 | 111 | 0 | 100 |
3 | 111 | 100 | |
4 | 122 | 100 | |
5 | 122 | 7 | 100 |
6 | 123 | 1 | 100 |
7 | 125 | 100 | |
8 | 125 | 100 | |
9 | 125 | 100 | |
10 | 125 | 100 |
Data Want
OBS | ID | Grouper | Paid |
1 | 111 | 0 | 300 |
2 | 122 | 7 | 200 |
3 | 123 | 1 | 100 |
4 | 125 | 100 | |
5 | 125 | 100 | |
6 | 125 | 100 | |
7 | 125 | 100 |
@Alexjaco1 wrote:
Can someone please help me write code that says if the grouper field is populated then Sum paid amount based on ID number.
Here's the data set that I have.
Data Have
OBS ID Grouper Paid 1 111 100 2 111 0 100 3 111 100 4 122 100 5 122 7 100 6 123 1 100 7 125 100 8 125 100 9 125 100 10 125 100
Data Want
OBS ID Grouper Paid 1 111 0 300 2 122 7 200 3 123 1 100 4 125 100 5 125 100 6 125 100 7 125 100
If your requirement is "sum if grouper field is populated " how does your want have anything for ID 125 since none of the example data in either example has any value for the "grouper" variable? Your rule needs some additional clarification.
Sum if grouper field is greater than or equal to 0, If grouper field is missing then keep that ID as is i.e. not rolled up to the ID Var.
data have;
input OBS ID Grouper Paid;
drop obs;
cards;
1 111 . 100
2 111 0 100
3 111 . 100
4 122 . 100
5 122 7 100
6 123 1 100
7 125 . 100
8 125 . 100
9 125 . 100
10 125 . 100
;
data want;
do until(last.id);
set have;
by id;
if Grouper ne . then do;
_k=1;
_g=Grouper;
end;
end;
do until(last.id);
set have;
by id;
if _k then sum=sum(paid,sum);
else do; sum=paid;output;end;
end;
if _k then do;
Grouper=_g;
output;
end;
drop _: paid;
run;
Please do me a favor. Post a good representative of what you HAVE and your WANT. Take your time, get the requirements in full and I will give you the solution in full. Thank you!
I Apologize, I thought I could figure it out on my own...
Hey @Alexjaco1 Nothing to be apologize or formal about. I suggested the above just to save each other time. That's all
EDIT: Plus the fact, I consider all this as playing video game and so I am not a person who fits into advisory. Pardon me.
Well, if you are still up to the challenge here is what I have
OBS | ID | Grouper | Paid | VAR 2 | VAR 3 |
1 | 111 | 100 | 150 | 200 | |
2 | 111 | 0 | 100 | 150 | 200 |
3 | 111 | 100 | 150 | 200 | |
4 | 122 | 100 | 150 | 200 | |
5 | 122 | 7 | 100 | 150 | 200 |
6 | 123 | 1 | 100 | 150 | 200 |
7 | 125 | 100 | 150 | 200 | |
8 | 125 | 100 | 150 | 200 | |
9 | 125 | 100 | 150 | 200 | |
10 | 125 | 100 | 150 | 200 |
Here is what I want
OBS | ID | Grouper | Paid | VAR 2 | VAR 3 |
1 | 111 | 0 | 300 | 450 | 600 |
2 | 122 | 7 | 200 | 300 | 400 |
3 | 123 | 1 | 100 | 150 | 200 |
4 | 125 | 100 | 150 | 200 | |
5 | 125 | 100 | 150 | 200 | |
6 | 125 | 100 | 150 | 200 | |
7 | 125 | 100 | 150 | 200 |
is that final one? lol 🙂
Yes sir.
Brb sir. and thank you. Need caffeine and will jump back from starbucks
Hey @Alexjaco1 Please find the new that handles the updated HAVE. This may look a little more complicated, so I will come back to you with a much simpler one using proc means and append shortly after I finish my coffee.
data have;
input OBS ID Grouper Paid VAR2 VAR3;
drop obs;
cards;
1 111 . 100 150 200
2 111 0 100 150 200
3 111 . 100 150 200
4 122 . 100 150 200
5 122 7 100 150 200
6 123 1 100 150 200
7 125 . 100 150 200
8 125 . 100 150 200
9 125 . 100 150 200
10 125 . 100 150 200
;
data want;
do until(last.id);
set have;
by id;
if Grouper ne . then do;
_k=1;
_g=Grouper;
end;
end;
do until(last.id);
set have;
by id;
array p(*) paid--var3;
array s(3);
if _k then do;
do _i=1 to dim(p);
s(_i)=sum(s(_i),p(_i));
end;
end;
else do;
do _i=1 to dim(p);
s(_i)=p(_i);
end;
output;
end;
end;
if _k then do;
Grouper=_g;
output;
end;
drop _: paid var:;
run;
Although long, but very easy, simple and convenient:
data have;
input OBS ID Grouper Paid VAR2 VAR3;
drop obs;
cards;
1 111 . 100 150 200
2 111 0 100 150 200
3 111 . 100 150 200
4 122 . 100 150 200
5 122 7 100 150 200
6 123 1 100 150 200
7 125 . 100 150 200
8 125 . 100 150 200
9 125 . 100 150 200
10 125 . 100 150 200
;
proc sql;
create table w as
select *
from have
where id in (select id from have where grouper ne .);
quit;
proc sql;
create table w1 as
select *
from have
where id in (select id from have group by id having max(grouper) eq .);
quit;
proc means data=w nway noprint;
class id;
var paid--var3;
output out=w2(drop=_:) sum=;
run;
data want;
retain id grouper paid var2 var3;
set w2 w1;
run;
data final_want;
merge want have(keep=id grouper where=(grouper ne .));
by id;
run;
Here's a simplified approach using SQL.
First generate a list of ID's that have a grouper value available. Note this assumes you'll have only a single value for grouper per ID. If that can vary, it needs to be modified.
1. Generate list of ID with grouper value
2. For list of ID from #1, summarize data
3. Union records where grouper is missing
4. Note that UNION ALL is used to handle duplicate data.
data have;
infile cards dlm=',' dsd truncover;
input OBS ID Grouper Paid VAR2 VAR3;
cards;
1, 111, , 100, 150, 200
2, 111, 0, 100, 150, 200
3, 111, , 100, 150, 200
4, 122, , 100, 150, 200
5, 122, 7, 100, 150, 200
6, 123, 1, 100, 150, 200
7, 125, , 100, 150, 200
8, 125, , 100, 150, 200
9, 125, , 100, 150, 200
10, 125, , 100, 150, 200
;;;;
run;
proc sql;
create table list1 as select ID from have where not missing(grouper);
quit;
proc sql;
create table want as
select ID, max(grouper) as grouper, sum(paid)as paid, sum(var2) as var2, sum(var3) as var3
from have
where id in (select id from list1)
group by id
union all
select *
from have (drop=obs) where id not in (select ID from list1);
quit;
proc print data=want;
run;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.
Find more tutorials on the SAS Users YouTube channel.