BookmarkSubscribeRSS Feed
Alexjaco1
Calcite | Level 5

Can someone please help me write code that says if the grouper field is populated then Sum paid amount based on ID number.

Here's the data set that I have.

 

Data Have

OBS IDGrouperPaid
1111 100
21110100
3111 100
4122 100
51227100
61231100
7125 100
8125 100
9125 100
10125 100

 

Data Want

OBS IDGrouperPaid
11110300
21227200
31231100
4125 100
5125 100
6125 100
7125 100
17 REPLIES 17
ballardw
Super User

@Alexjaco1 wrote:

Can someone please help me write code that says if the grouper field is populated then Sum paid amount based on ID number.

Here's the data set that I have.

 

Data Have

OBS  ID Grouper Paid
1 111   100
2 111 0 100
3 111   100
4 122   100
5 122 7 100
6 123 1 100
7 125   100
8 125   100
9 125   100
10 125   100

 

Data Want

OBS  ID Grouper Paid
1 111 0 300
2 122 7 200
3 123 1 100
4 125   100
5 125   100
6 125   100
7 125   100

If your requirement is "sum if grouper field is populated " how does your want have anything for ID 125 since none of the example data in either example has any value for the "grouper" variable? Your rule needs some additional clarification.

Alexjaco1
Calcite | Level 5

Sum if grouper field is greater than or equal to 0, If grouper field is missing then keep that ID as is i.e. not rolled up to the ID Var.  

novinosrin
Tourmaline | Level 20
data have;
input OBS 	ID	Grouper	Paid;
drop obs;
cards;
1	111	 .	100
2	111	0	100
3	111	 .	100
4	122	 .	100
5	122	7	100
6	123	1	100
7	125	 .	100
8	125	 .	100
9	125	 .	100
10	125	 .	100
;

data want;
do until(last.id);
set have;
by id;
if Grouper ne . then do;
_k=1;
_g=Grouper;
end;
end;
do until(last.id);
set have;
by id;
if _k then sum=sum(paid,sum);
else do; sum=paid;output;end;
end;
if _k then do;
Grouper=_g;
output;
end;
drop _: paid;
run;
Alexjaco1
Calcite | Level 5
How would this code alter if I have a second and third Variable that needs to be summed up with Paid amount, let's call them Var2 and Var 3.
novinosrin
Tourmaline | Level 20

Please do me a favor. Post a good representative of what you HAVE and your WANT. Take your time, get the requirements in full and I will give you the solution in full.  Thank you!

Alexjaco1
Calcite | Level 5

I Apologize, I thought I could figure it out on my own...

novinosrin
Tourmaline | Level 20

Hey @Alexjaco1  Nothing to be apologize or formal about. I suggested the above just to save each other time. That's all

 

EDIT: Plus the fact, I consider all this as playing video game and so I am not a person who fits into advisory. Pardon me. 

Alexjaco1
Calcite | Level 5

Well, if you are still up to the challenge here is what I have

 

OBS IDGrouperPaidVAR 2 VAR 3
1111 100150200
21110100150200
3111 100150200
4122 100150200
51227100150200
61231100150200
7125 100150200
8125 100150200
9125 100150200
10125 100150200

 

Here is what I want

 

OBS IDGrouperPaidVAR 2 VAR 3
11110300450600
21227200300400
31231100150200
4125 100150200
5125 100150200
6125 100150200
7125 100150200
novinosrin
Tourmaline | Level 20

is that final one? lol 🙂 

Alexjaco1
Calcite | Level 5

Yes sir.

novinosrin
Tourmaline | Level 20

Brb sir. and thank you. Need caffeine and will jump back from starbucks

novinosrin
Tourmaline | Level 20

Hey @Alexjaco1  Please find the new that handles the updated HAVE. This may look a little more complicated, so I will come back to you with a much simpler one using proc means and append shortly after I finish my coffee.

 

data have;
input OBS 	ID	Grouper	Paid	VAR2 	VAR3;
drop obs;
cards;
1	111	 .	100	150	200
2	111	0	100	150	200
3	111	 .	100	150	200
4	122	 .	100	150	200
5	122	7	100	150	200
6	123	1	100	150	200
7	125	 .	100	150	200
8	125	 .	100	150	200
9	125	 .	100	150	200
10	125	 .	100	150	200
;

data want;
do until(last.id);
	set have;
	by id;
	if Grouper ne . then do;
	_k=1;
	_g=Grouper;
	end;
end;
do until(last.id);
	set have;
	by id;
	array p(*) paid--var3;
	array s(3);
	if _k then do;
		do _i=1 to dim(p);
		s(_i)=sum(s(_i),p(_i));
		end;
	end;
	else do; 
		do _i=1 to dim(p);
		s(_i)=p(_i);
		end;
		output;
	end;
end;
if _k then do;
	Grouper=_g;
	output;
end;
drop _: paid var:;
run;
novinosrin
Tourmaline | Level 20

Although long, but very easy, simple and convenient:

 


data have;
input OBS 	ID	Grouper	Paid	VAR2 	VAR3;
drop obs;
cards;
1	111	 .	100	150	200
2	111	0	100	150	200
3	111	 .	100	150	200
4	122	 .	100	150	200
5	122	7	100	150	200
6	123	1	100	150	200
7	125	 .	100	150	200
8	125	 .	100	150	200
9	125	 .	100	150	200
10	125	 .	100	150	200
;

proc sql;
create table w as
select *
from have 
where id in (select id from have where grouper ne .);
quit;

proc sql;
create table w1 as
select *
from have 
where id in (select id from have group by id having max(grouper) eq .);
quit;

proc means data=w nway noprint;
class id;
var paid--var3;
output out=w2(drop=_:) sum=;
run;

data want;
retain id grouper paid var2 var3;
set w2 w1;
run;

data final_want;
merge want have(keep=id grouper where=(grouper ne .));
by id;
run;
Reeza
Super User

Here's a simplified approach using SQL. 

First generate a list of ID's that have a grouper value available. Note this assumes you'll have only a single value for grouper per ID. If that can vary, it needs to be modified. 

 

1. Generate list of ID with grouper value

2. For list of ID from #1, summarize data

3. Union records where grouper is missing

4. Note that UNION ALL is used to handle duplicate data.

 

 

data have;
infile cards dlm=',' dsd truncover;
input  OBS  ID	 Grouper	 Paid	 VAR2 	 VAR3;
cards;
1,	111,	 ,	100,	150,	200
2,	111,	0,	100,	150,	200
3,	111,	 ,	100,	150,	200
4,	122,	 ,	100,	150,	200
5,	122,	7,	100,	150,	200
6,	123,	1,	100,	150,	200
7,	125,	 ,	100,	150,	200
8,	125,	 ,	100,	150,	200
9,	125,	 ,	100,	150,	200
10,	125,	 ,	100,	150,	200
;;;;
run;

proc sql;
create table list1 as select ID from have where not missing(grouper);
quit;

proc sql;
create table want as
select ID, max(grouper) as grouper, sum(paid)as paid, sum(var2) as var2, sum(var3) as var3
from have
where id in (select id from list1)
group by id
union all
select *
from have (drop=obs) where id not in (select ID from list1);
quit;

proc print data=want;
run;

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

Mastering the WHERE Clause in PROC SQL

SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 17 replies
  • 1811 views
  • 2 likes
  • 5 in conversation