BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
bknitch
Quartz | Level 8

I have a large data set that has multiple tiers to a C_ID. The C_ID serves as the foundation of the detail, from there a P_ID is associated and finally a segment_id. I'm trying to find the weighted average PART_C_D_PREMIUM of each P_ID that is associated to the parent C_ID by rolling up the segment_id's premiums.

 

I need to calculate weights based on frequency or number of segment_id's associated to each p_id and c_id. Not sure if a weighted mean is possible using this methodology, or if anyone has a better approach. If its not possible, what is the best way to find the mean premium and rolling up the segment_id's so that its just c_id and p_id with the corresponding means for each premium. 


Example below of what I want my output to look like

C_IDP_IDparentorgplan_typeplan_namePART_C_D_PremiumMean
H05 067HM BC00
H15 006LPAM47,44,4746

 

Here is my data set 

 

data have;
	infile datalines dlm="09"x;
	input C_ID $ P_ID $ parentorg $ plan_type $ plan_name $ segment_id p_c_premium p_d_premium part_c_d_prem_combine;
    datalines;
    H05 	067	A 	HM 	BC	1	0	0	0
    H05 	067	A 	HM 	BC	2	0	0	0
    H10 	237	H 	HM 	HG	1	0	0	0
    H10 	237	H 	HM 	HG	2	0	0	0
    H15 	006	A 	LP	AM	1	0	47	47
    H15 	006	A 	LP	AM	2	0	44	44
    H15 	006	A 	LP	AM	4	0	47	47
    H19	    014	P 	HM 	PH	1	0	0	0
    H19	    014	P 	HM 	PH	2	25	0	25
    H20 	099	H 	HM 	HG	1	0	0	0
    H20 	099	H 	HM 	HG	2	0.6	22.4	23
    H20 	100	H 	HM 	HG	4	0	0	0
    H20 	100	H 	HM 	HG	5	19	0	19
    H39 	059	G 	LP	GG	3	36.4	33.6	70
    H39 	059	G 	LP	GG	4	35.9	39.1	75
    H39 	060	G 	LP	GG	3	0	0	0
    H39 	060	G 	LP	GG	4	0	0	0
    H39 	061	G 	LP	GG	1	122.8	56.2	179
    H39 	061	G 	LP	GG	2	102.8	56.2	159
    H39 	061	G 	LP	GG	5	62.8	56.2	119
	;
	run;

 

1 ACCEPTED SOLUTION

Accepted Solutions
PaigeMiller
Diamond | Level 26

Just to make sure I am understanding this properly 

 

For example C_ID= H15 P_ID= 006 has three rows, with values 47, 44 and 47 and segment IDs of 1, 2 and 4, and the value you want is (47*1+44*2+47*4)/(1+2+4) = 46.14...

 

As I said, PROC SUMMARY makes this easy to do. Although your PROC SQL also gives this answer, PROC SQL will give an incorrect answer in the presence of missing values, while PROC SUMMARY does not have this problem.

 

proc summary data=have nway;
    class contract_id plan_id;
    var part_c_d_prem_combine /weight=segment_id1;
    output out=want mean=;
run;
--
Paige Miller

View solution in original post

4 REPLIES 4
PaigeMiller
Diamond | Level 26

Actually its not hard to do using PROC SUMMARY, but its not clear from your description what the weights are. The weights have to be a variable in your data set.

--
Paige Miller
bknitch
Quartz | Level 8
The weighted average, should be the frequency of segment_id's under a specific c_id and p_id combination. For example C_ID= H15 P_ID= 006 has 3 segment_id's (1,2,4) I think i figured it out using a proc sql; It seems to work accordingly once i built the variable of segment_id freq in.

proc sql;
title 'WEIGHTED AVERAGES';
select c_id, p_id,
sum(part_c_d_prem_combine * segment_id1)/sum(segment_id1) as weighted_average
from work.plan15
group by contract_id, plan_id;
run;

PaigeMiller
Diamond | Level 26

Just to make sure I am understanding this properly 

 

For example C_ID= H15 P_ID= 006 has three rows, with values 47, 44 and 47 and segment IDs of 1, 2 and 4, and the value you want is (47*1+44*2+47*4)/(1+2+4) = 46.14...

 

As I said, PROC SUMMARY makes this easy to do. Although your PROC SQL also gives this answer, PROC SQL will give an incorrect answer in the presence of missing values, while PROC SUMMARY does not have this problem.

 

proc summary data=have nway;
    class contract_id plan_id;
    var part_c_d_prem_combine /weight=segment_id1;
    output out=want mean=;
run;
--
Paige Miller
bknitch
Quartz | Level 8

@PaigeMiller  thanks for the information! 

hackathon24-white-horiz.png

2025 SAS Hackathon: There is still time!

Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!

Register Now

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 4 replies
  • 1125 views
  • 2 likes
  • 2 in conversation