Greetings!
I am trying to output average of a variable within a conditional do loop.
I have a dataset with 5 variables visitID, msdrg, diagnosis, cost1 cost2
My do loop is that if if the msdrg and diagnosis is the same, then new cost=average of all the cost (within that same msdrg and diagnosis).
My program is the following:
data want;
set have;
by DiagnosisDSC MSDRGWeightNBR;
if first.MSDRGWeightNBR and first.DiagnosisDSC then do;
new_cost=mean(cost1);
end;
run;
I know there is no "mean" function within a data statement and I will have to proc means but how do I do it within an ongoing do loop.
The above program returns the new_cost as the first value of cost1 for all the records where the msdrg and diagnosis is the same and blank for the others.
Whereas I want to have the average of all the cost1 (5 or 6 or however many observations) where msdrg and diagnosis are the same.
Thanks so much!
The MEAN(,) function takes the mean of a list of values. So MEAN(1,2,3) would return 2. You cannot use it to get the mean over a series of observations.
You should just use PROC MEANS and re-merge.
proc means noprint nway data=have;
by DiagnosisDSC MSDRGWeightNBR;
var cost1 ;
output out=means(drop=_type_ _freq_) mean=new_cost;
run;
data want;
merge have means;
by DiagnosisDSC MSDRGWeightNBR;
run;
Or you could use PROC SQL and let it do the re-merge for you. Note that the MEAN() aggregate function in the SQL language is a totally different function than the MEAN(,...) function of the SAS language.
proc sql ;
create table want as
select *, mean(cost1) as new_cost
from have
group by DiagnosisDSC , MSDRGWeightNBR
;
quit;
Thanks for reply. My program is either giving me the first cost value for the same msdrg and diagnosis and giving me blanks for the rest of the costs for the same msdrg and diagnosis.
For example:
MSDRG DIAGNOSIS COST1 NEW_COST
1.8 fracture 6801 6801
1.8 fracture 6801 ...
1.8 fracture 6801 ...
1.8 fracture 6801 ...
1.8 fracture 6801 ...
And the same for other same msdrg and diagnosis.
While I want the result to be 6801/5..
Does it make sense?
I am sorry. Here is my dataset "Have"
Obs MSDRG DIAGNOSIS COST1
1 1.8 fracture 1000
2 1.8 fracture 2000
3 1.8 fracture 3000
4 1.8 fracture 4000
5 1.8 fracture 5000
6 3.8 diabetes 1000
7 3.8 diabetes 2000
8 3.8 diabetes 3000
9 3.8 diabetes 4000
10 3.8 diabetes 5000
Data want:
Obs MSDRG DIAGNOSIS COST1 Cost_new(average)
1 1.8 fracture 1000 3000 (average of same drg and diagnosis(1000+2000+3000+4000+5000)/5
2 1.8 fracture 2000
3 1.8 fracture 3000
4 1.8 fracture 4000
5 1.8 fracture 5000
6 3.8 diabetes 5000 3800(average of same drg and diagnosis(5000+2000+3000+4000+5000)/5
7 3.8 diabetes 2000
8 3.8 diabetes 3000
9 3.8 diabetes 4000
10 3.8 diabetes 5000
Code tried using:
data want;
set margin_match;
by DiagnosisDSC MSDRGWeightNBR;
if first.MSDRGWeightNBR and first.DiagnosisDSC then do;
DirectMarginAMT_bwh=mean(DirectMarginAMT);
end;
run;
But did not work. as mean is used in proc means...
No warning or error in log but the output is this:
Obs MSDRG DIAGNOSIS COST1 cost_new(average)
1 1.8 fracture 1000 1000
2 1.8 fracture 2000 ......
3 1.8 fracture 3000 ......
4 1.8 fracture 4000 ......
5 1.8 fracture 5000 ......
6 3.8 diabetes 1000 1000
7 3.8 diabetes 2000 ......
8 3.8 diabetes 3000 .......
9 3.8 diabetes 4000 ......
10 3.8 diabetes 5000 .....
Thanks!
The MEAN(,) function takes the mean of a list of values. So MEAN(1,2,3) would return 2. You cannot use it to get the mean over a series of observations.
You should just use PROC MEANS and re-merge.
proc means noprint nway data=have;
by DiagnosisDSC MSDRGWeightNBR;
var cost1 ;
output out=means(drop=_type_ _freq_) mean=new_cost;
run;
data want;
merge have means;
by DiagnosisDSC MSDRGWeightNBR;
run;
Or you could use PROC SQL and let it do the re-merge for you. Note that the MEAN() aggregate function in the SQL language is a totally different function than the MEAN(,...) function of the SAS language.
proc sql ;
create table want as
select *, mean(cost1) as new_cost
from have
group by DiagnosisDSC , MSDRGWeightNBR
;
quit;
Thank you so much. I tried proc sql and it worked!!
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.