- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hi all,
I am new to sas programming, and I want to calculate the composite mean score of these 4 variables by pid. I am able to obtain the mean of each variable, below is screen shot of my code:
Thanks for the help!
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
You will have to create a single variable for Proc Means/summary to get a "composite" statistic.
One way, which is a manual transpose of the data to a different form:
data want; set bldata_fmt; array p (*) promis_pa_scale1 promis_pa_scale3 promis_pa_scale4 promis_pa_scale5; do i=1 to dim(p); name=vname(p[i]); value = p[i]; output; end; keep pid name value; run; proc means data=want; var value; class pid; run;
The data step creates 4 records from each one in the original data set with the name of the original variable and the value of that variable in the new variable named Value. If you have not used Arrays they are a way to create temporary short cuts to reference multiple variables, typically to do similar operations on all of them. The VNAME function returns the name of a variable and was done for reference.
Then run statistics on Value.
You could run the data with class statement using PID and NAME to verify that you get the same result as your initial Proc Means output.
Warning: if you ever use this technique with Weight of Freq options you need to consider that you now would increase the total number of weights or freq values and need to adjust them.
For future questions: Please post code or log entries as TEXT. Copy the text from the editor or log, open a text box on the forum by clicking on the </> icon above the message window and paste the text.
Retyping significant amounts of text is a burden. The more we have to manually type the more likely to have spelling errors. If your code only needs a small change it is much easier for us to copy/paste and edit. If your code is long enough many of use aren't going to retype it and may delay getting a usable answer. Also the picture resolutions can be such that just reading text can be extremely difficult depending different monitor resolutions.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Here's an approach that is similar to how you began, however there is something to be careful of ... see the warning at the end;
proc summary data=BLdata_fmt nway;
class pid;
var promis_pa_scale1 promis_pa_scale3 promis_pa_scale4 promis_pa_scale5;
output out=totals (drop=_type_) sum=;
run;
This gives you a summary data set named TOTALS with one observation per PID, with these variables:
PID (obviously)
_freq_ = the number of original observations for that PID
promis_pa_scale1 = total of all promis_pa_scale1 values for that PID
promis_pa_scale3 = total of all promis_pa_scale3 values for that PID
promis_pa_scale4 = total of all promis_pa_scale4 values for that PID
promis_pa_scale5 = total of all promis_pa_scale5 values for that PID
From these variables, you should easily be able to calculate any sort of mean you want (but see warning below). For example, you might use:
composite_mean = (promis_pa_scale1 + promis_pa_scale3 + promis_pa_scale4 + promise_pa_scale5) / (4 * _freq_);
HOWEVER ... this approach is WRONG if your data contains any missing values. The number of observations for the PID would not be the correct denominator in that case, and the formula would become overly complex. But if you know you have no missing values, this is easy to understand and follow.