- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
hello,
I know that with proc means I can find summary statistics for my data such as mean and N per variable. However, how can I then use these summary statistics in my data step, so that I can do something like find the distance of each observation from the mean, and then finally add up these distances.
i guess that sometimes i want to operate at the per observation level, and then sometimes i want to operate at the aggregate level, and i'm not quite sure what the approach is to switching back and forth between these. hope i'm making some sense.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
is the example helpful:
proc sql noprint;
select mean(height) into :mn
from sashelp.class;
quit;
data class;
set sashelp.class;
mean_height=&mn;
diff=height-&mn;
proc print;run;
Obs Name Sex Age Height Weight height diff
1 Alfred M 14 69.0 112.5 62.3368 6.6632
2 Alice F 13 56.5 84.0 62.3368 -5.8368
3 Barbara F 13 65.3 98.0 62.3368 2.9632
4 Carol F 14 62.8 102.5 62.3368 0.4632
5 Henry M 14 63.5 102.5 62.3368 1.1632
6 James M 12 57.3 83.0 62.3368 -5.0368
7 Jane F 12 59.8 84.5 62.3368 -2.5368
8 Janet F 15 62.5 112.5 62.3368 0.1632
9 Jeffrey M 13 62.5 84.0 62.3368 0.1632
10 John M 12 59.0 99.5 62.3368 -3.3368
11 Joyce F 11 51.3 50.5 62.3368 -11.0368
12 Judy F 14 64.3 90.0 62.3368 1.9632
13 Louise F 12 56.3 77.0 62.3368 -6.0368
14 Mary F 15 66.5 112.0 62.3368 4.1632
15 Philip M 16 72.0 150.0 62.3368 9.6632
16 Robert M 12 64.8 128.0 62.3368 2.4632
17 Ronald M 15 67.0 133.0 62.3368 4.6632
18 Thomas M 11 57.5 85.0 62.3368 -4.8368
19 William M 15 66.5 112.0 62.3368 4.1632
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Add the statistics back to your datastep. Search on the forum for many ways to do that. This is especially useful if you have statistics at a group level.
You can also look at some of the other stats that proc means gives you because they can be useful.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Or something like :
proc means;
..
output out=stat .....;
run;
data want;
set have;
if _n_ eq 1 then set stat ;
...........
Ksharp
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
You can add ods statement to collect statistics into new dataset which might be helpful to you.
ODS OUTPUT summary=summary_means; /* summary_means is the new dataset */
PROC MEANS DATA= <DATASET_NAME>;
VAR years_on_farm;
RUN;
ODS OUTPUT CLOSE;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Instead of MEANS, use SQL to calculate statistics and then load them into macro variables that you reference in the data step.
eg.
PROC SQL noprint;
select mean(age), mean(weight) into :AverageAge, AverageWeight from your_data_set;
quit;
Data your_data_set2;
set your_data_set;
Age_variance=Age - &AverageAge;
Weight_variance=weight - &AverageWeight;
run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Forgive me if I need another cup of coffee on this one, but ...
If you compute the difference from the mean on each observation, then add up all the differences, doesn't the total have to be zero?
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
No because of rounding/floating point error
Yes otherwise :smileysilly: