- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
I use the following code to calculate means, etc. and add them to a dataset:
proc means data=mydata;
by class;
var var1;
output out=avg_value mean=average Q1=q25 Q3=q75;
run;
data new_data;
set mydata avg_value;
run;
What I get is a "stacked" dataset such as this:
Class | Var1 | _TYPE_ | _FREQ_ | average | q25 | q75 |
a | 2 | . | . | . | . | . |
a | 4 | . | . | . | . | . |
a | 3 | . | . | . | . | . |
a | 4 | . | . | . | . | . |
b | 9 | . | . | . | . | . |
b | 8 | . | . | . | . | . |
b | 6 | . | . | . | . | . |
b | 7 | . | . | . | . | . |
a | . | 0 | 4 | 3.25 | 2.75 | 4 |
b | . | 0 | 4 | 7.5 | 6.75 | 8.25 |
What I need is a dataset with those statistics in every row for a corresponding class, such as this:
Class | Var1 | average | q25 | q75 |
a | 2 | 3.25 | 2.75 | 4 |
a | 4 | 3.25 | 2.75 | 4 |
a | 3 | 3.25 | 2.75 | 4 |
a | 4 | 3.25 | 2.75 | 4 |
b | 9 | 7.5 | 6.75 | 8.25 |
b | 8 | 7.5 | 6.75 | 8.25 |
b | 6 | 7.5 | 6.75 | 8.25 |
b | 7 | 7.5 | 6.75 | 8.25 |
Thank you.
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Try this:
data new_data;
merge mydata avg_value;
by class;
drop _type_ _freq_;
run;
Which brings up the questions: why create this data set at all? What are you going to do next once you have it? Are you going to determine where each value of VAR1 exists between the p25 and p75? Please explain, very often this data set is not necessary, there may be easier ways to getting to the next step (if only we knew what the next step is, we could provide those easier ways).
Paige Miller
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Try this:
data new_data;
merge mydata avg_value;
by class;
drop _type_ _freq_;
run;
Which brings up the questions: why create this data set at all? What are you going to do next once you have it? Are you going to determine where each value of VAR1 exists between the p25 and p75? Please explain, very often this data set is not necessary, there may be easier ways to getting to the next step (if only we knew what the next step is, we could provide those easier ways).
Paige Miller
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
I want to identify various outliers (1.5*IQR, 3*IQR) among other things.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
@Vic3 wrote:
I want to identify various outliers (1.5*IQR, 3*IQR) among other things.
You don't need PROC MEANS or this data set NEW_DATA. SAS has a PROC that will do exactly what you want. This should work for you:
proc stdize data=mydata method=iqr out=want;
by class;
var var1;
run;
Paige Miller
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content