09-28-2012 02:37 PM
Need some advice here. I realized that the PROC MEANS command removes all other variables from my list. For instance, if I have variables A, B, C, D, E on my list, and I applied PROC MEANS on variable A, the rest of the variables (B,C,D,E) will be dropped from the output. This is the code I used:
PROC MEANS DATA=work.trial1 NOPRINT;
BY DATENEW DIST;
May I know, how can I write the code, such that Variables B, C, D, E show up in the output too.
Thank you guys so much for your help!
09-28-2012 02:53 PM
You can try the code below to see if you can get what you want:
proc sort data=trial2;by BY DATENEW DIST;run;
proc sort data=trial1;by BY DATENEW DIST;run;
by DATENEW DIST;
09-28-2012 03:19 PM
First, you have to realize that the output from PROC MEANS contains far fewer observations than you started with. So if you are looking to add the summarized results to your original data set, then Linlin has the right idea. However ...
If you are looking for the output from PROC MEANS to contain additional variables, which values do you want? SAS would have to choose B, C, D, and E from the values across many observations that all went into the calculations for that summary line. If you don't care which of those values become part of the output, the ID statement would do it for you:
ID B C D E;
There is a rhyme and reason as to which values SAS selects, but it's a little tricky to explain. And there are a couple of choices as well. Best to study up on the ID statement.
09-29-2012 02:12 PM
thanks for the advice. The code got me very close to what I need. I am hoping that you could help me a little more.
The code that you provided drew out all the variables for all the observations from the input file. However. I am only looking for that row of variables observed for that observation.
E.g these are the following variables
Carmake Price Mileage Insurance
BMW $1000 200 $10
BMW $1200 80 $10
MERC $1500 50 $50
PROC SQL NOPRINT;
CREATE TABLE TRY2 AS SELECT MAX(PRICE) AS MAX_PRICE, FROM CARDATA GROUP BY CARMAKE;
and I was hoping that only the accompanying variables show up in the output. So for BMW, the variables that is accompanied by observation 2 will show up. (But now, the codes you provided, shows observation 1 as well- and maps the $1200 to it).
Hope I am not too confusing.
Thanks a lot Ksharp.
09-29-2012 02:45 PM
what you are asking is remerging summary statistics back to the original data.
Hope the following code helps to get the desired results.
create table test
select a.*,max(age) max_age_by_gender
from sashelp.class as a
group by sex
Obs Name Sex Age Height Weight max_age_by_gender
1 Carol F 14 62.8 102.5 15
2 Alice F 13 56.5 84.0 15
3 Joyce F 11 51.3 50.5 15
4 Louise F 12 56.3 77.0 15
5 Barbara F 13 65.3 98.0 15
6 Janet F 15 62.5 112.5 15
7 Judy F 14 64.3 90.0 15
8 Jane F 12 59.8 84.5 15
9 Mary F 15 66.5 112.0 15
10 Robert M 12 64.8 128.0 16
11 James M 12 57.3 83.0 16
12 Henry M 14 63.5 102.5 16
13 Alfred M 14 69.0 112.5 16
14 John M 12 59.0 99.5 16
15 Jeffrey M 13 62.5 84.0 16
16 Thomas M 11 57.5 85.0 16
17 Philip M 16 72.0 150.0 16
18 William M 15 66.5 112.0 16
19 Ronald M 15 67.0 133.0 16
09-30-2012 04:51 AM
thank you so much for your code! It gave me what I needed. I was wondering if you could help me a little more.
Basically, I have used your code to retrieve a set of data. Then I used the PROC RANK function to form deciles within dates.
DATA=work.data OUT=data1 Groups=10;
Then now I am trying to obtain the median within each decile at each date (and at the same time having all the variables associated with that median observation displayed in that output as well). I modified and used the following code:
CREATE TABLE data2 AS SELECT*
FROM work.data1 GROUP BY DATE,R_PRICE
However, the following error showed up,
WARNING: A GROUP BY clause has been transformed into an ORDER BY clause because neither the SELECT clause nor the optional HAVING clause of the associated table-expression referenced a summary function."
Thanks a lot Arthur!
09-30-2012 09:59 AM
Median is a a row-wise not an aggregate function in proc sql. Take a look at: http://www.sascommunity.org/wiki/Fun_with_PROC_SQL_Summary_Functions
If you need to use median values, calculate them with proc means or proc summary, and merge the results with your original data.
Need further help from the community? Please ask a new question.