BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Benn
Calcite | Level 5

Hi Guys,

Need some advice here. I realized that the PROC MEANS command removes all other variables from my list. For instance, if I have variables A, B, C, D, E on my list, and I applied PROC MEANS on variable A, the rest of the variables  (B,C,D,E) will be dropped from the output. This is the code I used:

PROC MEANS DATA=work.trial1 NOPRINT;

VAR RET;

OUTPUT OUT=trial2

N=N

MAX(RET)=MAX_RET;

BY DATENEW DIST;

RUN;

May I know, how can I write the code, such that Variables B, C, D, E show up in the output too.

Thank you guys so much for your help!

Cheers.

1 ACCEPTED SOLUTION

Accepted Solutions
art297
Opal | Level 21

How about something like?:

proc sql;

  create table test as

    select *

      from sashelp.class

        group by sex

          having age=max(age)

  ;

quit;

View solution in original post

8 REPLIES 8
Linlin
Lapis Lazuli | Level 10

You can try the code below to see if you can get what you want:

proc sort data=trial2;by BY DATENEW DIST;run;
proc sort data=trial1;by BY DATENEW DIST;run;

data want;
  merge trial2(in=a)trial1;
  by DATENEW DIST;
if a;
run;

Astounding
PROC Star

Benn,

First, you have to realize that the output from PROC MEANS contains far fewer observations than you started with.  So if you are looking to add the summarized results to your original data set, then Linlin has the right idea.  However ...

If you are looking for the output from PROC MEANS to contain additional variables, which values do you want?  SAS would have to choose B, C, D, and E from the values across many observations that all went into the calculations for that summary line.  If you don't care which of those values become part of the output, the ID statement would do it for you:

ID B C D E;

There is a rhyme and reason as to which values SAS selects, but it's a little tricky to explain.  And there are a couple of choices as well.  Best to study up on the ID statement.

Good luck.

Ksharp
Super User

SQL is good for this scenario .

proc sql;

select mean(a) as mean_a,b,c,d

from have

group by DATENEW,DIST

;quit;

Ksharp

Benn
Calcite | Level 5

Hi Ksharp,

thanks for the advice. The code got me very close to what I need. I am hoping that you could help me a little more.

The code that you provided drew out all the variables for all the observations from the input file. However. I am only looking for that row of variables observed for that observation.

E.g these are the following variables

Carmake  Price   Mileage  Insurance

BMW       $1000    200          $10

BMW       $1200     80            $10

MERC      $1500     50          $50

PROC SQL NOPRINT;

CREATE TABLE TRY2 AS SELECT MAX(PRICE) AS MAX_PRICE, FROM CARDATA GROUP BY CARMAKE;

and I was hoping that only the accompanying variables show up in the output. So for BMW, the variables that is accompanied by observation 2 will show up. (But now, the codes you provided, shows observation 1 as well- and maps the $1200 to it).

Hope I am not too confusing.

Thanks a lot Ksharp.

jonam
Calcite | Level 5

Hi Benn,

what you are asking is remerging summary statistics back to the original data.

Hope the following code helps to get the desired results.

proc sql;

create table test

as

select a.*,max(age) max_age_by_gender

from sashelp.class as a

group by sex

;quit;

proc print;

run; 

Obs    Name       Sex    Age    Height    Weight    max_age_by_gender

                     1    Carol       F      14     62.8      102.5       15

                     2    Alice       F      13     56.5       84.0       15

                     3    Joyce       F      11     51.3       50.5       15

                     4    Louise      F      12     56.3       77.0       15

                     5    Barbara     F      13     65.3       98.0       15

                     6    Janet       F      15     62.5      112.5       15

                     7    Judy        F      14     64.3       90.0       15

                     8    Jane        F      12     59.8       84.5       15

                     9    Mary        F      15     66.5      112.0       15

                    10    Robert      M      12     64.8      128.0       16

                    11    James       M      12     57.3       83.0       16

                    12    Henry       M      14     63.5      102.5       16

                    13    Alfred      M      14     69.0      112.5       16

                    14    John        M      12     59.0       99.5       16

                    15    Jeffrey     M      13     62.5       84.0       16

                    16    Thomas      M      11     57.5       85.0       16

                    17    Philip      M      16     72.0      150.0       16

                    18    William     M      15     66.5      112.0       16

                    19    Ronald      M      15     67.0      133.0       16

art297
Opal | Level 21

How about something like?:

proc sql;

  create table test as

    select *

      from sashelp.class

        group by sex

          having age=max(age)

  ;

quit;

Benn
Calcite | Level 5

Hi Arthur,

thank you so much for your code! It gave me what I needed. I was wondering if you could help me a little more.

Basically, I have used your code to retrieve a set of data. Then I used the PROC RANK function to form deciles within dates.

PROC RANK

DATA=work.data OUT=data1 Groups=10;

BY DATE;

VAR PRICE;

RANKS R_PRICE;

RUN;

Then now I am trying to obtain the median within  each decile at each date (and at the same time having all the variables associated with that median observation displayed in that output as well). I modified and used the following code:

PROC SQL;

CREATE TABLE data2 AS SELECT*

FROM work.data1 GROUP BY DATE,R_PRICE

HAVING PRICE=MEDIAN(PRICE);

QUIT;

However, the following error showed up,

"

WARNING: A GROUP BY clause has been transformed into an ORDER BY clause because neither the SELECT clause nor the optional HAVING clause of the associated table-expression referenced a summary function."

Thanks a lot Arthur!

art297
Opal | Level 21

Median is a a row-wise not an aggregate function in proc sql.  Take a look at: http://www.sascommunity.org/wiki/Fun_with_PROC_SQL_Summary_Functions

If you need to use median values, calculate them with proc means or proc summary, and merge the results with your original data.

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 8 replies
  • 3893 views
  • 11 likes
  • 6 in conversation