BookmarkSubscribeRSS Feed
kajal_30
Quartz | Level 8

Hi All,

 

I have below data set (have) and need output like below(need) :

TIA 

 

DATA HAVE;
INPUT ID PRICE YEARMONTH;
CARDS;
1 23 2312
1 24 2401
1 25 2402
1 36 2403
2 45 2312
2 34 2401
2 53 2402
2 23 2403
2 664 2404
;
RUN;

 


DATA need;
INPUT ID PRICE YEARMONTH;
CARDS;
1 36 2403
2 664 2404
;
RUN;

PROC SQL;
SELECT * FROM HAVE 
GROUP BY ID ,YEARMONTH
HAVING MAX(YEARMONTH) ;
QUIT;

 

6 REPLIES 6
PaigeMiller
Diamond | Level 26

PROC SUMMARY is the tool you want to use, not PROC SQL. See Maxim 7 and Maxim 14 and Maxim 10.

 

proc summary data=have nway;
    class id;
    var price;
    output out=want max= maxid(price(yearmonth))=yearmonth;
run;

 

As a general piece of advice, you want to compute statistics in the proper PROC, in this case PROC SUMMARY but in other cases PROC FREQ. Using these PROCs will provide you with better and more powerful tools (that take less time to program and less time to execute) than PROC SQL. This is particularly useful if you have lots of variables, or lots of slices (here the only slice is by ID). The idea of doing your statistics in PROC SQL is an idea which you should avoid. For small problems like this, probably there is not really a noticeable difference between PROC SQL and PROC SUMMARY, but for larger real world problems, the difference can be dramatic (in favor of PROC SUMMARY or PROC FREQ).

--
Paige Miller
kajal_30
Quartz | Level 8

can I please get in terms of proc sql as I have many more columns to add along with the join .

 

Regards

kajal

PaigeMiller
Diamond | Level 26

Many more columns is a situation that is handled much easier in PROC SUMMARY

--
Paige Miller
ballardw
Super User

To demonstrate a bit more complex example of @PaigeMiller's advice please run the following code. You should have the SASHELP.CLASS data set provided by SAS for training and practice.

Proc summary data=sashelp.class;
    class sex age;
    var height weight;
    output out=work.class_summary min= max= mean= std= /autoname;
run;

The output data set contains multiple summaries of the variables height and weight with the requested statistics for height and weight. Note the _type_ variable provides information about combinations of the class variables. So we get a summary overall ( type=0), by age only (type=1), by sex only (type=2) and Sex crossed with age categories (type=3 ).

To get this in Proc SQL you would have to write 4 separate Proc SQL selects with different group by . Also the handy option AUTONAME, which appends the statistic to the variable name, means you don't have to expressly provide a written out name for each stat replacing multiple

mean(var) as var_mean

, max(var) as var_max

etc.

 

One project I have requires reporting on pretty much every crossing or single level of 7 different class variables plus the overall stats for 30 or so report variables. ONE call to proc summary creates all of the values in one data set. Select based on the value of _TYPE_ for specific report tables. About 2 pages (without the comments) of code generates about 4,000 pages of documents for the user to reference as needed. 

kajal_30
Quartz | Level 8

I also need to do a join so I am not able to use the join with the proc that's why It will easier for me to use Proc sql

 

Thanks 

PaigeMiller
Diamond | Level 26

You can  run a join on the output of PROC SUMMARY

--
Paige Miller

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 6 replies
  • 520 views
  • 0 likes
  • 3 in conversation