BookmarkSubscribeRSS Feed
Dor
Calcite | Level 5 Dor
Calcite | Level 5

Hello.

 

My question is how to create a macro/loop that prints the 20 observations with highest value for each variable in a dataset.

Obviously this can be done for each variable separately using sort and print, but seems to me that there should be a more elegant way.

 

Thanks!

5 REPLIES 5
andreas_lds
PROC Star

Using macro makes code hardly more elegant, in most cases macro code is not necessary at all, but seems to be easier than trying to understand basic group-procession. Could you please post an example dataset and explain what you exactly expect as result?

Dor
Calcite | Level 5 Dor
Calcite | Level 5

Sure.

 

For example, a dataset with two variables, age and income, with 100 observations.

 

DATA Example;
    INPUT age income;
   DATALINES;
22 1000
23 1200
…
;

I want to print the 20 observations with highest values, in this example highest, 2 outputs once for age and then for income, each time with the 20 highest values in each.

 

Hope this makes it clear, and thanks.

andreas_lds
PROC Star

Well, not that clear ... the following steps list the top 5 values of the numeric vars in sashelp.class:

proc summary data=sashelp.class;
   output out=top5(drop=_type_ _freq_) idgroup(max(Age) max(Weight) max(Height) out[5](Age Weight Height)=);
run;

proc transpose data=work.top5 out=work.transposed;
run;

data work.optimized;
   set work.transposed(rename=(_name_ = varName col1=value));
   
   length position 8;
   label varName = " ";

   position = input(compress(varName,,'dk'), 2.);
   varName = scan(varName, 1, '_');
run;

proc sort data=work.optimized out=work.sorted;
   by varName position;
run;


options nobyline;

title "Top 5 of ""#byval1""";

proc print data=work.sorted noobs;
   by varName;
run;

title;
Shmuel
Garnet | Level 18

Which observation you want in case there is more than one with save value of a variable ?

Are variables all numeric? do you want to include alphabetic variables ?

How many variables do you have ?

 

You can use proc summary to create a temporary dataset with all max values, then subset those observations

with equal values to the temporary file variables, finaly select 20 observations. It may come out with less then 20 obs.

 

proc summary data=have missing noprint;
var <variable names separated by space, numeric only >;
output out=temp(drop=count _type_) max=;
run;

proc sql;
   create table want as
   select a.* from have as a
   left join temp as b on
       a.var1 = b.var1 or
       a.var2 = b.var2 or
       ..... /* up to last variable in temp */
; quit;

 

Rick_SAS
SAS Super FREQ

You can get this information from PROC UNIVARIATE. If you don't mind also getting the lowest values, use

proc univariate data=Sashelp.Cars nextrobs=20;
   ods select ExtremeObs;
run;

If you don't want the low values, write the ExtremeObs table to a data set and use PROC PRINT:

ods select none;
proc univariate data=Sashelp.Cars nextrobs=20;
   ods output ExtremeObs=XOBs;
run;
ods select all;

proc print data=XObs noobs;
  by VarName notsorted;
  var High HighObs;
run;

 

SAS INNOVATE 2024

Innovate_SAS_Blue.png

Registration is open! SAS is returning to Vegas for an AI and analytics experience like no other! Whether you're an executive, manager, end user or SAS partner, SAS Innovate is designed for everyone on your team. Register for just $495 by 12/31/2023.

If you are interested in speaking, there is still time to submit a session idea. More details are posted on the website. 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Get the $99 certification deal.jpg

 

 

Back in the Classroom!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 5 replies
  • 341 views
  • 6 likes
  • 4 in conversation