My question is how to create a macro/loop that prints the 20 observations with highest value for each variable in a dataset.
Obviously this can be done for each variable separately using sort and print, but seems to me that there should be a more elegant way.
Using macro makes code hardly more elegant, in most cases macro code is not necessary at all, but seems to be easier than trying to understand basic group-procession. Could you please post an example dataset and explain what you exactly expect as result?
For example, a dataset with two variables, age and income, with 100 observations.
DATA Example; INPUT age income; DATALINES; 22 1000 23 1200 … ;
I want to print the 20 observations with highest values, in this example highest, 2 outputs once for age and then for income, each time with the 20 highest values in each.
Hope this makes it clear, and thanks.
Well, not that clear ... the following steps list the top 5 values of the numeric vars in sashelp.class:
proc summary data=sashelp.class; output out=top5(drop=_type_ _freq_) idgroup(max(Age) max(Weight) max(Height) out(Age Weight Height)=); run; proc transpose data=work.top5 out=work.transposed; run; data work.optimized; set work.transposed(rename=(_name_ = varName col1=value)); length position 8; label varName = " "; position = input(compress(varName,,'dk'), 2.); varName = scan(varName, 1, '_'); run; proc sort data=work.optimized out=work.sorted; by varName position; run; options nobyline; title "Top 5 of ""#byval1"""; proc print data=work.sorted noobs; by varName; run; title;
Which observation you want in case there is more than one with save value of a variable ?
Are variables all numeric? do you want to include alphabetic variables ?
How many variables do you have ?
You can use proc summary to create a temporary dataset with all max values, then subset those observations
with equal values to the temporary file variables, finaly select 20 observations. It may come out with less then 20 obs.
proc summary data=have missing noprint; var <variable names separated by space, numeric only >; output out=temp(drop=count _type_) max=; run; proc sql; create table want as select a.* from have as a left join temp as b on a.var1 = b.var1 or a.var2 = b.var2 or ..... /* up to last variable in temp */ ; quit;
You can get this information from PROC UNIVARIATE. If you don't mind also getting the lowest values, use
proc univariate data=Sashelp.Cars nextrobs=20; ods select ExtremeObs; run;
If you don't want the low values, write the ExtremeObs table to a data set and use PROC PRINT:
ods select none; proc univariate data=Sashelp.Cars nextrobs=20; ods output ExtremeObs=XOBs; run; ods select all; proc print data=XObs noobs; by VarName notsorted; var High HighObs; run;
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.