I want to select out the top 10% data based on several variables (that means several sub sets, each bsed on one variable). I have an idea like the code below.
Is this the best way to do it? Or is there a better way to do it?
Thanks in advance!
/*first define a macro selecting first 10% data */
%macro top10pct(lib, dataset, var);
proc sql noprint;
select max(ceil(0.1*nlobs)) into :N_top10pct
from &lib..&dataset
;
quit;
data &lib..&var;
set &lib..&dataset;
if _n_ <= &N_top10pct;
run;
%mend top10pct;
/* top 10 % of var1 */
proc sort data=mylib.mydata;
by var1;
run;
%top10pct(mylib, mydata, var1);
/* top 10 % of var2 */
proc sort data=mylib.mydata;
by var2;
run;
%top10pct(mylib, mydata, var2);
/* top 10 % of var3 */
proc sort data=mylib.mydata;
by var3;
run;
%top10pct(mylib, mydata, var3);
... View more