BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Junyong
Pyrite | Level 9

I want to use PROC UNIVARIATE PCTLPTS to find the 200/4 (50th), 200/5 (40th), 200/6 (33.33th), ..., 200/400 (0.5th) percentiles as follows.

data have;
do t=1 to 10;
do i=1 to 10000;
x=rannor(1);
output;
end;
end;
run;

proc iml;
i=200/(4:400);
call symputx("i",rowcat(char(i,16,14)+" "));
quit;

proc univariate noprint;
var x;
output pctlpre=x pctlpts=&i out=want;
run;

And I realized that PCTLPTS accepts maximum two digits—for example, PCTLPTS cannot separately find the 200/150 (1.333th) and 200/151 (1.325th) percentiles because PCTLPTS recognizes both as 1.33. Do I have other alternatives to do this?

1 ACCEPTED SOLUTION

Accepted Solutions
PaigeMiller
Diamond | Level 26

This may not work exaclty as you want, since UNIVARIATE allows only two decimal places in the output percentile variable names. maybe the PCTLNDEC= option fixes the problem, but I'll let you play with that.

 

However, here I create a macro variable &P in PROC SQL which does what you want.

 

data have;
do i=1 to 100000;
x=rannor(1);
output;
end;
run;

data pctlpts;
do i=4 to 400;
fraction = 200/i;
output;
end;

proc sql noprint;
    select fraction into :p separated by ' ' from pctlpts;
quit;
%put &=p;

proc univariate noprint data=have;
var x;
output pctlpre=x pctlpts=&p out=want;
run;

 

--
Paige Miller

View solution in original post

4 REPLIES 4
ballardw
Super User

I do not understand what the division you show means in terms of percentiles.

 

If you see the same output for 1.325 and 1.333 then you are doing something else, probably in your MACRO variable.

When I use:

proc univariate noprint data=have;
   var x;
   output pctlpre=x pctlpts=1.333, 1.325 out=want;
run;

The output variable names are x1_33 and x1_32 and the associated values are different. With a small data set multiple percentiles will often have the same values if the values in the data repeat. Consider a very small set:

data have2;
  do i=1 to 5;
     t=1;
     output;
     t=3;
     output;
  end;
run;

proc univariate  data=have2;
   var t;
   output pctlpre=t pctlpts=5 to 95 by 5 out=want;
run;

There are only 2 values in the data set and except where we get a percentile that breaks a tie with the value of 2 the percentiles are all 1 or 3.

 

If you think that Univariate can't do the percentile you want then

1) sort the data by the variable of interest

2) run it through a data step

3) add a "percentile" position for each record.

data have3;
  do t= 1 to 13;
output;
end; run; proc sort data=have3; by t; run; data want; set have3 nobs=cnt; pct = 100*( _n_/cnt); run;

So any "percentile" in the above that you request below 7.6923076923 is going to return 1.

For any given "percentile" you would look for the record with the first pct value greater than or equal to your value of interest.

 

 

Junyong
Pyrite | Level 9

My previous example was incorrect—let me further simplify the problem.

data have;
do t=1 to 5;
do i=1 to 200000;
x=rannor(1);
output;
end;
end;
run;

proc univariate noprint;
var x;
output pctlpre=x pctlpts=1.001 1.002 out=want;
run;

If a data set is large enough, then the 1.001th and 1.002th percentiles will differ—SAS returns not the latter but the error message. Is calculating these impossible with PROC UNIVARIATE?

PaigeMiller
Diamond | Level 26

This may not work exaclty as you want, since UNIVARIATE allows only two decimal places in the output percentile variable names. maybe the PCTLNDEC= option fixes the problem, but I'll let you play with that.

 

However, here I create a macro variable &P in PROC SQL which does what you want.

 

data have;
do i=1 to 100000;
x=rannor(1);
output;
end;
run;

data pctlpts;
do i=4 to 400;
fraction = 200/i;
output;
end;

proc sql noprint;
    select fraction into :p separated by ' ' from pctlpts;
quit;
%put &=p;

proc univariate noprint data=have;
var x;
output pctlpre=x pctlpts=&p out=want;
run;

 

--
Paige Miller
Junyong
Pyrite | Level 9

PCTLNDEC was the solution—thanks a lot!

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 4 replies
  • 2416 views
  • 0 likes
  • 3 in conversation