Statistical Procedures

Rick_SAS · Posted 07-12-2021 10:33 AM

Let me say it again: quantiles are statistics, which means that they estimate underlying parameters in the population (in this case, the quantiles of the population). The values you say are "correct" are merely the sample quantiles that R computes by default. There are many definitions of sample quantiles. None are more correct than the others.

The following SAS/IML program simplifies the program in my blog post and computes only the TYPE=7 definition, which is the default in R. You can run this program to obtain the sample quantiles that you want.

/* Compute the sample quantiles that R computes by default */
proc iml;
/* Define function that returns the TYPE=7 sample quantiles. For more info, see   https://blogs.sas.com/content/iml/2017/05/24/definitions-sample-quantiles.html*/
start GetRQuantiles(y, probs);
   x = colvec(y);
   call sort(x);
   N = nrow(x);       /* assume all values are nonmissing */
   
   p = colvec(probs);
   m = 1-p;
   j = floor(N*p + m);
   g = N*p + m - j;

   q = j(nrow(p), 1, x[N]);    /* if p=1, x[N]=return max(x) */
   idx = loc(p < 1);
   if ncol(idx) >0 then do;
      j = j[idx]; g = g[idx];
      q[idx] = (1-g)#x[j] + g#x[j+1];
   end;   
   return q;
finish;

use Q; read all var "x"; close;       /* read sample into x */
p = {12.5, 25, 37.5, 50, 62.5, 75, 87.5, 100} / 100;  /* define probabilities */
q = GetRQuantiles(x, p);  /* sample quantiles */
print p q;

molla · Posted 07-12-2021 03:26 PM

I tried using the above code
but not able to understand how can I use the below coding in sas

use Q; read all var "x"; close; /* read sample into x */
p = {12.5, 25, 37.5, 50, 62.5, 75, 87.5, 100} / 100; /* define probabilities */
q = GetRQuantiles(x, p); /* sample quantiles */
print p q;

Tom · Posted 07-12-2021 03:42 PM

So did you create a dataset named Q with a variable named X for the IML code to read int?

data q;
  input x;
cards;
19967.95
19271.69
16525.2
6885.5
3442.75
;



proc iml;
/* Define function that returns the TYPE=7 sample quantiles. For more info, see   https://blogs.sas.com/content/iml/2017/05/24/definitions-sample-quantiles.html*/
start GetRQuantiles(y, probs);
   x = colvec(y);
   call sort(x);
   N = nrow(x);       /* assume all values are nonmissing */
   
   p = colvec(probs);
   m = 1-p;
   j = floor(N*p + m);
   g = N*p + m - j;

   q = j(nrow(p), 1, x[N]);    /* if p=1, x[N]=return max(x) */
   idx = loc(p < 1);
   if ncol(idx) >0 then do;
      j = j[idx]; g = g[idx];
      q[idx] = (1-g)#x[j] + g#x[j+1];
   end;   
   return q;
finish;

use q; read all var "x"; close;       /* read sample into x */
p = {12.5, 25, 37.5, 50, 62.5, 75, 87.5, 100} / 100;  /* define probabilities */
q = GetRQuantiles(x, p);  /* sample quantiles */
print p q;

quit;

Output

molla · Posted 07-14-2021 11:21 AM

Hi,

I used the above logic for calculating quantiles for 5 data sets,written a logic to do the same using loops,
for the data sets which have only one obeservation am getting the error:

RROR: (execution) Invalid subscript or subscript out of range

what does this error actually mean and how can we resolve it

Rick_SAS · Posted 07-14-2021 11:40 AM

For a degenerate sample that has one observation, the empirical CDF is a vertical line and all quantiles are equal to the value of the observation. For example, if the sample is {7}, then

min = 0th percentile = 7

10th percentile = 7

...

90th percentile = 7

max = 100th percentile = 7

molla · Posted 07-14-2021 11:43 PM

Getting the same, it getting that error as well,can’t we avoid that error

molla · Posted 07-12-2021 03:28 PM

that coding part is in R as per my understanding,how can we use the same in SAS,
Is there any other way in computing the quantiles other than this to get the accurate results

Rick_SAS · Posted 07-12-2021 03:44 PM

I am trying to complete a project, so I will let others help you. Good luck!

molla · Posted 07-13-2021 01:33 AM

Thanks a lot

molla · Posted 07-12-2021 03:47 PM

I got the logic and also able to run the code thanks a lot

molla · Posted 07-12-2021 03:48 PM

just want to know is there any other procedure where we can calculate the quantiles

Tom · Posted 07-12-2021 03:51 PM

As Rick stated you can use the normal SAS procedures to calculate normal estimates of quantiles.

If you want some special algorithm then you can write your own code to perform it, like the way Rick showed with the IML code to reproduce the numbers the R function you used was producing.

Rick_SAS · Posted 07-15-2021 06:17 AM

The OP has complained that the original program I posted did not handle the degenerate case of a sample that has only one observation. Here is the modification that handles N=1:

data q;
  input x;
cards;
19967.95
19271.69
16525.2
6885.5
3442.75
;


proc iml;
/* Define function that returns the TYPE=7 sample quantiles. For more info, see   https://blogs.sas.com/content/iml/2017/05/24/definitions-sample-quantiles.html*/
start GetRQuantiles(y, probs);
   x = colvec(y);
   call sort(x);
   N = nrow(x);       /* assume all values are nonmissing */
   
   p = colvec(probs);
   m = 1-p;
   j = floor(N*p + m);
   g = N*p + m - j;

   q = j(nrow(p), 1, x[N]);    /* if p=1, x[N]=return max(x) */
   if N=1 then return q;
   idx = loc(p < 1);
   if ncol(idx) >0 then do;
      j = j[idx]; g = g[idx];
      q[idx] = (1-g)#x[j] + g#x[j+1];
   end;   
   return q;
finish;

/* TEST the function on an example */
use q; read all var "x"; close;       /* read sample into x */
p = {12.5, 25, 37.5, 50, 62.5, 75, 87.5, 100} / 100;  /* define probabilities */
q = GetRQuantiles(x, p);              /* get type=7 sample quantiles */
print p q;

/* for a degenerate sample (N=1), all estimates are equal to x[1] */
q = GetRQuantiles(123.45, p);  
print p q;

Statistical Procedures

Re: Percentiles with PROC UNIVARIATE

Re: Percentiles with PROC UNIVARIATE

Re: Percentiles with PROC UNIVARIATE

Re: Percentiles with PROC UNIVARIATE

Re: Percentiles with PROC UNIVARIATE

Re: Percentiles with PROC UNIVARIATE

Re: Percentiles with PROC UNIVARIATE

Re: Percentiles with PROC UNIVARIATE

Re: Percentiles with PROC UNIVARIATE

Re: Percentiles with PROC UNIVARIATE

Re: Percentiles with PROC UNIVARIATE

Re: Percentiles with PROC UNIVARIATE

Re: Percentiles with PROC UNIVARIATE

Proc univariate : Conf intervals for percentiles

[ SAS 활용 노하우 ] PROC / PRINT / MEANS / CONTENTS / FREQ / UNIVARIATE

Label for the percentile points in PROC UNIVARIATE

Proc Univariate trouble

Proc Univariate vs Proc Summary?

Follow Us

What is...

Statistical Procedures

Join us for our biggest event of the year!

Follow Us

What is...