BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Rick_SAS
SAS Super FREQ

Let me say it again: quantiles are statistics, which means that they estimate underlying parameters in the population (in this case, the quantiles of the population).  The values you say are "correct" are merely the sample quantiles that R computes by default. There are many definitions of sample quantiles. None are more correct than the others.

 

The following SAS/IML program simplifies the program in my blog post and computes only the TYPE=7 definition, which is the default in R. You can run this program to obtain the sample quantiles that you want.

 

/* Compute the sample quantiles that R computes by default */
proc iml;
/* Define function that returns the TYPE=7 sample quantiles. For more info, see
https://blogs.sas.com/content/iml/2017/05/24/definitions-sample-quantiles.html
*/ start GetRQuantiles(y, probs); x = colvec(y); call sort(x); N = nrow(x); /* assume all values are nonmissing */ p = colvec(probs); m = 1-p; j = floor(N*p + m); g = N*p + m - j; q = j(nrow(p), 1, x[N]); /* if p=1, x[N]=return max(x) */ idx = loc(p < 1); if ncol(idx) >0 then do; j = j[idx]; g = g[idx]; q[idx] = (1-g)#x[j] + g#x[j+1]; end; return q; finish; use Q; read all var "x"; close; /* read sample into x */ p = {12.5, 25, 37.5, 50, 62.5, 75, 87.5, 100} / 100; /* define probabilities */ q = GetRQuantiles(x, p); /* sample quantiles */ print p q;
molla
Fluorite | Level 6
I tried using the above code
but not able to understand how can I use the below coding in sas


use Q; read all var "x"; close; /* read sample into x */
p = {12.5, 25, 37.5, 50, 62.5, 75, 87.5, 100} / 100; /* define probabilities */
q = GetRQuantiles(x, p); /* sample quantiles */
print p q;
Tom
Super User Tom
Super User

So did you create a dataset named Q with a variable named X for the IML code to read int?

 

data q;
  input x;
cards;
19967.95
19271.69
16525.2
6885.5
3442.75
;



proc iml;
/* Define function that returns the TYPE=7 sample quantiles. For more info, see   https://blogs.sas.com/content/iml/2017/05/24/definitions-sample-quantiles.html*/
start GetRQuantiles(y, probs);
   x = colvec(y);
   call sort(x);
   N = nrow(x);       /* assume all values are nonmissing */
   
   p = colvec(probs);
   m = 1-p;
   j = floor(N*p + m);
   g = N*p + m - j;

   q = j(nrow(p), 1, x[N]);    /* if p=1, x[N]=return max(x) */
   idx = loc(p < 1);
   if ncol(idx) >0 then do;
      j = j[idx]; g = g[idx];
      q[idx] = (1-g)#x[j] + g#x[j+1];
   end;   
   return q;
finish;

use q; read all var "x"; close;       /* read sample into x */
p = {12.5, 25, 37.5, 50, 62.5, 75, 87.5, 100} / 100;  /* define probabilities */
q = GetRQuantiles(x, p);  /* sample quantiles */
print p q;

quit;

Output
image.png

 

molla
Fluorite | Level 6
Hi,

I used the above logic for calculating quantiles for 5 data sets,written a logic to do the same using loops,
for the data sets which have only one obeservation am getting the error:

RROR: (execution) Invalid subscript or subscript out of range

what does this error actually mean and how can we resolve it
Rick_SAS
SAS Super FREQ

For a degenerate sample that has one observation, the empirical CDF is a vertical line and all quantiles are equal to the value of the observation. For example, if the sample is {7}, then 

min = 0th percentile = 7

10th percentile = 7

...

90th percentile = 7

max = 100th percentile = 7

 

molla
Fluorite | Level 6
Getting the same, it getting that error as well,can’t we avoid that error
molla
Fluorite | Level 6
that coding part is in R as per my understanding,how can we use the same in SAS,
Is there any other way in computing the quantiles other than this to get the accurate results
Rick_SAS
SAS Super FREQ

I am trying to complete a project, so I will let others help you. Good luck!

molla
Fluorite | Level 6
Thanks a lot
molla
Fluorite | Level 6
I got the logic and also able to run the code thanks a lot
molla
Fluorite | Level 6
just want to know is there any other procedure where we can calculate the quantiles
Tom
Super User Tom
Super User

As Rick stated you can use the normal SAS procedures to calculate normal estimates of quantiles.

If you want some special algorithm then you can write your own code to perform it, like the way Rick showed with the IML code to reproduce the numbers the R function you used was producing.

Rick_SAS
SAS Super FREQ

The OP has complained that the original program I posted did not handle the degenerate case of a sample that has only one observation. Here is the modification that handles N=1:

 

data q;
  input x;
cards;
19967.95
19271.69
16525.2
6885.5
3442.75
;


proc iml;
/* Define function that returns the TYPE=7 sample quantiles. For more info, see   https://blogs.sas.com/content/iml/2017/05/24/definitions-sample-quantiles.html*/
start GetRQuantiles(y, probs);
   x = colvec(y);
   call sort(x);
   N = nrow(x);       /* assume all values are nonmissing */
   
   p = colvec(probs);
   m = 1-p;
   j = floor(N*p + m);
   g = N*p + m - j;

   q = j(nrow(p), 1, x[N]);    /* if p=1, x[N]=return max(x) */
   if N=1 then return q;
   idx = loc(p < 1);
   if ncol(idx) >0 then do;
      j = j[idx]; g = g[idx];
      q[idx] = (1-g)#x[j] + g#x[j+1];
   end;   
   return q;
finish;

/* TEST the function on an example */
use q; read all var "x"; close;       /* read sample into x */
p = {12.5, 25, 37.5, 50, 62.5, 75, 87.5, 100} / 100;  /* define probabilities */
q = GetRQuantiles(x, p);              /* get type=7 sample quantiles */
print p q;

/* for a degenerate sample (N=1), all estimates are equal to x[1] */
q = GetRQuantiles(123.45, p);  
print p q;

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 27 replies
  • 5129 views
  • 9 likes
  • 6 in conversation