- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Let me say it again: quantiles are statistics, which means that they estimate underlying parameters in the population (in this case, the quantiles of the population). The values you say are "correct" are merely the sample quantiles that R computes by default. There are many definitions of sample quantiles. None are more correct than the others.
The following SAS/IML program simplifies the program in my blog post and computes only the TYPE=7 definition, which is the default in R. You can run this program to obtain the sample quantiles that you want.
/* Compute the sample quantiles that R computes by default */
proc iml;
/* Define function that returns the TYPE=7 sample quantiles. For more info, see https://blogs.sas.com/content/iml/2017/05/24/definitions-sample-quantiles.html*/
start GetRQuantiles(y, probs);
x = colvec(y);
call sort(x);
N = nrow(x); /* assume all values are nonmissing */
p = colvec(probs);
m = 1-p;
j = floor(N*p + m);
g = N*p + m - j;
q = j(nrow(p), 1, x[N]); /* if p=1, x[N]=return max(x) */
idx = loc(p < 1);
if ncol(idx) >0 then do;
j = j[idx]; g = g[idx];
q[idx] = (1-g)#x[j] + g#x[j+1];
end;
return q;
finish;
use Q; read all var "x"; close; /* read sample into x */
p = {12.5, 25, 37.5, 50, 62.5, 75, 87.5, 100} / 100; /* define probabilities */
q = GetRQuantiles(x, p); /* sample quantiles */
print p q;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
but not able to understand how can I use the below coding in sas
use Q; read all var "x"; close; /* read sample into x */
p = {12.5, 25, 37.5, 50, 62.5, 75, 87.5, 100} / 100; /* define probabilities */
q = GetRQuantiles(x, p); /* sample quantiles */
print p q;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
So did you create a dataset named Q with a variable named X for the IML code to read int?
data q;
input x;
cards;
19967.95
19271.69
16525.2
6885.5
3442.75
;
proc iml;
/* Define function that returns the TYPE=7 sample quantiles. For more info, see https://blogs.sas.com/content/iml/2017/05/24/definitions-sample-quantiles.html*/
start GetRQuantiles(y, probs);
x = colvec(y);
call sort(x);
N = nrow(x); /* assume all values are nonmissing */
p = colvec(probs);
m = 1-p;
j = floor(N*p + m);
g = N*p + m - j;
q = j(nrow(p), 1, x[N]); /* if p=1, x[N]=return max(x) */
idx = loc(p < 1);
if ncol(idx) >0 then do;
j = j[idx]; g = g[idx];
q[idx] = (1-g)#x[j] + g#x[j+1];
end;
return q;
finish;
use q; read all var "x"; close; /* read sample into x */
p = {12.5, 25, 37.5, 50, 62.5, 75, 87.5, 100} / 100; /* define probabilities */
q = GetRQuantiles(x, p); /* sample quantiles */
print p q;
quit;
Output
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
I used the above logic for calculating quantiles for 5 data sets,written a logic to do the same using loops,
for the data sets which have only one obeservation am getting the error:
RROR: (execution) Invalid subscript or subscript out of range
what does this error actually mean and how can we resolve it
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
For a degenerate sample that has one observation, the empirical CDF is a vertical line and all quantiles are equal to the value of the observation. For example, if the sample is {7}, then
min = 0th percentile = 7
10th percentile = 7
...
90th percentile = 7
max = 100th percentile = 7
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Is there any other way in computing the quantiles other than this to get the accurate results
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
I am trying to complete a project, so I will let others help you. Good luck!
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
As Rick stated you can use the normal SAS procedures to calculate normal estimates of quantiles.
If you want some special algorithm then you can write your own code to perform it, like the way Rick showed with the IML code to reproduce the numbers the R function you used was producing.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
The OP has complained that the original program I posted did not handle the degenerate case of a sample that has only one observation. Here is the modification that handles N=1:
data q;
input x;
cards;
19967.95
19271.69
16525.2
6885.5
3442.75
;
proc iml;
/* Define function that returns the TYPE=7 sample quantiles. For more info, see https://blogs.sas.com/content/iml/2017/05/24/definitions-sample-quantiles.html*/
start GetRQuantiles(y, probs);
x = colvec(y);
call sort(x);
N = nrow(x); /* assume all values are nonmissing */
p = colvec(probs);
m = 1-p;
j = floor(N*p + m);
g = N*p + m - j;
q = j(nrow(p), 1, x[N]); /* if p=1, x[N]=return max(x) */
if N=1 then return q;
idx = loc(p < 1);
if ncol(idx) >0 then do;
j = j[idx]; g = g[idx];
q[idx] = (1-g)#x[j] + g#x[j+1];
end;
return q;
finish;
/* TEST the function on an example */
use q; read all var "x"; close; /* read sample into x */
p = {12.5, 25, 37.5, 50, 62.5, 75, 87.5, 100} / 100; /* define probabilities */
q = GetRQuantiles(x, p); /* get type=7 sample quantiles */
print p q;
/* for a degenerate sample (N=1), all estimates are equal to x[1] */
q = GetRQuantiles(123.45, p);
print p q;
- « Previous
-
- 1
- 2
- Next »