Hello
Let's say that I have a data set with 3 columns: rank,min,max(Note- this table was created by proc rank)
I wanted to calculated percentiles 1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25
But in the output i didn't get all the percentiles I have asked.
My question;
What is the way to add rows of percentiles that doesn't exist and they will get values of columns Min,MAX from the upper row.
For example:
There is no row for P6 so we will add row with P6 and it will get value of min and max from row P5
There is no row fro P19 till P24 so we will add these rows and min,max values will come from row P18
What is the way to add the missing rows?
Data have;
infile datalines dlm=',';
input rank min max;
cards;
0, 0,800
1,1000,1901
2,2000,2425
3,2500,3000
4,3001,4000
5,4015,4935
7,5000,5000
9,5001,6000
10,6001,6480
13,6500,6500
16,6501,6934
17,7000,7200
18,7244,7400
25,7500,7500
;
Run;
Data want;
infile datalines dlm=',';
input rank min max;
cards;
0, 0,800
1,1000,1901
2,2000,2425
3,2500,3000
4,3001,4000
5,4015,4935
6,4015,4935
7,5000,5000
8,5000,5000
9,5001,6000
10,6001,6480
11,6001,6480
12,6001,6480
13,6500,6500
14,6500,6500
15,6500,6500
16,6501,6934
17,7000,7200
18,7244,7400
19,7244,7400
20,7244,7400
21,7244,7400
22,7244,7400
23,7244,7400
24,7244,7400
25,7500,7500
;
Run;
Not sure why you haven't got all percentiles you have asked for ... but here's a way that is waterproof.
proc univariate data=sashelp.cars noprint;
var mpg_city mpg_highway;
output out=UniWidePctls
pctlpre=CityP_ HwyP_
pctlpts=1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25;
run;
proc print data=UniWidePctls noobs; run;
/* end of program */
See :
Output percentiles of multiple variables in a tabular format
By Rick Wicklin on The DO Loop October 23, 2013
https://blogs.sas.com/content/iml/2013/10/23/percentiles-in-a-tabular-format.html
for even more ways to achieve the same.
BR, Koen
Hello ,
Koen
@Ronein wrote:
Thank you,
However the question was how to add the missing rows and get values from upper row
So let's see, you have recently posted three different threads on subjects related to coding percentiles. And it feels as if no progress has been made. How about you take a different approach? How about, instead of focusing on how to write specific pieces of code, you go ahead and describe the big problem you are trying to solve.
For example:
I want to do an analysis/table/chart/graph/report (pick the appropriate one) that performs the task of ____________________ leading to the desired end analysis/table/chart/graph/report.
Note how coding is not yet mentioned. Note how this is a more wholistic description of your problem. If you had provided this (and ideally sample data), instead of getting bogged down in certain specific coding issues (which is where we are now), the people here could go ahead and think up a good way to get to the desired results, and show you example code that gets you there (and that works). But we don't really know where you want to end up, and so we can't get you there.
Please read up on the XY Problem and X-Y problem. Please approach asking questions in a more wholistic way, and less in a way asking about specific pieces of code. (Or at the very least, describe the big picture in a wholistic way and show us the code you are having trouble with)
@PaigeMiller wrote:
So let's see, you have recently posted three different threads on subjects related to coding percentiles. And it feels as if no progress has been made. How about you take a different approach? How about, instead of focusing on how to write specific pieces of code, you go ahead and describe the big problem you are trying to solve.
and
If you had provided this (and ideally sample data), instead of getting bogged down in certain specific coding issues (which is where we are now), the people here could go ahead and think up a good way to get to the desired results, and show you example code that gets you there (and that works). But we don't really know where you want to end up, and so we can't get you there.
In short @Ronein: Help us help you.
There is probably an easier solution for creating the percentile dataset that does not require this post processing. But if we just look at the task you asked for help with:
If you just want to generate all integer ranks between the two values you could use some type of simple look ahead to tell if the next RANK is the current RANK+1 or not. So something like this:
data have;
input rank min max;
cards;
0 0 800
1 1000 1901
2 2000 2425
3 2500 3000
4 3001 4000
5 4015 4935
7 5000 5000
9 5001 6000
10 6001 6480
13 6500 6500
16 6501 6934
17 7000 7200
18 7244 7400
25 7500 7500
;
data want;
set have end=eof;
if not eof then set have(firstobs=2 keep=rank rename=(rank=next_rank));
output;
do rank=rank+1 to next_rank-1;
output;
end;
drop next_rank;
run;
data want;
merge
have
have (firstobs=2 keep=rank rename=(rank=_rank))
;
output;
do rank = rank + 1 to _rank - 1;
output;
end;
drop _rank;
run;
Untested, posted from my tablet.
I suggest that you go back to your proc rank step, share the code you ran and the input data set.
What are the "min" "max" supposed to represent? Perhaps you need some additional options on the Proc Rank code.
How did you handle ties for example?
"missing" percentiles I would generally expect to occur if the input data set did not have at least 100 non-missing non-tied values.
If you ask for percentiles from an input data set with 3 observations of non-tied values you tend to get percentiles of 25, 50 and 75
I would almost certainly if completing "ranks" have duplicate ranges of values, i.e. your copy the previous min max. I might INTERPOLATE values but copy, no.
What is this data set to be used for?
A minor tweak of the code provided by @ballardw, in the pursuit of tighter code:
data have;
input rank min max;
cards;
0 0 800
1 1000 1901
2 2000 2425
3 2500 3000
4 3001 4000
5 4015 4935
7 5000 5000
9 5001 6000
10 6001 6480
13 6500 6500
16 6501 6934
17 7000 7200
18 7244 7400
25 7500 7500
run;
data want (drop=nxt:);
merge have
have (firstobs=2 keep=rank rename=(rank=nxt_rank)) ;
do rank=rank by 1 until (rank=coalesce(nxt_rank,26)-1);
output;
end;
run;
Catch the best of SAS Innovate 2025 — anytime, anywhere. Stream powerful keynotes, real-world demos, and game-changing insights from the world’s leading data and AI minds.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.