Hi,
I have a dataset and for a variable (e.g. application_scores), is there a way of finding out the lower and upper bound numbers that would split the total number of records in the dataset into 10 equal deciles? The example below was the upper and lower bound numbers that were previously obtained but can someone please provide some code to obtain the correct scores (i.e. instead of 370 and 474, what would it be now?) that would split the records into 10 bands in my updated dataset?
else if 370<=application_scores<=474 then band='10';
else if 475<=application_scores<=539 then band='09';
else if 540<=application_scores<=616 then band='08';
else if 617<=application_scores<=686 then band='07';
else if 687<=application_scores<=815 then band='06';
else if 816<=application_scores<=929 then band='05';
else if 930<=application_scores<=975 then band='04';
else if 976<=application_scores<=1011 then band='03';
else if 1012<=application_scores<=1120 then band='02';
else if 1121<=application_scores then band='01';
I think this is what you want
data have;
do x=1 to 1e4;
y=rand('integer', 1, 100);
output;
end;
run;
proc rank data=have groups=10 descending out=temp;
var y;
ranks band;
run;
proc sql;
create table want as
select band,
min(y) as lower,
max(y) as upper,
count(y) as numrecords
from temp
group by band;
quit;
Maybe proc rank is useful to build a solution, but without seeing data and the expected result, it is difficult to suggest something.
You can do something like this
data have;
do x=1 to 1e4;
y=rand('integer', 1, 100);
output;
end;
run;
proc rank data=have groups=10 descending out=want;
var y;
ranks decile;
run;
Thanks for your reply, your code works and I can see which band each record would fit into.
However, is there code available to look at all of the records for that one variable, application_score, and tell me what the lower and upper bound limits were to put an equal amount of records to make 10 bands (ie. in the Results tab, there would be 10 numbers of lower limits (one for each band), and 10 numbers of upper limits (one for each band)?
Example:
There are 100 records, so there should be 10 records in each band and it would pick out the lower and upper bound scores for each band so that this would be satisfied.
The code would identify the numbers e.g. 0 and 9 for band 1, 10 and 19 for band 2 etc..
Band 1: 0<=application_score<=9
Band 2: 10<=application_score<=19
: :
: :
Band 10: 90<=application_score<=99
I think this is what you want
data have;
do x=1 to 1e4;
y=rand('integer', 1, 100);
output;
end;
run;
proc rank data=have groups=10 descending out=temp;
var y;
ranks band;
run;
proc sql;
create table want as
select band,
min(y) as lower,
max(y) as upper,
count(y) as numrecords
from temp
group by band;
quit;
Anytime, glad to help 🙂
Hi,
Is there a reason why the band lower and upper limits have an equal width for all of the bands?
However, when I change the line of code that you provided: DSC_OPTIN_91BWN=rand('integer', 1, 100);, i.e. change the 1 to -447 and the 100 to 1246, the numrecords changes to 10% in each band out of the 93,260 records in my dataset, t
However, the problem is that when I change the minimum to -999, it creates band 9 (-999 to -779) and has 9349 records. I have checked the minimum value of the variable and it is only -447, sow can the table say that there would be 9349 records in band 9 and even 9288 in band 8, despite the two bands being beyond the minimum?
proc rank data=have groups=10 ties=high out=want;
var y;
ranks decile;
run;
Thanks for your code. However, is there a way of identifying the lower and upper limits for each of the band in the Results/Output data tab, similar to an example I have created below? If there were 100 records in total in my dataset, the code would split it into 10 groups of 10 records and identify the band's lower and upper bounds for each band (e.g. 500 as the lower bound and 575 as the upper bound for band 1).
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.