BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
jeremy4
Quartz | Level 8

Hi,

 

I have a dataset and for a variable (e.g. application_scores), is there a way of finding out the lower and upper bound numbers that would split the total number of records in the dataset into 10 equal deciles? The example below was the upper and lower bound numbers that were previously obtained but can someone please provide some code to obtain the correct scores (i.e. instead of 370 and 474, what would it be now?) that would split the records into 10 bands in my updated dataset?

 

else if 370<=application_scores<=474 then band='10';
else if 475<=application_scores<=539 then band='09';
else if 540<=application_scores<=616 then band='08';
else if 617<=application_scores<=686 then band='07';
else if 687<=application_scores<=815 then band='06';
else if 816<=application_scores<=929 then band='05';
else if 930<=application_scores<=975 then band='04';

else if 976<=application_scores<=1011 then band='03';
else if 1012<=application_scores<=1120 then band='02';
else if 1121<=application_scores then band='01';

1 ACCEPTED SOLUTION

Accepted Solutions
PeterClemmensen
Tourmaline | Level 20

I think this is what you want

 

data have;
   do x=1 to 1e4;
      y=rand('integer', 1, 100);
      output;
   end;
run;

proc rank data=have groups=10 descending out=temp;
   var y;
   ranks band;
run;

proc sql;
   create table want as
   select band,
          min(y) as lower,
          max(y) as upper,
          count(y) as numrecords
   from temp
   group by band;
quit;

View solution in original post

9 REPLIES 9
andreas_lds
Jade | Level 19

Maybe proc rank is useful to build a solution, but without seeing data and the expected result, it is difficult to suggest something.

PeterClemmensen
Tourmaline | Level 20

You can do something like this

 

data have;
   do x=1 to 1e4;
      y=rand('integer', 1, 100);
      output;
   end;
run;

proc rank data=have groups=10 descending out=want;
   var y;
   ranks decile;
run;
jeremy4
Quartz | Level 8

Thanks for your reply, your code works and I can see which band each record would fit into.

 

However, is there code available to look at all of the records for that one variable, application_score, and tell me what the lower and upper bound limits were to put an equal amount of records to make 10 bands (ie. in the Results tab, there would be 10 numbers of lower limits (one for each band), and 10 numbers of upper limits (one for each band)?

 

Example:

There are 100 records, so there should be 10 records in each band and it would pick out the lower and upper bound scores for each band so that this would be satisfied.

 

The code would identify the numbers e.g. 0 and 9 for band 1, 10 and 19 for band 2 etc.. 

 

Band 1:    0<=application_score<=9

Band 2:   10<=application_score<=19

:                      :

:                      :

Band 10: 90<=application_score<=99

PeterClemmensen
Tourmaline | Level 20

I think this is what you want

 

data have;
   do x=1 to 1e4;
      y=rand('integer', 1, 100);
      output;
   end;
run;

proc rank data=have groups=10 descending out=temp;
   var y;
   ranks band;
run;

proc sql;
   create table want as
   select band,
          min(y) as lower,
          max(y) as upper,
          count(y) as numrecords
   from temp
   group by band;
quit;
jeremy4
Quartz | Level 8

Hi,

 

Is there a reason why the band lower and upper limits have an equal width for all of the bands?

SAS deciles output.PNG

 

However, when I change the line of code that you provided: DSC_OPTIN_91BWN=rand('integer', 1, 100);, i.e. change the 1 to -447 and the 100 to 1246, the numrecords changes to 10% in each band out of the 93,260 records in my dataset, t

 

However, the problem is that when I change the minimum to -999, it creates band 9 (-999 to -779) and has 9349 records. I have checked the minimum value of the variable and it is only -447, sow can the table say that there would be 9349 records in band 9 and even 9288 in band 8, despite the two bands being beyond the minimum?SAS -999.PNG

 

 

 

Ksharp
Super User

proc rank data=have groups=10 ties=high out=want;
   var y;
   ranks decile;
run;
jeremy4
Quartz | Level 8

Thanks for your code. However, is there a way of identifying the lower and upper limits for each of the band in the Results/Output data tab, similar to an example I have created below? If there were 100 records in total in my dataset, the code would split it into 10 groups of 10 records and identify the band's lower and upper bounds for each band (e.g. 500 as the lower bound and 575 as the upper bound for band 1).

 

SAS deciles.PNG

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 9 replies
  • 4835 views
  • 2 likes
  • 4 in conversation