Help using Base SAS procedures

scoring based on a range of values

Reply
Frequent Contributor
Posts: 101

scoring based on a range of values

Hi all—

I have to assign a score based on a range of values. I don’t think I’m doing this the best way. For some of the cases a missing value is returned.  I have attached a sample data set. 

Also, how would this translate into base SAS?

Any help is greatly appreciated.

Matt--

proc sql;

create table want as

select distinct

AGENCY_NAME,

type_new,

Eight_Months_Total_sum,

Number_Active,

percent_18,

case when (Type_NEW='GP') and  0 le percent_18 le .12   then 1

     when (Type_NEW='GP') and .13 le percent_18 le .17  then .8

     when (Type_NEW='GP') and .18 le percent_18 le .24  then .6

  when (Type_NEW='GP') and .25 le  percent_18 le .41  then .4

     when (Type_NEW='GP') and    percent_18 gt .40 then 0

  when (Type_NEW='FTR') and  0 le percent_18 le .11   then  1

     when (Type_NEW='FTR') and .12 le percent_18 le .14  then .8

     when (Type_NEW='FTR') and .15 le percent_18 le .22  then .6

  when (Type_NEW='FTR') and .23 le  percent_18 le .34  then .4

     when (Type_NEW='FTR') and  percent_18 gt .4 then 0

  end as LOS_Score  format=percent6. 

  from

have

group by AGENCY_NAME,Type_NEW  ;

quit;

Trusted Advisor
Posts: 2,116

Re: scoring based on a range of values

Posted in reply to RobertNYC

Matt,

The reason that you are getting missing results is that you have some rounding in values that are on the boundary.  For instance, data that are entered as 0.12 in your spreadsheet may be expressed as 0.120000000001 in SAS and the compare will fail.  You need to change one edge of the boundaries to be a "lt" and use a matching value to make sure that things are inclusive.  If your data are actually recorded to just 2 decimals, you can also put the cut-points at 3 decimals to avoid the problem.  E.g.

case when (Type_NEW='GP') and  0 le percent_18 le .125   then 1

     when (Type_NEW='GP') and .125 le percent_18 le .175 then .8

...

(not the changes in bold face).

Doc Muhlbaierr

Duke

Respected Advisor
Posts: 4,935

Re: scoring based on a range of values

Posted in reply to RobertNYC

I think you would do better with something like this :

data ranges;
input type_new $ pctMin pctMax LOS_Score;
format LOS_Score percent6.;
datalines;
GP 0.00 0.12 1.0
GP 0.13 0.17 0.8
GP 0.18 0.24 0.6
GP 0.25 0.39 0.4
GP 0.40 1.00 0.0
FTR 0.00 0.11 1.0
FTR 0.12 0.14 0.8
FTR 0.15 0.22 0.6
FTR 0.23 0.39 0.4
FTR 0.40 1.00 0.0
;

proc sql;
create table want as
select
AGENCY_NAME,
H.type_new,
Eight_Months_Total_sum,
Number_Active,
percent_18,
LOS_score
  from have as H inner join ranges as R
on H.type_new=R.type_new and percent_18 between pctMin and pctMax
order by AGENCY_NAME, h.Type_NEW;
quit;

PG

PG
Super User
Posts: 10,046

Re: scoring based on a range of values

Posted in reply to RobertNYC

I would like to use proc format.

proc format;

value $ type

'GP'='gp.'

'FTR'='ftr.';

invalue    gp

     0-.12=1

     .13-.17=.8

     .18-.24=.6

     .25-.41=.4

     .41<-high=0 ;

invalue    ftr

     0-.11=1

     .12-.14=.8

     .15-.22=.6

     .23-.34=.4

     .4<-hight=0 ;

run;

data want;

set have;

LOS_Score=inputn(percent_18,put(type_new,$type.));

run;

Ksharp

Frequent Contributor
Posts: 101

Re: scoring based on a range of values

Thanks Ksharp. Quick question: is there a way to add an additional format to this. I need to have the scores in percents.

so 100%, 7%, 5%, 0%

proc format;

value Utl

     .97-1=1

     .93-.96=.7

     .90-.92=.5

     low-.89=0;

run;

Super User
Posts: 10,046

Re: scoring based on a range of values

Posted in reply to RobertNYC

That is easy. Just add a format statement into data step.

format LOS_Score  percent. ;

Frequent Contributor
Posts: 101

Re: scoring based on a range of values

Hi Ksharp--

Thank you for your help. adding a percent does not seem to be working because im trying to modify a format  not change the format.  I've been trying to achive this by using a picture format but I can't seem to get it right.

This is what I have:

proc format;

picture  Utl

     .97-1='100%'

     .93-.96='70%'

     .90-.92='50%

     low-.89='0%';

run;

Your feedback is appreciated.

Super User
Posts: 10,046

Re: scoring based on a range of values

Posted in reply to RobertNYC

OK . That would be more simple .

proc format;

value $ type

'GP'='gp.'

'FTR'='ftr.';

value  gp

     0-.12='100%'

     .13-.17='80%'

     .18-.24='60%'

     .25-.41='40%'

     .41<-high='0%' ;

value  ftr

     0-.11='100%'

     .12-.14='80%'

     .15-.22='60%'

     .23-.34='40%'

     .4<-high='0%' ;

run;

data want;

set have;

length LOS_Score  $ 4;

LOS_Score=putn(percent_18,put(type_new,$type.));

run;

Ksharp

Super User
Super User
Posts: 7,078

Re: scoring based on a range of values

Posted in reply to RobertNYC

Formats will handle the imprecision of storing decimal fractions in binary computers for you.

What you need to do is use the same value for the lower limit of one range and the upper limit of the next. 

Check the manual for details.

value Utl

   .97-1  ='1'

   .93-.97='.7'

   .90-.93='.5'

   low-.90='0'

;

Ask a Question
Discussion stats
  • 8 replies
  • 268 views
  • 3 likes
  • 5 in conversation