BookmarkSubscribeRSS Feed
RobertNYC
Obsidian | Level 7

Hi all—

I have to assign a score based on a range of values. I don’t think I’m doing this the best way. For some of the cases a missing value is returned.  I have attached a sample data set. 

Also, how would this translate into base SAS?

Any help is greatly appreciated.

Matt--

proc sql;

create table want as

select distinct

AGENCY_NAME,

type_new,

Eight_Months_Total_sum,

Number_Active,

percent_18,

case when (Type_NEW='GP') and  0 le percent_18 le .12   then 1

     when (Type_NEW='GP') and .13 le percent_18 le .17  then .8

     when (Type_NEW='GP') and .18 le percent_18 le .24  then .6

  when (Type_NEW='GP') and .25 le  percent_18 le .41  then .4

     when (Type_NEW='GP') and    percent_18 gt .40 then 0

  when (Type_NEW='FTR') and  0 le percent_18 le .11   then  1

     when (Type_NEW='FTR') and .12 le percent_18 le .14  then .8

     when (Type_NEW='FTR') and .15 le percent_18 le .22  then .6

  when (Type_NEW='FTR') and .23 le  percent_18 le .34  then .4

     when (Type_NEW='FTR') and  percent_18 gt .4 then 0

  end as LOS_Score  format=percent6. 

  from

have

group by AGENCY_NAME,Type_NEW  ;

quit;

8 REPLIES 8
Doc_Duke
Rhodochrosite | Level 12

Matt,

The reason that you are getting missing results is that you have some rounding in values that are on the boundary.  For instance, data that are entered as 0.12 in your spreadsheet may be expressed as 0.120000000001 in SAS and the compare will fail.  You need to change one edge of the boundaries to be a "lt" and use a matching value to make sure that things are inclusive.  If your data are actually recorded to just 2 decimals, you can also put the cut-points at 3 decimals to avoid the problem.  E.g.

case when (Type_NEW='GP') and  0 le percent_18 le .125   then 1

     when (Type_NEW='GP') and .125 le percent_18 le .175 then .8

...

(not the changes in bold face).

Doc Muhlbaierr

Duke

PGStats
Opal | Level 21

I think you would do better with something like this :

data ranges;
input type_new $ pctMin pctMax LOS_Score;
format LOS_Score percent6.;
datalines;
GP 0.00 0.12 1.0
GP 0.13 0.17 0.8
GP 0.18 0.24 0.6
GP 0.25 0.39 0.4
GP 0.40 1.00 0.0
FTR 0.00 0.11 1.0
FTR 0.12 0.14 0.8
FTR 0.15 0.22 0.6
FTR 0.23 0.39 0.4
FTR 0.40 1.00 0.0
;

proc sql;
create table want as
select
AGENCY_NAME,
H.type_new,
Eight_Months_Total_sum,
Number_Active,
percent_18,
LOS_score
  from have as H inner join ranges as R
on H.type_new=R.type_new and percent_18 between pctMin and pctMax
order by AGENCY_NAME, h.Type_NEW;
quit;

PG

PG
Ksharp
Super User

I would like to use proc format.

proc format;

value $ type

'GP'='gp.'

'FTR'='ftr.';

invalue    gp

     0-.12=1

     .13-.17=.8

     .18-.24=.6

     .25-.41=.4

     .41<-high=0 ;

invalue    ftr

     0-.11=1

     .12-.14=.8

     .15-.22=.6

     .23-.34=.4

     .4<-hight=0 ;

run;

data want;

set have;

LOS_Score=inputn(percent_18,put(type_new,$type.));

run;

Ksharp

RobertNYC
Obsidian | Level 7

Thanks Ksharp. Quick question: is there a way to add an additional format to this. I need to have the scores in percents.

so 100%, 7%, 5%, 0%

proc format;

value Utl

     .97-1=1

     .93-.96=.7

     .90-.92=.5

     low-.89=0;

run;

Ksharp
Super User

That is easy. Just add a format statement into data step.

format LOS_Score  percent. ;

RobertNYC
Obsidian | Level 7

Hi Ksharp--

Thank you for your help. adding a percent does not seem to be working because im trying to modify a format  not change the format.  I've been trying to achive this by using a picture format but I can't seem to get it right.

This is what I have:

proc format;

picture  Utl

     .97-1='100%'

     .93-.96='70%'

     .90-.92='50%

     low-.89='0%';

run;

Your feedback is appreciated.

Ksharp
Super User

OK . That would be more simple .

proc format;

value $ type

'GP'='gp.'

'FTR'='ftr.';

value  gp

     0-.12='100%'

     .13-.17='80%'

     .18-.24='60%'

     .25-.41='40%'

     .41<-high='0%' ;

value  ftr

     0-.11='100%'

     .12-.14='80%'

     .15-.22='60%'

     .23-.34='40%'

     .4<-high='0%' ;

run;

data want;

set have;

length LOS_Score  $ 4;

LOS_Score=putn(percent_18,put(type_new,$type.));

run;

Ksharp

Tom
Super User Tom
Super User

Formats will handle the imprecision of storing decimal fractions in binary computers for you.

What you need to do is use the same value for the lower limit of one range and the upper limit of the next. 

Check the manual for details.

value Utl

   .97-1  ='1'

   .93-.97='.7'

   .90-.93='.5'

   low-.90='0'

;

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 8 replies
  • 1157 views
  • 3 likes
  • 5 in conversation