BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
Lapis Lazuli | Level 10

## Case when statement is not showing the correct result.

Hi all,

I am trying the case when statement to categorize the numbers to Low, medium and high but when my code runs, it shows all the results as low. Can you please suggest how to correct this?

Here is the sample dataset:

Data Test;
input debt_code Pence_in_pound credit_score;
Datalines;
1001452 3.01% 640
129979530 0.25% 603
291138113 76.00% 598
294432885 3.95% 552
350203659 8.74% 591
;

run;

``````Proc sql;
create table Willingness_to_pay as
select a.*,
case
when Pence_in_Pound between 0 and 5 then 'Low'
when Pence_in_Pound between 5.001 and 15 then 'Medium'
when Pence_in_Pound > 15.001 then 'High'
end as Pence_pound_metrics
from Book_on as a;
quit;``````
1 ACCEPTED SOLUTION

Accepted Solutions
Super User

## Re: Case when statement is not showing the correct result.

Most likely it's a numeric variable with a percentage format? If that's the case the values are actually 0.00 and 0.05 not 0 and 5.

A value of 0.05 with a percent format will show as 5%. So either scale your numbers by multiplying by 100 or change your ranges.
5 REPLIES 5
Diamond | Level 26

## Re: Case when statement is not showing the correct result.

The data set you provided has Pence_in_Pound always missing, so you will have to modify it to give correct data. Thank you.

P.S. When you provide data as a SAS data step, please test your code from now on before you post it.

--
Paige Miller
Lapis Lazuli | Level 10

## Re: Case when statement is not showing the correct result.

The data in percentage is the Pence_in_pound (3.01%,0.25% etc).
Diamond | Level 26

## Re: Case when statement is not showing the correct result.

@Sandeep77 wrote:
The data in percentage is the Pence_in_pound (3.01%,0.25% etc).

It's a character string? Or not? It seems as if you want to perform numerical calculations on this. Can you fix the code you provided?

--
Paige Miller
Super User

## Re: Case when statement is not showing the correct result.

Most likely it's a numeric variable with a percentage format? If that's the case the values are actually 0.00 and 0.05 not 0 and 5.

A value of 0.05 with a percent format will show as 5%. So either scale your numbers by multiplying by 100 or change your ranges.
Super User

## Re: Case when statement is not showing the correct result.

@Sandeep77 wrote:
The data in percentage is the Pence_in_pound (3.01%,0.25% etc).

The syntax for the data step is incorrect and does not read that variable correctly as shown:

```265  Data Test;
266  input debt_code Pence_in_pound credit_score;
267  Datalines;

NOTE: Invalid data for Pence_in_pound in line 268 9-13.
RULE:      ----+----1----+----2----+----3----+----4----+----5----+----6----+----7----+----8----+--
268        1001452 3.01% 640
debt_code=1001452 Pence_in_pound=. credit_score=640 _ERROR_=1 _N_=1
NOTE: Invalid data for Pence_in_pound in line 269 11-15.
269        129979530 0.25% 603
debt_code=129979530 Pence_in_pound=. credit_score=603 _ERROR_=1 _N_=2
NOTE: Invalid data for Pence_in_pound in line 270 11-16.
270        291138113 76.00% 598
debt_code=291138113 Pence_in_pound=. credit_score=598 _ERROR_=1 _N_=3
NOTE: Invalid data for Pence_in_pound in line 271 11-15.
271        294432885 3.95% 552
debt_code=294432885 Pence_in_pound=. credit_score=552 _ERROR_=1 _N_=4
NOTE: Invalid data for Pence_in_pound in line 272 11-15.
272        350203659 8.74% 591
debt_code=350203659 Pence_in_pound=. credit_score=591 _ERROR_=1 _N_=5
NOTE: The data set WORK.TEST has 5 observations and 3 variables.
NOTE: DATA statement used (Total process time):
real time           0.01 seconds
cpu time            0.00 seconds

273  ;
274
```

If values that should be numeric have any special characters other than -, for negative values, or decimal you need to provide a proper informat that will read the value as intended.

Such as

```Data Test;
input debt_code Pence_in_pound :percent. credit_score;
Datalines;
1001452 3.01% 640
129979530 0.25% 603
291138113 76.00% 598
294432885 3.95% 552
350203659 8.74% 591
;

run;```

At which point a short examination of the example data shows that NONE of your values are actually greater than 1. The largest percentage shown, 76.00% is a numeric value of 0.76. In this code where you are using numeric ranges:

```Proc sql;
create table Willingness_to_pay as
select a.*,
case
when Pence_in_Pound between 0 and 5 then 'Low'
when Pence_in_Pound between 5.001 and 15 then 'Medium'
when Pence_in_Pound > 15.001 then 'High'
end as Pence_pound_metrics
from Book_on as a;
quit;```

will therefore have ALL values of Pence_pound_metrics as 'Low'.

Really no reason to even run that as a Format would do that if the proper ranges are provided in the definition and then just print the data set with the format applied.

```proc format;
value metric
0 - 0.05     = 'Low'
0.05< - 0.15 = 'Medium'
0.15< -high  = 'High'
;
run;

proc print data=test noobs;
format pence_in_pound metric.;
run;```

Also note that your between code would have missed, if the values were not actual percentages, 5.0004 for example if it occurred in the data.

From SAS Users blog
Want more? Visit our blog for more articles like these.
Discussion stats
• 5 replies
• 203 views
• 1 like
• 4 in conversation