turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- Data Management
- /
- Forum
- /
- DATA REPLACE VALUES WITH SCORES

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

07-21-2017 07:05 AM

Hi SAS,

I have a data set which contains over 30 character and numeric variables. Based on this dataset, I generate a scorecard. Now, I want to replace each variable's value with the score I got.

Here is part of my dataset:

ID | DAYS | GRADE | REMAINING_PERIOD |

1 | 5 | 2A | 5.23 |

2 | 6 | 2C | 10.95 |

3 | 8 | 2D | 20.22 |

4 | 3 | 1C | . |

5 | 4 | 2B | 8.66 |

Here is part of my scorecard:

Variable | Group | Lable | Scorecard Points |

DAYS | 2 | 2<= DAYS< 5 | 12 |

DAYS | 3 | 5<= DAYS< 6 | -4 |

DAYS | 4 | 6<= DAYS< 9 | 7 |

GRADE | 1 | 2A, 2C, 3A | 6 |

GRADE | 2 | 2D | 55 |

GRADE | 3 | 1C, 2B, 3B, _MISSING_, _UNKNOWN_ | 6 |

GRADE | 4 | 0, 4 | 8 |

REMAINING_PERIOD | 3 | 4.66<= REMAINING_PERIOD< 9.28 | 13 |

REMAINING_PERIOD | 4 | 9.28<= REMAINING_PERIOD< 11.65 | 18 |

REMAINING_PERIOD | 5 | 11.65<= REMAINING_PERIOD< 23.32, _MISSING_ | 9 |

The below one is the result I want to generate:

SCORE TABLE | |||

ID | DAYS | GRADE | REMAINING_PERIOD |

1 | -4 | 6 | 13 |

2 | 7 | 6 | 18 |

3 | 7 | 55 | 9 |

4 | 12 | 6 | 9 |

5 | 12 | 6 | 13 |

I can do it using DATA, IF, ELSE IF, ELSE, RUN. But since I have more than 150 groups in my scorecard, it is really a time-consuming work.

Could any expert suggest me an easier and time-consuming way? I am thing maybe I can use some REPLACE procedure, but not familiar with it. Could you help me?

Accepted Solutions

Solution

08-03-2017
04:13 AM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to JinboZhao

07-23-2017 11:17 PM

It looks to me like you have code generation problem.

Do you really want to generate variables that use the same names as the input data? Because it looks like all of your scores are numeric and your input variables are mixed.

You probably just need to beef up your metadata (scorecard) so that it contains all of the information you need to generate the code to compute the score. For example you might want to either include a variable to tell if the source variable is character or numeric. Or at least populate the values of the character variables with quotes so that the generated code will be valid SAS syntax. So perhaps something like this:

```
data score_meta ;
length var $32 order 8 condition $200 points 8;
infile cards dsd dlm='|' ;
input var order condition points;
cards;
DAYS|2|2<= DAYS< 5|12
DAYS|3|5<= DAYS< 6|-4
DAYS|4|6<= DAYS< 9|7
GRADE|1|grade in ('2A' '2C' '3A')|6
GRADE|2|grade='2D'|55
GRADE|3|grade in ('1C' '2B' '3B' '_MISSING_' '_UNKNOWN_')|6
GRADE|4|grade in ('0' '4')|8
REMAINING_PERIOD|3|4.66<= REMAINING_PERIOD< 9.28|13
REMAINING_PERIOD|4|9.28<= REMAINING_PERIOD< 11.65|18
REMAINING_PERIOD|5|11.65<= REMAINING_PERIOD< 23.32|9
REMAINING_PERIOD|5|REMAINING_PERIOD is null|9
;
```

Then you could use that to generate an SQL statement use a simple data _null_ step.

```
filename code temp;
data _null_;
set score_meta end=eof;
by var order ;
file code ;
if _n_=1 then put
/ 'create table want as'
/ ' select id'
;
if first.var then put ' , case' ;
put ' when (' condition ') then ' points ;
if last.var then put ' end as ' var ;
if eof then put
' from have'
/ ';'
;
run;
```

Which you could then execute using %INCLUDE.

```
proc sql;
%include code / source2 ;
quit;
```

So for the metadata I posted above the SQL code would look like this:

```
create table want as
select id
, case
when (2<= DAYS< 5 ) then 12
when (5<= DAYS< 6 ) then -4
when (6<= DAYS< 9 ) then 7
end as DAYS
, case
when (grade in ('2A' '2C' '3A') ) then 6
when (grade='2D' ) then 55
when (grade in ('1C' '2B' '3B' '_MISSING_' '_UNKNOWN_') ) then 6
when (grade in ('0' '4') ) then 8
end as GRADE
, case
when (4.66<= REMAINING_PERIOD< 9.28 ) then 13
when (9.28<= REMAINING_PERIOD< 11.65 ) then 18
when (11.65<= REMAINING_PERIOD< 23.32 ) then 9
when (REMAINING_PERIOD is null ) then 9
end as REMAINING_PERIOD
from have
;
```

All Replies

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to JinboZhao

07-21-2017 07:20 AM

Unfortunately I don't see a simpler way. Your logical conditions are quite variaed and use multiple variables, if it was just one variable, then you could apply a format, however that is not the case. You could perhaps save a bit of typing by using select syntax:

data want; set have; select; when(2 <= input(days,best.) < 5) then scorecard="12"; ... when(grade in ("2A","2C","3A") then scorecard=6; ... otherwise; end; run;

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to JinboZhao

07-21-2017 09:41 AM

You could conceivably create a set of formats based on your SCORECARD data. That's work, but perhaps not as much work. You would begin by creating two SAS data sets (one for numeric values, one for character values). For numeric variables, LABEL would be numeric:

Variable Start End Label

DAYS 2 4 12

DAYS 5 5 -4

DAYS 6 8 7

REMAINING_PERIOD 4.66 9.28 13

REMAINING_PERIOD 9.28 11.65 18

I'm not sure if that would be simpler than what you have now. It's actually more complex than this, since a fifth variable needs to be added. SAS needs to be told that 9.28 exactly belongs in the "18" category, not the "13" category. And when it comes to the character fields, separate lines are required for each value (2A, 2C, and 3A for example all need to go on separate lines as the value of START).

If this is simpler than what you have now (difficult to tell from here), the remaining programming would be relatively easy. But at least y ou have a choice to think about.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Astounding

07-22-2017 01:50 PM

Now THAT's an interesting one!

Here's an option, very complicated, but might be both reasonably time-saving and able to be changed.

First, assume we input your data, I'm naming the dataset "Have":

**data** Have;

length ID **8** DAYS **8** GRADE $**5** REMAINING_PERIOD **8**;

input ID DAYS GRADE REMAINING_PERIOD;

cards;

1 5 2A 5.23

2 6 2C 10.95

3 8 2D 20.22

4 3 1C .

5 4 2B 8.66

**run**;

Next, we input your specs for transformation:

**data** ScorecardSpecs;

length Variable $**32** Group **8** Label $**128** ScorecardPoints **8**;

input Variable Group Label && ScorecardPoints;

cards;

DAYS 2 2<= DAYS< 5 12

DAYS 3 5<= DAYS< 6 -4

DAYS 4 6<= DAYS< 9 7

GRADE 1 2A, 2C, 3A 6

GRADE 2 2D 55

GRADE 3 1C, 2B, 3B, _MISSING_, _UNKNOWN_ 6

GRADE 4 0, 4 8

REMAINING_PERIOD 3 4.66<= REMAINING_PERIOD< 9.28 13

REMAINING_PERIOD 4 9.28<= REMAINING_PERIOD< 11.65 18

REMAINING_PERIOD 5 11.65<= REMAINING_PERIOD< 23.32 9

REMAINING_PERIOD 5 REMAINING_PERIOD = _MISSING_ 9

**run**;

Note the two spaces after each value of "Label".

Now this is where things get weird. I'm just doing a first-level implementation, as a proof of concept. In this, the macro Days writes out the transformations needed for variable "DAYS". Then the macro is run, and generates the code. Finally, the data step creates dataset Want by including in the transformation. (I'm creating a variable NewDays, so you can see the original and the derived, but this is easy to change).

%let SpecVar = DAYS;

**%macro** ** Days**;

data _null_;

file 'C:...\GenDays.sas';

length CodeLine $**512**;

set ScoreCardSpecs;

if variable = "&SpecVar."

then do;

CodeLine = catx(" ", "if", Label, "then NewDays =", ScoreCardPoints, ";");

put CodeLine;

end;

run;

**%mend**;

%** Days**;

**data** want;

set have;

%include 'C:...\GenDays.sas';

**run**;

It seems to work for this variable.

To make this reasonable efficient to code, I was thinking the second-level implementation would be to have a macro that would generate all of the transformations of one type, another to do the transformations of another type, etc. This is where I don't know how many transformation "types" you have. In your post, it looks like DAYS and REMAINING_PERIOD are one type, and GRADE is another.

Second question, are you comfortable with this level of complexity?

Before I do any more, let me know if this is going in a reasonable direction.

Tom

Solution

08-03-2017
04:13 AM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to JinboZhao

07-23-2017 11:17 PM

It looks to me like you have code generation problem.

Do you really want to generate variables that use the same names as the input data? Because it looks like all of your scores are numeric and your input variables are mixed.

You probably just need to beef up your metadata (scorecard) so that it contains all of the information you need to generate the code to compute the score. For example you might want to either include a variable to tell if the source variable is character or numeric. Or at least populate the values of the character variables with quotes so that the generated code will be valid SAS syntax. So perhaps something like this:

```
data score_meta ;
length var $32 order 8 condition $200 points 8;
infile cards dsd dlm='|' ;
input var order condition points;
cards;
DAYS|2|2<= DAYS< 5|12
DAYS|3|5<= DAYS< 6|-4
DAYS|4|6<= DAYS< 9|7
GRADE|1|grade in ('2A' '2C' '3A')|6
GRADE|2|grade='2D'|55
GRADE|3|grade in ('1C' '2B' '3B' '_MISSING_' '_UNKNOWN_')|6
GRADE|4|grade in ('0' '4')|8
REMAINING_PERIOD|3|4.66<= REMAINING_PERIOD< 9.28|13
REMAINING_PERIOD|4|9.28<= REMAINING_PERIOD< 11.65|18
REMAINING_PERIOD|5|11.65<= REMAINING_PERIOD< 23.32|9
REMAINING_PERIOD|5|REMAINING_PERIOD is null|9
;
```

Then you could use that to generate an SQL statement use a simple data _null_ step.

```
filename code temp;
data _null_;
set score_meta end=eof;
by var order ;
file code ;
if _n_=1 then put
/ 'create table want as'
/ ' select id'
;
if first.var then put ' , case' ;
put ' when (' condition ') then ' points ;
if last.var then put ' end as ' var ;
if eof then put
' from have'
/ ';'
;
run;
```

Which you could then execute using %INCLUDE.

```
proc sql;
%include code / source2 ;
quit;
```

So for the metadata I posted above the SQL code would look like this:

```
create table want as
select id
, case
when (2<= DAYS< 5 ) then 12
when (5<= DAYS< 6 ) then -4
when (6<= DAYS< 9 ) then 7
end as DAYS
, case
when (grade in ('2A' '2C' '3A') ) then 6
when (grade='2D' ) then 55
when (grade in ('1C' '2B' '3B' '_MISSING_' '_UNKNOWN_') ) then 6
when (grade in ('0' '4') ) then 8
end as GRADE
, case
when (4.66<= REMAINING_PERIOD< 9.28 ) then 13
when (9.28<= REMAINING_PERIOD< 11.65 ) then 18
when (11.65<= REMAINING_PERIOD< 23.32 ) then 9
when (REMAINING_PERIOD is null ) then 9
end as REMAINING_PERIOD
from have
;
```

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

08-01-2017 10:20 AM

Hi, sorry to reply you so late.

This code does not work well on my SAS, it always say, the FILE WORK.HAVE.DATA does not exist.

I could not find the reason, could you help me?

Thank you.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to JinboZhao

08-01-2017 10:46 AM

If you have an error you need to post the log so that the error can be seen in context.

Most likely you have copied example code from the discussion and not replaced the generic dataset name that was used to show the structure of the code you need with the name of your actual dataset.

If is common to post code in the form.

```
data want;
set have ;
...
```

since we are not running SAS on your machine and so do not have access to your datasets. You just need to adapt the code to work for your situation.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

08-01-2017 12:06 PM

Thank you very much. The exercise code works well in my SAS. It should work on my data.