turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- SAS Programming
- /
- Base SAS Programming
- /
- Lookup rows in a different table based on column v...

Topic Options

- RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

02-28-2014 10:59 AM

Hello there,

I've got a question I've yet to find an answer to. Let's say I have a table with columns containing a 1 or 0, something like this:

Value1 | Value2 | Value3 | Value4 |
---|---|---|---|

1 | 0 | 0 | 0 |

0 | 1 | 1 | 0 |

0 | 0 | 1 | 1 |

1 | 0 | 1 | 0 |

1 | 1 | 1 | 1 |

And another table that contains lookup values:

Value | Return |
---|---|

1 | 5 |

2 | 18 |

3 | 2 |

4 | 1 |

What I want to do is add a field which is the sum of the associated value in the lookup table if the indicator is a 1. So something like this:

Value1 | Value2 | Value3 | Value4 | Total |
---|---|---|---|---|

1 | 0 | 0 | 0 | 5 |

0 | 1 | 1 | 0 | 20 |

0 | 0 | 1 | 1 | 3 |

1 | 0 | 1 | 0 | 7 |

1 | 1 | 1 | 1 | 26 |

Thanks in advance

Accepted Solutions

Solution

02-28-2014
07:10 PM

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to HSARAI

02-28-2014 07:10 PM

Hash seems pretty handy for this kind of lookup:

**data** source;

input value1 value2 value3 value4;

cards;

1 0 0 0

0 1 1 0

0 0 1 1

1 0 1 0

1 1 1 1

;;;;

**run**;

**data** lookup;

input value return;

cards;

1 5

2 18

3 2

4 1

;;;;

**data** want;

if _n_=**1** then do;

declare hash h(dataset:'lookup');

h.definekey('value');

h.definedata('return');

h.definedone();

call missing (value,return);

end;

set source;

array _v value1-value4;

do over _v;

if _v=**1** then do;

rc=h.find(key:_i_);

total=sum(total,return);

end;

end;

drop value return rc;

**run**;

Haikuo

All Replies

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to HSARAI

02-28-2014 11:04 AM

Are your tables that small?

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Reeza

02-28-2014 11:05 AM

No. In reality there are 200+ indicator variables with close to a million rows.

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to HSARAI

02-28-2014 11:06 AM

How big is the lookup table?

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Reeza

02-28-2014 11:08 AM

the lookup table has the same number of rows as the indicator variables, so 200+ rows.

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to HSARAI

02-28-2014 11:41 AM

You need to know the number of rows vars ahead of time, but you can easily put those into macro variables.

This solutions loads the lookup table into a temporary array and then filters through it. A Hash solution may even be more efficient.

data source;

input value1 value2 value3 value4;

cards;

1 0 0 0

0 1 1 0

0 0 1 1

1 0 1 0

1 1 1 1

;

run;

data lookup;

input value return;

cards;

1 5

2 18

3 2

4 1

;

run;

data want;

array lookup(4) _temporary_;

do i=1 to 4;

set lookup;

lookup(i)=return;

end;

do j=1 to 5;

array obs(4) value1-value4;

set source;

running_total=0;

do i=1 to 4;

running_total=running_total+obs(i)*lookup(i);

end;

output;

drop i j value return;

end;

run;

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to HSARAI

02-28-2014 11:33 AM

Given your examples can you explain how you got the value of total for each row?

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to art297

02-28-2014 11:38 AM

For the first row, just the value of Value1 is equal to 1. Therefore you look up the rows in the lookup table where Value=1, whose value of the Result column is 5.

For the second row, the value of Value2 and Value3 are equal to 1. Therefore you look up the rows in the lookup table where Value=2 and Value=3 and add the values of the Result column for those rows which are 18 and 2, sum is 20.

For the third row, the value of Value3 and Value4 are equal to 1. Therefore you look up the rows in the lookup table where Value=3 and Value=4 and add the values of the Result column for those rows which are 2 and 1, sum is 3.

And so on...

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to HSARAI

02-28-2014 11:52 AM

Why not just transpose the data ? So table one becomes:

RowId Value YesNo

1 1 1

1 2 0

1 3 0

1 4 0

1 5 0

2 1 0

2 2 1

....

Then you can just join with the lookup values an sum up RETURN to get TOTAL.

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

02-28-2014 11:58 AM

I considered a transpose solution too, but the problem is that the indicator variable isn't always the same name as the lookup table. So there could be some values where the column name is ValueABC and I would do a translation to look up the value where the Value is 'XYZ' in the lookup table. I should have specified this, but even though a transpose solution would work with the example I provided it wouldn't unfortunately work for the actual problem I am trying to solve.

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to HSARAI

02-28-2014 12:05 PM

You could consider PROC SCORE. You again need to process the look up table and name the "VALUE" to match the name in the master. But you don't have to know how many.

input value1 value2 value3 value4;

cards;

1 0 0 0

0 1 1 0

0 0 1 1

1 0 1 0

1 1 1 1

;;;;

input value return;

cards;

1 5

2 18

3 2

4 1

;;;;

id value;

var return;

retain _type_ 'SCORE' _name_ 'SUM';

set score(drop=_name_);

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to HSARAI

02-28-2014 12:08 PM

This is also basic matrix math, so IML will probably handle it very easily.

Lots of options

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Reeza

02-28-2014 12:12 PM

I'm sure there are a lot of options since I can't imagine this being uncommon. I just need to try understand your proposed solution with my minimal experience of the SAS language, but I'm hoping I can let you know soon if I worked something out.

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to HSARAI

02-28-2014 12:36 PM

Below is a fairly simple way. The first line creates a macro variable called &n. and sets it to the number of rows that you have, thus you would just have to change it from 4 to 200 (or however number of values that you actually have). _n_ is an automatic counter that keeps track of the number of the row you are actually reading. However, within an interation _n_ can be reused and it won't forget its correct value when it gets to the next iteration.

data table1;

input Value1-Value4;

cards;

1 0 0 0

0 1 1 0

0 0 1 1

1 0 1 0

1 1 1 1

;

run;

data table2;

input Value Return;

cards;

1 5

2 18

3 2

4 1

;

run;

%let n=4;

data want (drop=Value Return);

array recodes(&n.)_temporary_;

array values(*) value1-value&n.;

if _n_ eq 1 then do;

do _n_=1 to &n.;

set table2;

recodes(value)=Return;

end;

end;

set table1;

do _n_=1 to dim(values);

total=sum(total,values(_n_)*recodes(_n_));

end;

run;

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to HSARAI

02-28-2014 02:01 PM

You will need to get a solution to matching the 0/1 variables to the proper weight value from your lookup table.

The right answer probably depends on why the names are not matching now. If it is just because users are entering names manually perhaps you can use the order of the variables in the dataset to drive the match?

Call your first dataset HAVE and your value lookup table LOOKUP. Then this code should create new dataset WANT that is a copy of HAVE with the new TOTAL variable appended on the right.

data want ;

set have ;

array all _numeric_;

do _i_ = 1 to dim(all) ;

if all(_i_) then do;

set lookup (keep=return) point=_i_ ;

total = sum(total,return);

end;

end;

drop return ;

run;