BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
HSARAI
Calcite | Level 5

Hello there,

I've got a question I've yet to find an answer to. Let's say I have a table with columns containing a 1 or 0, something like this:

Value1Value2Value3Value4
1000
0110
0011
1010
1111

And another table that contains lookup values:

ValueReturn
15
218
32
41

What I want to do is add a field which is the sum of the associated value in the lookup table if the indicator is a 1. So something like this:

Value1Value2Value3Value4Total
10005
011020
00113
10107
111126

Thanks in advance

1 ACCEPTED SOLUTION

Accepted Solutions
Haikuo
Onyx | Level 15

Hash seems pretty handy for this kind of lookup:

data source;

   input value1 value2 value3 value4;

   cards;

1    0 0    0

0    1 1    0

0    0 1    1

1    0 1    0

1    1 1    1

;;;;

   run;

data lookup;

   input value return;

   cards;

1    5

2    18

3    2

4    1

;;;;

data want;

   if _n_=1 then do;

      declare hash h(dataset:'lookup');

      h.definekey('value');

      h.definedata('return');

      h.definedone();

      call missing (value,return);

    end;

    set source;

    array _v value1-value4;

      do over _v;

        if _v=1 then do;

          rc=h.find(key:_i_);

          total=sum(total,return);

         end;

      end;

        drop value return rc;

run;


Haikuo

View solution in original post

20 REPLIES 20
Reeza
Super User

Are your tables that small?

HSARAI
Calcite | Level 5

No. In reality there are 200+ indicator variables with close to a million rows.

Reeza
Super User

How big is the lookup table?

HSARAI
Calcite | Level 5

the lookup table has the same number of rows as the indicator variables, so 200+ rows.

Reeza
Super User

You need to know the number of rows vars ahead of time, but you can easily put those into macro variables.

This solutions loads the lookup table into a temporary array and then filters through it. A Hash solution may even be more efficient.

data source;

input value1 value2 value3 value4;

cards;

1    0    0    0

0    1    1    0

0    0    1    1

1    0    1    0

1    1    1    1

;

run;

data lookup;

input value return;

cards;

1    5

2    18

3    2

4    1

;

run;

data want;

array lookup(4) _temporary_;

do i=1 to 4;

set lookup;

lookup(i)=return;

end;

do j=1 to 5;

array obs(4) value1-value4;

set source;

running_total=0;

do i=1 to 4;

    running_total=running_total+obs(i)*lookup(i);

end;

output;

drop i j value return;

end;

run;

art297
Opal | Level 21

Given your examples can you explain how you got the value of total for each row?

HSARAI
Calcite | Level 5

For the first row, just the value of Value1 is equal to 1. Therefore you look up the rows in the lookup table where Value=1, whose value of the Result column is 5.

For the second row, the value of Value2 and Value3 are equal to 1. Therefore you look up the rows in the lookup table where Value=2 and Value=3 and add the values of the Result column for those rows which are 18 and 2, sum is 20.

For the third row, the value of Value3 and Value4 are equal to 1. Therefore you look up the rows in the lookup table where Value=3 and Value=4 and add the values of the Result column for those rows which are 2 and 1, sum is 3.

And so on...

Tom
Super User Tom
Super User

Why not just transpose the data ? So table one becomes:

RowId Value YesNo

1 1 1

1 2 0

1 3 0

1 4 0

1 5 0

2 1 0

2 2 1

....

Then you can just join with the lookup values an sum up RETURN to get TOTAL.

HSARAI
Calcite | Level 5

I considered a transpose solution too, but the problem is that the indicator variable isn't always the same name as the lookup table. So there could be some values where the column name is ValueABC and I would do a translation to look up the value where the Value is 'XYZ' in the lookup table. I should have specified this, but even though a transpose solution would work with the example I provided it wouldn't unfortunately work for the actual problem I am trying to solve.

data_null__
Jade | Level 19

You could consider PROC SCORE.  You again need to process the look up table and name the "VALUE" to match the name in the master.  But you don't have to know how many.

data source;
   input value1 value2 value3 value4;
   cards;
1    0    0    0
0    1    1    0
0    0    1    1
1    0    1    0
1    1    1    1
;;;;
   run;


data lookup;
   input value return;
   cards;
1    5
2    18
3    2
4    1
;;;;
   run;
proc transpose data=lookup prefix=value out=score;
   id value;
   var return;
   run;
data score;
   retain _type_ 'SCORE' _name_ 'SUM';
  
set score(drop=_name_);
   run;
proc print;
  
run;
proc score data=source score=score out=scored;
   run;
proc print;
  
run;

Reeza
Super User

This is also basic matrix math, so IML will probably handle it very easily.

Lots of options Smiley Happy

HSARAI
Calcite | Level 5

I'm sure there are a lot of options since I can't imagine this being uncommon. I just need to try understand your proposed solution with my minimal experience of the SAS language, but I'm hoping I can let you know soon if I worked something out.

art297
Opal | Level 21

Below is a fairly simple way.  The first line creates a macro variable called &n. and sets it to the number of rows that you have, thus you would just have to change it from 4 to 200 (or however number of values that you actually have).  _n_ is an automatic counter that keeps track of the number of the row you are actually reading.  However, within an interation _n_ can be reused and it won't forget its correct value when it gets to the next iteration.

data table1;

  input Value1-Value4;

  cards;

1 0 0 0

0 1 1 0

0 0 1 1

1 0 1 0

1 1 1 1

;

run;

data table2;

  input Value Return;

  cards;

1 5

2 18

3 2

4 1

;

run;

%let n=4;

data want (drop=Value Return);

  array recodes(&n.)_temporary_;

  array values(*) value1-value&n.;

  if _n_ eq 1 then do;

    do _n_=1 to &n.;

      set table2;

      recodes(value)=Return;

    end;

  end;

  set table1;  

  do _n_=1 to dim(values);

    total=sum(total,values(_n_)*recodes(_n_));

  end;

run;

Tom
Super User Tom
Super User

You will need to get a solution to matching the 0/1 variables to the proper weight value from your lookup table.

The right answer probably depends on why the names are not matching now.  If it is just because users are entering names manually perhaps you can use the order of the variables in the dataset to drive the match?

Call your first dataset HAVE and your value lookup table LOOKUP.  Then this code should create new dataset WANT that is a copy of HAVE with the new TOTAL variable appended on the right.

data want ;

   set have ;

   array all _numeric_;

   do _i_ = 1 to dim(all) ;

      if all(_i_) then do;

         set lookup (keep=return) point=_i_ ;

         total = sum(total,return);

     end;

   end;

   drop return ;

run;

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 20 replies
  • 4845 views
  • 8 likes
  • 7 in conversation