Contrast two sets of variables

Solved
Frequent Contributor
Posts: 75

Contrast two sets of variables

My question is as follows. Suppose the data looks like:

ID----X1----X2----X3----Z1----Z2----Z3

For each ID, consider two sets of variables {X1, X2, X3} and {Z1, Z2, Z3} that:

- The numbers of Xs and Zs may be equal or not. They may also have missing values.

- Values of variables in each set is unique. That is, for each ID, X1 not equal X2 not equal X3. The same applies for Zs.

- Values of Xs and Zs can be equal, and there comes the question. How can I create a new data that retains equal values of Xs and Zs and exclude unequal values. For example, if X1 is equal to any Zs, then X1 will be retained together with the Z.

I write array statements but get stuck at a point (after then). I'm not sure if it can work with these statements.

data arr;

set arr;

array a1(3) X1 X2 X3;

do i=1 to 3;

array a2(3) Z1 Z2 Z3;

do i=1 to 3;

if a1 = a2 then ???;

end;

end;

run;

Accepted Solutions
Solution
‎06-28-2013 04:17 AM
Super User
Posts: 10,784

Re: Contrast two sets of variables

Posted in reply to NonSleeper

Try this one:

```data temp;
input ID x1 x2 x3 z1 z2 z3;
datalines;
1001      11      12      13     .      12      11
1002      21      22      23    24     25     26
1003      31      32      33    31     32      .
1004      41      .        43    41     44     45
;
run;
data temp1(keep=id _:);
set temp;
array one x1 x2 x3 ;      array _one _x1 _x2 _x3 ;
array two z1 z2 z3 ;      array _two _z1 _z2 _z3 ;
do i=1 to dim(one);
do j=1 to dim(two);
if one{i}=two{j} then do;_one{i}=one{i};_two{j}=two{j};end;
end;
end;
run;
```

Ksharp

All Replies
Posts: 3,167

Re: Contrast two sets of variables

Posted in reply to NonSleeper

data want;

set have;

array x x:;

array z z:;

do over x;

if x in z then do; output;return;end;

end;

run;

Haikuo

update: if you don't want missing=missing scenario, then try the following the tweaked version:

data want;

set have;

array x x:;

array z z:;

do over x;

if not missing(x) then if x in z then do; output;return;end;

end;

run;

Frequent Contributor
Posts: 75

Re: Contrast two sets of variables

I'm not sure where I did it wrong, but I couldn't make it work. Consider a hypothetical data:

data temp;

input ID x1 x2 x3 z1 z2 z3;

datalines;

1001      11      12      13     .      12      11

1002      21      22      23    24     25     26

1003      31      32      33    31     32      .

1004      41      .        43    41     44     45

;

run;

data temp1;

set temp;

array one x1 x2 x3 ;

array two z1 z2 z3 ;

do over one;

if one in two then do;

output;

return;

end;

end;

run;

The output temp1 is the same as the original data temp, while it should look like:

1001     11     12     .     .     12     11

1002      .        .       .     .      .        .

1003     31     32     .    31   32      .

1004     41      .       .    41    .        .

Solution
‎06-28-2013 04:17 AM
Super User
Posts: 10,784

Re: Contrast two sets of variables

Posted in reply to NonSleeper

Try this one:

```data temp;
input ID x1 x2 x3 z1 z2 z3;
datalines;
1001      11      12      13     .      12      11
1002      21      22      23    24     25     26
1003      31      32      33    31     32      .
1004      41      .        43    41     44     45
;
run;
data temp1(keep=id _:);
set temp;
array one x1 x2 x3 ;      array _one _x1 _x2 _x3 ;
array two z1 z2 z3 ;      array _two _z1 _z2 _z3 ;
do i=1 to dim(one);
do j=1 to dim(two);
if one{i}=two{j} then do;_one{i}=one{i};_two{j}=two{j};end;
end;
end;
run;
```

Ksharp

Frequent Contributor
Posts: 75

Re: Contrast two sets of variables

Can you explain what the _: does in the KEEP statement? The program works, I just want to know.

Super User
Posts: 10,784

Re: Contrast two sets of variables

Posted in reply to NonSleeper

Keep all of variables which start with _    (i.e.  _x1 ....._z1 ...).

Or use HaiKuo's code .

Posts: 3,167

Re: Contrast two sets of variables

Posted in reply to NonSleeper

Without seeing your sample output, I misunderstood your requirement, thought you want to only output ( you use retain) those records with matched values, instead, you want output all of the records but with only matched values filled. In that case you will need a minor tweak:

data temp;

input ID x1 x2 x3 z1 z2 z3;

datalines;

1001      11      12      13     .      12      11

1002      21      22      23    24     25     26

1003      31      32      33    31     32      .

1004      41      .        43    41     44     45

;

run;

data temp1;

set temp;

array one x1 x2 x3 ;

array two z1 z2 z3 ;

do over one;

if one not in two then call missing(one);

end;

do over two;

if two not in one then call missing(two);

end;

run;

BTW, this is exactly why Reeza came up with the thread: https://communities.sas.com/docs/DOC-2263

Had I seen your desired output, you would have your question answered more promptly.

HTH,

Haikuo

Posts: 3,852

Re: Contrast two sets of variables

```
do over one;
if one not in two then call missing(one);
end;
do over two;
if two not in one then call missing(two);
end;
run;

```

Can you look both ways at once and only have to DO OVER one time?

Posts: 3,167

Re: Contrast two sets of variables

Posted in reply to data_null__

Oh, Boy. Thank you so much, DN! I am very lucky to have you mentoring me over my slow growth.

Yes, we can, and of course you knew the answer all along. This is amazing trait of the array(). So after one array cycle is kicked off, the other defined arrays will syn at the same time? How does it work?

To get back to OP, you don't need 2X do loops, just one will do:

data temp1;

set temp;

array one x1 x2 x3 ;

array two z1 z2 z3 ;

do over one;

if one not in two then call missing(one);

if two not in one then call missing(two);

end;

run;

Haikuo

Update: Dug a little deeper, the picture is getting clearer. It works the best if all of the arrays have the same dimensions, otherwise, you will either miss some variable in the bigger arrays or get some error in the smaller ones.

Posts: 3,852

Re: Contrast two sets of variables

`So after one array cycle is kicked off, the other defined arrays will syn at the same time? How does it work?`

There is no magic.  Both arrays have implied index variable _I_ therefore DO OVER "either" will increment the common index variable.  If the arrays had different number of elements there would be a problem.

The syntax for implicit arrays is something this.

array array-name(index-variable-name) sas-variable-list;

where the default index-variable-name is _I_.

Posts: 3,167

Re: Contrast two sets of variables

Posted in reply to data_null__

Thanks, DN.

Posts: 5,535

Re: Contrast two sets of variables

Hi! Is there a good reason why DO OVER is not documented? I for one will not use it until it is. - PG

PG
SAS Super FREQ
Posts: 4,242

Re: Contrast two sets of variables

The DO OVER syntax is not documented because it was depricated in SAS Version 7. See

1780 - Implicit ARRAY support in Version 7 and beyond

Posts: 3,852

Re: Contrast two sets of variables

I was once ask not to use PROC DELETE because it was not documented, anymore, at that time, 10 years ago or so.  I see that it has been "reinstated".

I suppose the same could be said for Implicit ARRAYs and I see them used not infrequently in programs that I look at from time to time.  If I recall correctly SAS has had explicitly subscripted arrays since version 6 that's well over 20 years ago.  It time to let go.:smileycry:

Posts: 5,535

Re: Contrast two sets of variables

Thanks Rick, the Note doesn't tell why the feature would be discouraged. I would like to think that SAS language(s) is evolving toward more explicit coding. Implicit features are more difficult to learn, master and read, which explains also how attached we become once we grasp them. - PG

PG
🔒 This topic is solved and locked.

Need further help from the community? Please ask a new question.

Discussion stats
• 15 replies
• 532 views
• 6 likes
• 6 in conversation