BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
ilikesas
Barite | Level 11

Hi,

 

suppsoe I have the following data:

 

var1 var2 var3 var4 var5 var6
a a b a b b
a b b a b a
a b a a b a

 

var1 - var 3 are considered one group, and var4 - var6 are considered a second group.

 

what I would like to do is to select observations where there are identical entries adjacent to each other within the same group.

So the first row will be selected because the value "a" is present for var1 and var2 (and they are adjacent within the first group), and also the value "b" is present in var5 and var6 (and they are adjacent within the second group).

Likewise the second row will also be selected because "b" is in var2 and var3. The third row will NOT be selected.

I guess that the tricky art here is to code for "adjacentness within group"

 

Thanks!

1 ACCEPTED SOLUTION

Accepted Solutions
Astounding
PROC Star

This is getting very close to the solution I pictured for hundreds of variables.  There are two changes to consider:

 

  • Must both groups contain the same number of variables?
  • Should processing be cut short once a match is found?

Here would be the result:

 

data want;
set have;
array group1 {*} first list of many variable names;
array group2 {*} another set containing variables that belong in the second group;
flag=0;
do i=1 to dim(group1)-1 until (flag=1);
   if group1{i}=group1{i+1} then flag=1;
end;
if flag=0 then do i=1 to dim(group2)-1 until (flag=1);
   if group2{i}=group2{i+1} then flag=1;
end;

if flag;
run;

View solution in original post

8 REPLIES 8
Astounding
PROC Star

If this really represents your problem, it's easy enough:

 

data want;

set have;

if var1=var2 or var2=var3 or var4=var5 or var5=var6;

run;

 

If you actually have hundreds of variables instead, the solution would use arrays.  But let's not go there unless it's needed.

Reeza
Super User

Another option, define two different arrays and check for adjacent within each array. 

art297
Opal | Level 21

I'm not sure if I understand the requirements but here is the solution for what I think is being asked:

 

data have;
  infile cards dlm=',';
  informat var1-var6 $1.;
  length test $1;
  input var1-var6;
  array b $ var4-var6;
  call missing(test);
  if var1 eq var2 then test=var1;
  else if var2 eq var3 then test=var2;
  if not missing(test) and
     var1 in b and
     var3 in b;
  cards;
a,a,b,a,b,b
a,b,b,a,b,a
a,b,a,a,b,a
;

Art, CEO, AnalystFinder.com

 

ilikesas
Barite | Level 11

Hi art297,

 

thanks for the code. I ran it and it gave me the first 2 rows. But then to test it further I made a small change to the second observation:

 

a,b,a,a,b,b --> I made the b's be adjacent in the second group. When I ran th ecode again I got only the first row and not also the second row, so its as if the second group isn't included in the selection process.

art297
Opal | Level 21

Then try it with this modification:

 

data have (drop=test:);
  infile cards dlm=',';
  informat var1-var6 $1.;
  length test1 test2 $1;
  input var1-var6;
  array a $ var1-var3;
  array b $ var4-var6;
  call missing(test1);
  call missing(test2);
  if var1 eq var2 then test1=var1;
  else if var2 eq var3 then test1=var2;
  else if var4 eq var5 then test2=var4;
  else if var5 eq var6 then test2=var5;
  if not missing(test1) then do;
    if var1 in b and var3 in b then output;
  end;
  else if not missing(test2) then do;
    if var4 in a and var6 in a then output;
  end;
  cards;
a,a,b,a,b,b
a,b,b,a,b,a
a,b,a,a,b,a
a,b,a,a,b,b
;

Art, CEO, AnalystFinder.com

 

stat_sas
Ammonite | Level 13

Hi,

 

Define two separate arrays based on two group of variables. Please try this.

 

data want;
set have;
array v13(*) var1-var3;
array v46(*) var4-var6;
flag=0;
do i=1 to dim(v13)-1;
   if v13{i}=v13{i+1} then flag+1;
   if v46{i}=v46{i+1} then flag+1;
end;
if flag;
run;

Astounding
PROC Star

This is getting very close to the solution I pictured for hundreds of variables.  There are two changes to consider:

 

  • Must both groups contain the same number of variables?
  • Should processing be cut short once a match is found?

Here would be the result:

 

data want;
set have;
array group1 {*} first list of many variable names;
array group2 {*} another set containing variables that belong in the second group;
flag=0;
do i=1 to dim(group1)-1 until (flag=1);
   if group1{i}=group1{i+1} then flag=1;
end;
if flag=0 then do i=1 to dim(group2)-1 until (flag=1);
   if group2{i}=group2{i+1} then flag=1;
end;

if flag;
run;

mkeintz
PROC Star

If there are multple groups of varying sizes, there is still a way to avoid multiple arrays and do groups.  Say you have 100 vars in 6 groups of size 10, 20, 30, 20,10, 10.  Placed in an array of 100 vars, they would have "upper bounds" at elements 10, 30, 60, 80, 90, and 100 respectively:

 

data want;
  set have;
  array var{*}   var1-var100;
  array upbnds{6} _temporary_ (10 30 60 80 90 100);

  U=1;
  do V=1 to dim(var)-1 until (flag=1);
    if V=upbnds{U} then U=U+1;
    else if var{v}=var{v+1} then flag=1;
  end;
  if flag;
run;

 

 

 

 

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 8 replies
  • 2565 views
  • 6 likes
  • 6 in conversation