Solved: Create a new variable based on other variables

ANKH1 · Posted 02-24-2023 10:49 AM

There is this dataset. Var1 shows the place within a group. Var2 shows the ID to which the member is related to.

site ID var1 var2
1 11 2 .
1 12 1 11
2 21 2 .
2 22 1 21
2 31 2 .
2 33 1 31

We need to create a variable that shows that two IDs belong to a group.

site ID var1 var2 group
1 11 2 . 1
1 12 1 11 1
2 21 2 . 2
2 22 1 21 2
2 31 2 . 3
2 33 1 31 3

PaigeMiller · Posted 02-24-2023 02:31 PM

I still think there is a piece of logic you haven't explained, but based upon what you have said, this works:

data want;
    set have;
    group=1+floor((_n_-1)/2);
run;

I contend something hasn't been explained, because the above solution doesn't even look at the value of var1 and var2.

--
Paige Miller

View solution in original post

PaigeMiller · Posted 02-24-2023 12:46 PM

Please explain the logic used here to derive variable GROUP.

--
Paige Miller

ANKH1 · Posted 02-24-2023 01:24 PM

group indicates that ID 11 and ID 12 are associated. var2 indicates that ID 12 is related to ID 11 (value 11).

PaigeMiller · Posted 02-24-2023 02:01 PM

is VAR1 always 2 in the first record and VAR1 always 1 in the second record? Is VAR2 always present when var1=1 and missing when var1^=1? Are there ever more than 2 records in each GROUP?

--
Paige Miller

ANKH1 · Posted 02-24-2023 02:14 PM

Yes to the first two questions. No, only 2 records per group.

PaigeMiller · Posted 02-24-2023 02:31 PM

I still think there is a piece of logic you haven't explained, but based upon what you have said, this works:

data want;
    set have;
    group=1+floor((_n_-1)/2);
run;

I contend something hasn't been explained, because the above solution doesn't even look at the value of var1 and var2.

--
Paige Miller

ANKH1 · Posted 02-24-2023 02:43 PM

Thanks. We need for ID 11 and ID 12 to belong to the same unique group. ID 12 shows that it is related to ID 11 by showing in var2 that 11 is the ID it is attached to. var1 shows that ID 11 is number 1 in the list and ID 12 is number 2.

PaigeMiller · Posted 02-24-2023 02:45 PM

Yes, I think I understand that, but what is wrong with my solution that doesn't use VAR1 or VAR2? I get the proper values of GROUP.

--
Paige Miller

ANKH1 · Posted 02-24-2023 03:56 PM

Your code worked perfectly. I was just wondering if a proc sort will be necessary in order not to mess up the pairings? Since the group variable is based on observation number?

PaigeMiller · Posted 02-24-2023 04:54 PM

What is not sorted properly?

--
Paige Miller

ANKH1 · Posted 02-27-2023 11:11 AM

I was just wondering if there should be prior step but it works great. Thanks!

Ksharp · Posted 02-25-2023 07:22 AM

data x;
infile cards expandtabs;
input site ID var1 var2;
from=catx('|',site,id);
to=catx('|',site,var2);
cards;
1    11    2       .
1    12   1      11
2    21   2       .
2    22   1      21
2    31   2       .
2    33   1     31
;



data have;
set x;
if not missing(id) and not missing(var2);
keep from to;
run;



data full;
  set have end=last;
  if _n_ eq 1 then do;
   declare hash h();
    h.definekey('node');
     h.definedata('node');
     h.definedone();
  end;
  output;
  node=from; h.replace();
  from=to; to=node; output;
  node=from; h.replace();
  if last then h.output(dataset:'node');
  drop node;
run;


data want(keep=node household);
declare hash ha(ordered:'a');
declare hiter hi('ha');
ha.definekey('count');
ha.definedata('last');
ha.definedone();
declare hash _ha(hashexp: 16);
_ha.definekey('key');
_ha.definedone();

if 0 then set full;
declare hash from_to(dataset:'full',hashexp:20,multidata:'y');
 from_to.definekey('from');
 from_to.definedata('to');
 from_to.definedone();

if 0 then set node;
declare hash no(dataset:'node');
declare hiter hi_no('no');
 no.definekey('node');
 no.definedata('node');
 no.definedone();
 

do while(hi_no.next()=0);
 household+1; output;
 count=1;
 key=node;_ha.add();
 last=node;ha.add();
 rc=hi.first();
 do while(rc=0);
   from=last;rx=from_to.find();
   do while(rx=0);
     key=to;ry=_ha.check();
      if ry ne 0 then do;
       node=to;output;rr=no.remove(key:node);
       key=to;_ha.add();
       count+1;
       last=to;ha.add();
      end;
      rx=from_to.find_next();
   end;
   rc=hi.next();
end;
ha.clear();_ha.clear();
end;
stop;
run;

proc sql;
create table final_want as
select a.site,a.id,a.var1,a.var2,b.household as group 
 from x as a left join want as b
  on catx('|',a.site,a.id)=b.node
   order by site,id;
quit;

Create a new variable based on other variables

Re: Create a new variable based on other variables

Re: Create a new variable based on other variables

Re: Create a new variable based on other variables

Re: Create a new variable based on other variables

Re: Create a new variable based on other variables

Re: Create a new variable based on other variables

Re: Create a new variable based on other variables

Re: Create a new variable based on other variables

Re: Create a new variable based on other variables

Re: Create a new variable based on other variables

Re: Create a new variable based on other variables

Re: Create a new variable based on other variables

Ready to join fellow brilliant minds for the SAS Hackathon?

Classroom Training Available!