BookmarkSubscribeRSS Feed
sas_Forum
Calcite | Level 5

Grouping and Giving one id
Hi i am having pan1,pan2,pan3 and add1
now i want to generate household id on that

data test;
input pan1 pan2 pan3 add1
cards;
1 2 3 4
5 6 7 8
. . . .
1 2 7 8
1 5 9 12
13 14 15 16
12 18 19 .
run;

Here i want to create one unique number as Household id
first it should check the pan1 if any of the pan are in pan2 or pan3 or add1 it should give one
household id like versa pan2 should check the observations matching in pan1 or pan3 or add1
and pan3 and add1 to do the same like pan1 and pan2..
my final createria is to get one number unique if any of the pan1,pan2,pan3 and add1 obs matching
any where and i should generate one new number if all the observations are blank.
These is the process of creating household in Dataflux i am doing this in baase sas.

output:
pan1 pan2 pan3 add1 Household
1 2 3 4  1    
5 6 7 8  1
. . . .  2
1 2 7 8  1
1 5 9 12 1
13 14 15 16 3
12 18 19 . 1

4 REPLIES 4
art297
Opal | Level 21

You've posted this twice but haven't provided enough information for one to understand what you are trying to do.  You seem to want to increase Household by 1 each time something does or doesn't happen, but I wasn't able to understand what that event might be.

Also, you will have to clearly indicate if your actual data are just going to be numbers, like in your example and, if so, how large those numbers can be.

sas_Forum
Calcite | Level 5

I want the Household id for the if  there is a observation 123 in pan1 and if this observations again accor in pan2 at any place and if in pan3 and add1 then these all should get one Household id not only that

pan1 pan2 pan3 add1

123   456   789 345

567 123  098 067

In this example if 123 is repeating on pan1 and pan2 then these two get one H_id not only that the number corresponding to

123 if occuring any where like(456 789 345 in obs1 and 567 098 067 in obs 2) like that they should

get same H_id...

Tom
Super User Tom
Super User

It sounds like you want to split your ids into connected subgraphs.  If you have SAS/OR licensed I have had success using PROC NETDRAW to do this.  Otherwise you can code your own logic.

I would start by converting your data from the current four variables to two variables and populate it with pairs of values from your four variables.

data relations;

  set test;

  array x pan1-pan3 add1 ;

  do i=1 to dim(x)-1;

     do j=i+1 to dim(x);

       id1=x(i); id2=x(j); output;

       id1=x(j); id2=x(i); output;

     end;

  end;

  keep id1 id2 ;

run;

proc sort data=relations nodupkey;

  by id1 id2;

run;

You can grow your own subnets by looping and adding in all of the connected ids from the current list until the current subnet stops growing.

Ksharp
Super User

Still not understand what is your mean totally.

How about:

data test;
input pan1 pan2 pan3 add1 ;
cards;
1 2 3 4
5 6 7 8
. . . .
1 2 7 8
1 5 9 12
13 14 15 16
12 18 19 .
;
run;

data want(keep=pan1 pan2 pan3 add1 household);
 set test;
 retain first;
 count+1;
 _count=0;
 do i=1 to _nobs;
  set test(rename=(pan1=_pan1 pan2=_pan2 pan3=_pan3 add1=_add1)) nobs=_nobs point=i;
  _count+1;
  if count ne _count and
   ((pan1=_pan1 and not missing(_pan1))  or  (pan1=_pan2 and not missing(_pan2))  or (pan1=_pan3 and not missing(_pan3)) or (pan1=_add1 and not missing(_add1)) or 
    (pan2=_pan1 and not missing(_pan1))  or  (pan2=_pan2 and not missing(_pan2))  or (pan2=_pan3 and not missing(_pan3)) or (pan2=_add1 and not missing(_add1)) or 
    (pan3=_pan1 and not missing(_pan1))  or  (pan3=_pan2 and not missing(_pan2))  or (pan3=_pan3 and not missing(_pan3)) or (pan3=_add1 and not missing(_add1)) or 
    (add1=_pan1 and not missing(_pan1))  or  (add1=_pan2 and not missing(_pan2))  or (add1=_pan3 and not missing(_pan3)) or (add1=_add1 and not missing(_add1))  )
    then do; _n+1;
           if _n eq 1 then do;
                             n+1; first=n;
                             household=n;
                           end;
             else household=first;
           found=1;leave;
          end;
 end;
 if not found then do; n+1; household=n; end;
run;

Ksharp

Message was edited by: xia keshan

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 4 replies
  • 1822 views
  • 0 likes
  • 4 in conversation