BookmarkSubscribeRSS Feed
sas_Forum
Calcite | Level 5

Grouping and Giving one id
Hi i am having pan1,pan2,pan3 and add1
now i want to generate household id on that

data test;
input pan1 pan2 pan3 add1
cards;
1 2 3 4
5 6 7 8
. . . .
1 2 7 8
1 5 9 12
13 14 15 16
12 18 19 .
run;

Here i want to create one unique number as Household id
first it should check the pan1 if any of the pan are in pan2 or pan3 or add1 it should give one
household id like versa pan2 should check the observations matching in pan1 or pan3 or add1
and pan3 and add1 to do the same like pan1 and pan2..
my final createria is to get one number unique if any of the pan1,pan2,pan3 and add1 obs matching
any where and i should generate one new number if all the observations are blank.
These is the process of creating household in Dataflux i am doing this in baase sas.

output:
pan1 pan2 pan3 add1 Household
1 2 3 4  1    
5 6 7 8  1
. . . .  2
1 2 7 8  1
1 5 9 12 1
13 14 15 16 3
12 18 19 . 1

4 REPLIES 4
art297
Opal | Level 21

You've posted this twice but haven't provided enough information for one to understand what you are trying to do.  You seem to want to increase Household by 1 each time something does or doesn't happen, but I wasn't able to understand what that event might be.

Also, you will have to clearly indicate if your actual data are just going to be numbers, like in your example and, if so, how large those numbers can be.

sas_Forum
Calcite | Level 5

I want the Household id for the if  there is a observation 123 in pan1 and if this observations again accor in pan2 at any place and if in pan3 and add1 then these all should get one Household id not only that

pan1 pan2 pan3 add1

123   456   789 345

567 123  098 067

In this example if 123 is repeating on pan1 and pan2 then these two get one H_id not only that the number corresponding to

123 if occuring any where like(456 789 345 in obs1 and 567 098 067 in obs 2) like that they should

get same H_id...

Tom
Super User Tom
Super User

It sounds like you want to split your ids into connected subgraphs.  If you have SAS/OR licensed I have had success using PROC NETDRAW to do this.  Otherwise you can code your own logic.

I would start by converting your data from the current four variables to two variables and populate it with pairs of values from your four variables.

data relations;

  set test;

  array x pan1-pan3 add1 ;

  do i=1 to dim(x)-1;

     do j=i+1 to dim(x);

       id1=x(i); id2=x(j); output;

       id1=x(j); id2=x(i); output;

     end;

  end;

  keep id1 id2 ;

run;

proc sort data=relations nodupkey;

  by id1 id2;

run;

You can grow your own subnets by looping and adding in all of the connected ids from the current list until the current subnet stops growing.

Ksharp
Super User

Still not understand what is your mean totally.

How about:

data test;
input pan1 pan2 pan3 add1 ;
cards;
1 2 3 4
5 6 7 8
. . . .
1 2 7 8
1 5 9 12
13 14 15 16
12 18 19 .
;
run;

data want(keep=pan1 pan2 pan3 add1 household);
 set test;
 retain first;
 count+1;
 _count=0;
 do i=1 to _nobs;
  set test(rename=(pan1=_pan1 pan2=_pan2 pan3=_pan3 add1=_add1)) nobs=_nobs point=i;
  _count+1;
  if count ne _count and
   ((pan1=_pan1 and not missing(_pan1))  or  (pan1=_pan2 and not missing(_pan2))  or (pan1=_pan3 and not missing(_pan3)) or (pan1=_add1 and not missing(_add1)) or 
    (pan2=_pan1 and not missing(_pan1))  or  (pan2=_pan2 and not missing(_pan2))  or (pan2=_pan3 and not missing(_pan3)) or (pan2=_add1 and not missing(_add1)) or 
    (pan3=_pan1 and not missing(_pan1))  or  (pan3=_pan2 and not missing(_pan2))  or (pan3=_pan3 and not missing(_pan3)) or (pan3=_add1 and not missing(_add1)) or 
    (add1=_pan1 and not missing(_pan1))  or  (add1=_pan2 and not missing(_pan2))  or (add1=_pan3 and not missing(_pan3)) or (add1=_add1 and not missing(_add1))  )
    then do; _n+1;
           if _n eq 1 then do;
                             n+1; first=n;
                             household=n;
                           end;
             else household=first;
           found=1;leave;
          end;
 end;
 if not found then do; n+1; household=n; end;
run;

Ksharp

Message was edited by: xia keshan

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 4 replies
  • 932 views
  • 0 likes
  • 4 in conversation