DATA Step, Macro, Functions and more

Giving Unique Number

Reply
Frequent Contributor
Posts: 137

Giving Unique Number

Grouping and Giving one id
Hi i am having pan1,pan2,pan3 and add1
now i want to generate household id on that

data test;
input pan1 pan2 pan3 add1
cards;
1 2 3 4
5 6 7 8
. . . .
1 2 7 8
1 5 9 12
13 14 15 16
12 18 19 .
run;

Here i want to create one unique number as Household id
first it should check the pan1 if any of the pan are in pan2 or pan3 or add1 it should give one
household id like versa pan2 should check the observations matching in pan1 or pan3 or add1
and pan3 and add1 to do the same like pan1 and pan2..
my final createria is to get one number unique if any of the pan1,pan2,pan3 and add1 obs matching
any where and i should generate one new number if all the observations are blank.
These is the process of creating household in Dataflux i am doing this in baase sas.

output:
pan1 pan2 pan3 add1 Household
1 2 3 4  1    
5 6 7 8  1
. . . .  2
1 2 7 8  1
1 5 9 12 1
13 14 15 16 3
12 18 19 . 1

Super Contributor
Posts: 5,768

Giving Unique Number

You've posted this twice but haven't provided enough information for one to understand what you are trying to do.  You seem to want to increase Household by 1 each time something does or doesn't happen, but I wasn't able to understand what that event might be.

Also, you will have to clearly indicate if your actual data are just going to be numbers, like in your example and, if so, how large those numbers can be.

Frequent Contributor
Posts: 137

Giving Unique Number

I want the Household id for the if  there is a observation 123 in pan1 and if this observations again accor in pan2 at any place and if in pan3 and add1 then these all should get one Household id not only that

pan1 pan2 pan3 add1

123   456   789 345

567 123  098 067

In this example if 123 is repeating on pan1 and pan2 then these two get one H_id not only that the number corresponding to

123 if occuring any where like(456 789 345 in obs1 and 567 098 067 in obs 2) like that they should

get same H_id...

Esteemed Advisor
Esteemed Advisor
Posts: 5,380

Giving Unique Number

It sounds like you want to split your ids into connected subgraphs.  If you have SAS/OR licensed I have had success using PROC NETDRAW to do this.  Otherwise you can code your own logic.

I would start by converting your data from the current four variables to two variables and populate it with pairs of values from your four variables.

data relations;

  set test;

  array x pan1-pan3 add1 ;

  do i=1 to dim(x)-1;

     do j=i+1 to dim(x);

       id1=x(i); id2=x(j); output;

       id1=x(j); id2=x(i); output;

     end;

  end;

  keep id1 id2 ;

run;

proc sort data=relations nodupkey;

  by id1 id2;

run;

You can grow your own subnets by looping and adding in all of the connected ids from the current list until the current subnet stops growing.

Super User
Posts: 8,527

Re: Giving Unique Number

Still not understand what is your mean totally.

How about:

data test;
input pan1 pan2 pan3 add1 ;
cards;
1 2 3 4
5 6 7 8
. . . .
1 2 7 8
1 5 9 12
13 14 15 16
12 18 19 .
;
run;

data want(keep=pan1 pan2 pan3 add1 household);
 set test;
 retain first;
 count+1;
 _count=0;
 do i=1 to _nobs;
  set test(rename=(pan1=_pan1 pan2=_pan2 pan3=_pan3 add1=_add1)) nobs=_nobs point=i;
  _count+1;
  if count ne _count and
   ((pan1=_pan1 and not missing(_pan1))  or  (pan1=_pan2 and not missing(_pan2))  or (pan1=_pan3 and not missing(_pan3)) or (pan1=_add1 and not missing(_add1)) or 
    (pan2=_pan1 and not missing(_pan1))  or  (pan2=_pan2 and not missing(_pan2))  or (pan2=_pan3 and not missing(_pan3)) or (pan2=_add1 and not missing(_add1)) or 
    (pan3=_pan1 and not missing(_pan1))  or  (pan3=_pan2 and not missing(_pan2))  or (pan3=_pan3 and not missing(_pan3)) or (pan3=_add1 and not missing(_add1)) or 
    (add1=_pan1 and not missing(_pan1))  or  (add1=_pan2 and not missing(_pan2))  or (add1=_pan3 and not missing(_pan3)) or (add1=_add1 and not missing(_add1))  )
    then do; _n+1;
           if _n eq 1 then do;
                             n+1; first=n;
                             household=n;
                           end;
             else household=first;
           found=1;leave;
          end;
 end;
 if not found then do; n+1; household=n; end;
run;

Ksharp

Message was edited by: xia keshan

Post a Question
Discussion Stats
  • 4 replies
  • 106 views
  • 0 likes
  • 4 in conversation