BookmarkSubscribeRSS Feed
art297
Opal | Level 21

I may have corrected the code myself, but would definitely appreciate feedback regarding whether I did it correctly.  The following code appears to accomplish what the OP had actually wanted:

data want (keep=pan1 pan2 pan3 add1 household);

  if _n_ eq 1 then do;

    declare hash ha(hashexp: 16);

    ha.definekey('key');

    ha.definedata('hhold');

    ha.definedone();

  end;

  set test;

  array _house{*} $ 40 pan1--pan3;

  do i=1 to dim(_house);

    key=_house{i};

    call missing(hhold);

    rc=ha.find();

    if rc=0 then do;

      found=1;

      household=hhold;

      leave;

    end;

  end;

  if not found then do;

    n+1;

    household=n;

  end;

  do j=1 to dim(_house);

    if not missing(_house{j}) then do;

      key=_house{j};

      hhold=household;

      ha.replace();

    end;

  end;

run;

Ksharp
Super User

Art297.

I have already written a code like you before at this forum.

And I also remener the OP is R_Win.

But your code is not what sas_Forum want.

I ran your code based on the following data, but the result is wrong.Plz see the following.

aaa   bbb    ccc     ddd   

qqq   rrr       www   aaa  

rrr     ppp    mmm lll      

uuu   zzz    ffff      ppp    

p       l        m          n     

jjjj    eee     rrr       ooo   

.    .   .   .            

sss   www  .    .            

.         .        .        eee

Your output:

pan1    pan2    pan3    add1    household

aaa    bbb    ccc    ddd    1

qqq    rrr    www    aaa    2

rrr    ppp    mmm    lll    2

uuu    zzz    ffff    ppp    3

p    l    m    n    4

jjjj    eee    rrr    ooo    2

                5

sss    www            2

            eee    6

Regards.

Ksharp

Smiley Happy

art297
Opal | Level 21

Ksharp,

I'm not sure what you are saying.  The results you showed are NOT the results of the code I posted.  I just ran the code against your data as shown below:

The results I got were:

pan1 add1 pan2 pan3    household

aaa   bbb    ccc     ddd     1

qqq   rrr       www   aaa     1

rrr     ppp    mmm lll          1

uuu   zzz    ffff      ppp       1

p       l        m          n       2

jjjj    eee     rrr       ooo      1

.    .   .   .                         3

sss   www  .    .                1

.         .        .        eee      1

As I understood sas_Forum's original request (at the top of this thread), that was precisely the desired result.

Records 1-4,6,8 and 9 all get a 1 because they all share something in common.

Record 5 gets a 2 because all of its values are unique as compared to the previous values;

And, finally, record 7 gets a 3 because all of its values were missing.

I'm not sure what you ran, but the above is what I got using the code that I posted (and shown again below).

Art

p.s. Congratulations on attaining the fourth highest point total on the forum.

data test;

  input (pan1 add1 pan2 pan3) ($);

  cards;

aaa   bbb    ccc     ddd  

qqq   rrr       www   aaa 

rrr     ppp    mmm lll     

uuu   zzz    ffff      ppp   

p       l        m          n    

jjjj    eee     rrr       ooo  

.    .   .   .           

sss   www  .    .           

.         .        .        eee

data want (keep=pan1 pan2 pan3 add1 household);

  if _n_ eq 1 then do;

    declare hash ha(hashexp: 16);

    ha.definekey('key');

    ha.definedata('hhold');

    ha.definedone();

  end;

  set test;

  array _house{*} $ 40 pan1--pan3;

  do i=1 to dim(_house);

    key=_house{i};

    call missing(hhold);

    rc=ha.find();

    if rc=0 then do;

      found=1;

      household=hhold;

      leave;

    end;

  end;

  if not found then do;

    n+1;

    household=n;

  end;

  do j=1 to dim(_house);

    if not missing(_house{j}) then do;

      key=_house{j};

      hhold=household;

      ha.replace();

    end;

  end;

run;

sas_Forum
Calcite | Level 5

Art thqs it working but u did not added add1 in the array ,but that may create a probelm i hope

Ksharp
Super User

Art.T already add add1 into array.

Check  whether the code is what your need.

sas_Forum
Calcite | Level 5

ksharp i did not find the add1 added in the array  ..

Ksharp
Super User

pan1 -- pan3

It is a range, already include add1,since in the dataset order of variable is pan1 add1 pan2 pan3.

Actually I wrote this code before ,at the R_Win post.

http://communities.sas.com/thread/30690

Ksharp

sas_Forum
Calcite | Level 5

ya i checked it Thqs Ksharp and art for your Help...

art297
Opal | Level 21

Ksharp,

I didn't check it line by line but, yes, the code is either VERY similar if not exactly the same as the code you suggested to the other thread.  The code you suggested in the current thread, though, assigned a "1" for household each time a match was found.

Did you ever discover why you thought that the code I presented didn't work correctly?

Ksharp
Super User

Because I test the data is:

aaa   bbb    ccc     ddd 

. . . .

qqq   rrr       www   aaa

. . . .

not like:

aaa   bbb    ccc     ddd  

qqq   rrr       www   aaa 

data copied  have a missing line between every two obs.

Art297.

I have a question. Consider the following situation.

pan1 pan2 pan3 add1  household

1         2         3       4           1

5         6         7        8           2

10        2         7      80        ?

How do you count the third obs?

in third obs, 2 is in obs1, however 7 is in obs2 .

Ksharp

sas_Forum
Calcite | Level 5

HI Ksharp  i have tested art code e with 8 lakh obs but the output is getting wrong the pan1,add1,pan2,pan3 is moving to 2-11 differnt household id ideally one pan should be in one Household id only it should not move to anothe household id..

for exx pan1 --xyz  

           pan2--abc

           pan3 --lpo

In the above exaple xyz is having Household id like 200,678,12,34,89  like that...

Ksharp
Super User

Your origin data not include pan4,

And what is your logic? Your explaination is ambiguous.

Can you post some more data and what output you would like.

Give some more sample data will be more helpful.

Ksharp

sas_Forum
Calcite | Level 5

sorry in place of it pan 4 is nothing but add1 i have renamed it ,ok finally in pan1,add1,pan2 and pan3 but regaring data i can not post that much i am having huge data i have posted 100 records ,but my logic is if any pan is in one household it should be in that household only it should not move to another household id's ..

sas_Forum
Calcite | Level 5

Art i have tested the code but for pan1 is fine it was not going in to two houshold ids but add1,pan2 and pan3 are going to two or three household id ...ex pan2-aaa the household should be single it should not have two or more household ids it is having the Household is like --23,45,999..

please can u help me in the remaining for the add1,pan2 and pan3 it should not go to more than  one householdids..

art297
Opal | Level 21

Ksharp,

I tested it with missing values in every other record and still appear to get the correct results.

The code was not designed to account for crossover household members.  Doing so would require knowing the maximum number of potential crossovers.

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 124 replies
  • 3699 views
  • 4 likes
  • 7 in conversation