BookmarkSubscribeRSS Feed
HarryB
Obsidian | Level 7

Hi,

 

I would truly appreciate anyone's help on the following issue.

 

I would like to flag if two or more different clients share the same household ID. In other words, I would like to know who is with their family.

 

My data structure:

 

ClientID          HouseholdID

9804                     4182

10230                   4461

10230                  4461

10230                   4461

10230                   4461

10230                  4461

10230                 4461

10230                 4461

10230                 4461

190928                120137

190928                 120137

190928                120137

190928               120137

190928                120137

193909                120137

114631                 77320

114631                  77320

114631                 77320

186869                  77320

114631                 77320

186869                    77320

114631                 77320

114631                 77320

11635                   5390

11635                  5390

11635                   5390

11635                     5390

 

 

Want:

ClientID          HouseholdID   Flag

9804       4182       0

10230    4461       0

10230    4461       0

10230    4461       0

10230    4461       0

10230    4461       0

10230    4461       0

10230    4461       0

10230    4461       0

190928  120137  1

190928  120137  1

190928  120137  1

190928  120137  1

190928  120137  1

193909  120137  1

114631  77320    1

114631  77320    1

114631  77320    1

186869  77320    1

114631  77320    1

186869  77320    1

114631  77320    1

114631  77320    1

11635    5390       0

11635    5390       0

11635    5390       0

11635    5390       0

 

Thank you in advance!

5 REPLIES 5
ballardw
Super User

Are you sure you only want a single value? The flag as shown doesn't differentiate between the 120137 and 77320 houses. How do you actually expect to use that flag value?

HarryB
Obsidian | Level 7
The flag value would be a binary variable that would say whether the client has a household (family). If one household ID is not shared by at least two clients, that means the client is not with the family. For example, the Household ID 4461 is only associated with one client 10230, which means this client is by him/herself.
Hope, this is clear. Thanks
Astounding
PROC Star

One approach:

 

data want;
   flag = 0;
   do until (last.HouseholdID);
      set have;
      by HouseholdID notsorted;
      prior_client = lag(clientID);
      if first.HouseholdID = 0 and prior_client ne clientID then flag=1;
   end;
   do until (last.HouseholdID);
      set have;
      by HouseholdID notsorted;
      output;
   end;
   drop prior_client;
run;

It looks right, but hasn't been tested.  You have the data, so that part is up to you.

FreelanceReinh
Jade | Level 19

Hi @HarryB,

 

Here's another approach:

proc sql;
create table want as
select *, (count(distinct ClientID)>1) as Flag
from have
group by HouseholdID;
quit;

Or, if you want to preserve the original sort order (rather than sort by HouseholdID), but you don't have a suitable sort key:

data _temp / view=_temp;
set have;
seqno=_n_;
run;

proc sql;
create table want(drop=seqno) as
select *, count(distinct ClientID)>1 as Flag
from _temp
group by HouseholdID
order by seqno;
quit;
HarryB
Obsidian | Level 7

Thank you @FreelanceReinh  and @Astounding 

 

I made a minor edits on @Astounding 's code and made it work.

proc sort data=have;

by Household_ID;

run;

data want;

 Group = 0;

do until (last.Household_ID);

set have;

by Household_ID notsorted;

 

prior_client = lag(client_uid);

if not missing(household_Id) and first.Household_ID = 0 and prior_client ne client_uid then Group=1;

end;

do until (last.Household_ID);

set have;

by Household_ID notsorted;

output;

end;

drop prior_client;

run;

 

SAS Innovate 2025: Register Now

Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!

Mastering the WHERE Clause in PROC SQL

SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 5 replies
  • 926 views
  • 0 likes
  • 4 in conversation