BookmarkSubscribeRSS Feed
Tita
Fluorite | Level 6

I am trying to delete HHID duplicate code to leave only Households (not all members of the household). But i need to not just blinding delete the duplicate one, but i have to create a condition that if the person is the financial respondent of the family, then delete the other person, if not then delete person #1. How can i remove duplicate with an embedded if statement?

Thanks,

Tita

3 REPLIES 3
Steelers_In_DC
Barite | Level 11

It's kind of hard to guess what your data looks like without an example.  You'd be better off adding a small sample of what you have and again what you want:

data have;

input hhid$ person$ head$;

cards;

01 lammar 0

01 sally 0

01 renoldo 1

02 Judy 1

02 Leroy 0

02 Jane 0

;

data responsible not;

set have;

if head = 1 then output responsible;

else output not;

run;

Tita
Fluorite | Level 6

Hi Mark, Thanks for the response. I have the following dataset:

HHID RespID SpouseID  Couple  Respondent(Fin resp)  Spouse(Fin Resp)  And 600 other variables

20       2020      2010          1                 1                                 0                           ,,,,,,

20       2010      2020          1                  0                                1                           ........

10        1010      0               0                   1                               0                            ......

30          3010    3020          1                   1                               0                         ......

30          3020     3010         1                   0                                1                        ......

I want to have the following:

HHID RespID SpouseID  Couple  Respondent(Fin resp)  Spouse(Fin Resp)  And 600 other variables

20       2020      2010          1                 1                                 0                           ,,,,,,

10        1010      0               0                   1                               0                            ......

30          3010    3020          1                   1                               0                         ......

Leave only one household who will be the financial respondent if coupled. So i will have two statements :

if Couple=0, it is not duplicate, leave it like that;

else if couple=1 and Respond(fin)=1 then delete the spouse row (duplicate one);

run;

Steelers_In_DC
Barite | Level 11

Here is a solution for you:

data have;

input HHID RespID SpouseID  Couple  'Respondent(Fin resp)'n  'Spouse(Fin Resp)'n;

cards;

20       2020      2010         1                 1                               0

20       2010      2020         1                 0                               1

10       1010      0            0                 1                               0

30       3010      3020         1                 1                               0

30       3020      3010         1                 0                               1

;

data want rest;

set have;

if couple = 0 then output want;

else if couple = 1 and 'Respondent(Fin resp)'n = 1 then output want;

else output rest;

run;

Go Pirates

hackathon24-white-horiz.png

The 2025 SAS Hackathon Kicks Off on June 11!

Watch the live Hackathon Kickoff to get all the essential information about the SAS Hackathon—including how to join, how to participate, and expert tips for success.

YouTube LinkedIn

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 3 replies
  • 1598 views
  • 0 likes
  • 2 in conversation